我想创建一个新列,它是每个 TimePeriodId 的“BillType”列中“IN”和“SA”的生命总和。这样,我将为单个 TimePeriodId 设置一个“总生命数”条目。我已经浏览了很多文档,但无法弄清楚在这种情况下我会怎么做。代码示例:sa = pd.read_sql(sa_q1, sql_conn)#convert TimePeriodId to string valuessa['TimePeriodId'] = sa['TimePeriodId'].astype(str)sa = sa.loc[(sa['BillType'] =='SA') | (sa['BillType']=='IN')]#.drop(['BillType'], axis = 1)sa.head(10).to_dict()#the last line returns the following:{'TimePeriodId': {1: '201811', 2: '201811', 4: '201812', 5: '201812', 9: '201901', 11: '201901', 13: '201902', 14: '201902', 17: '201903', 18: '201903'}, 'BillType': {1: 'IN', 2: 'SA', 4: 'IN', 5: 'SA', 9: 'SA', 11: 'IN', 13: 'IN', 14: 'SA', 17: 'IN', 18: 'SA'}, 'Lives': {1: 1067, 2: 288028, 4: 1058, 5: 287501, 9: 293560, 11: 1068, 13: 1089, 14: 278850, 17: 1076, 18: 276961}}任何帮助,将不胜感激!
1 回答
GCT1015
TA贡献1827条经验 获得超4个赞
首先找到您的可执行文件的安装位置blastp,并将其作为参数提供给NcbiblastpCommandline.
from Bio.Blast.Applications import NcbiblastpCommandline
blastp_path = r"C:\path\to\blastp.exe"
result = r"C:\Users\Uzytkownik\Desktop\tests\result.xml"
q = r"C:\Users\Uzytkownik\Desktop\tests\fastas\my_example2.faa"
database = r"C:\Users\Uzytkownik\Desktop\tests\my_examplemultif.faa"
blastp_cline = NcbiblastpCommandline(cmd=blastp_path, query=q, db=database, evalue=0.001, outfmt=5, out=result)
如果你现在这样做,print(blastp_cline)它应该打印出将要运行的完整命令。通过复制/粘贴此输出并从命令行运行它来仔细检查它是否有效。如果可行,那么
stdout, stderr = blastp_cline()
也应该工作。
添加回答
举报
0/150
提交
取消