我们得到了一个 file.tsv,我们需要构建一个函数。其中之一是如果一列(此处称为“low_confidence_variant”)= True,则删除每一行。我在某种程度上为这部分而奋斗。另外,有什么优化建议吗?根据结果,我们需要制作一个迈阿密图。这是我到目前为止所做的。任何提示都会有用;import numpy as npimport pandas as pdimport matplotlib.pyplot as pltdef read_file(file, chromosome):df = pd.read_csv(file, sep='\t', usecols=['chromosome', 'position', 'pval', 'low_confidence_variant'])df.drop(['low_confidence_variant'], True)df.dropna()sub_data = df.replace({'pval': 0}, 1e-274)sub_data['log10'] = -np.log10(sub_data['pval'])chr_group = sub_data.groupby(['chromosome'])chromosome = chr_group.get_group(chromosome)return chromosomedf1 = read_file('vitamin_d.females.tsv.gz', 1)df2 = read_file('vitamin_d.males.tsv.gz', 1)xa = df2['position']ya = df2['log10']xb = df1['position']yb = df1['log10'] * -1fig, (ax1, ax2) = plt.subplots(nrows=2, ncols=1, sharex=True, figsize=(12, 4))ax1.scatter(xa, ya, s=1, c="tab:blue")ax1.set_ylabel('males $\it{-log_{10}(pval)}$')ax1.set_title('vitamin D (nmol/L)', fontweight='bold')ax1.axhline(-np.log10(5*10**-8), c ='darkgray', ls='--')ax2.scatter(xb, yb, s=1, c="tab:blue")ax2.set_ylabel('females $\it{log_{10}(pval)}$')ax2.axhline(np.log10(5*10**-8), c ='darkgray', ls='--')plt.xlabel('Chromosome 1 positions')plt.subplots_adjust(hspace=.0)plt.show()fig.savefig(fname='miami.png', dpi=300, bbox_inches='tight', format='png')
1 回答
largeQ
TA贡献2039条经验 获得超7个赞
我有点不确定你的意思。
Say Df =
A B low_confidence_variant
10 20 True
2 4 False
6 0 False
So after deleting the rows with low_confidence_variant = True, you should have
df =
A B low_confidence_variant
2 4 False
6 0 False
正确的?
如果这就是您的意思:
### Add below line
df = df[df['low_confidence_variant'] != True]
并删除这一行
### Delete this line from the code
df.drop(['low_confidence_variant'], True)
您所做的就是删除整个列本身。
添加回答
举报
0/150
提交
取消