1 回答
TA贡献1805条经验 获得超10个赞
我相信你DataFrame.explode
需要DataFrame.dropna
:
#changed data for better sample
print (df)
col1 col2
0 A [1, 2, 1]
1 A []
2 B [3, abc, abc]
3 B [abc]
4 C []
df2 = df.explode('col2').dropna(subset=['col2'])
print (df2)
col1 col2
0 A 1
0 A 2
0 A 1
2 B 3
2 B abc
2 B abc
3 B abc
df2 = df2.groupby('col1')['col2'].value_counts(normalize=True).reset_index(name='%')
print (df2)
col1 col2 %
0 A 1 0.666667
1 A 2 0.333333
2 B abc 0.750000
3 B 3 0.250000
编辑:
import ast
df = pd.read_csv('beforeexplode.csv')
df['col2'] = df['col2'].apply(ast.literal_eval)
df2 = df.explode('col2').dropna(subset=['col2'])
print (df2)
col1 col2
0 dev1 android
1 dev1 android
2 dev3 oscp
2 dev3 gpen
2 dev3 ceh
.. ... ...
206 dev2 wcag
207 dev2 linux
207 dev2 unix
208 dev2 linux
208 dev2 unix
[460 rows x 2 columns]
添加回答
举报