3 回答
TA贡献1854条经验 获得超8个赞
检查“最近的购物清单”中是否存在字符串“Oranges”并根据结果创建一个新列“Oranges Lost”:
df['Oranges Lost'] = np.where(df['Recent shopping list'].str.contains('Oranges'), 'No Change', 'Lost')```
TA贡献1799条经验 获得超9个赞
用于处理数据的确切函数取决于每个组合所需的确切输出。希望下面会给您足够的信息来为您的问题创建解决方案:
# process data so each row contains a list of elements
df['PSL_processed'] = df['Previous shopping list'].str.split('+')
df['RSL_processed'] = df['Recent shopping list'].str.split('+')
def compare_items(x):
if set(x.PSL_processed) == set(x.RSL_processed):
return 'No change'
elif set(x.PSL_processed) - set(x.CSL_processed) > 0:
return 'Lost'
# add in conditional logic here, to meet specification
df.apply(compare_items, axis=1)
的官方文档pd.apply()写得很好。
TA贡献1936条经验 获得超6个赞
所以马克的解决方案很好地抓住了列表之间的差异
# process data so each row contains a list of elements
df['PSL_processed'] = df['Previous shopping list'].str.split()
df['RSL_processed'] = df['Recent shopping list'].str.split()
def compare_items(x):
return set(x.PSL_processed) - set(x.RSL_processed)
# add in conditional logic here, to meet specification
df['Products_lost'] = df.apply(compare_items, axis=1)
print(df)
除此之外,为了找到=水果和产品=鱼的产品,我使用了以下内容:
for idx, row in df.iterrows():
for c in Fruit:
if c in row['Products_lost']:
df.ix[idx, 'Fruit lost'] = c
for c in Fish:
if c in row['Products_lost']:
df.ix[idx, 'Fish lost'] = c
似乎运作良好!
添加回答
举报