我能够在综合数据上重现该错误:import pandas as pdfrom datetime import datetimedf1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'], 'B': ['B0', 'B1', 'B2', 'B3'], 'C': [datetime.now(), datetime.now(), datetime.now(), datetime.now()], 'D': ['D0', 'D1', 'D2', 'D3']}, index=[0, 1, 2, 3]);df2 = pd.DataFrame({'A': ['A1', 'A2', 'A3', 'A4'], 'E': ['E1', 'E2', 'E3', 'E4']}, index=[0,1,2,3]);df = pd.merge(df1, df2, how='left', on=['A', 'A']);def getList(row): r = []; if row["A"] == "A1": r.append("test-01"); if row["B"] == "B1": r.append("test-02"); if row["B"] == "B2": r.append("test-03"); return r;df["NEW_COLUMN"] = df.apply(lambda row: getList(row), axis = 1);原始帖子:我想基于多种条件在pandas数据框中创建一个新列。新列的值应为list。但是我收到“ ValueError:指定索引传递的空数据。” 如果列表为空。def getList(p_row): r = []; if p_row["field1"] > 0: r.append("x"); ... return r;df["new_list_field"] = df.apply(lambda row: getList(row), axis = 1);完整的错误:ValueError追溯(最近一次通话最近)C:\ ProgramData \ Anaconda3 \ lib \ site-packages \ pandas \ core \ internals.py在create_block_manager_from_arrays(数组,名称,轴)4636中尝试:-> 4637块= form_blocks(数组,名称) ,轴)4638 mgr = BlockManager(块,轴)C:\ ProgramData \ Anaconda3 \ lib \ site-packages \ pandas \ core \ internals.py in form_blocks(数组,名称,轴)4728如果len(object_items)> 0:-> 4729 object_blocks = _simple_blockify(object_items,np.object_ )4730 blocks.extend(object_blocks)C:\ ProgramData \ Anaconda3 \ lib \ site-packages \ pandas \ core \ internals.py in _simple_blockify(tuples,dtype)4758“”“-> 4759值,位置= _stack_arrays(tuples,dtype)4760C:\ ProgramData \ Anaconda3 \ lib \ site-packages \ pandas \ core \ internals.py in _stack_arrays(tuples,dtype)4822 for i,enr in enumerate(arrays):-> 4823 Stacked [i] = _asarray_compat(arr) 4824ValueError:无法将输入数组从形状(2)广播到形状(195)
2 回答
人到中年有点甜
TA贡献1895条经验 获得超7个赞
最终制作了一个列表列表,将其转换为,pd.Series()然后将其分配给新列。字典key2list返回可变长度列表作为值:
new_col_list = []
for _, row in my_df.iterrows():
new_col_list.append(key2list[row[u'key']])
my_df[u'new_col'] = pd.Series(new_col_list)
慕姐4208626
TA贡献1852条经验 获得超7个赞
该函数的输出长度因行而异,但是您不能将不等长的列表分配给新的pandas列。您可以通过以下方式进行验证:
for idx,row in df.iterrows():
print(getList(row))
一种替代方法是将输出转换为字符串:
def getListString(row):
r = ''
if row["A"] == "A1": r+="test-01"
if row["B"] == "B1": r+="test-02"
if row["B"] == "B2": r+="test-03"
return r
添加回答
举报
0/150
提交
取消