首页猿问如何将不同列大小的熊猫数据框拆分为...

如何将不同列大小的熊猫数据框拆分为单独的数据框？

Python

慕慕森 2022-06-22 16:24:26

我有一个大熊猫数据框，由整个数据框中不同数量的列组成。这是一个示例：当前数据框示例我想根据它拥有的列数将数据框拆分为多个数据框。此处的示例输出图像：输出图像谢谢。

查看完整描述

2 回答

沧海一幻觉

TA贡献1824条经验获得超5个赞

如果我说得对，您要做的是将现有的 1 个数据框与n列拆分为ceil(n/5)数据框，每个数据框有 5 列，最后一个带有n/5.

如果是这种情况，这将起到作用：

import pandas as pd

import math

max_cols=5

dt={"a": [1,2,3], "b": [6,5,3], "c": [8,4,2], "d": [8,4,0], "e": [1,9,5], "f": [9,7,9]}

df=pd.DataFrame(data=dt)

dfs=[df[df.columns[max_cols*i:max_cols*i+max_cols]] for i in range(math.ceil(len(df.columns)/max_cols))]

for el in dfs:

print(el)

并输出：

a b c d e

0 1 6 8 8 1

1 2 5 4 4 9

2 3 3 2 0 5

0 9

1 7

2 9

[Program finished]

反对回复 2022-06-22

波斯汪

TA贡献1811条经验获得超4个赞

如果您有一个说 10 列的数据框，并且您想将具有 3 个NaN值的记录与具有 1 的结果数据框一样放在另一个结果数据框中NaN，您可以按如下方式执行此操作：

# evaluate the number of NaNs per row

num_counts=df.isna().sum('columns')

# group by this number and add the grouped

# dataframe to a dictionary

results= dict()

num_counts=df.isna().sum('columns')

for key, sub_df in df.groupby(num_counts):

results[key]= sub_df

执行此代码后，结果包含子集，df其中每个子集包含相同数量的NaNs（因此相同数量的非NaNs）。

如果要将结果写入 excel 文件，只需执行以下代码：

with pd.ExcelWriter('sorted_output.xlsx') as writer:

for key, sub_df in results.items():

# if you want to avoid the detour of using dicitonaries

# just replace the previous line by

# for key, sub_df in df.groupby(num_counts):

sub_df.to_excel(

writer,

sheet_name=f'missing {key}',

na_rep='',

inf_rep='inf',

float_format=None,

index=True,

index_label=True,

header=True)

例子：

# create an example dataframe

df=pd.DataFrame(dict(a=[1, 2, 3, 4, 5, 6], b=list('abbcac')))

df.loc[[2, 4, 5], 'c']= list('xyz')

df.loc[[2, 3, 4], 'd']= list('vxw')

df.loc[[1, 2], 'e']= list('qw')

它看起来像这样：

Out[58]:

a b c d e

0 1 a NaN NaN NaN

1 2 b NaN NaN q

2 3 b x v w

3 4 c NaN x NaN

4 5 a y w NaN

5 6 c z NaN NaN

如果你在这个数据帧上执行上面的代码，你会得到一个包含以下内容的字典：

0: a b c d e

2 3 b x v w

1: a b c d e

4 5 a y w NaN

2: a b c d e

1 2 b NaN NaN q

3 4 c NaN x NaN

5 6 c z NaN NaN

3: a b c d e

0 1 a NaN NaN NaN

字典的键是行中 s 的数量，NaN值是数据帧，其中仅包含具有该数量NaNs 的行。

反对回复 2022-06-22

2 回答
0 关注
116 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

如何将不同列大小的熊猫数据框拆分为单独的数据框？

如何将不同列大小的熊猫数据框拆分为单独的数据框？

2 回答

添加回答