4 回答

TA贡献1890条经验 获得超9个赞
#having dataframe x:
>>> x = pd.DataFrame([['PartNo',12],['Meas1',45],['Meas2',23],['!END',''],['PartNo',13],['Meas1',63],['Meas2',73],['!END',''],['PartNo',12],['Meas1',82],['Meas2',84],['!END','']])
>>> x
0 1
0 PartNo 12
1 Meas1 45
2 Meas2 23
3 !END
4 PartNo 13
5 Meas1 63
6 Meas2 73
7 !END
8 PartNo 12
9 Meas1 82
10 Meas2 84
11 !END
#grouping by first column, and aggregating values to list. First column then contains Series that you want. By converting each list in this series to series, dataframe is created, then you just need to transpose
>>> df = x.groupby(0).agg(lambda x: list(x))[1].apply(lambda x: pd.Series(x)).transpose()
>>> df[['PartNo','Meas1','Meas2']]
0 PartNo Meas1 Meas2
0 12 45 23
1 13 63 73
2 12 82 84

TA贡献1851条经验 获得超3个赞
这是我会怎么做。我会将文件解析为任何文本文件,然后根据我需要的字段创建记录。我会使用 '!END' 行作为行创建完成的指示器,将其写入列表,然后最终将列表转换为 DataFrame
import pandas as pd
filename='PartDetail.csv'
with open(filename,'r') as file:
LinesFromFile=file.readlines()
RowToWrite=[]
for EachLine in LinesFromFile:
ValuePosition=EachLine.find(" ")+1
CurrentAttrib=EachLine[0:ValuePosition-1]
if CurrentAttrib=='PartNo':
PartNo=EachLine[ValuePosition+1:len(EachLine)-1].strip()
if CurrentAttrib=='Meas1':
Meas1=EachLine[ValuePosition+1:len(EachLine)-1].strip()
if CurrentAttrib=='Meas2':
Meas2=EachLine[ValuePosition+1:len(EachLine)-1].strip()
if EachLine[0:4]=='!END':
RowToWrite.append([PartNo,Meas1,Meas2])
PartsDataDF=pd.DataFrame(RowToWrite,columns=['PartNo','Meas1','Meas2']) #Converting to DataFrame
这将为您提供一个更清晰的 DataFrame,如下所示:-

TA贡献1827条经验 获得超4个赞
该文件不是 csv 文件,因此使用 csv 模块解析它无法产生正确的输出。它不是众所周知的格式,所以我会使用自定义解析器:
with open(filename) as fd:
data = []
row = None
for line in fd:
line = line.strip()
if line == '!END':
row = None
else:
k,v = line.split(None, 1)
if row is None:
row = {k : v}
data.append(row)
else:
row[k] = v
header = set(i for row in data for i in row.keys())
df = pd.DataFrame(data, columns=header)

TA贡献1853条经验 获得超6个赞
根据提供的信息,我认为你应该能够使用这种方法实现你想要的:
df = df[df[0] != '!END']
out = df.groupby(0).agg(list).T.apply(lambda x: x.explode(), axis=0)
输出:
0 Meas1 Meas2 PartNo
1 45 23 12
1 63 73 13
1 82 84 12
这基本上按 PartNo、Meas1 和 Meas2 键对原始 df 进行分组,并为每个列表创建一个列表。然后将每个列表分解为一个 pd.Series,从而为每个列表创建一个列,行数等于条目数每个键(都应该相同)
添加回答
举报