3 回答
![?](http://img1.sycdn.imooc.com/54584dd900014f6c02200220-100-100.jpg)
TA贡献1820条经验 获得超2个赞
尝试类似的方法:
rows = []
columns = ['assets', 'rateOfReturn', 'assetClassName', 'assetAmount']
for entry in root.xpath('//assetClassDetails'):
row = []
row.extend([entry.xpath('preceding-sibling::assets/text()')[0],
entry.xpath('following-sibling::rateOfReturn/text()')[0],
entry.xpath('./assetClassName/text()')[0],
entry.xpath('./assetAmount/text()')[0]])
rows.append(row)
pd.DataFrame(rows,columns=columns)
输出:
assets rateOfReturn assetClassName assetAmount
0 600000 6.3 Bonds 100000
1 600000 6.3 Equities 500000
另一种有趣的方法是使用另一个库:
import pandas_read_xml as pdx
df1 = pdx.read_xml(r'path\to\myfile.xml',['portfolio','assetClassDetails'])
df2 = pdx.read_xml(r'path\to\myfile.xml',['portfolio'])
pd.concat([df2[['assets','rateOfReturn']],df1], axis=1)
输出:
assets rateOfReturn assetClassName assetAmount
0 600000 6.3 Bonds 100000
1 600000 6.3 Equities 500000
![?](http://img1.sycdn.imooc.com/5458626a0001503602200220-100-100.jpg)
TA贡献1829条经验 获得超6个赞
使用 @JackFleeting 提到的包的另一种方法可能是:
import pandas_read_xml as pdx
from pandas_read_xml import fully_flatten
df = (pdx.read_xml(r'path\to\myfile.xml', ['portfolio'])
.pipe(fully_flatten))
展平将列表(XML 中的同级标签)展开为单独的行,或将字典(XML 中的子标签)展开为单独的列。
![?](http://img1.sycdn.imooc.com/545847d40001cbef02200220-100-100.jpg)
TA贡献1852条经验 获得超1个赞
下面(不使用任何外部库)
import xml.etree.ElementTree as ET
xml = """
<portfolio>
<assets>600000</assets>
<assetClassDetails>
<assetClassName>Bonds</assetClassName>
<assetAmount>100000</assetAmount>
</assetClassDetails>
<assetClassDetails>
<assetClassName>Equities</assetClassName>
<assetAmount>500000</assetAmount>
</assetClassDetails>
<rateOfReturn>6.3</rateOfReturn>
</portfolio>
"""
data = []
root = ET.fromstring(xml)
global_properties = {'assets': root.find('assets').text, 'rateOfReturn': root.find('rateOfReturn').text,
'type': root.tag}
for asset in root.findall('.//assetClassDetails'):
entry = {x.tag: x.text for x in list(asset)}
for k, v in global_properties.items():
entry[k] = v
data.append(entry)
for entry in data:
print(entry)
输出
{'assetClassName': 'Bonds', 'assetAmount': '100000', 'assets': '600000', 'rateOfReturn': '6.3', 'type': 'portfolio'}
{'assetClassName': 'Equities', 'assetAmount': '500000', 'assets': '600000', 'rateOfReturn': '6.3', 'type': 'portfolio'}
添加回答
举报