如何将XML文件转换为漂亮的熊猫数据框？

假设我有一个像这样的XML：<author type="XXX" language="EN" gender="xx" feature="xx" web="foobar.com"> <documents count="N"> <document KEY="e95a9a6c790ecb95e46cf15bee517651" web="www.foo_bar_exmaple.com"><![CDATA[A large text with lots of strings and punctuations symbols [...] ]]> </document> <document KEY="bc360cfbafc39970587547215162f0db" web="www.foo_bar_exmaple.com"><![CDATA[A large text with lots of strings and punctuations symbols [...] ]]> </document> <document KEY="19e71144c50a8b9160b3f0955e906fce" web="www.foo_bar_exmaple.com"><![CDATA[A large text with lots of strings and punctuations symbols [...] ]]> </document> <document KEY="21d4af9021a174f61b884606c74d9e42" web="www.foo_bar_exmaple.com"><![CDATA[A large text with lots of strings and punctuations symbols [...] ]]> </document> <document KEY="28a45eb2460899763d709ca00ddbb665" web="www.foo_bar_exmaple.com"><![CDATA[A large text with lots of strings and punctuations symbols [...] ]]> </document> <document KEY="a0c0712a6a351f85d9f5757e9fff8946" web="www.foo_bar_exmaple.com"><![CDATA[A large text with lots of strings and punctuations symbols [...] ]]> </document> <document KEY="626726ba8d34d15d02b6d043c55fe691" web="www.foo_bar_exmaple.com"><![CDATA[A large text with lots of strings and punctuations symbols [...] ]]> </document> <document KEY="2cb473e0f102e2e4a40aa3006e412ae4" web="www.foo_bar_exmaple.com"><![CDATA[A large text with lots of strings and punctuations symbols [...] [...] ]]> </document> </documents></author>有人可以为我解决这个问题提供更好的方法吗？

查看完整描述

3 回答

缥缈止盈

TA贡献2041条经验获得超4个赞

您还可以通过创建元素字典来进行转换，然后直接转换为数据框：

import xml.etree.ElementTree as ETimport pandas as pd# Contents of test.xml# <?xml version="1.0" encoding="utf-8"?> <tags>   <row Id="1" TagName="bayesian" Count="4699" ExcerptPostId="20258" WikiPostId="20257" />   <row Id="2" TagName="prior" Count="598" ExcerptPostId="62158" WikiPostId="62157" />   <row Id="3" TagName="elicitation" Count="10" />   <row Id="5" TagName="open-source" Count="16" /> </tags>root = ET.parse('test.xml').getroot()tags = {"tags":[]}for elem in root:
    tag = {}
    tag["Id"] = elem.attrib['Id']
    tag["TagName"] = elem.attrib['TagName']
    tag["Count"] = elem.attrib['Count']
    tags["tags"]. append(tag)df_users = pd.DataFrame(tags["tags"])df_users.head()

反对回复 2019-09-26

如何将XML文件转换为漂亮的熊猫数据框？

如何将XML文件转换为漂亮的熊猫数据框？

3 回答

相关问题推荐

添加回答

热搜

最近搜索清空

如何将XML文件转换为漂亮的熊猫数据框？

如何将XML文件转换为漂亮的熊猫数据框？

3 回答

相关问题推荐

添加回答