为了账号安全,请及时绑定邮箱和手机立即绑定

Python:从 xml 文件构建不同的路径/树

Python:从 xml 文件构建不同的路径/树

PHP
慕哥9229398 2023-11-09 21:17:23
以下是 xml 文件的示例:<?xml version="1.0" encoding="utf-8"?><SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">  <SOAP-ENV:Header />  <SOAP-ENV:Body>    <ADD_LandIndex_001>      <CNTROLAREA>        <BSR>          <status>ADD</status>          <NOUN>LandIndex</NOUN>          <REVISION>001</REVISION>        </BSR>      </CNTROLAREA>      <DATAAREA>        <LandIndex>          <reportId>AMI100031</reportId>          <requestKey>R3278458</requestKey>          <SubmittedBy>EN4871</SubmittedBy>          <submittedOn>2015/01/06 4:20:11 PM</submittedOn>          <LandIndex>            <agreementdetail>              <agreementid>001       4860</agreementid>              <agreementtype>NATURAL GAS</agreementtype>              <currentstatus>                <status>ACTIVE</status>                <statuseffectivedate>1965/02/18</statuseffectivedate>                <termdate>1965/02/18</termdate>              </currentstatus>              <designatedrepresentative>              </designatedrepresentative>            </agreementdetail>          </LandIndex>        </LandIndex>      </DATAAREA>    </ADD_LandIndex_001>  </SOAP-ENV:Body></SOAP-ENV:Envelope>我想将 xml 文件中包含文本的所有不同路径存储在列表中。所以我想要这样的东西:['Envelope/Body/ADD_LandIndex_01/CNTROLAREA/BSR/status', 'Envelope/Body/ADD_LandIndex_01/CNTROLAREA/BSR/LandIndex', ...]我尝试了一些不起作用的代码。我不知道如何单独获取一个分支的最后一个元素,以及当我在中间切换节点时如何从头开始所有路径(即Envelope/Body/ADD_LandIndex_01/DATAAREA...import xml.etree.ElementTree as etimport osimport pandas as pdfrom re import searchfilename = 'file_try.xml'element_tree = et.parse(filename)root = element_tree.getroot()namespace = "{http://schemas.xmlsoap.org/soap/envelope/}"def remove_namespace(string,namespace) :        if search(namespace, string) :        new_string = string.replace(namespace,'')    else :         new_string= string    return new_string谁能帮我 ?
查看完整描述

1 回答

?
狐的传说

TA贡献1804条经验 获得超3个赞

您可以根据实际代码修改它,但基本上 - 它应该如下所示:


from lxml import etree

soap = """[your xml above]"""

root = etree.XML(soap.encode())    

tree = etree.ElementTree(root)

for target in root.xpath('//text()'):

    if len(target.strip())>0:       

        print(tree.getpath(target.getparent()).replace('SOAP-ENV:',''))

输出:


/Envelope/Body/ADD_LandIndex_001/CNTROLAREA/BSR/status

/Envelope/Body/ADD_LandIndex_001/CNTROLAREA/BSR/NOUN

/Envelope/Body/ADD_LandIndex_001/CNTROLAREA/BSR/REVISION

/Envelope/Body/ADD_LandIndex_001/DATAAREA/LandIndex/reportId

/Envelope/Body/ADD_LandIndex_001/DATAAREA/LandIndex/requestKey

/Envelope/Body/ADD_LandIndex_001/DATAAREA/LandIndex/SubmittedBy

/Envelope/Body/ADD_LandIndex_001/DATAAREA/LandIndex/submittedOn

/Envelope/Body/ADD_LandIndex_001/DATAAREA/LandIndex/LandIndex/agreementdetail/agreementid

/Envelope/Body/ADD_LandIndex_001/DATAAREA/LandIndex/LandIndex/agreementdetail/agreementtype

/Envelope/Body/ADD_LandIndex_001/DATAAREA/LandIndex/LandIndex/agreementdetail/currentstatus/status

/Envelope/Body/ADD_LandIndex_001/DATAAREA/LandIndex/LandIndex/agreementdetail/currentstatus/statuseffectivedate

/Envelope/Body/ADD_LandIndex_001/DATAAREA/LandIndex/LandIndex/agreementdetail/currentstatus/termdate



查看完整回答
反对 回复 2023-11-09
  • 1 回答
  • 0 关注
  • 83 浏览

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信