为了账号安全,请及时绑定邮箱和手机立即绑定

剥离lxml中的单个元素

剥离lxml中的单个元素

Cats萌萌 2022-11-01 14:33:46
我需要在保留其数据的同时删除 XML 元素。lxml 函数strip_tags确实删除了元素,但它以递归方式工作,我想去除单个元素。我尝试使用这篇文章的答案,但remove删除了整个元素。xml="""<groceries>  One <fruit state="rotten">apple</fruit> a day keeps the doctor away.  This <fruit state="fresh">pear</fruit> is fresh.</groceries>"""tree=ET.fromstring(xml)for bad in tree.xpath("//fruit[@state='rotten']"):    bad.getparent().remove(bad)print (ET.tostring(tree, pretty_print=True))我想得到<groceries>    One apple a day keeps the doctor away.    This <fruit state="fresh">pear</fruit> is fresh.</groceries>\n'我明白了<groceries>    This <fruit state="fresh">pear</fruit> is fresh.</groceries>\n'我尝试使用strip_tags:for bad in tree.xpath("//fruit[@state='rotten']"):    ET.strip_tags(bad.getparent(), bad.tag)<groceries>    One apple a day keeps the doctor away.    This pear is fresh.</groceries>但这会剥离一切,我只想用state='rotten'.
查看完整描述

1 回答

?
ibeautiful

TA贡献1993条经验 获得超5个赞

也许其他人有更好的主意,但这是一种可能的解决方法:


bad = tree.xpath(".//fruit[@state='rotten']")[0] #for simplicity, I didn't bother with a for loop in this case

txt = bad.text+bad.tail # collect the text content of bad; strangely enough it's not just 'apple'

bad.getparent().text += txt # add the collected text to the parent's existing text

tree.remove(bad) # this gets rid only of this specific 'bad'

print(etree.tostring(tree).decode())

输出:


<groceries>

  One apple a day keeps the doctor away.

  This <fruit state="fresh">pear</fruit> is fresh.

</groceries>


查看完整回答
反对 回复 2022-11-01
  • 1 回答
  • 0 关注
  • 81 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
微信客服

购课补贴
联系客服咨询优惠详情

帮助反馈 APP下载

慕课网APP
您的移动学习伙伴

公众号

扫描二维码
关注慕课网微信公众号