使用 python django 解析 RSS XML

我正在尝试解析 3 个不同的 RSS 源，这些是源。https://www.nba.com/bucks/rss.xmlhttp://www.espn.com/espn/rss/ncb/newshttp://rss.nytimes.com/services/xml/rss/nyt/ProBasketball.xml在大多数情况下，所有这三个来源的结构都相似，除了 url我正在尝试将这些解析为以下 Feed 对象，class Feed(Base): title = models.CharField(db_index=True, unique=True, max_length=255) link = models.CharField(db_index=True, max_length=255, ) summary = models.TextField(null=True) author = models.CharField(null=True, max_length=255) url = models.CharField(max_length=512, null=True) published = models.DateTimeField() source = models.ForeignKey(Source, on_delete=models.CASCADE, null=True)这是源对象，class Source(Base): name = models.CharField(db_index=True, max_length=255) link = models.CharField(db_index=True, max_length=255, unique=True)这是我用来解析的代码，import loggingimport xml.etree.ElementTree as ETimport requestsimport mayafrom django.utils import timezonefrom aggregator.models import Feedclass ParseFeeds: @staticmethod def parse(source): logger = logging.getLogger(__name__) logger.info("Starting {}".format(source.name)) root = ET.fromstring(requests.get(source.link).text) items = root.findall(".//item") for item in items: title = '' if item.find('title'): title = item.find('title').text link = '' if item.find('link'): link = item.find('link').text description = '' if item.find('description'): description = item.find('description').text author = '' if item.find('author'): author = item.find('author').text published = timezone.now()虽然我可以在 python 控制台上解析这些源中的每一个，但此处创建的提要对象以所有None或默认字段结束。我在这里做错了什么。

查看完整描述

1 回答

慕哥9229398

TA贡献1877条经验获得超6个赞

你应该使用

for item in items:

title = ''

if item.find('title') is not None: # The "is not None" part is critical here.

title = item.find('title').text

# And so on ...

如果您在终端中尝试

bool(item.find('title')) # This is False

item.find('title') is not None # while this is True

每次你想检查某事是否为 None 时，请使用if something is None构造。

反对回复 2021-11-02

热搜

最近搜索清空

使用 python django 解析 RSS XML

使用 python django 解析 RSS XML

1 回答

添加回答