How to detect changed and new items in an RSS feed?

Using feedparser or another Python library to download and analyze RSS feeds; How can I reliably detect new and modified elements?

So far, I've seen new items in feeds with publication dates earlier than the last item. I also saw channel readers display the same element published with slightly different content as separate elements. I do not implement a feed reader, I just want a reasonable strategy for archiving feed data.

+4
source share
1 answer

It depends on how much you trust the feed source. feedparser provides the .id attribute for feed elements - this attribute must be unique for both RSS sources and ATOM. For example, see, For example, feedparser ATOM Docs . Although .id will cover most cases, it is possible that the source may post multiple elements with the same identifier. In this case, you do not have much choice but to hash the contents of the element.

+5
source

All Articles