I pull some RSS feeds into the data store in App Engine to serve the iPhone application. I use cron to schedule RSS updates every x minutes. Each task processes only one RSS feed (which contains 15-20 elements). I often get warnings about high CPU usage in the App Engine toolbar, so I'm looking for ways to optimize my code.
I am currently using minidom (since it is already present in App Engine), but I suspect it is not very effective!
Here is the code:
dom = minidom.parseString(urlfetch.fetch(url).content)
if dom:
items = []
for node in dom.getElementsByTagName('item'):
item = RssItem(
key_name = self.getText(node.getElementsByTagName('guid')[0].childNodes),
title = self.getText(node.getElementsByTagName('title')[0].childNodes),
description = self.getText(node.getElementsByTagName('description')[0].childNodes),
modified = datetime.now(),
link = self.getText(node.getElementsByTagName('link')[0].childNodes),
categories = [self.getText(category.childNodes) for category in node.getElementsByTagName('category')]
);
items.append(item);
db.put(items);
def getText(self, nodelist):
rc = ''
for node in nodelist:
if node.nodeType == node.TEXT_NODE:
rc = rc + node.data
return rc
Not much happens, but scripts often take 2-6 seconds in processor time, which seems too redundant to cyclize 20ish elements and read several attributes.
, ? - , ? - ( App Engine), , RSS ?