How to Parse Namespaces using the Python RSS Parser?

In the last tutorial, we learned about how to build a Python based RSS Parser. Continuing that conversation and building on that tutorial, let’s now look at parsing Namespaces and Namespace specific elements.

Getting Ready

For the purpose of this tutorial, we will use the file that we created in the previous tutorial.

Parsing Namespaces

Let’s extend the RSS Aggregator file below.

import feedparser

class WhizRssAggregator():
feedurl = ""

def __init__(self, paramrssurl):
self.feedurl = paramrssurl

def parse(self):
thefeed = feedparser.parse(self.feedurl)

print("Getting Feed Data")
print(thefeed.feed.get("title", ""))
print(thefeed.feed.get("link", ""))
print(thefeed.feed.get("description", ""))
print(thefeed.feed.get("published", ""))

for thefeedentry in thefeed.entries:
print(thefeedentry.get("guid", ""))
print(thefeedentry.get("title", ""))
print(thefeedentry.get("link", ""))
print(thefeedentry.get("description", ""))

# Parsing Namespaces
for thefeednamespace in thefeed.namespaces:
if (thefeednamespace == "media"):
# parse for Yahoo Media
allmediacontent = thefeedentry.get("media_content", "")
for themediacontent in allmediacontent:

In the above code snippet that follows the Parsing Namespaces comment, you use yet another powerful capability of FeedParser. By simply referencing thefeed.namespaces, you can retrieve the list of namespaces referenced in the RSS XML Document. You can then iterate through the namespace. In the example above, we assume that the “media” namespace is referenced in the RSS XML Document.

The media namespace uses a series of tags to define its content. Using feedparser, you can access the tag defined within a namespace by referencing it as namespace_tagname.

In this example, since we are referencing the  tags defined within the namespace, you can simply use the get() function with the “media_content” parameter. This returns all of the items using the tags defined within the context of the “media” namespace. You can simply iterate and print each sub-tag or attribute. In this example, print(themediacontent[“url”]) simply prints the link to the media content which is an attribute of the content tag.


Most RSS Documents use multiple namespaces. By using the namespace feature and iterating through the document, you can very easily factor in various popular namespaces.

I hope this was helpful. Have fun coding in python.

P.S. Click here to download the files via github.

#Coding #Python

30 views0 comments