Developing Feeds with RSS and Atom. Chapter 8: Parsing and Using Feeds
Developing Feeds with RSS and Atom. Chapter 8: Parsing and Using Feeds
Written by Ben Hammersley
This content is excerpted from Chapter 8 of the new book, "Developing Feeds with RSS and Atom", authored by Ben Hammersley, published by O'Reilly, copyright 2005. To learn more, please visit: https://www.oreilly.com/catalog/deveoprssatom/index.html.
Parsing for Programming
The ability to display a feed on a web page is important, no doubt about it, but it's not going to really excite anyone. To do that, you need to be able to parse feeds inside your own programs. In this section, we'll look at the two major alternatives, MagpieRSS and the Ultraliberal Feed Parser. Both parsers are libraries; both convert feeds into native data structures; and neither cares whether a feed is RSS 1.0, RSS 2.0 or Atom. That, really, is the final word with respect to the Great Battle of the Standards; most of the time, at a programmatic level, no one cares.
PHP: MagpieRSS
The most popular parser in PHP, and arguably the most popular in use on the Web right now, is Kellan Elliott-McCrea's MagpieRSS. As I write this, it stands at version 0.7, a low number indicative of modesty rather than product immaturity. MagpieRSS is a very refined product indeed.
To use MagpieRSS, first download the latest build from its web page at https://sourceforge.net/projects/magpierss/. There is also a weblog at https://laughingmeme.org/magpie_blog/.
Once downloaded, you're presented with a load of READMEs and example scripts, plus five include files:
|
To install these include files, place them in the same directory as the script that is going to use them.
Using MagpieRSS
MagpieRSS is simple to use and comes well-documented. Included in the
distribution is an example script called magpie_simple.php
. It
looks like Example 8-3.
Running this on my own weblog's RSS 1.0 feed produces a page that looks like Figure 8-3.
Figure 8-3. A very basic display using MagpieRSS
As you can see, it's very straightforward. Taken line by line, the meat of the script goes like this:
define('MAGPIE_DIR', '../');
require_once(MAGPIE_DIR.'rss_fetch.inc');
$url = $_GET['url'];
Here, you tell PHP where Magpie's files are keptÂin this case, in the parent
directory to the script. Now, invoke the rss_fetch.inc library,
retrieve the URL, and place it, as a string, into the variable
$url
:
if ( $url ) {
$rss = fetch_rss( $url );
echo "Channel: " . $rss->channel['title'] . "<p>";
echo "<ul>";
If the retrieval worked, you pass the contents of $url
to the
parser and print out a headline for the web page, containing the
<title>
of the <channel>
and the start of
an HTML list. (The HTML in this example isn't very compliant, but no
matter.)
As you can see, the rest is easy to follow. It simply sets up a loop to run down the feed document and creates a link within an HTML list element from what it finds in the feed. This method of looping through the 15 or so elements in a feed is very typical.
foreach ($rss->items as $item) {
$href = $item['link'];
$title = $item['title'];
echo "<li><a href=$href>$title</a></li>";
}
Once that's done, you can close off the list and get on with other things:
echo "</ul>";
Python: The Universal Feed Parser
Mark Pilgrim's Universal Feed Parser, hosted at https://sourceforge.net/projects/feedparser/, is perhaps the best feed application ever written. It is incredibly well-done and magnificently well-documented. Furthermore, it is released under the GPL and comes with over 2,000 unit tests. Those unit tests themselves are worth months of screaming from anyone writing their own parser; however, the question remains why would you when the UFP already exists?
It's well-documented, so the following sections will serve only to demonstrate its power.
Created: March 27, 2003
Revised: June 3, 2005
URL: https://webreference.com/programing/rss_atom/1