RSS and Atom in Action: Newsfeed Formats | Part 2/Page 2 | WebReference

RSS and Atom in Action: Newsfeed Formats | Part 2/Page 2


[previous] [next]

RSS and Atom in Action: Newsfeed Formats
Part 2

4.5.2 Atom common constructs

Atom defines a number of common constructs, attributes, and elements that are reused throughout the format. The most significant are date, text, and person. Dates are simple. A date construct is defined as an element that contains a date in date-time format as defined by RFC 3339. The text and person constructs require explanation.

Text construct

A text construct is an element that contains text. The way the text is stored is indicated by a type attribute. If the type attribute is "text", the element contains plain text and no markup of any kind. If it's "html", the element contains text and escaped HTML markup. If type is "xhtml", the element contains unescaped XHTML markup in the form of XHTML XML elements and text. Figure 4.4 summarizes this.

Text construct

Person construct

Some constructs involve more than one XML element. For example, Atom defines a person construct that is used to represent authors and contributors, such as the <author> element [6] in listing 4.3. As you can see in figure 4.5, a person construct must contain a name and may contain an email address and a URL. Just as you can extend Atom by adding new elements under <feed> and <entry>, you can do the same for a person construct.

Person construct

4.5.3 The elements of Atom

Figure 4.6 is our standard newsfeed summary diagram for Atom. We've used the notation <>, <>, and <> to indicate which elements are common constructs. Required elements are shaded.

The XML elements that make up an Atom newsfeed

Some important requirements are not apparent from this Atom format diagram, so let's review them. First, the feed-level requirements:

  • The feed must contain an <id> element. See section 4.5.4 for more about Atom identifiers.
  • The feed must contain a <link> with rel="self" that contains a link to the feed itself. This makes it possible for a program, which may have only a copy of a newsfeed document, to find the URL of the newsfeed.
  • The feed must include a single alternate link, meaning a element with rel="alternate" — typically a feed's alternate link references an alternate representation of the feed, such as the main page of the web site that provided the feed.
  • The author must be specified at the feed level or in each individual entry.

And now, the entry-level requirements:

  • Each entry must contain an <id> element. See section 4.5.4 for more about Atom identifiers.
  • If the entry does not have a <content> element, it must have an alternate link. An entry's alternate link is its permalink, a permanent link to the entry's web representation.
  • An entry can have multiple alternate links for different languages and content types, but an entry must not contain more than one alternate link for each combination of language and content type.
  • The entry must include a <summary> element if the content is not easily readable, i.e., if there is no <content> element, the element contains something other than text, or the <content> element references content elsewhere.

And notice that Atom is extensible. As you can see from figure 4.6, <feed> and <entry> elements may contain extension elements. This allows Atom to be extended in the same ways that RSS 1.0 and RSS 2.0 are. You can add new XML elements as long as they are defined in an XML namespace. Let's discuss some of the key features of the Atom format that set it apart from RSS formats. We'll start with ids.


[previous] [next]

URL: