An example of XML processing
Utilities for traversing XML

Next:tag-named ( tag name/string -- matching-tag )


To illustrate how to use the XML library, we develop a simple Atom parser in Factor. Atom is an XML-based syndication format, like RSS. To see the full version of what we develop here, look at basis/syndication at the atom1.0 word. First, we want to load a file and get a DOM tree for it.
"file.xml" file>xml

No encoding descriptor is needed, because XML files contain sufficient information to auto-detect the encoding. Next, we want to extract information from the tree. To get the title, we can use the following:
"title" tag-named children>string

The tag-named word finds the first tag named title in the top level (just under the main tag). Then, with a tag on the stack, its children are asserted to be a string, and the string is returned.

For a slightly more complicated example, we can look at how entries are parsed. To get a sequence of tags with the name entry:
"entry" tags-named

Imagine that, for each of these, we want to get the URL of the entry. In Atom, the URLs are in a link tag which is contained in the entry tag. There are multiple link tags, but one of them contains the attribute rel=alternate, and the href attribute has the URL. So, given an element of the sequence produced in the above quotation, we run the code:
"link" tags-named [ "rel" attr "alternate" = ] find nip

to get the link tag on the stack, and
"href" attr >url

to extract the URL from it.