RDFa in XHTML: Syntax and Processing
RDFa in XHTML: Syntax and Processing
A collection of attributes and processing rules for extending XHTML to support RDF
W3C Recommendation 14 October 2008
This version:
Latest version:
Previous version:
Diff from previous version:
rdfa-syntax-diff.html
Editors:
Ben Adida, Creative Commons
ben@adida.net
Mark Birbeck,
webBackplane
mark.birbeck@webBackplane.com
Shane McCarron,
Applied Testing and Technology,
Inc.
shane@aptest.com
Steven Pemberton, CWI
Please refer to the
errata
for this document, which may include some normative corrections.
This document is also available in these non-normative formats:
PostScript version
PDF version
ZIP
archive
, and
Gzip'd TAR archive
The English version of this specification is the only normative version. Non-normative
translations
may also be available.
W3C
MIT
ERCIM
Keio
), All Rights Reserved. W3C
liability
trademark
and
document use
rules apply.
Abstract
The current Web is primarily made up of an enormous number of documents that have been created using HTML. These documents contain significant amounts of structured data, which is largely
unavailable to tools and applications. When publishers can express this data more completely, and when tools can read it, a new world of user functionality becomes available, letting users transfer
structured data between applications and web sites, and allowing browsing applications to improve the user experience: an event on a web page can be directly imported into a user's desktop calendar;
a license on a document can be detected so that users can be informed of their rights automatically; a photo's creator, camera setting information, resolution, location and topic can be published as
easily as the original photo itself, enabling structured search and sharing.
RDFa is a specification for attributes to express structured data in any markup language. This document specifies how to use RDFa with XHTML. The rendered, hypertext data of XHTML is reused by the
RDFa markup, so that publishers don't need to repeat significant data in the document content. The underlying abstract representation is RDF [
RDF-PRIMER
],
which lets publishers build their own vocabulary, extend others, and evolve their vocabulary with maximal interoperability over time. The expressed structure is closely tied to the data, so that
rendered data can be copied and pasted along with its relevant structure.
The rules for interpreting the data are generic, so that there is no need for different rules for different formats; this allows authors and publishers of data to define their own formats without
having to update software, register formats via a central authority, or worry that two formats may interfere with each other.
RDFa shares some use cases with microformats [
MICROFORMATS
]. Whereas microformats specify both a syntax for embedding structured data into HTML
documents and a vocabulary of specific terms for each microformat, RDFa specifies only a syntax and relies on independent specification of terms (often called vocabularies or taxonomies) by others.
RDFa allows terms from multiple independently-developed vocabularies to be freely intermixed and is designed such that the language can be parsed without knowledge of the specific term vocabulary
being used.
This document is a detailed syntax specification for RDFa, aimed at:
those looking to create an RDFa parser, and who therefore need a detailed description of the parsing rules;
those looking to recommend the use of RDFa within their organisation, and who would like to create some guidelines for their users;
anyone familiar with RDF, and who wants to understand more about what is happening 'under the hood', when an RDFa parser runs.
For those looking for an introduction to the use of RDFa and some real-world examples, please consult the
RDFa Primer
How to Read this Document
If you are already familiar with RDFa, and you want to examine the processing rules — perhaps to create a parser — then you'll find the
Processing Model
section
of most interest. It contains an overview of each of the processing steps, followed by more detailed sections, one for each rule.
If you are not familiar with RDFa, but you
are
familiar with RDF, then you might find reading the
Syntax Overview
useful, before looking at the
Processing Model
since it gives a range of examples of XHTML mark-up that use RDFa. Seeing some examples first should make reading the processing rules easier.
If you are not familiar with RDF, then you might want to take a look at the section on
RDF Terminology
before trying to do too much with RDFa. Although RDFa is
designed to be easy to author—and authors don't need to understand RDF to use it—anyone writing applications that
consume
RDFa will need to understand RDF. There is a lot of
material about RDF on the web, and a growing range of tools that support RDFa, so all we try to do in this document is provide enough background on RDF to make the goals of RDFa clearer.
And finally, if you are not familiar with either RDFa
or
RDF, and simply want to add RDFa to your documents, then you may find the RDFa Primer [
RDFaPRIMER
] to be a better introduction.
Status of this Document
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of
this technical report can be found in the
W3C technical reports index
at http://www.w3.org/TR/.
This document has been reviewed by W3C Members, by software developers, and by other W3C groups and interested parties, and is endorsed by the Director as a W3C Recommendation. It is a stable
document and may be used as reference material or cited from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread
deployment. This enhances the functionality and interoperability of the Web.
Members of the public are invited to send comments on this Recommendation to
public-rdf-in-xhtml-tf@w3.org
(with
public archive
).
sample test harness
is available. This set of tests is not intended to be exhaustive. Users may find the tests to be useful
examples of RDFa usage. An
implementation report
lists several implementations of this specification tested during the
Candidate Recommendation period. A community-maintained
Wiki page
includes subsequent updates.
This document has been produced jointly by the
Semantic Web Deployment Working Group
and the
XHTML2 Working
Group
as part of the
Semantic Web Activity
and the
HTML Activity
. It contains small editorial changes
arising from comments received during the Proposed Recommendation review; see the diff-marked version for details.
This document was produced by groups operating under the
5 February 2004 W3C Patent Policy
. W3C maintains a
public list of any patent disclosures
made in connection with the deliverables of the XHTML 2 Working Group; and also maintains a
public list of any patent disclosures
made in connection with the deliverables of the Semantic Web Deployment Working Group;
those pages also include instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains
Essential Claim(s)
must disclose the information in accordance with
section 6 of the W3C Patent Policy
Table of Contents
1.
Motivation
2.
Syntax Overview
2.1.
The RDFa Attributes
2.2.
Examples
3.
RDF Terminology
3.1.
Statements
3.2.
Triples
3.3.
URI references
3.4.
Plain literals
3.5.
Typed literals
3.6.
Turtle
3.7.
Graphs
3.8.
Compact URIs
3.9.
XHTML Fragments and RDFa
3.10.
A description of RDFa in RDF terms
4.
Conformance Requirements
4.1.
Document Conformance
4.2.
User Agent Conformance
4.3.
RDFa Processor Conformance
5.
Processing Model
5.1.
Overview
5.2.
Evaluation Context
5.3.
Chaining
5.4.
CURIE and URI Processing
5.4.1.
Scoping of Prefix Mappings
5.4.2.
Converting a CURIE to a URI
5.4.3.
General Use of CURIEs in Attributes
5.4.4.
Use of CURIEs in Specific Attributes
5.4.5.
Referencing Blank Nodes
5.5.
Sequence
6.
RDFa Processing in detail
6.1.
Changing the evaluation context
6.1.1.
Setting the current subject
6.2.
Completing 'incomplete triples'
6.3.
Object resolution
6.3.1.
Literal object resolution
6.3.2.
URI object resolution
7.
CURIE Syntax Definition
8.
XHTML+RDFa Definition
9.
Metainformation Attributes Module
9.1.
Datatypes
9.2.
Metainformation Attributes Collection
9.3.
@rel/@rev attribute values
A.
XHTML+RDFa DTD
A.1.
XHTML Metainformation Attributes Module
A.2.
XHTML+RDFa Content Model Module
A.3.
XHTML+RDFa Driver Module
A.4.
SGML Open Catalog Entry for XHTML+RDFa
B.
CURIE Datatypes
B.1.
XML Schema Definition
B.2.
XML DTD Definition
C.
Deployment Advice
D.
References
D.1.
Related Specifications
D.2.
Other References
E.
Change History
F.
Acknowledgments
1. Motivation
This section is informative.
RDF/XML [
RDF-SYNTAX
] provides sufficient flexibility to represent all of the abstract concepts in RDF [
RDF-CONCEPTS
]. However, it presents a number of challenges; first it is difficult or impossible to validate documents that contain RDF/XML using XML Schemas or DTDs, which
therefore makes it difficult to import RDF/XML into other markup languages. Whilst newer schema languages such as RELAX NG [
RELAXNG
] do provide a way to
validate documents that contain arbitrary RDF/XML, it will be a while before they gain wide support.
Second, even if one could add RDF/XML directly into an XML dialect like XHTML, there would be significant data duplication between the rendered data and the RDF/XML structured data. It would be
far better to add RDF to a document without repeating the document's existing data. For example, an XHTML document that explicitly renders its author's name in the text—perhaps as a byline on a
news site—should not need to repeat this name for the RDF expression of the same concept: it should be possible to supplement the existing markup in such a way that it can also be interpreted
as RDF.
Another reason for aligning the rendered data with the structured data is that it is highly beneficial to express the web data's structure 'in context'; as users often want to transfer structured
data from one application to another, sometimes to or from a non-web-based application, the user experience can be enhanced. For example, information about specific rendered data could be presented
to the user via 'right-clicks' on an item of interest.
In the past, many attributes were 'hard-wired' directly into the markup language to represent specific concepts. For example, in XHTML 1.1 [
XHTML11
] and
HTML [
HTML4
] there is
@cite
; the attribute allows an author to add information to a document which is used to indicate the origin of a
quote.
However, these 'hard-wired' attributes make it difficult to define a generic process for extracting metadata from any document since a parser would need to know about each of the special
attributes. One motivation for RDFa has been to devise a means by which documents can be augmented with metadata in a general rather than hard-wired manner. This has been achieved by creating a fixed
set of attributes and parsing rules, but allowing those attributes to contain properties from any of a number of the growing range of available RDF vocabularies. The
values
of those
properties are in most cases the information that is already in an author's XHTML document.
RDFa alleviates the pressure on XML format authors to anticipate all the structural requirements users of their format might have, by outlining a new syntax for RDF that relies only on XML
attributes. This specification deals specifically with the use of RDFa in XHTML, and defines an RDF mapping for a number of XHTML attributes, but RDFa can be easily imported into other XML-based
markup languages.
2. Syntax Overview
This section is informative.
The following examples are intended to help readers who are not familiar with RDFa to quickly get a sense of how it works. For a more thorough introduction, please read the RDFa Primer [
RDFaPRIMER
].
For brevity, in the following examples and throughout this document, assume that the following vocabulary prefixes have been defined:
biblio:
cc:
dbp:
dbr:
dc:
ex:
foaf:
rdf:
rdfs:
taxo:
xhv:
xsd:
2.1. The RDFa Attributes
RDFa in XHTML makes use of a number of XHTML attributes, as well as providing a few new ones. Attributes that already exist in XHTML will have the same meaning as in XHTML, although their syntax
may be slightly modified. For example, in XHTML,
@rel
already defines the relationship between one document and another. However, in XHTML there is no clear way to add new
values; RDFa sets out to explicitly solve this problem, and does so by allowing URIs as values. It also introduces the idea of 'compact URIs'—referred to as CURIEs in this document—which
allow a full URI value to be expressed succinctly.
The XHTML attributes that are relevant are:
@rel
a whitespace separated list of
CURIE
s, used for expressing relationships between two resources ('predicates' in RDF terminology);
@rev
a whitespace separated list of
CURIE
s, used for expressing reverse relationships between two resources (also 'predicates');
@content
a string, for supplying machine-readable content for a literal (a 'plain literal object', in RDF terminology);
@href
URI
for expressing the partner resource of a relationship (a 'resource object', in
RDF terminology);
@src
URI
for expressing the partner resource of a relationship when the resource is
embedded (also a 'resource object').
The new—RDFa-specific—attributes are:
@about
URIorSafeCURIE
, used for stating what the data is about (a 'subject' in RDF terminology);
@property
a whitespace separated list of
CURIE
s, used for expressing relationships between a subject and some literal text (also a 'predicate');
@resource
URIorSafeCURIE
for expressing the partner resource of a relationship that is not intended to be 'clickable' (also an
'object');
@datatype
CURIE
representing a datatype, to express the datatype of a literal;
@typeof
a whitespace separated list of
CURIE
s that indicate the RDF type(s) to associate with a subject.
For a normative definition of these attributes see the
XHTML Metainformation Attributes Module
2.2. Examples
As an XHTML author you will already be familiar with using
meta
and
link
to add additional information to your documents:


Page 7




...

RDFa makes use of this concept, enhancing it with the ability to make use of other vocabularies by using compact URIs:
xmlns="http://www.w3.org/1999/xhtml"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:dc="http://purl.org/dc/elements/1.1/"

My home-page



...

Although not widely used, XHTML already supports the use of
@rel
and
@rev
on the
element. This becomes more useful in RDFa with
the addition of support for different vocabularies:
This document is licensed under a
xmlns:cc="http://creativecommons.org/ns#"
rel="cc:license"
href="http://creativecommons.org/licenses/by-nc-nd/3.0/">
Creative Commons License
.
Not only can URLs in the document be re-used to provide metadata, but so can inline text:
xmlns="http://www.w3.org/1999/xhtml"
xmlns:cal="http://www.w3.org/2002/12/cal/ical#"
Jo's Friends and Family Blog


I'm holding
property="cal:summary"
one last summer Barbecue
,
on September 16th at 4pm.




If some displayed text is different to the actual 'value' it represents, more precise values can be added, which can optionally include datatypes:
xmlns="http://www.w3.org/1999/xhtml"
xmlns:cal="http://www.w3.org/2002/12/cal/ical#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
Jo's Friends and Family Blog


I'm holding

one last summer Barbecue
,
on
property="cal:dtstart"
content="2007-09-16T16:00:00-05:00"
datatype="xsd:dateTime"
September 16th at 4pm
.




In many cases a block of mark-up will contain a number of properties that relate to the same item; it's possible with RDFa to indicate the type of that item:
xmlns="http://www.w3.org/1999/xhtml"
xmlns:cal="http://www.w3.org/2002/12/cal/ical#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
Jo's Friends and Family Blog

typeof="cal:Vevent"
I'm holding

one last summer Barbecue
,
on
datatype="xsd:dateTime">
September 16th at 4pm
.




The metadata features available in XHTML only allow information to be expressed about the document itself. RDFa allows the document to contain metadata information about other documents and
resources:
xmlns="http://www.w3.org/1999/xhtml"
xmlns:biblio="http://example.org/"
xmlns:dc="http://purl.org/dc/elements/1.1/"

Books by Marco Pierre White


I think White's book
'about="urn:ISBN:0091808189"
typeof="biblio:book"
property="dc:title"
Canteen Cuisine
'
is well worth getting since although it's quite advanced stuff, he
makes it pretty easy to follow. You might also like
about="urn:ISBN:1596913614"
typeof="biblio:book"
property="dc:description"
White's autobiography
.


3. RDF Terminology
This section is informative.
The previous section gave examples of typical mark-up in order to illustrate what RDFa in XHTML looks like. But what RDFa in XHTML
represents
is RDF. In order to author RDFa in XHTML you
do not need to understand RDF, although it would certainly help. However, if you are building a system that consumes the RDF output of an RDFa in XHTML document you will almost certainly need to
understand RDF. In this section we introduce the basic concepts and terminology of RDF. For a more thorough explanation of RDF, please refer to the RDF Concepts document [
RDF-CONCEPTS
] and the RDF Sytax Document [
RDF-SYNTAX
].
3.1. Statements
The structured data that RDFa provides access to is a collection of
statements
. A statement is a basic unit of information that has been constructed in a specific format to make it
easier to process. In turn, by breaking large sets of information down into a collection of statements, even very complex metadata can be processed using simple rules.
To illustrate, suppose we have the following set of facts:
Albert was born on March 14, 1879, in Germany. There is a picture of him at
the web address, http://en.wikipedia.org/wiki/Image:Albert_Einstein_Head.jpg.
This would be quite difficult for a machine to interpret, and it is certainly not in a format that could be passed from one data application to another. However, if we convert the information to a
set of statements it begins to be more manageable. The same information could therefore be represented by the following shorter 'statements':
Albert was born on March 14, 1879.
Albert was born in Germany.
Albert has a picture at
3.2. Triples
To make this information machine-processable, RDF defines a structure for these statements. A statement is formally called a [
triple
], meaning
that it is made up of three components. The first is the
subject
of the triple, and is what we are making our statements
about
. In all of these examples the subject is 'Albert'.
The second part of a triple is the property of the subject that we want to define. In the examples here, the properties would be 'was born on', 'was born in', and 'has a picture at'. These are
more usually called
predicates
in RDF.
The final part of a triple is called the
object
. In the examples here the three objects have the values 'March 14, 1879', 'Germany', and
'http://en.wikipedia.org/wiki/Image:Albert_Einstein_Head.jpg'.
3.3. URI references
Breaking complex information into manageable units helps us be specific about our data, but there is still some ambiguity. For example, which 'Albert' are we talking about? If another system has
more facts about 'Albert', how could we know whether they are about the same person, and so add them to the list of things we know about that person? If we wanted to find people born in Germany, how
could we know that the predicate 'was born in' has the same purpose as the predicate 'birthplace' that might exist in some other system? RDF solves this problem by replacing our vague terms with
URI references
URIs are most commonly used to identify web pages, but RDF makes use of them as a way to provide unique identifiers for concepts. For example, we could identify the subject of all of our
statements (the first part of each triple) by using the DBPedia [
] URI for Albert Einstein, instead of the ambiguous string 'Albert':

has the name
Albert Einstein.

was born on
March 14, 1879.

was born in
Germany.

has a picture at
URI references are also used to uniquely identify the objects in metadata statements (the third part of each triple). The picture of Einstein is already a URI, but we could also use a URI to
uniquely identify the country Germany. At the same time we'll indicate that the name and date of birth really are literals (and not URIs), by putting quotes around them:

has the name
Albert Einstein

was born on
March 14, 1879

was born in


has a picture at
URI references are also used to ensure that predicates are unambiguous; now we can be sure that 'birthplace', 'place of birth', 'Lieu de naissance' and so on, all mean the same thing:


"Albert Einstein".


"March 14, 1879".


.


.
3.4. Plain literals
Although URI resources are always used for subjects and predicates, the object part of a triple can be either a URI or a [
literal
]. In the
example triples, Einstein's name is represented by a [
plain literal
], which means that it is a basic string with no type or language
information:


"Albert Einstein"
3.5. Typed literals
Some literals, such as dates and numbers, have very specific meanings, so RDF provides a mechanism for indicating the type of a literal. A [
typed
literal
] is indicated by attaching a URI to the end of a [
plain literal
], and this URI indicates the literal's datatype. This URI is usually
based on datatypes defined in the XML Schema Datatypes specification [
XMLSCHEMA
]. The following syntax would be used to unambiguously express Einstein's date
of birth as a literal of type

"1879-03-14"
^^
3.6. Turtle
RDF itself does not have one set way to express triples, since the key ideas of RDF are the triple and the use of URIs, and
not
any particular syntax. However, there are a number of
mechanisms for expressing triples, such as RDF/XML, Turtle [
TURTLE
], and of course RDFa. Many discussions of RDF make use of the
Turtle
syntax to
explain their ideas, since it is quite compact. The examples we have just seen are already using this syntax, and we'll continue to use it throughout this document when we need to talk about the RDF
that could be generated from some RDFa. Turtle allows long URIs to be abbreviated by using a URI mapping, which can be used to express a compact URI as follows:
@prefix dbp: .
@prefix foaf: .

foaf:name
"Albert Einstein" .

dbp:birthPlace
.
Here 'dbp:' has been mapped to the URI for DBPedia and 'foaf:' has been mapped to the URI for the 'Friend of a Friend' taxonomy.
Any URI in Turtle could be abbreviated in this way. This means that we could also have used the same technique to abbreviate the identifier for Einstein, as well as the datatype indicator:
@prefix dbp: .
@prefix dbr: .
@prefix foaf: .
@prefix xsd: .
dbr:Albert_Einstein
dbp:dateOfBirth "1879-03-14"^^
xsd:date
dbr:Albert_Einstein
foaf:depiction .
When writing examples, you will often see the following URI in the Turtle representation:
<>
This indicates the 'current document', i.e., the document being processed. In reality there would always be a full URI based on the document's location, but this abbreviation serves to make
examples more compact. Note in particular that the whole technique of abbreviation is merely a way to make examples more compact, and the actual triples generated would always use the full URIs.
3.7. Graphs
A collection of triples is called a
graph
For more information on the concepts described above, see [
RDF-CONCEPTS
]. RDFa additionally defines the following terms:
3.8. Compact URIs
In order to allow for the compact expression of RDF statements, RDFa allows the contraction of all [
URI reference
]s into a form called a 'compact URI',
or
CURIE
. A detailed discussion of this mechanism is in section
CURIE and URI Processing
Note that CURIEs are only used in the mark-up and Turtle examples, and will never appear in the generated [
triple
]s, which are defined in RDF to use [
URI reference
]s.
Full details on how CURIEs are processed is in the section titled
CURIE Processing
3.9. XHTML Fragments and RDFa
A growing use of embedded metadata is to take fragments of mark-up and move them from one document to another. This may happen through the use of tools, such as drag-and-drop in a browser, or
through snippets of code provided to authors for inclusion in their documents. (A good example of the latter is the licensing fragment provided by Creative Commons.)
However, those involved in creating fragments (either by building tools, or authoring snippets), should be aware that this specification does not say how fragments of XHTML+RDFa should be
processed whilst they are 'outside' of a complete XHTML+RDFa document (although future versions of this or related specifications may do so).
Developers of tools that process fragments, or authors of fragments for manual inclusion, should also bear in mind what will happen to their fragment once it is included in an XHTML+RDFa document,
and are advised to carefully consider the amount of 'context' information that will be needed in order to ensure a correct interpretation of their fragment.
3.10. A description of RDFa in RDF terms
The following is a brief description of RDFa in terms of the RDF terminology introduced here. It may be useful to readers with an RDF background:
The aim of RDFa is to allow a single [
RDF graph
] to be carried in various types of document mark-up. However, this specification deals only with RDFa in
XHTML. An [
RDF graph
] comprises [
node
]s linked by relationships. The basic unit of an [
RDF graph
] is a [
triple
], in which a subject [
node
] is linked to an object [
node
via a [
predicate
]. The [
subject
] [
node
] is always either a [
URI reference
] or a [
blank node (or bnode)
], the [
predicate
] is
always
URI reference
], and the object of a statement can be a [
URI reference
], a [
literal
], or a [
bnode
].
In RDFa, a subject [
URI reference
] is generally indicated using
@about
, and predicates are represented using one of
@property
@rel
, or
@rev
. Objects which are [
URI reference
]s are represented using
@href
@resource
or
@src
, whilst objects that are [
literal
]s are represented either
with
@content
or the content of the element in question (with an optional datatype expressed using
@datatype
).
4. Conformance Requirements
This section is normative.
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [
RFC2119
].
Note that all examples in this document are informative, and are not meant to be interpreted as normative requirements.
4.1. Document Conformance
A strictly conforming XHTML+RDFa document is a document that requires only the facilities described as mandatory in this specification. Such a document satisfies the following criteria:
The document MUST conform to the constraints expressed in the schemas in
Appendix A - XHTML+RDFa Document Type Definition
The local part of the root element of the document MUST be
html
The start tag of the root element of the document MUST explicitly contain a default namespace declaration for the XHTML namespace [
XMLNS
].
The namespace URI for XHTML is defined to be
Sample root element

There SHOULD be a
@version
attribute on the
html
element with the value "
XHTML+RDFa 1.0
Example of an XHTML+RDFa 1.0 document

html
xmlns="http://www.w3.org/1999/xhtml"
version="XHTML+RDFa 1.0"
xml:lang="en">

Virtual Library


Moved to example.org.




Note that in this example, the XML declaration is included. An XML declaration like the one above is not required in all XML documents.
XHTML document authors SHOULD use XML
declarations in all their documents.
XHTML document authors MUST use an XML declaration when the character encoding of the document is other than the default UTF-8 or UTF-16
and no encoding is specified by a higher-level protocol.
XHTML+RDFa documents SHOULD be labeled with the Internet Media Type "application/xhtml+xml" as defined in [
RFC3236
. For further
information on using media types with XHTML family markup languages, see the informative note [
XHTMLMIME
].
4.2. User Agent Conformance
A conforming user agent MUST support all of the features required in this specification.
A conforming user agent must also support the User Agent
conformance requirements as defined in XHTML Modularization [
XHTMLMOD
] section on "XHTML Family User Agent Conformance".
4.3. RDFa Processor Conformance
A conforming RDFa Processor MUST make available to a consuming application a single [
RDF graph
] containing all possible triples generated
by using the rules in the
Processing Model
section.
This specification uses the term [
default graph
] to mean all of the
triples asserted by a document according to the
Processing Model
section.
A conforming RDFa Processor MAY make available additional triples that have been generated using rules not described here, but these triples MUST NOT be made available in the
default graph
].
(Whether these additional triples are made available in one or more additional [
RDF graph
]s
is implementation-specific, and therefore not defined here.)
Since XHTML+RDFa is based upon XHTML Modularization [
XHTMLMOD
], and since XHTML Modularization requires that whitespace is preserved,
conforming processors must preserve whitespace in both [
plain literal
]s and [
XML literals
].
However, it
may be the case that the architecture in which a processor operates does not make all whitespace available. It is therefore advisable for authors who would like to make their documents consumable
across different processors, to remove any unnecessary whitespace in their mark-up.
5. Processing Model
This section is normative.
This section looks at a generic set of processing rules for creating a set of triples that represent the structured data present in an XHTML+RDFa document. Processing need not follow the DOM
traversal technique outlined here, although the effect of following some other manner of processing must be the same as if the processing outlined here were followed. The processing model is
explained using the idea of DOM traversal which makes it easier to describe (particularly in relation to the [
evaluation context
]).
Note that in this section, explanations about the processing model or guidance to implementors are enclosed in sections like this.
5.1. Overview
Parsing a document for RDFa triples is carried out by starting at the document object, and then visiting each of its child elements in turn, in document order, applying processing rules.
Processing is recursive in that for each child element the processor also visits each of
its
child elements, and applies the same processing rules.
(Note that in some environments there will be little difference between starting at the root element of the document, and starting at the document object itself. However, we define it this way
since in some environments important information is present at the document object level which is not present on the root element.)
As processing continues, rules are applied which may generate triples, and may also change the [
evaluation context
] information that will then be
used when processing descendant elements.
Note that we don't say anything about what should happen to the triples generated, or whether more triples might be generated during processing than are outlined here. However, to be conformant,
an RDFa processor needs to act as if at a minimum the rules in this section are applied, and a single [RDF graph] produced. As described in the
RDFa Processor Conformance
section, any additional triples generated MUST NOT appear in the [
default graph
].
5.2. Evaluation Context
During processing, each rule is applied using information provided by an [
evaluation context
]. An initial context is created when processing
begins, with the following set of values:
The [
base
]. This will usually be the URL of the document being processed, but it could be some other URL, set by some other mechanism, such as the XHTML
base
element. The important thing is that it establishes a URL against which relative paths can be resolved.
The [
parent subject
]. The initial value will be the same as the initial value of [
base
], but it will usually
change during the course of processing.
The [
parent object
]. In some situations the object of a statement becomes the subject of any nested statements, and this property is used to convey
this value. Note that this value may be a bnode, since in some situations a number of nested statements are grouped together on one bnode. This means that the bnode must be set in the containing
statement and passed down, and this property is used to convey this value.
A list of current, in-scope [
URI mappings
].
A list of [
incomplete triple
]s. A triple can be incomplete when no object resource is provided alongside a predicate that requires a resource
(i.e.,
@rel
or
@rev
). The triples can be completed when a resource becomes available, which will be when the next subject is specified (part of the
process called [
chaining
]).
The [
language
]. Note that there is no default language.
During the course of processing new [
evaluation context
]s are created which are passed to each child element. The rules described below will
determine the values of the items in the context. Additionally, some rules will cause new triples to be created by combining information provided by an element with information from the [
evaluation context
].
During the course of processing a number of locally scoped values are needed, as follows:
An initially empty list of [
URI mapping
]s, called the [
local list of URI mappings
].
An initially empty [
list of incomplete triples
], called the [
local list of
incomplete triples
].
An initially empty [
language
] value.
A [
recurse
] flag. Processing generally continues recursively through the entire tree of elements available. However, if an author indicates that some
branch of the tree should be treated as an XML literal, no further processing should take place on that branch, and setting this flag to
false
would have that effect.
A [
skip element
] flag, which indicates whether the [
current element
] can safely be ignored since it has
no relevant RDFa attributes. Note that descendant elements will still be processed.
A [
new subject
] value, which once calculated will set the [
parent subject
] property in an [
evaluation context
], as well as being used to complete any [
incomplete triple
]s, as described in the next
section.
A value for the [
current object literal
], the literal to use when creating triples that have a literal object.
A value for the [
current object resource
], the resource to use when creating triples that have a resource object.
5.3. Chaining
RDFa has the notion of [
chaining
] which aims to combine statements together in as intuitive a way as possible, so as avoid unnecessary repetition of
mark-up. For example, if an author were to add statements as children of an object that was a resource, these statements should be interpreted as being about that resource:

Albert Einstein
1879-03-14
resource="http://dbpedia.org/resource/Germany"
Federal Republic of Germany


In this example we can see that an object resource ('Germany'), has become the subject for nested statements. This mark-up also illustrates the basic chaining pattern of 'A has a B has a C'
(i.e., Einstein has a birth place of Germany, which has a long name of "Federal Republic of Germany").
It's also possible for the subject of nested statements to provide the object for
containing
statements—essentially the reverse of the example we have just seen. To illustrate,
we'll take an example of the type of chaining just described, and show how it could be marked up more efficiently. To start, we mark up the fact that Albert Einstein had both German and American
citizenship:




Now, we show the same information, but this time we create an [
incomplete triple
] from the citizenship part, and then use any number of further
subjects to 'complete' that triple, as follows:




In this example, the [
incomplete triple
] actually gets completed twice, once for Germany and once for the USA, giving exactly the same information
as we had in the earlier example:

dbp:citizenship .

dbp:citizenship .
Chaining can sometimes involve elements containing relatively minimal mark-up, for example showing only one resource, or only one predicate. Here the
img
element is used to carry a
picture of Einstein:





When such minimal mark-up is used, any of the resource-related attributes could act as a subject or an object in the chaining:






5.4. CURIE and URI Processing
Since RDFa is ultimately a means for transporting RDF, then a key concept is the
resource
and its manifestation as a URI. Since RDF deals with complete URIs (not relative paths), then
when converting RDFa to triples, any relative URIs will need to be resolved relative to the base URI, using the algorithm defined in section 5 of RFC 3986 [
URI
],
Reference Resolution
Many of the attributes that hold URIs are also able to carry 'compact URIs' or CURIEs. A CURIE is a convenient way to represent a long URI, by replacing a leading section of the URI with a
substitution token. It's possible for authors to define a number of substitution tokens as they see fit; the full URI is obtained by locating the mapping defined by a token from a list of in-scope
tokens, and then simply concatenating the second part of the CURIE onto the mapped value.
For example, the full URI for Albert Einstein on DPPedia is:
This can be shortened by authors to make the information easier to manage, using a CURIE. The first step is for the author to create a prefix mapping that links a prefix to some leading segment of
the URI. In RDFa these mappings are expressed using the XML namespace syntax:
xmlns:db="http://dbpedia.org/"
...