RELAX NG Compact Syntax
RELAX NG Compact Syntax
Committee Specification 21 November 2002
This version:
Committee Specification: 21 November 2002
Editor:
James Clark
[email protected]
Copyright © The Organization for the Advancement of
Structured Information Standards [OASIS] 2002. All Rights
Reserved.
This document and translations of it may be copied and furnished
to others, and derivative works that comment on or otherwise explain
it or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any kind,
provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to OASIS, except as needed for the
purpose of developing OASIS specifications, in which case the
procedures for copyrights defined in the OASIS Intellectual Property
Rights document must be followed, or as required to translate it into
languages other than English.
The limited permissions granted above are perpetual and will not
be revoked by OASIS or its successors or assigns.
This document and the information contained herein is provided
on an "AS IS" basis and OASIS DISCLAIMS ALL WARRANTIES,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE
USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY
IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR
PURPOSE.
Abstract
This document specifies a compact, non-XML syntax for [
RELAX NG
].
Status of this Document
This Committee Specification was approved for publication by the
OASIS RELAX NG technical committee. It is a stable document which
represents the consensus of the committee. Comments on this document
may be sent to
[email protected]
A list of known errors in this document is available at
Table of Contents
1.
Introduction
2.
Syntax
3.
Lexical structure
4.
Declarations
5.
Annotations
5.1.
Initial annotations
5.2.
Documentation shorthand
5.3.
Following annotations
5.4.
Grammar annotations
6.
Conformance
6.1.
Validator
6.2.
Structure preserving translator
6.3.
Non-structure preserving translator
Appendixes
A.
Formal description
1.
Syntax
2.
Lexical structure
2.1.
Character encoding
2.2.
BOM stripping
2.3.
Newline normalization
2.4.
Escape interpretation
2.5.
Tokenization
B.
Compact syntax RELAX NG schema for RELAX NG (Non-Normative)
References
1. Introduction
This specification describes a compact, non-XML syntax for [
RELAX NG
].
The goals of this syntax are to:
maximize readability;
support all features of RELAX NG; it must be possible
to translate a schema from the XML syntax to the compact syntax and
back without losing significant information;
support separate translation; a RELAX NG schema may be
spread amongst multiple files; it must be possible to represent each
of the files separately in the compact syntax; the representation of
each file must not depend on the other files.
The syntax has similarities to [
XQuery Formal Semantics
], to
XDuce
] and to the DTD syntax of [
XML 1.0
].
The body of this document contains an informal description of
the syntax and how it maps onto the XML syntax. Developers should
consult
Appendix A
for a complete, rigorous
description.
Appendix B
contains an example in
the form of schema for RELAX NG.
2. Syntax
The following is a summary of the syntax in EBNF. Square
brackets are used to indicate optionality. The reader may find it
helpful to compare this with the syntax in Section 3 of [
RELAX NG
]. The start symbol is topLevel.
topLevel
::=
decl
* (
pattern
grammarContent
*)
decl
::=
namespace
identifierOrKeyword
namespaceURILiteral
| "
default
" "
namespace
" [
identifierOrKeyword
] "
namespaceURILiteral
| "
datatypes
identifierOrKeyword
literal
pattern
::=
element
nameClass
pattern
| "
attribute
nameClass
pattern
pattern
("
pattern
)+
pattern
("
pattern
)+
pattern
("
pattern
)+
pattern
pattern
pattern
| "
list
" "
pattern
| "
mixed
" "
pattern
identifier
| "
parent
identifier
| "
empty
| "
text
| [
datatypeName
datatypeValue
datatypeName
["
param
* "
"] [
exceptPattern
| "
notAllowed
| "
external
anyURILiteral
inherit
| "
grammar
" "
grammarContent
* "
| "
pattern
param
::=
identifierOrKeyword
literal
exceptPattern
::=
pattern
grammarContent
::=
start
define
| "
div
" "
grammarContent
* "
| "
include
anyURILiteral
inherit
] ["
includeContent
* "
"]
includeContent
::=
define
start
| "
div
" "
includeContent
* "
start
::=
start
assignMethod
pattern
define
::=
identifier
assignMethod
pattern
assignMethod
::=
| "
|=
| "
&=
nameClass
::=
name
nsName
exceptNameClass
anyName
exceptNameClass
nameClass
nameClass
| "
nameClass
name
::=
identifierOrKeyword
CName
exceptNameClass
::=
nameClass
datatypeName
::=
CName
| "
string
| "
token
datatypeValue
::=
literal
anyURILiteral
::=
literal
namespaceURILiteral
::=
literal
| "
inherit
inherit
::=
inherit
" "
identifierOrKeyword
identifierOrKeyword
::=
identifier
keyword
identifier
::=
NCName
keyword
quotedIdentifier
quotedIdentifier
::=
NCName
CName
::=
NCName
NCName
nsName
::=
NCName
:*
anyName
::=
literal
::=
literalSegment
("
literalSegment
)+
literalSegment
::=
' (
Char
- ('
' |
newline
))* '
| "
" (
Char
- ("
" |
newline
))* "
| '
"""
' (['
'] ['
'] (
Char
- '
'))* '
"""
| "
'''
" (["
"] ["
"] (
Char
- "
"))* "
'''
keyword
::=
attribute
| "
default
| "
datatypes
| "
div
| "
element
| "
empty
| "
external
| "
grammar
| "
include
| "
inherit
| "
list
| "
mixed
| "
namespace
| "
notAllowed
| "
parent
| "
start
| "
string
| "
text
| "
token
NCName is defined in [
XML Namespaces
]. Char is
defined in [
XML 1.0
].
In order to use a keyword as an identifier, it must be quoted
with
. It is not necessary to quote a keyword that
is used as the name of an element or attribute or as datatype
parameter.
The value of a literal is the concatenation of the values of its
constituent literalSegments. A literalSegment is always terminated
only by an occurrence of the same delimiter that began it. The
delimiter used to begin a literalSegment may be either one or three
occurrences of a single or double quote character. Newlines are
allowed only in literalSegments delimited by three quote characters.
The value of a literal segment consists of the characters between its
delimiters. One way to get a literal whose value contains both a
single and a double quote is to divide the literal into multiple
literalSegments so that the single and double quote are in separate
literalSegments. Another way is to use a literalSegment delimited by
three single or double quotes.
Annotations can be specified as described in
Section 5
There is no notion of operator precedence. It is an error
for patterns to combine the
and
operators without using
parentheses to make the grouping explicit. For example,
foo | bar, baz
is not allowed; instead,
either
(foo | bar), baz
or
foo | (bar, baz)
must be used. A similar
restriction applies to name classes and the use of the
and
operators. These
restrictions are not expressed in the above EBNF but they are made
explicit in the BNF in
Section 1
The value of an anyURILiteral specified with
include
or
external
is a URI
reference to a grammar in the compact syntax.
3. Lexical structure
Whitespace is allowed between tokens. Tokens are the strings
occurring in double quotes in the EBNF in
Section 2
, except
that literalSegment, nsName, CName, identifier and quotedIdentifer are
single tokens.
Comments are also allowed between tokens. Comments start with a
and continue to the end of the line. Comments
starting with
##
are treated specially; see
Section 5
A Unicode character with hex code
can be represented by the escape sequence
\x{
. Using such an
escape sequence is completely equivalent to the entering the
corresponding character directly. For example,
element \x{66}\x{6f}\x{6f} { empty }
is equivalent to
element foo { empty }
4. Declarations
datatypes
declaration declares a prefix
used in a QName identifying a datatype. For example,
datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes"
element height { xsd:double }
In fact, in the above example, the
datatypes
declaration is not required: the
xsd
prefix is predeclared to the above URI.
namespace
declaration declares a prefix used in a
QName specifying the name of an element or attribute. For
example,
namespace rng = "http://relaxng.org/ns/structure/1.0"
element rng:text { empty }
As in XML, the
xml
prefix is predeclared.
default namespace
declaration declares
the namespace used for unprefixed names specifying the name
of an element (but not of an attribute). For example,
default namespace = "http://example.com"
element foo { attribute bar { string } }
is equivalent to
namespace ex = "http://example.com"
element ex:foo { attribute bar { string } }
default namespace
declaration may have a prefix
as well. For example,
default namespace ex = "http://example.com"
is equivalent to
default namespace = "http://example.com"
namespace ex = "http://example.com"
The URI may be empty. This makes the prefix stand for the absent
namespace URI. This is necessary for specifying a name class that
matches any name with an absent namespace URI. For example:
namespace local = ""
element foo { attribute * - local:* { string }* }
is equivalent to
ns="http://example.com">
RELAX NG has the feature that if a file does not specify an
ns
attribute then the
ns
attribute can be inherited from the including file. To support this
feature, the keyword
inherit
can be specified in
place of the namespace URI in a namespace declaration. For
example,
default namespace this = inherit
element foo { element * - this:* { string }* }
is equivalent to
In addition, the
include
and
external
patterns can specify
inherit =
prefix
to specify the namespace
to be inherited by the referenced file. For example,
namespace x = "http://www.example.com"
external "foo.rng" inherit = x
is equivalent to
xmlns="http://relaxng.org/ns/structure/1.0"/>
In the absence of an
inherit
parameter on
include
or
external
, the default
namespace will be inherited by the referenced file.
In the absence of a
default namespace
declaration, a declaration of
default namespace = inherit
is assumed.
5. Annotations
The RELAX NG XML syntax allows foreign elements and attributes
to be used to annotate a RELAX NG schema. A schema in the compact
syntax can also have annotations, which will turn into foreign
elements and attributes when the schema is translated into XML syntax.
The way these annotations are specified depends on where the foreign
elements and attributes are to appear in the translated schema. There
is also a special shorthand syntax when the foreign element is a
documentation
element as described in [
Compatibility
].
5.1. Initial annotations
An annotation in square brackets can be inserted immediately
before a pattern, param, nameClass, grammarContent or includeContent. It has
the following syntax:
annotation
::=
annotationAttribute
annotationElement
* "
annotationAttribute
::=
name
literal
annotationElement
::=
name
annotationAttribute
* (
annotationElement
literal
)* "
Each of the annotationAttributes will turn into attributes on
the corresponding RELAX NG element. Each of the annotationElements
will turn into initial children of the corresponding RELAX NG element,
except in the case where the RELAX NG element cannot have children, in
which case they will turn into following elements.
5.2. Documentation shorthand
Comments starting with
##
are used to specify
documentation
elements from the
namespace as described in [
Compatibility
]. For example,
## Represents a language
element lang {
## English
"en" |
## Japanese
"jp"
turns into
xmlns="http://relaxng.org/ns/structure/1.0">
##
comments can only be used immediately
before before a pattern, nameClass, grammarContent or includeContent.
Multiple
##
comments are allowed. Multiple adjacent
##
comments without any intervening blank lines are
merged into a single
documentation
element. Any
##
comments must precede any annotation in square brackets.
5.3. Following annotations
A pattern or nameClass may be followed by any number of
followAnnotations with the following syntax:
followAnnotation
::=
>>
annotationElement
Each such annotationElement turns into a following sibling of
the RELAX NG element representing the pattern or nameClass.
5.4. Grammar annotations
An annotationElement may be used in any place where
grammarContent or includeContent is allowed. For example,
namespace x = "http://www.example.com"
start = foo
x:entity [ name="picture" systemId="picture.jpeg" notation="jpeg" ]
foo = element foo { empty }
turns into
If the name of such an element is a keyword, then it must be
quoted with
6. Conformance
There are three kinds of conformant implementation.
6.1. Validator
A validator conforming to this specification must be able to
determine whether a textual object is a correct RELAX NG Compact
Syntax schema as specified in
Appendix A
. It must also
be able to determine for any XML document and for any correct RELAX NG
Compact Syntax schema whether the document is valid (as defined in
RELAX NG
]) with respect to the translation of the schema
into XML syntax. It need not be able to output a representation of
the translation of the schema into XML syntax.
The requirements in the preceding paragraph are subject to the
provisions of the second paragraph of Section 8 of [
RELAX NG
].
6.2. Structure preserving translator
A structure preserving translator must be able to translate any
correct RELAX NG Compact Syntax schema into an XML document whose data
model is strictly equivalent to the translation specified in
Appendix A
. For this purpose, two instances of the data model
(as specified in Section 2 of [
RELAX NG
]) are considered
strictly equivalent if they are identical after applying the
simplifications specified in Sections 4.2, 4.3, 4.4, 4.8, 4.9 and 4.10
of [
RELAX NG
], with the exception that the base URI in the
context of elements may differ.
Note
The RELAX NG compact syntax is not a representation of the
XML syntax of a RELAX NG schema; rather it is a representation of the
semantics of a RELAX NG schema. Details of the XML syntax that were
judged to be insignificant are not captured in the compact syntax.
For example, in the XML syntax if the name class for an
element
or
attribute
pattern
consists of just a single name, it can be expressed either as a
name
attribute or as a
name
element; however, in the compact syntax, there is only one way to
express such a name class. The simplifications listed in the previous
paragraph correspond to those syntactic details that are not captured
in the compact syntax.
When comparing two
include
or
externalRef
patterns in the XML source for strict
equivalence, the value of the
href
attributes are
not compared; instead the referenced XML documents are compared for
strict equivalence.
6.3. Non-structure preserving translator
A non-structure preserving translator must be able to translate
any correct RELAX NG Compact Syntax schema into an XML document whose
data model is loosely equivalent to the translation specified in
Appendix A
. For this purpose, two instances of the data model
(as specified in Section 2 of [
RELAX NG
]) are considered
loosely equivalent if they are such that, after applying all the
simplifications specified in Section 4 of [
RELAX NG
], one
can be transformed into the other merely by reordering and renaming
definitions. After the simplifications have been applied, the context
of elements is ignored when comparing the two instances.
Note
A validator for the compact syntax can be implemented as a
combination of a non-structure preserving translator for the compact
syntax and a validator for the XML syntax.
A. Formal description
1. Syntax
The compact syntax is specified by a grammar in BNF. The
translation into the XML syntax is specified by annotations in the
grammar.
The start symbol is
topLevel
The BNF description consists of a set of production rules. Each
production rule has a left-hand side and right-hand side separated by
::=
. The left-hand side specifies the name of a
non-terminal. The right-hand side specifies a list of one or more
alternatives separated by
. Each alternative
consists of a sequence of terminals and non-terminals. A non-terminal
is specified by a name in italics. A terminal is either a literal
string in quotes or a named terminal specified by a name in bold
italics. An alternative can also be specified as
, which
denotes an empty sequence of tokens.
Each alternative may be followed by references to one or more
named constraints that apply to that alternative.
The translation into XML syntax is specified by associating a
value with each terminal and non-terminal in the derivation. Each
alternative in the BNF may be followed by an expression in curly
braces, which specifies how to compute the value associated with the
left-hand side non-terminal. Each terminal and non-terminal on the
right-hand side can be labelled with a subscript specifying a variable
name. When that variable name is used within the curly braces, it
refers to the value associated with that terminal or non-terminal. If
an alternative consists of a single terminal or non-terminal, then the
expression in curly braces can be omitted; in this case the value of
the left-hand side is the value of that terminal or
non-terminal.
The result of the translation is not a string containing the XML
representation of a RELAX NG schema, but rather is an instance of the
data model described in Section 2 of [
RELAX NG
]; this
instance will match the RELAX NG schema for RELAX NG.
A textual object is a correct RELAX NG Compact Syntax schema
if:
it matches the grammar specified in this section,
it satisfies all the constraints specified in this section, and
the result of the translation is a correct RELAX NG schema.
The computation of the value of a non-terminal may make use of
one or more arguments. When the name of such a non-terminal occurs on
the left-hand side of a production, it is followed by an argument list
that declares the formal arguments for the non-terminal; these formal
arguments may be referred to by expressions on the right-hand side,
as, for example, in
simpleNameClass
. When the name
occurs on the right-hand side of a production, it may be followed by
one or more assignments that specify the actual arguments which will
be bound to the formal arguments during the computation of the value
of the non-terminal. Arguments may be passed down implicitly: if
there is no actual argument corresponding to a particular formal
argument, then the formal argument is bound to the value of the
variable with the same name as the name of the formal argument. In
other words, for any variable
, a default actual
argument of
:=
is assumed. For example, see
nameClassChoice
In addition to explicit arguments, every non-terminal implicitly
has an argument that specifies an environment for the interpretation
of a pattern. By default, the implicit environment argument to each
non-terminal is the same as its parent. This may be overridden for a
particular non-terminal by including
environment
in the
argument list. For example, see
topLevel
and
preamble
An environment specifies:
a mapping from datatype prefixes to
URIs;
a mapping from namespace prefixes to URIs; a namespace
prefix may be mapped to a special value
inherit
as well as to a
URI;
the default namespace; the default namespace is either
a URI or the special value
inherit
the base URI.
The special value
inherit
is used to indicate that a
namespace URI should be inherited from the referencing schema.
In the initial environment used for the start symbol,
xml
is bound as a namespace prefix to
, and
xsd
is bound as a datatype prefix to
; the
base URI is determined as specified by [
RFC 2396
].
The value of an expression is one of the following:
the constants true, false or
inherit;
a string;
a name (a namespace URI/local name pair);
a qualified-name (a prefix/local name pair);
an XML fragment, where an XML fragment is a pair of a
set of zero or more attributes and a content sequence of zero or more
strings and elements, as described in the data model of [
RELAX NG
]; an XML fragment is thus the same kind of thing as
what is matched against a RELAX NG pattern;
an environment.
Each terminal and non-terminal has an associated type identified
by a name. A type is simply a set of values. The value of a terminal
or non-terminal is always a member of the set of values identified by
the name of its type. The name of the type of a terminal or
non-terminal is given following the keyword returns before ::= in the
production rule. Similarly, each argument has a type, which is given
immediately before the name of the argument. The value of a
non-terminal may also be specified to be void; no expression is given
for the value of such a non-terminal, nor will the value of such a
non-terminal be used in any expression.
The following types are all disjoint:
Boolean contains true and false;
Inherit contains inherit;
String contains all strings;
Name contains all names;
Qname contains all qualified-names;
Environment contains all enviroments;
Xml contains all XML fragments.
It is also useful to identify some subtypes of Xml. One type is
a subtype of another if the set of values of the one type is a subset
of the set of values of the other.
Content contains all XML fragments that have an empty
set of attributes;
Elements contains all XML fragments that have an empty
set of attributes and whose content sequence does not have any string
members; it is a subtype of content;
Element contains all XML fragments that have an empty
set of attributes and whose content sequence consists of a single
element; it is a subtype of elements;
Attributes contains all XML fragments that have an
empty content sequence;
Attribute contains all XML fragments that have an
empty content sequence and whose attribute set consists of a single
attribute.
In addition it is useful to have the following union type.
NamespaceURI is the union of String and
Inherit.
Expressions use the following notation:
denotes the value of the variable
named
( ) denotes an empty
XML fragment;
) denotes the concatenation of the XML fragments
and
; the attributes of the
resulting XML fragment consist of the union of the attributes of
and
and the content sequence
consists of the concatenation of the content sequence of
and
(this is the same as the meaning of
the comma operator in the compact syntax);
environment
denotes the value of the implicit
environment argument;
true
false
and
inherit
are used
to denote the corresponding special constant;
xyzzy
" denotes a string
consisting of the characters
xyzzy
f(
, . . . ) denotes the
value of the function f applied to the arguments
, . . . ; the available functions are as
follows:
qName
(String
, String
returns
Qname
returns a
qualified-name with prefix
and local part
prefix
(Qname
returns
String
returns the prefix of the qualified-name
localPart
(Qname
returns
String
returns the local-part of the qualified-name
name
(String
, String
returns
Name
returns a name
with namespace URI
and local name
attribute
(Name
, String
returns
Attribute
returns an XML
fragment consisting of an attribute with name
and
value
element
(Environment
, Name
, Xml
returns
Element
returns an XML fragment consisting of an element with
name
and attributes and children
; the namespace map of the context of the element contains
all the mappings from namespace prefixes to URIs from
except those mappings that map to
inherit
or to the
empty string; the namespace map may contain an additional mapping from
an implementation-dependent prefix to the compatibility annotations
URI; the default namespace of namespace map of the context of the
element is the RELAX NG namespace URI
; the base URI
of the context of the element is not
constrained;
text
(String
returns
Content
returns an XML fragment whose content sequence consists of
if
is not the empty string,
and otherwise the empty XML fragment;
bindPrefix
(Environment
, String
, NamespaceURI
returns
Environment
returns an environment that is the same as
except that it has the prefix
bound to
bindDefault
(Environment
, NamespaceURI
returns
Environment
returns an environment that is the same as
except it has the default namespace
bindDatatypePrefix
(Environment
, String
, String
returns
Environment
returns an environment that is the same
as
except that it has
bound as
a prefix for datatypes to the URI
lookupPrefix
(Environment
, String
returns
String
returns the
binding in the environment
for the prefix
; it is an error if there is no applicable
binding;
lookupDefault
(Environment
returns
NamespaceURI
returns the default namespace of the
environment
, or, if no default has been bound,
inherit
lookupDatatypePrefix
(Environment
, String
returns
String
returns the
binding as a datatype prefix in the environment
for the prefix
; it is an error if there is no
applicable binding;
mapSchemaRef
(Environment
, String
returns
String
returns a
URI;
is a URI reference of a resource containing a
schema in the syntax described by this specification; the returned URI
is the URI of a resource containing the translation of this schema
into RELAX NG XML syntax;
is resolved into an
absolute form as described in section 5.2 of [
RFC 2396
using the base URI from the environment
; the
restriction on the use of fragment identifiers specified in section
4.5 of [
RELAX NG
] applies to
makeNsAttribute
(NamespaceURI
returns
Attributes
returns an empty set if
is
inherit
, and otherwise an attribute whose namespace
URI is the empty string, whose local name is
ns
and whose value is
applyAnnotations
(Xml
, Element
returns
Element
returns an element
whose name is the name of
, whose attributes are
the union of the first member of
and the
attributes of
, and whose children are the
concatenation of the second member of
and the
children of
applyAnnotationsGroup
(Xml
, Elements
returns
Elements
returns the same as
applyAnnotations
) unless
is equal to ( ), in which case it
returns
applyAnnotationsChoice
(Xml
, Elements
returns
Elements
returns the same as
applyAnnotations
) unless
is equal to ( ), in which case it
returns
stringConcat
(String
, String
returns
String
returns a string that is the
concatenation of the strings
and
datatypeAttributes
(String
, String
returns
Attributes
returns a
set of two attributes; both attributes have the empty string as their
namespace URI; one attribute has local name
datatypeLibrary
and value
; the
other attribute has local name
type
and value
documentationElementName
()
returns
Name
returns the name of the
documentation
element
defined in [
Compatibility
], that is, the name with namespace
URI
and
local name
documentation
is a conditional
expression, which denotes
if
is
true
and
if
is
false
denotes
an XML fragment containing an element from the RELAX NG namespace with
local name
foo
, attributes
and
content
; the context of the element is determined
from the implicit environment argument as specified for the element
function above.
topLevel
returns
Element ::=
preamble
topLevelBody
environment
:=
preamble
returns
Environment ::=
environment
decl
preamble
environment
:=
decl
returns
Environment ::=
namespace
namespacePrefix
namespaceURILiteral
Constraint:
xml prefix
Constraint:
xml namespace URI
Constraint:
duplicate declaration
bindPrefix
environment
) }
| "
default
" "
namespace
" "
namespaceURILiteral
Constraint:
xml namespace URI
Constraint:
duplicate declaration
bindDefault
environment
) }
| "
default
" "
namespace
namespacePrefix
namespaceURILiteral
Constraint:
xml prefix
Constraint:
xml namespace URI
Constraint:
duplicate declaration
bindDefault
bindPrefix
environment
),
) }
| "
datatypes
datatypePrefix
literal
Constraint:
xsd prefix
Constraint:
datatypes URI
Constraint:
duplicate declaration
bindDatatypePrefix
environment
) }
namespacePrefix
returns
String ::=
identifierOrKeyword
Constraint:
valid prefix
datatypePrefix
returns
String ::=
identifierOrKeyword
namespaceURILiteral
returns
NamespaceURI ::=
literal
| "
inherit
inherit
topLevelBody
returns
Element ::=
pattern
Constraint:
single element
grammar
grammar
returns
Elements ::=
{ ( ) }
member
grammar
{ (
) }
member
returns
Element ::=
annotatedComponent
annotationElementNotKeyword
annotatedComponent
returns
Element ::=
annotations
component
applyAnnotations
) }
component
returns
Element ::=
start
define
include
div
start
returns
Element ::=
start
assignOp
pattern
define
returns
Element ::=
identifier
assignOp
pattern
assignOp
returns
Attributes ::=
{ ( ) }
| "
|=
attribute
name
("
", "
combine
"), "
choice
") }
| "
&=
attribute
name
("
", "
combine
"), "
interleave
") }
include
returns
Element ::=
include
anyURILiteral
optInherit
optIncludeBody
mapSchemaRef
environment
anyURILiteral
returns
String ::=
literal
Constraint:
any URI
optInherit
returns
Attributes ::=
makeNsAttribute
lookupDefault
environment
)) }
| "
inherit
" "
identifierOrKeyword
makeNsAttribute
lookupPrefix
environment
)) }
optIncludeBody
returns
Elements ::=
{ ( ) }
| "
includeBody
includeBody
returns
Elements ::=
{ ( ) }
includeMember
includeBody
{ (
) }
includeMember
returns
Element ::=
annotatedIncludeComponent
annotationElementNotKeyword
annotatedIncludeComponent
returns
Element ::=
annotations
includeComponent
applyAnnotations
) }
includeComponent
returns
Element ::=
start
define
includeDiv
div
returns
Element ::=
div
" "
grammar
includeDiv
returns
Element ::=
div
" "
includeBody
pattern
returns
Elements ::=
innerPattern
anno
:= ( ))
innerPattern
(Xml
anno
returns
Elements ::=
innerParticle
particleChoice
applyAnnotations
anno
) }
particleGroup
applyAnnotations
anno
) }
particleInterleave
applyAnnotations
anno
) }
annotatedDataExcept
applyAnnotationsGroup
anno
) }
particleChoice
returns
Elements ::=
particle
particle
{ (
) }
particle
particleChoice
{ (
) }
particleGroup
returns
Elements ::=
particle
particle
{ (
) }
particle
particleGroup
{ (
) }
particleInterleave
returns
Elements ::=
particle
particle
{ (
) }
particle
particleInterleave
{ (
) }
particle
returns
Elements ::=
innerParticle
anno
:= ( ))
innerParticle
(Xml
anno
returns
Elements ::=
annotatedPrimary
applyAnnotationsGroup
anno
) }
repeatedPrimary
followAnnotations
{ (
applyAnnotations
anno
),
) }
repeatedPrimary
returns
Element ::=
annotatedPrimary
annotatedPrimary
annotatedPrimary
annotatedPrimary
returns
Elements ::=
leadAnnotatedPrimary
followAnnotations
{ (
) }
annotatedDataExcept
returns
Elements ::=
leadAnnotatedDataExcept
followAnnotations
{ (
) }
leadAnnotatedDataExcept
returns
Element ::=
annotations
dataExcept
applyAnnotations
) }
leadAnnotatedPrimary
returns
Elements ::=
annotations
primary
applyAnnotations
) }
annotations
innerPattern
anno
:=
primary
returns
Element ::=
element
nameClass
isElem
:=
true
pattern
| "
attribute
nameClass
isElem
:=
false
pattern
| "
mixed
" "
pattern
| "
list
" "
pattern
datatypeName
optParams
datatypeName
datatypeValue
datatypeValue
| "
empty
| "
notAllowed
| "
empty
ref
name=
/>
| "
parent
ref
/>
| "
grammar
" "
grammar
| "
external
anyURILiteral
optInherit
mapSchemaRef
/>
dataExcept
returns
Element ::=
datatypeName
optParams
leadAnnotatedPrimary
ref
returns
String ::=
identifier
datatypeName
returns
Attributes ::=
CName
datatypeAttributes
lookupDatatypePrefix
environment
prefix
)),
localPart
)) }
| "
string
datatypeAttributes
("
", "
string
") }
| "
token
datatypeAttributes
("
", "
token
") }
datatypeValue
returns
String ::=
literal
optParams
returns
Elements ::=
{ ( ) }
| "
params
params
returns
Elements ::=
{ ( ) }
param
params
{ (
) }
param
returns
Element ::=
annotations
identifierOrKeyword
literal
applyAnnotations
name=
) }
nameClass
(Boolean
isElem
returns
Elements ::=
innerNameClass
anno
:= ( ))
innerNameClass
(Boolean
isElem
, Xml
anno
returns
Elements ::=
annotatedSimpleNameClass
applyAnnotationsChoice
anno
) }
nameClassChoice
applyAnnotations
anno
) }
annotatedExceptNameClass
applyAnnotationsChoice
anno
) }
nameClassChoice
(Boolean
isElem
returns
Elements ::=
annotatedSimpleNameClass
annotatedSimpleNameClass
{ (
) }
annotatedSimpleNameClass
nameClassChoice
{ (
) }
annotatedExceptNameClass
(Boolean
isElem
returns
Elements ::=
leadAnnotatedExceptNameClass
followAnnotations
{ (
) }
leadAnnotatedExceptNameClass
(Boolean
isElem
returns
Element ::=
annotations
exceptNameClass
applyAnnotations
) }
annotatedSimpleNameClass
(Boolean
isElem
returns
Elements ::=
leadAnnotatedSimpleNameClass
followAnnotations
{ (
) }
leadAnnotatedSimpleNameClass
(Boolean
isElem
returns
Elements ::=
annotations
simpleNameClass
applyAnnotations
) }
annotations
innerNameClass
anno
:=
exceptNameClass
(Boolean
isElem
returns
Element ::=
nsName
leadAnnotatedSimpleNameClass
lookupPrefix
environment
))
| "
" "
leadAnnotatedSimpleNameClass
simpleNameClass
(Boolean
isElem
returns
Element ::=
identifierOrKeyword
isElem
lookupDefault
environment
) : "
")
CName
lookupPrefix
environment
prefix
)))
localPart
nsName
lookupPrefix
environment
))
/>
| "
followAnnotations
returns
Elements ::=
{ ( ) }
| "
>>
annotationElement
followAnnotations
{ (
) }
annotations
returns
Xml ::=
documentations
documentations
annotationAttributes
annotationElements
{ (
, (
)) }
annotationAttributes
returns
Attributes ::=
{ ( ) }
foreignAttributeName
literal
annotationAttributes
Constraint:
duplicate attributes
{ (
attribute
),
) }
foreignAttributeName
returns
Name ::=
prefixedName
Constraint:
xmlns namespace URI
Constraint:
unqualified name
Constraint:
RELAX NG namespace URI
annotationElements
returns
Elements ::=
{ ( ) }
annotationElement
annotationElements
{ (
) }
annotationElement
returns
Element ::=
foreignElementName
annotationAttributesContent
element
environment
) }
foreignElementName
returns
Name ::=
identifierOrKeyword
name
("
",
) }
prefixedName
Constraint:
RELAX NG namespace URI
annotationElementNotKeyword
returns
Element ::=
foreignElementNameNotKeyword
annotationAttributesContent
element
environment
) }
foreignElementNameNotKeyword
returns
Name ::=
identifier
name
("
",
) }
prefixedName
Constraint:
RELAX NG namespace URI
annotationAttributesContent
returns
Xml ::=
nestedAnnotationAttributes
annotationContent
{ (
) }
nestedAnnotationAttributes
returns
Attributes ::=
{ ( ) }
anyAttributeName
literal
nestedAnnotationAttributes
Constraint:
duplicate attributes
{ (
attribute
),
) }
anyAttributeName
returns
Name ::=
identifierOrKeyword
name
("
",
) }
prefixedName
Constraint:
xmlns namespace URI
annotationContent
returns
Content ::=
{ ( ) }
nestedAnnotationElement
annotationContent
{ (
) }
literal
annotationContent
{ (
text
),
) }
nestedAnnotationElement
returns
Element ::=
anyElementName
annotationAttributesContent
element
environment
) }
anyElementName
returns
Name ::=
identifierOrKeyword
name
("
",
) }
prefixedName
prefixedName
returns
Name ::=
CName
Constraint:
annotation inherit
name
lookupPrefix
environment
prefix
)),
localPart
)) }
documentations
returns
Elements ::=
{ ( ) }
documentation
documentations
{ (
element
environment
documentationElementName
(),
text
)),
) }
identifierOrKeyword
returns
String ::=
identifier
keyword
keyword
returns
String ::=
attribute
| "
default
| "
datatypes
| "
div
| "
element
| "
empty
| "
external
| "
grammar
| "
include
| "
inherit
| "
list
| "
mixed
| "
namespace
| "
notAllowed
| "
parent
| "
start
| "
string
| "
text
| "
token
literal
returns
String ::=
literalSegment
literalSegment
literal
stringConcat
) }
Constraint:
valid prefix
It is an error if the value of a
namespacePrefix
is
xmlns
Constraint:
xml prefix
It is an error if the value of
namespacePrefix
is
xml
and the the value
of the
namespaceURILiteral
is not
Constraint:
xml namespace URI
It is an error if the value
of the
namespaceURILiteral
is
and the value
of the
namespacePrefix
is not
xml
Constraint:
xsd prefix
It is an error if the value of
datatypePrefix
is
xsd
and the the value
of the
literal
is not
Constraint:
datatypes URI
It is an error if the value of
the literal in a
datatypes
declaration is not a
syntactically legal value for a
datatypeLibrary
as
specified in Section 3 of [
RELAX NG
].
Constraint:
duplicate declaration
It is an error if there
is more than one namespace declaration of a particular prefix, more
than one default namespace declaration or more than one declaration of
a particular datatypes prefix.
Constraint:
single element
It is an error if a top-level
pattern translates to a sequence of more than one element (which can
happen as the result of the use of annotations).
Constraint:
unqualified name
It is an error if the namespace URI of a
prefixedName
in a
foreignAttributeName
is the empty string.
Constraint:
RELAX NG namespace URI
It is an error if the namespace URI of a
prefixedName
in a
foreignElementName
, a
foreignElementNameNotKeyword
or a
foreignAttributeName
is the RELAX NG namespace URI
Constraint:
xmlns namespace URI
It is an error if the namespace URI of a
prefixedName
in a
foreignAttributeName
or an
anyAttributeName
is
Constraint:
duplicate attributes
It is an error if a
nestedAnnotationAttributes
or an
annotationAttributes
contains two attributes with the same
namespace URI and local name.
Constraint:
annotation inherit
It is an error if the namespace URI in the value of a
prefixedName
is
inherit
Constraint:
any URI
It is an error if the value of the literal used with
external
or
include
declaration
does not meet the requirements for the anyURI symbol specified in
Section 3 of [
RELAX NG
].
2. Lexical structure
This section describes how to transform the textual
representation of a RELAX NG schema in compact syntax into a sequence
of tokens, which can be parsed using the grammar specified in
Section 1
There are five distinct stages, which are logically consecutive;
the result of each stage is the input to the following stage.
2.1. Character encoding
The textual representation of the RELAX NG schema in compact
syntax may be either a sequence of Unicode characters or a sequence of
bytes. In the latter case, the first stage is to transform the
sequence of bytes to the sequence of characters. The sequence of
bytes may have associated metadata specifying the encoding. One
example of such metadata is the
charset
parameter
in a MIME media type [
RFC 2046
].
If there is such metadata, then the specified
encoding is used. Otherwise, the first two bytes of the sequence are
examined. If these are #xFF followed by #xFE or #xFE followed by
#xFF, then an encoding of UTF-16 [
Unicode
] will be
used, little-endian in the former case, big-endian in the latter case.
Otherwise an encoding of UTF-8 [
Unicode
] is used. It
is an error if the sequence of bytes is not a legal sequence in the
selected encoding.
2.2. BOM stripping
If the first character of the sequence is a byte order mark
(#xFEFF), then it is removed.
2.3. Newline normalization
Representations of newlines are normalized to a newline
marker. Specifically, each occurrence of
a #xA character,
a #xD character that is not followed by a #xA character, or
a #xD, #xA character pair
is transformed to a newline marker. The result of this stage is
thus a sequence whose members are Unicode characters and newline
markers.
2.4. Escape interpretation
In this stage, each escape sequence of the form
\x{
, where
is a hexadecimal number, is replaced by
the character with Unicode code
. The
escape sequence must match the production
escapeSequence
; the value computed in the BNF is the Unicode
code of the replacement character. It is an error if the replacement
character does not match the
Char
production of
XML 1.0
]. It is an error if the input character
sequence contains a character sequence
escapeOpen
that does not start an
escapeSequence
. After an
escape sequence has been replaced, scanning for escape sequences
continues following the replacement character; thus
\x{5C}x{5C}
is transformed to
\x{5C}
not to
. The
replacement for
\x{A}
or
\x{D}
is a character, as for all other escape sequences, not a newline
marker. Thus the sequence that results from this stage can contain #xA
and #xD characters as well as newline markers.
Note
The
character that opens an escape
sequence may be followed by more than one
. This
makes it possible for there to be a reversible transformation that
maps a schema to a form containing only ASCII characters; the
transformation replaces adds an extra
to each
existing escape sequence, and replaces every non-ASCII character by an
escape sequence with exactly one
escapeSequence
returns
Number ::=
escapeOpen
hexNumber
escapeClose
escapeOpen
returns
Void ::=
xs
xs
returns
Void ::=
| "
xs
escapeClose
returns
Void ::=
hexNumber
returns
Number ::=
hexDigit
hexNumber
hexDigit
{ (
* 16) +
hexDigit
returns
Number ::=
{ 0 }
| "
{ 1 }
| "
{ 2 }
| "
{ 3 }
| "
{ 4 }
| "
{ 5 }
| "
{ 6 }
| "
{ 7 }
| "
{ 8 }
| "
{ 9 }
| [
{ 10 }
| [
{ 11 }
| [
{ 12 }
| [
{ 13 }
| [
{ 14 }
| [
{ 15 }
2.5. Tokenization
In this stage, the sequence of characters and newline markers is
tokenized: it is transformed into a sequence of tokens, where each
token corresponds to a terminal in the grammar in
Section 1
A sequence of characters and newline markers is tokenized by
first finding the longest initial subsequence that:
is one of the literal string terminals occurring
in the BNF in
Section 1
matches the grammar of one of the named terminals that
is referenced in
Section 1
and specified in this
section, that is,
identifier
literalSegment
CName
nsName
or
documentation
; or
matches the grammar for
separator
If the longest such initial subsequence matches
separator
, this subsequence is discarded. Otherwise, a
single non-terminal is produced from this initial subsequence. In
either case, the tokenization proceeds with the rest of the sequence.
It is an error if there is no such initial subsequence.
The production rules below use some additional notation. Square
brackets enclose a character class. A character class of the form
[^
chars
specifies any
legal XML character that does not occur in
chars
. A legal XML character is a
character that matches the
Char
production of [
XML 1.0
]. A character class of the form
chars
, where
chars
does not being with
, specifies any single character that occurs in
chars
. XML hexadecimal character
references are used to denote a single character, as in XML. A
newline marker is denoted by &newline;.
NCName
is defined in [
XML Namespaces
].
The value of a variable bound to a character class is a string
of length 1 containing the character that matched the character class;
if the character class matches a newline marker, then the string
contains the character #xA.
identifier
returns
String ::=
NCName
keyword
| "
NCName
CName
returns
Qname ::=
NCName
NCName
qName
) }
nsName
returns
String ::=
NCName
:*
literalSegment
returns
String ::=
stringNoQuot
| "
stringNoApos
| "
"""
stringNoTripleQuot
"""
| "
'''
stringNoTripleApos
'''
stringNoQuot
returns
String ::=
{ "
" }
| [^
&newline;]
stringNoQuot
stringConcat
) }
stringNoApos
returns
String ::=
{ "
" }
| [^
&newline;]
stringNoApos
stringConcat
) }
stringNoTripleQuot
returns
String ::=
{ "
" }
| [^
stringNoTripleQuot
stringConcat
) }
| "
" [^
stringNoTripleQuot
stringConcat
("
",
) }
| "
""
" [^
stringNoTripleQuot
stringConcat
("
""
",
) }
stringNoTripleApos
returns
String ::=
{ "
" }
| [^
stringNoTripleApos
stringConcat
) }
| "
" [^
stringNoTripleApos
stringConcat
("
",
) }
| "
''
" [^
stringNoTripleApos
stringConcat
("
''
",
) }
documentation
returns
String ::=
documentationLine
documentation
documentationContinuation
stringConcat
) }
documentationLine
returns
String ::=
##
documentationLineContent
documentationContinuation
returns
String ::=
[&newline; ]
indent
documentationLine
stringConcat
) }
indent
returns
Void ::=
| [ ]
indent
documentationLineContent
returns
String ::=
{ "
" }
| "
documentationLineContent
| "
restOfLine
| [^&newline;
restOfLine
stringConcat
) }
restOfLine
returns
String ::=
{ "
" }
| [^&newline; ]
restOfLine
stringConcat
) }
separator
returns
Void ::=
[ &newline;]
| "
" [^&newline;
restOfLine
| "
B. Compact syntax RELAX NG schema for RELAX NG (Non-Normative)
# RELAX NG XML syntax specified in compact syntax.
default namespace rng = "http://relaxng.org/ns/structure/1.0"
namespace local = ""
datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes"
start = pattern
pattern =
element element { (nameQName | nameClass), (common & pattern+) }
| element attribute { (nameQName | nameClass), (common & pattern?) }
| element group|interleave|choice|optional
|zeroOrMore|oneOrMore|list|mixed { common & pattern+ }
| element ref|parentRef { nameNCName, common }
| element empty|notAllowed|text { common }
| element data { type, param*, (common & exceptPattern?) }
| element value { commonAttributes, type?, xsd:string }
| element externalRef { href, common }
| element grammar { common & grammarContent* }
param = element param { commonAttributes, nameNCName, xsd:string }
exceptPattern = element except { common & pattern+ }
grammarContent =
definition
| element div { common & grammarContent* }
| element include { href, (common & includeContent*) }
includeContent =
definition
| element div { common & includeContent* }
definition =
element start { combine?, (common & pattern+) }
| element define { nameNCName, combine?, (common & pattern+) }
combine = attribute combine { "choice" | "interleave" }
nameClass =
element name { commonAttributes, xsd:QName }
| element anyName { common & exceptNameClass? }
| element nsName { common & exceptNameClass? }
| element choice { common & nameClass+ }
exceptNameClass = element except { common & nameClass+ }
nameQName = attribute name { xsd:QName }
nameNCName = attribute name { xsd:NCName }
href = attribute href { xsd:anyURI }
type = attribute type { xsd:NCName }
common = commonAttributes, foreignElement*
commonAttributes =
attribute ns { xsd:string }?,
attribute datatypeLibrary { xsd:anyURI }?,
foreignAttribute*
foreignElement = element * - rng:* { (anyAttribute | text | anyElement)* }
foreignAttribute = attribute * - (rng:*|local:*) { text }
anyElement = element * { (anyAttribute | text | anyElement)* }
anyAttribute = attribute * { text }
References
Normative
Compatibility
James Clark, Makoto
MURATA, editors.
RELAX NG
DTD Compatibility
. OASIS, 2001.
RELAX NG
James Clark, Makoto
MURATA, editors.
RELAX NG
Specification
. OASIS, 2001.
RFC 2396
T. Berners-Lee, R. Fielding, L. Masinter.
RFC 2396:
Uniform Resource Identifiers (URI): Generic
Syntax
IETF (Internet Engineering Task Force). 1998.
Unicode
The Unicode Consortium.
The Unicode Standard, Version 3.2 or
later
XML 1.0
Tim Bray,
Jean Paoli, and
C. M. Sperberg-McQueen, Eve Maler, editors.
Extensible Markup
Language (XML) 1.0 Second Edition
W3C (World Wide Web Consortium), 2000.
XML Namespaces
Tim Bray,
Dave Hollander,
and Andrew Layman, editors.
Namespaces in
XML
W3C (World Wide Web Consortium), 1999.
Non-Normative
Guidelines
James Clark, Kohsuke
KAWAGUCHI, editors.
Guidelines
for using W3C XML Schema Datatypes with RELAX NG
OASIS, 2001.
RFC 2046
N. Freed,
N. Borenstein.
RFC 2046: Multipurpose
Internet Mail Extensions (MIME) Part Two: Media
Types
. IETF (Internet Engineerig Task Force),
1996.
W3C XML Schema Datatypes
Paul V. Biron, Ashok Malhotra, editors.
XML Schema Part 2: Datatypes
W3C (World Wide Web Consortium), 2001.
XDuce
Haruo Hosoya.
Regular Expression
Types for XML
. PhD Thesis. The University of
Tokyo, 2000.
XQuery Formal Semantics
Peter
Fankhauser et al., editors.
XQuery
1.0 Formal Semantics. W3C Working Draft 07 June
2001
. W3C (World Wide Web Consortium),
2001.