Book HomeXML SchemaSearch this book

6.2. The Simplest Possible Patterns

In their simplest form, patterns may be used as enumerations applied to the lexical space rather than on the value space.

If, for instance, we have a byte value that can only take the values "1," "5," or "15," the classical way to define such a datatype is to use the xs:enumeration facet:

<xs:simpleType name="myByte">
  <xs:restriction base="xs:byte">
    <xs:enumeration value="1"/>
    <xs:enumeration value="5"/>
    <xs:enumeration value="15"/>
  </xs:restriction>
</xs:simpleType>

This is the "normal" way of defining this datatype if it matches the lexical space and the value space of an xs:byte. It gives the flexibility to accept the instance documents with values such as "1," "5," and "15," but also "01" or "0000005." One of the particularities of xs:pattern is it must be the only facet constraining the lexical space. If we have an application that is disturbed by leading zeros, we can use patterns instead of enumerations to define our datatype:

<xs:simpleType name="myByte">
  <xs:restriction base="xs:byte">
    <xs:pattern value="1"/>
    <xs:pattern value="5"/>
    <xs:pattern value="15"/>
  </xs:restriction>
</xs:simpleType>

This new datatype is still derived from xs:byte and has the semantic of a byte, but its lexical space is now constrained to accept only "1," "5," and "15," leaving out any variation that has the same value but a different lexical representation.

TIP: This is an important difference from Perl regular expressions, on which W3C XML Schema patterns are built. A Perl expression such as /15/ matches any string containing "15," while the W3C XML Schema pattern matches only the string equal to "15." The Perl expression equivalent to this pattern is thus /^15$/.

This example has been carefully chosen to avoid using any of the meta characters used within patterns, which are: ".", "\", "?", "*", "+", "{", "}", "(", ")", "[", and "]". We will see the meaning of these characters later in this chapter; for the moment, we just need to know that each of these characters needs to be "escaped" by a leading "\" to be used as a literal. For instance, to define a similar datatype for a decimal when lexical space is limited to "1" and "1.5," we write:

<xs:simpleType name="myDecimal">
  <xs:restriction base="xs:decimal">
    <xs:pattern value="1"/>
    <xs:pattern value="1\.5"/>
  </xs:restriction>
</xs:simpleType>

A common source of errors is that "normal" characters should not be escaped: we will see later that a leading "\" changes their meaning (for instance, "\P" matches all the Unicode punctuation characters and not the character "P").



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.