I. Document Type Definition (DTD): A set of rules that defines 1) the legal
set of elements for an XML document, 2) the legal set of attributes for
each element, 3) the legal data content for each element and attribute,
and 4) the order and number of times in which subelements must occur.
A. Originally created as part of SGML (Standard Generalized Markup
Language) and uses SGML syntax
B. Advantages
1. Allows you to specify the elements that may appear in an XML
file, the order of the elements, and the number of the elements
2. Allows you to specify the attributes that may appear with an
element and the datatypes for those attributes
C. Disadvantages
1. Cannot specify the datatypes of element values (e.g., int, bool,
string, float)
2. Cannot specify more than one definition for an element. It is
conceivable that you want an element to have different meanings
depending on which element is its parent. For example, a name
element might represent the name of a customer if the parent element
is an account element and might represent the name of a bank if the
parent element is a bank element. You might want a first/last name
representation for the customer name and a
title/stock ticker/acronym
representation for the bank name but be unable to do so because DTD
allows only one global definition for each element.
3. It is hard to specify that the number of occurrences of an element
can be in a range (e.g., 1-3).
D. DTD declaration in an XML file:
OR
Example:
1. SYSTEM keyword tells processor to fetch an external document
(i.e., the DTD is external)
2. file reference is a Uniform Resource Identifier (URI): The example
file, bookstore.dtd, resides in the same directory as the XML
file
3. The second type of DTD declaration is called an internal
declaration since the DTD specification is physically included
in the XML file
E. Sample bookstore xml specification
and its DTD specification from
XML Web Development With PHP by Thomas Myer.
1. Notice that the DTD file starts with an XML header but it could
be omitted
F. Element Declarations
1. Syntax:
Example:
a. Cannot specify elements with duplicate names
2. Content Types
a. ANY: Allows any content type, including text or elements.
Basically the same as unstructured XML.
b. EMPTY: Contains no content
c. Mixed content: allows either child elements, parsed
character data (#PCDATA), or simple character data (#CDATA)
i. Syntax:
or
n)*>
ii. Parsed character data
1) XML entity references (discussed later) will get
expanded
2) Tags will be recognized by a parser
iii. The * is required if you allow mixed content. You
will lose the ability to sequence elements in this case
iv. Example:
: allows the
author element to include both parsed character data and
publisher elements. There may be any number of publisher
elements interspersed with character values. For example:
John Wiley
McGraw Hill
A big CS publisher
Bantam Books
Paperback publisher
d. Element content
i. Syntax:
Example:
ii. The order in which elements appear on the child-list
determines the order in which they must appear in the
XML document
iii. Specifying the number of occurrences
1) *: 0 or more
2) +: 1 or more
3) ?: 0 or 1 (i.e., optional)
4) |: either element may appear and order is irrelevant
G. Attribute Declarations
1. Syntax:
Example:
2. Common Datatype Values
a. CDATA: the following chars must use special forms:
< = <
> = >
& = &
" = "
b. ID: creates a unique ID for an attribute that identifies an
element.
i. Typically used by programs that process a document. Not
typically used by XSLT files
ii. only one ID attribute is allowed per element
iii. ID's must start with a letter or underscore (_)
c. enumerated list: (value1 | value2 | valuen)
i. example:
ii. note that enumerated values do not appear in quotes but
the default value does
iii. enumerated types must be single words--they cannot be
multiple word strings
3. Default Value
a. #REQUIRED: user must always provide a value
b. #IMPLIED: attribute is optional and no default value is
provided if attribute is omitted
c. #FIXED: attribute is optional but if it appears it must take
a fixed value that is provided in the declaration
Example:
d. value: attribute is required but may be optionally specified by
the user. If it does not appear then the default value is used
Example:
H. Entity Declarations: Provide macros that expand into longer text
1. Example:
2. General entities: meant to be used in XML documents only
a. Syntax:
b. Usage in an XML Document: &entity-name;
c. Example: &UT; 37996
3. Parameter entities: meant to be used in DTD documents
a. Syntax:
b. Usage in a DTD document: %entity-name;
c. Example:
d. Useful when the same content will appear in different elements.
Any change to the content requires only one edit as opposed
to multiple edits.
4. External entity: Entities you've seen thus far are
internal. External entities allow you to integrate external
files that include either replacement strings or non-XML
resources, such as jpg images. We won't discuss external
entities