DTD for XML

The article was added by Gore Mitrovich at 02/28/2008.

Home | Submit article | About us | Contact us
Other XML articles

You are here: Categories » Technology and Computer » XML

DTD for XML

DTD represents a set of rules that define the structure and logic of XML documents. The documents that store these rules are called DTD documents (referred to as DTDs from here on) and have the extension .dtd.

To better understand the concept of DTDs, compare them with the creation of tables in a database. When you create a table in a database system, you specify the columns, the data types for different columns, the validation rules for data within columns, and so on. Similarly, you can specify rules that can be used in XML documents, such as tags and attributes, by using a DTD. DTDs can be considered to be rule books for XML documents.

It's not essential for you to create a DTD for your XML documents. However, a DTD can be important to users who need to understand the structure of your XML documents or who need to create an XML document similar to the one you've already created. These users can refer to your DTD document to understand the structure and logic of your XML documents.

When you create a DTD document for an XML document, the XML document is checked against the rules specified in the DTD document. If the XML document adheres to all the DTD rules, the document is considered valid. Otherwise, the XML document fails to generate the desired output.

The components of a DTD are listed below:

  • ads1
    ads2

    DOCTYPE declarations. The <!DOCTYPE> declaration contains the information about the location of the DTD.

  • Element declarations. An element is a logical component of a document. Every element that is contained in an XML document must have a corresponding declaration in the DTD. The element declaration is used to validate the elements in the document.

  • Attributes declaration. Attributes represent the characteristics of an element. An element can contain multiple attributes. For each element attribute that is used in an XML document, a corresponding attribute declaration must be specified in the DTD.

  • Content model. The content model is used to describe the content of an element.

  • Entity declaration. Entities are aliases associated with a group of data. These are used in a document to avoid typing long pieces of text repeatedly.

The general structure of a DTD is shown below:

<!DOCTYPE dtd-name   
[  Element declaration(  Attribute declaration  ]  >  

Element Declaration

An element declaration specifies a single markup element. Every tag used in the XML document must be must be defined with an element declaration in the corresponding DTD.

The syntax to declare an element is:

<!ELEMENT element-name (element content-type)>

For example, consider a DTD, restaurant.dtd, that is used to define details about restaurants. The details include the following elements:

  • RESTAURANT. Identifies the restaurant

  • NAME. Identifies the name of the restaurant

  • LOCATION. Identifies the location of the restaurant

  • ADDRESS. Identifies the address of the restaurant

  • PHONE. Provides the phone number of the restaurant

  • REMARKS. Used to provide comments about the restaurant

The declarations for these elements are:

<!ELEMENT restaurant>  
<!ELEMENT name>
<!ELEMENT location>
<!ELEMENT address>  
<!ELEMENT phone>
<!ELEMENT remarks>

Attribute Declaration

Attribute declarations define the sets of attributes for an element. Every attribute used in the XML document must have a declaration in the corresponding DTD. All elements need not have attributes.

For example, in restaurant.dtd, attributes may be added to the RESTAURANT element.

An attribute TYPE with values as Continental, Chinese, Indian, Mexican, and Multicuisine can be added to the RESTAURANT element using the following declaration:

<!ATTLIST RESTAURANT TYPE 
(INDIAN | CONTINENTAL | CHINESE | MEXICAN | MULTICUISINE )  
"CONTINENTAL" #REQUIRED>

The default value for an attribute is enclosed in quotation marks. #REQUIRED indicates that the attribute is mandatory and is required each time the element is used in a document.

Content Model

A content model is part of the element declaration and is used to describe the content of the element. There are three different types of content:

  • Data content. This signifies text-based characters and is the most basic type of content. Data content can be specified either as #CDATA or #PCDATA. #CDATA is used to specify that the element contain data that is not to be parsed by the parser, whereas #PCDATA is used to specify that the element contains data that is to be parsed by the parser.

  • Element content. This specifies the child elements that are contained in the element. In addition, element content specifies which of the child elements are required and the order in which these elements must appear in the document.

  • Mixed content. Mixed content signifies both the data and element content.

An element with data is declared as shown:

<!ELEMENT element-name (data-type)>

An element with a child element is declared as shown:

<!ELEMENT element-name (child-element-name)>
ads4

Multiple child elements can be separated with a comma. In an XML document, the child elements must appear in the same sequence as they have been declared in the DTD. A question mark (?) after a child element indicates that the element is optional.

In the restaurant.dtd, the RESTAURANT element contains all the other elements. The restaurant.dtd, after adding the content model information, is as follows:

<!ELEMENT RESTAURANT (NAME, LOCATION, ADDRESS, PHONE, REMARKS?)>  
<!ATTLIST RESTAURANT TYPE (INDIAN | CONTINENTAL | CHINESE | MEXICAN  | MULTICUISINE ) "CONTINENTAL" #REQUIRED>
<!ELEMENT NAME (#PCDATA)>                 
<!ELEMENT LOCATION EMPTY>                 
<!ATTLIST LOCATION TYPE (SOUTH|NORTH|EAST|WEST) "SOUTH" )>  
<!ELEMENT ADDRESS (#PCDATA)>  <!ELEMENT PHONE (#PCDATA)> 

The keyword EMPTY can be used as the content-type to specify that the element has no child elements. The ELEMENT LOCATION is a singleton tag that does not require the start and the end tags.

Entity Declaration

Entities are used within a document to avoid typing long pieces of repetitive text. Such texts can be assigned an alias, which can further be used in the document. When the document is processed, the alias is replaced by the text specified.

Predefined Entities in XML
Entity Name Character

&lt;

<

&gt;

>

&amp;

&

&quto

"

&apos;

'

Entities are of two types:

  • General entities. A general entity is declared as follows:

    <!ENTITY myaddress " 112 Vasant Enclave New Delhi –57">  

    This is an example of an internal entity, where the text phrase being mapped is in the entity declaration itself. An external entity maps the unique name to a block of text stored outside of the document. A general entity is referenced with & before the entity name.

  • Parameter entities. Parameter entities are specified by %. These entities are similar to general entities but can be used only within the DTD.

Structure of an XML Document

An XML document consists of character data and the markup that describes the data. A sample XML document created based on restaurant.dtd is shown below:

<?xml version="1.0"?>  
<RESTAURANT TYPE="CONTINENTAL">        
<NAME> Sensoi </NAME>        
<LOCATION TYPE="SOUTH" />        
<ADDRESS> West End, Wellingdon Street, New Delhi</ADDRESS>        
<PHONE>91-011-6854672</PHONE>  
</RESTAURANT>

An XML document has the following components:

  • XML declaration

  • Elements

  • Attributes

  • Entities

  • Comments

XML Declaration

An XML declaration is the first statement in an XML document. It is used to identify the document as an XML document. It is also used to specify processing instructions such as whether the application should process only the XML document or the DTD as well. The XML declaration may include attributes such as version and encoding. For example,

<?XML version= "1.0" encoding="UTF-8"?>  

<? and ?> signifies that XML is a processing instruction. The processing instructions are used to pass messages to the application processing the XML document. Such processing instructions can be placed anywhere in the document.

The attribute version specifies the version of the XML document. The encoding attribute is used to specify the character encoding used by the author. UTF-8 corresponds to 8-bit ASCII characters.

Elements

Elements are the main components of a markup language and are defined in the DTD. Every XML document must have one root element. A root element describes the function of the document. In the restaurant.dtd example, <RESTAURANT> is the root element. The root element contains the other elements of the XML document.

Elements are specified using tags. A tag is specified with in angular brackets (< >). A tag can be a paired tag with a start tag (<element>) and an end tag (</element>). A tag can also be a singleton tag that does not have start and end tags and therefore cannot contain any elements or data. Singleton tags are signified with the EMPTY keyword in the DTD.

The text between the start and the end tags is defined as the character data. Character data may be any legal Unicode character except <.

Attributes

Attributes provide additional information about the elements. Attributes are embedded in the stat tag. An attribute consists of an attribute name and an attribute value. In the preceding sample XML code, the RESTAURANT element contains an attribute TYPE that specifies the cuisine that the restaurant specializes in.

Entities

Entities are used to specify an alias for test data that needs to be typed repeatedly. Entities must be declared before they are referenced in the XML document. An example of an entity is as follows:

<!ENTITY Poor "The restaurant has poor customer service">  

This entity can be referenced as &Poor. For example,

<REMARKS> &Poor </REMARKS>

In an XML document, all entities are declared within a DOCTYPE declaration. The <!DOCTYPE […] > declaration follows the XML declaration. For example,

<?xml version="1.0"?>  <!DOCTYPE RESTAURANT[  <!ENTITY Poor "The restaurant has poor customer service">  ]>

Comments

The syntax to specify comments in an XML document is:

<!- Comments->

For example,

<?xml version="1.0"?>  
<!-This is a comment ->  
<RESTAURANT TYPE="CONTINENTAL">        
<NAME> Sensoi </NAME>        
<LOCATION TYPE="SOUTH" />        
<ADDRESS> West End, Wellingdon Street, New Delhi</ADDRESS>        
<PHONE>91-011-6854672</PHONE>  
</RESTAURANT>

Applying Style Sheets to an XML Document

XML is used to organize and display data to Web users. The output of such a document is plain with different tags displayed in a tree structure and is not formatted. To format an XML document, you can apply a style sheet. Style sheets apply style to the XML documents and make them look attractive and user-friendly. Style sheets contain the rules that declare how an XML document must appear. There are many style sheets available in the market; two of these are:

  • CSSs (Cascading Style Sheets). These help to manipulate the visibility, positioning and sizing of elements; colors; and background, font, text, and spacing of an element.

  • XSL (eXtensible Stylesheet Language). XSL contains an XML vocabulary that specifies the formatting rules and a language to transform XML documents.

A CSS is included in an XML document using the following statement:

<? xml-stylesheet type="text/css" href="mycsssheet.css"?>

An XSL is included in an XML document using the following statement:

<? xml-stylesheet type="text/xsl" href="myxslsheet.xsl"?>

XML Disclaimer

  • The Soft articles directory team is not responsible for falsehoods, inaccuracies, or any other types of misinformation this article may contain and will not be liable for any damage or loss suffered by a user through the user's reliance on the information gained here.
  • Soft Articles Directory is not responsible for any and all copyright infringements by writers and authors. If you suspect the information contained by this page for any copyright infringements, please contact us and we'll investigate the issue.

 
free content
    Copyright © 2007-2008 Soft Articles Directory. Designed by the Soft Article Directory Team.
The articles and tutorials in the directory are property of their respective owners and authors.