Monday, March 7, 2011

Untyped and Typed XML in SQL Server

SQL Server 2005 allows you to store XML in two broad ways. Untyped XML is
XML that is stored without having a schema specified for it. Typed XML is XML
that has an associated schema document that defines which structure(s) are
allowed in the XML that is stored.
Untyped XML is easier to use, because you don't need to create a schema
document. It is also more flexible because you can change the structure of
XML documents. A downside of untyped XML is that the data stored in that
column can contain errors that you may avoid with typed XML instead.
If you use typed XML, then each XML document is checked to ensure that it
has the correct structure — for example, before it is stored in a column in a
SQL Server table.
Stored XML, whether it is untyped or typed, cannot exceed 2GB in size per
instance. That means that the data in a column that has the xml datatype
cannot exceed 2GB. For many purposes that size limit won't be a problem.
Using untyped XML
To demonstrate how to use untyped XML, I use the Purchase order document
In the examples in this chapter, you can either type the code or you can use
the T-SQL file that contains the code. To run the code in the file for each step,
highlight the relevant part of the code and then press the F5 key.
To create a PurchaseOrders table and add a single purchase order to it,
follow these steps:
1. Open the SQL Server Management Studio and create a new database
engine query.
The code is included in the UntypedXML.sql file.
2. Create a database called XMLUSE by using this code:
CREATE DATABASE XMLUSE

3. Elect to use the newly created XMLUSE database by using this code:
USE XMLUSE
4. Create a table called PurchaseOrders with two columns.
The PurchaseID column contains an integer value and the
PurchaseOrder column contains an XML document. Use the following
code:
CREATE TABLE PurchaseOrders (PurchaseID int primary
key, PurchaseOrder xml)

5. Add a purchase order to the PurchaseOrders table with the following
code:
INSERT INTO PurchaseOrders VALUES(1, '<?xml
version="1.0"?>
<PurchaseOrder>
<Date>2005/12/31</Date>
<From>Some Fictional Company</From>
<Contact>Fred Smith</Contact>
<ContactPhone>123-456-7890</ContactPhone>
<LineItems>
<LineItem Quantity="3">Some article</LineItem>
<LineItem Quantity="12">Some other
article</LineItem>
<LineItem Quantity="300">Yet another kind of
article</LineItem>
</LineItems>
</PurchaseOrder>')

6. Retrieve the purchase order that you added to the PurchaseOrders
table in Step 5 using the following code:
SELECT PurchaseOrder FROM PurchaseOrders

7. To view the XML document that displays in the grid,
drag the column separator to the right.
8. To view the entire XML document, displayed in a hierarchical way,
click the XML.
 
 
Understanding the XML Schema Definition language
The W3C has defined a Schema Definition language, often called W3C XML
Schema Definition language and abbreviated to XSD. In SQL Server 2005, the
only schema definition language supported is XSD, which replaces the proprietary
XML Data Reduced (XDR) schema language that Microsoft provided in
SQL Server 2000.
An XSD schema can apply the following constraints to associated instance
XML documents:
� The structure of elements in the document.
� The number of similar elements allowed in a place in the XML structure.
For example, you can specify whether an element is optional (occurs
zero or one time) or that it occurs a specified number of times.
� Whether or not a particular element has one or more attributes.
� The values allowed in a particular element or attribute. For example, in
an element to store the date of an invoice, you can specify that the element
stores only valid dates. Similarly, if an element is specified as containing
string data, you won't be allowed to carry out arithmetic on data
contained in it.
These constraints remove important classes of possible problems or variability
in the data you store. This improves data quality at the cost of creating
and applying an XSD schema document.
SQL Server validates the XML to be stored every time that you add or change
data in a column.
W3C XML Schema is a complex topic that deserves a book of its own.
If you have a copy of Visual Studio 2005, you can create an XSD schema for
the purchase order document used in the earlier example (in the "Using
untyped XML" section) by following these steps:
1. Open Visual Studio 2005.
2. Open the PurchaseOrder.xml file in Visual Studio by choosing
File➪Open and navigating to the directory that you stored
PurchaseOrder.xml in.
3. To create an XSD schema, choose XML➪Create Schema.
The XML menu displays in Visual Studio only when you have an XML file
open.
After a pause while Visual Studio creates the XSD schema,
The schema that Visual Studio created is in the PurchaseOrder.xsd file.
For convenience, I show the XSD schema here:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified"
elementFormDefault="qualified"
<xs:element name="PurchaseOrder">
<xs:complexType>
<xs:sequence>
<xs:element name="Date" type="xs:string" />
<xs:element name="From" type="xs:string" />
<xs:element name="Contact" type="xs:string" />
<xs:element name="ContactPhone" type="xs:string" />
<xs:element name="LineItems">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="LineItem">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="Quantity"
type="xs:unsignedShort" use="required" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Notice that the XSD schema is itself an XML document. The document element
is an xs:schema element. The name of the xs:schema element defines it as
being a W3C XML Schema element. The prefix, xs, is associated with a URL
that is unique to W3C XML Schema in the start tag of the xs:schema element:
<xs:schema attributeFormDefault="unqualified"
elementFormDefault="qualified"
The association of a namespace prefix with a namespace URI is called a
namespace declaration.
As is often the case, the XSD schema document is much longer than the XML
instance document it describes. In this document, the first xs:element
<xs:element name="PurchaseOrder">
tells you that the name of the document element of PurchaseOrder.xml
and other documents of the same class is PurchaseOrder. The rest of the
schema defines the allowed structure of the purchase order.
Now you have an XSD schema that you can use to work with a typed XML
column.
Using typed XML
Typed XML is used in a similar way to untyped XML but you must specify the
schema(s) to be associated with the typed XML before the XML is stored, if
you want SQL Server to automatically validate XML data as you add it to the
database.
You can associate an XSD schema with XML in the following situations:
� Column
� Function parameter
� Variable
The following steps show you how to create a table to hold a typed XML
column. You can omit Steps 1 and 2 if you created the Chapter7 database in
the earlier example (in the "Using untyped XML" section) and left SQL Server
Management Studio open.
1. Open SQL Server Management Studio and click the Database Engine
Query button to create a new database engine query.
2. Create the XMLUSE database by using the following code:
CREATE DATABASE XMLUSE
 
3. Elect to use the newly created XMLUSE database using this code:
USE XMLUSE
4. Create an XML schema collection called PurchaseOrderCollection
in the XMLUSE database using the following code:
CREATE XML SCHEMA COLLECTION PurchaseOrderCollection
AS
'<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified"
elementFormDefault="qualified"
<xs:element name="PurchaseOrder">
<xs:complexType>
<xs:sequence>
<xs:element name="Date" type="xs:string" />
<xs:element name="From" type="xs:string" />
<xs:element name="Contact" type="xs:string" />
<xs:element name="ContactPhone" type="xs:string"
/>
<xs:element name="LineItems">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded"
name="LineItem">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="Quantity"
type="xs:unsignedShort" use="required" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>'
The CREATE SCHEMA COLLECTION statement specifies the name for
the schema collection and you supply a literal XSD schema document as
the argument.
5. Verify that the schema collection has been created by using the following
code:
SELECT name, create_date
FROM sys.xml_schema_collections

6. Create the TypedPurchaseOrders table by using the following code:
CREATE TABLE TypedPurchaseOrders (PurchaseID int
primary key,
PurchaseOrder xml (PurchaseOrderCollection))
Notice that after you specify the PurchaseID and PurchaseOrder
columns in the normal way, you supply the name of an XML schema collection
to specify which schema is to apply to the PurchaseOrder
column.
7. Add a valid purchase order to the TypedPurchaseOrders table using
the following code:
INSERT INTO TypedPurchaseOrders VALUES(1, '<?xml
version="1.0"?>
<PurchaseOrder>
<Date>2005/12/31</Date>
<From>Some Fictional Company</From>
<Contact>Fred Smith</Contact>
<ContactPhone>123-456-7890</ContactPhone>
<LineItems>
<LineItem Quantity="3">Some article</LineItem>
<LineItem Quantity="12">Some other
article</LineItem>
<LineItem Quantity="300">Yet another kind of
article</LineItem>
</LineItems>
</PurchaseOrder>')
Apart from the change in the name of the table, this syntax is the same
as the untyped XML syntax. After you associate the schema with a
column in Step 6, you simply add the XML to the column in the same
way as with untyped XML.
8. Check that the purchase order was added correctly, using the following
code:
SELECT PurchaseOrder FROM TypedPurchaseOrders

9. Attempt to add an invalid purchase order to the
TypedPurchaseOrders table by using the following code.
Notice that the XML contains two Contact elements, when the schema
specifies that only one is allowed. You see an error.
INSERT INTO TypedPurchaseOrders VALUES(2, '<?xml
version="1.0"?>
<PurchaseOrder>
<Date>2005/12/31</Date>
<From>Some Fictional Company</From>
<Contact>Fred Smith</Contact>
<Contact>Not allowed here.</Contact>
<ContactPhone>123-456-7890</ContactPhone>
<LineItems>
<LineItem Quantity="3">Some article</LineItem>
<LineItem Quantity="12">Some other
article</LineItem>
<LineItem Quantity="300">Yet another kind of
article</LineItem>
</LineItems>
</PurchaseOrder>')
I show you how to store both untyped and typed XML. You also need to
retrieve XML data.
 
wait for the next Article Querying XML using XQuery or XPath in SQL Server

No comments :

Post a Comment