Manage XML Data

This objective may include, but is not limited to: XML datatype, XML indexes, XML schema collections

When you want to work with XML data in SQL Server, you have the xml data type at hand. This one has some useful functions to query and modify XML data, but more of this later. There are two types of the xml data type, namely untyped and typed. The difference is that you can use an XML schema to enforce rules when working with typed xml.

To create an XML schema in SQL Server, you use the CREATE XML SCHEMA COLLECTION order. This looks as follows:

CREATE XML SCHEMA COLLECTION MyOwnSchema
AS N’<?xml version=”1.0” encoding=”utf-16” ?>
<xsd:schema xmlns:xsd=”http://www.w3.org/2001/XMLSchema/”>
 <xsd:element name=”Person”>
   <xsd:complexType>
    <xsd:sequence>
     <xsd:element name=”FirstName” type=”xsd:string” />
     <xsd:element name=”LastName” type=”xsd:string” />
    </xsd:sequence>
   </xsd:complexType>
  </xsd:element>
</xsd:schema>

Now you can enforce the validity of any XML you create against this schema by declaring an xml type as follows:

DECLARE @x XML (CONTENT MyOwnSchema);

I think this is pretty straightforward. Now let’s see the methods the xml type offers us:

  • Query(xquery): performs the given XQuery against the xml instance. Returns an untyped xml instance.
  • Value(xquery, sql_type): performs the given XQuery against an xml instance, and returns a scalar of the specified SQL type.
  • Exist(xquery): performs the given query, and returns 0 if the query returns empty, 1 if not, and NULL if the xml instance is NULL.
  • Modify(xml_dml): performs an XML DML statement on the xml instance.
  • Nodes(xquery) as table_name(column_name): performs the xquery on the xml instance, and returns the matching nodes in a SQL result set.

I’m not the biggest fan of these methods, so please check out the SQL Books Online instead.

The last thing to mention here is the question of XML indexes. SQL Server lets you place indexes to improve the querying of xml data type columns. There are two types of XML indexes available:

  • Primary XML Index: each xml column can have exactly on primary XML index on it. It isn’t the classic SQL index, rather a persisted and preshredded representation of XML data. You can imagine this as the XML stored in the xml data type column is expanded into a standard SQL table. In order to use this type of index, you must have a clustered index on the primary key columns of the table.
  • Secondary XML Index: these are nonclustered indexes created on primary XML indexes. The prerequisite is that you must have a primary XML index on the column. There are three different types of secondary XML indexes:
    • PATH: optimized for XPath and XQuery path expressions. Created on the path and node values of the primary XML index.
    • VALUE: the inverse of a PATH index, in case when the path is unknown.
    • PROPERTY: is good when you need other columns’ data, based on values of the xml column.

2 thoughts on “Manage XML Data

  1. Another very strong and powerful post. I’ve been reading through some of your previous posts and finally decided to drop a comment on this one. I signed up for your newsletter, so please keep up the informative posts!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s