Introduction to XQuery

Whilst a JSON lad myself; the XML data type is commonly used, especially in applications using .Net and dealing with serialization. Data sets provided in XML can be parsed with relative ease using C#; however in SQL Server it can be tricky especially when you want to work in sets. I’ve seen (and previously written) complex string concatenation, CASE statements & even cursors to modify XML with the end result always being a muddle.

There is an easier way that comes in the form of XQuery; a set of functions introduced with the XML data type in SQL Server 2005. XQuery allows for the transformation and modification of XML within queries without any manual parsing. XQuery is one of those “hard to learn – hell to master” style of languages but it is exceeding powerful.

Before we begin; below is our sample data stored in an XML parameter:

DECLARE @SampleXml XML = ( 
    '<Organisations> 
        <Organisation id="1" name="Bristol Data"> 
		    <User id="1"> 
			    <FirstName>Carl</FirstName> 
			    <LastName>Llama</LastName> 
	        </User> 
			<User id="2"> 
			    <FirstName>Agent</FirstName> 
				<LastName>Smith</LastName> 
		    </User> 
		</Organisation> 
		<Organisation id="2" name="Github" /> 
	</Organisations>' 
)

Let’s say our requirement was to get all the users out of this data set; one row per user. In XQuery that would look like this:

SELECT  [UserId] = x.value('(@id)[1]','INT'), 
        [FirstName] = x.value('(FirstName)[1]','VARCHAR(255)'), 
        [LastName] = x.value('(LastName)[1]','VARCHAR(255)')
FROM @SampleXml.nodes('/Organisations/Organisation/User') d(x)

Resulting in:

01_SelectResults

@id denotes an attribute whilst FirstName denotes an XML tag. Note how we have traversed the nodes in the FROM clause in order to only loop through the User entries.

Now let’s say we wanted to change the organisation called “Github” to “Geeklist”; that can be done using:

DECLARE @ReplacementValue VARCHAR(255) = ('Geeklist') 

SET @SampleXml.modify('replace value of (/Organisations/Organisation[@name="Github"]/@name)[1] with sql:variable("@ReplacementValue") ' ) 

SELECT  [OrganisationId] = x.value('(@id)[1]','INT'), 
        [Name] = x.value('(@name)[1]','VARCHAR(255)') 
FROM @SampleXml.nodes('/Organisations/Organisation') d(x)

In the selection we are filtering which record we want to update using [@name=”Github”] (read: attribute should equal “Github”) and specifying which parameter this should change to.

As a final introductory example let’s say we wanted to join some of the data in the XML onto another table so our result set is a combination of the 2. In order to do this we would use:

DECLARE @UserHours TABLE ( UserId INT, Amount INT ) 

INSERT INTO @UserHours (UserId, Amount) 
VALUES (1, 36), (2, 27) 

SELECT  [UserId] = x.value('(@id)[1]','int'), 
        [FirstName] = x.value('(FirstName)[1]','VARCHAR(255)'),
        [LastName] = x.value('(LastName)[1]','VARCHAR(255)'), [Hours] = UH.Amount 
FROM @SampleXml.nodes('/Organisations/Organisation/User') d(x) 
JOIN @UserHours UH ON UH.UserId = x.value('(@id)[1]','int')

Resulting in:

02_JoinResults

If you want to learn more about XQuery you can’t beat this list of articles on Beyond Relational by Jacob.