SQL Server to create XML data rows from a JOINed select statement

Question

SQL Server to create XML data rows from a JOINed select statement

I have three tables in SQL Server 2008 that are configured as follows:

WORK TABLE

empid(PK) 1 2

connected to EMPLOYEEATTRIBUTES

 dataId(PK) | empId(FK) | attributeid | attributeVal 10 | 1 | A1 | somevalue1 20 | 1 | A2 | somevalue2 30 | 2 | A1 | somevalue3 40 | 2 | A3 | somevalue4

connected to ATTRIBUTES

 attributeid | attributeName A1 | attribute1 A2 | attribute2 A3 | attribute3

I need to get xml data in the following format

 <rows> <row empid="1"> <attribute1>somevalue1</attribute1> <attribute2>somevalue2</attribute1> </row> <row empid="2"> <attribute1>somevalue3</attribute1> <attribute3>somevalue4</attribute1> </row> </rows>

Does anyone know how to do this?

+1

sql xml sql-server path

Puc Dec 05 '10 at 20:51

source share

3 answers

Wreach · Answer 1 · 2010-12-11T20:35:22+0000

If you want to skip all the details of gory and just see the answer, look at the SQL query at the bottom of this publication.

The main problem here is that the various SQL Server FOR XML parameters cannot generate dynamic element names specified in the desired output. So my first answer is to simply return the normal SQL result set and create a client XML file. This is a very simple stream transformation. However, this may not be an option for you, so we continue to create SQL Server XML.

My second thought was to use the built-in functionality of XQuery SQL Server to perform the conversion, like this:

 /* WARNING: the following SQL does not work */ SELECT CAST((SELECT * FROM data FOR XML RAW) AS XML) .query(' <rows> { for $empId in distinct-values(/row/@empId) return <row empid="{$empId}"> { for $attr in /row[@empId = $empId] return attribute { "attribute" } { $attr/@attributeValue } } </row> } </rows> ')

Alas, this does not work. SQL Server complains:

 Msg 9315, Level 16, State 1, Line 25 XQuery [query()]: Only constant expressions are supported for the name expression of computed element and attribute constructors.

The XQuery implementation seems to have the same limitations as the FOR XML functions. So, my second answer is to offer to generate XML on the client side :) But if you insist on generating XML from SQL, then fasten your seat belts ...

The overall strategy is to drop the SQL Server native tools for generating SQL. Instead, we are going to create an XML document using string concatenation. If this approach is offensive, you can stop reading now :)

Start by creating a sample dataset for playback using:

 SELECT NULL AS empId INTO employee WHERE 1=0 UNION SELECT 1 UNION SELECT 2 SELECT NULL AS dataId, NULL AS empId, NULL AS attributeId, NULL AS attributeVal INTO employeeAttributes WHERE 1=0 UNION SELECT 10, 1, 'A1', 'someValue1' UNION SELECT 20, 1, 'A2', 'someValue2' UNION SELECT 30, 2, 'A1', 'someValue3' UNION SELECT 40, 2, 'A3', 'someValue4 & <>!' SELECT NULL AS attributeId, NULL AS attributeName INTO attributes WHERE 1=0 UNION SELECT 'A1', 'attribute1' UNION SELECT 'A2', 'attribute2' UNION SELECT 'A3', 'attribute3'

Note that I changed the value of the last attribute in the above example to include some XML-unfriendly characters.

Now make a basic SQL query to perform the necessary joins:

 SELECT e.empId , a.attributeName , ea.attributeVal FROM employee AS e INNER JOIN employeeAttributes AS ea ON ea.empId = e.empId INNER JOIN attributes AS a ON a.attributeId = ea.attributeId

which gives this result:

 empId attributeName attributeVal 1 attribute1 someValue1 1 attribute2 someValue2 2 attribute1 someValue3 2 attribute3 someValue4 & <>!

These funny characters in the last attribute will give us problems. Let me modify the query to avoid them.

 ; WITH cruftyData AS ( SELECT e.empId , a.attributeName , (SELECT ea.attributeVal AS x FOR XML RAW) AS attributeValXml FROM employee AS e INNER JOIN employeeAttributes AS ea ON ea.empId = e.empId INNER JOIN attributes AS a ON a.attributeId = ea.attributeId ) , data AS ( SELECT empId , attributeName , SUBSTRING(attributeValXml, 9, LEN(attributeValXml)-11) AS attributeVal FROM cruftyData ) SELECT * FROM data

with the results:

 empId attributeName attributeValXml 1 attribute1 someValue1 1 attribute2 someValue2 2 attribute1 someValue3 2 attribute3 someValue4 &amp; &lt;&gt;!

This ensures that attribute values can now be safely used in an XML document. What about attribute names? Rules for XML attribute names are more restrictive than rules for element content. We assume that attribute names are valid XML identifiers. If this is not the case, then some design will need to be developed to convert the names in the database into valid XML names. This remains as an exercise for the reader :)

The next task is to make sure that the attributes are grouped together for each employee, and we can tell when we are in the first or last value in the group. Here is the updated request:

 ; WITH cruftyData AS ( SELECT e.empId , a.attributeName , (SELECT ea.attributeVal AS x FOR XML RAW) AS attributeValXml FROM employee AS e INNER JOIN employeeAttributes AS ea ON ea.empId = e.empId INNER JOIN attributes AS a ON a.attributeId = ea.attributeId ) , data AS ( SELECT empId , attributeName , SUBSTRING(attributeValXml, 9, LEN(attributeValXml)-11) AS attributeVal , ROW_NUMBER() OVER (PARTITION BY empId ORDER BY attributeName DESC) AS down , ROW_NUMBER() OVER (PARTITION BY empId ORDER BY attributeName) AS up FROM cruftyData ) SELECT * FROM data ORDER BY 1, 2

The only change is to add the columns up and down in the result set:

 empId attributeName attributeVal down up 1 attribute1 someValue1 2 1 1 attribute2 someValue2 1 2 2 attribute1 someValue3 2 1 2 attribute3 someValue4 &amp; &lt;&gt;! 1 2

Now we can identify the first attribute for the employee, because it will be 1. The last attribute can be identified in the same way using the bottom column.

Armed with all of this, we are now ready to run the nasty business of creating an XML result using string concatenation.

 ; WITH cruftyData AS ( SELECT e.empId , a.attributeName , (SELECT ea.attributeVal AS x FOR XML RAW) AS attributeValXml FROM employee AS e INNER JOIN employeeAttributes AS ea ON ea.empId = e.empId INNER JOIN attributes AS a ON a.attributeId = ea.attributeId ) , data AS ( SELECT empId , attributeName , SUBSTRING(attributeValXml, 9, LEN(attributeValXml)-11) AS attributeVal , ROW_NUMBER() OVER (PARTITION BY empId ORDER BY attributeName DESC) AS down , ROW_NUMBER() OVER (PARTITION BY empId ORDER BY attributeName) AS up FROM cruftyData ) , xmlData AS ( SELECT empId , up , CASE WHEN up <> 1 THEN '' ELSE '<row id="' + CAST (empId AS NVARCHAR) + '">' END AS xml1 , '<' + attributeName + '>' + attributeVal + '</' + attributeName + '>' AS xml2 , CASE WHEN down <> 1 THEN '' ELSE '</row>' END AS xml3 FROM data ) SELECT xml1, xml2, xml3 --SELECT @result = @result + 'wombat' + xmlString FROM xmlData ORDER BY empId, up

with the result:

 xml1 xml2 xml3 <row id="1"> <attribute1>someValue1</attribute1> <attribute2>someValue2</attribute2> </row> <row id="2"> <attribute1>someValue3</attribute1> <attribute3>someValue4 &amp; &lt;&gt;!</attribute3> </row>

It remains only to combine all the lines together and add the tags of the root lines. Since T-SQL does not yet have a string concatenation aggregation function, we resort to using the variable as an accumulator. Here is the final request in all its hacker glory :

 DECLARE @result AS NVARCHAR(MAX) SELECT @result = '<rows>' ; WITH cruftyData AS ( SELECT e.empId , a.attributeName , (SELECT ea.attributeVal AS x FOR XML RAW) AS attributeValXml FROM employee AS e INNER JOIN employeeAttributes AS ea ON ea.empId = e.empId INNER JOIN attributes AS a ON a.attributeId = ea.attributeId ) , data AS ( SELECT empId , attributeName , SUBSTRING(attributeValXml, 9, LEN(attributeValXml)-11) AS attributeVal , ROW_NUMBER() OVER (PARTITION BY empId ORDER BY attributeName DESC) AS down , ROW_NUMBER() OVER (PARTITION BY empId ORDER BY attributeName) AS up FROM cruftyData ) , xmlData AS ( SELECT empId , up , CASE WHEN up <> 1 THEN '' ELSE '<row id="' + CAST (empId AS NVARCHAR) + '">' END AS xml1 , '<' + attributeName + '>' + attributeVal + '</' + attributeName + '>' AS xml2 , CASE WHEN down <> 1 THEN '' ELSE '</row>' END AS xml3 FROM data ) SELECT @result = @result + xml1 + xml2 + xml3 FROM xmlData ORDER BY empId, up SELECT @result = @result + '</rows>' SELECT @result

XML ends in the @result variable. You can verify that it is well-formed XML using:

 SELECT CAST(@result AS XML)

The final XML is as follows:

 <rows><row id="1"><attribute1>someValue1</attribute1><attribute2>someValue2</attribute2></row><row id="2"><attribute1>someValue3</attribute1><attribute3>someValue4 &amp; &lt;&gt;!</attribute3></row></rows>

marc_s · Answer 2 · 2010-12-05T21:04:19+0000

You can get closer, but you cannot get the desired result 100%.

Using this query:

 SELECT EmpID AS '@empid', ( SELECT a.AttributeName AS '@name', ea.AttributeVal FROM dbo.EmployeeAttributes ea INNER JOIN dbo.Attributes a ON ea.AttributeId = a.AttributeId WHERE ea.EmpID = e.EmpID FOR XML PATH ('attribute'), TYPE ) FROM dbo.Employee e FOR XML PATH('row'), ROOT('rows')

you will get this output:

 <rows> <row empid="1"> <attribute name="Attribute1"> <AttributeVal>SomeValue1</AttributeVal> </attribute> <attribute name="attribute2"> <AttributeVal>SomeValue2</AttributeVal> </attribute> </row> <row empid="2"> <attribute name="Attribute1"> <AttributeVal>SomeValue3</AttributeVal> </attribute> <attribute name="attribute3"> <AttributeVal>SomeValue4</AttributeVal> </attribute> </row> </rows>

What you cannot do is make the internal XML nodes have tag names matching the attribute name - you should use some fixed tag name (for example, <attribute> in my example), and then apply the values that are extracted from your tables as attributes of these XML tags (for example, the name= attribute in my example) or the values of the XML element.

As far as I know, there is no way to use AttributeValue as the name of an XML tag ....

Stuart ainsworth · Answer 3 · 2010-12-10T20:41:27+0000

Here's the answer, but the PIVOT command limits you to knowing the name of your attributes in advance. With a little tweaking, you could do this dynamically (try searching for a dynamic core in SQL Server 2005):

 DECLARE @Employee TABLE ( empid INT ) DECLARE @EA TABLE ( dataid INT , empid INT , attributeid CHAR(2) , AttributeVal VARCHAR(100) ) DECLARE @Attributes TABLE ( AttributeID CHAR(2) , AttributeName VARCHAR(100) ) INSERT INTO @Employee VALUES ( 1 ), ( 2 ) INSERT INTO @EA ( dataid, empid, attributeid, AttributeVal ) VALUES ( 10, 1, 'A1', 'somevalue1' ) , ( 20, 1, 'A2', 'somevalue2' ) , ( 30, 2, 'A1', 'somevalue3' ) , ( 40, 2, 'A3', 'somevalue4' ) INSERT INTO @Attributes ( AttributeID, AttributeName ) VALUES ( 'A1', 'attribute1' ) , ( 'A2', 'attribute2' ) , ( 'A3', 'attribute3' ) SELECT empID as '@empid' , attribute1 , attribute2 , attribute3 , attribute4 FROM ( SELECT e.empid , a.AttributeName , ea.AttributeVal FROM @Employee e JOIN @EA ea ON e.empid = ea.empid JOIN @Attributes a ON ea.attributeid = a.attributeid ) ps PIVOT ( MIN(AttributeVal) FOR AttributeName IN ( [attribute1], [attribute2], [attribute3], [attribute4] ) ) AS pvt FOR XML PATH('row'), ROOT('rows')

SQL Server to create XML data rows from a JOINed select statement

More articles: