If you want to skip all the details of gory and just see the answer, look at the SQL query at the bottom of this publication.
The main problem here is that the various SQL Server FOR XML parameters cannot generate dynamic element names specified in the desired output. So my first answer is to simply return the normal SQL result set and create a client XML file. This is a very simple stream transformation. However, this may not be an option for you, so we continue to create SQL Server XML.
My second thought was to use the built-in functionality of XQuery SQL Server to perform the conversion, like this:
SELECT CAST((SELECT * FROM data FOR XML RAW) AS XML) .query(' <rows> { for $empId in distinct-values(/row/@empId) return <row empid="{$empId}"> { for $attr in /row[@empId = $empId] return attribute { "attribute" } { $attr/@attributeValue } } </row> } </rows> ')
Alas, this does not work. SQL Server complains:
Msg 9315, Level 16, State 1, Line 25 XQuery [query()]: Only constant expressions are supported for the name expression of computed element and attribute constructors.
The XQuery implementation seems to have the same limitations as the FOR XML functions. So, my second answer is to offer to generate XML on the client side :) But if you insist on generating XML from SQL, then fasten your seat belts ...
The overall strategy is to drop the SQL Server native tools for generating SQL. Instead, we are going to create an XML document using string concatenation. If this approach is offensive, you can stop reading now :)
Start by creating a sample dataset for playback using:
SELECT NULL AS empId INTO employee WHERE 1=0 UNION SELECT 1 UNION SELECT 2 SELECT NULL AS dataId, NULL AS empId, NULL AS attributeId, NULL AS attributeVal INTO employeeAttributes WHERE 1=0 UNION SELECT 10, 1, 'A1', 'someValue1' UNION SELECT 20, 1, 'A2', 'someValue2' UNION SELECT 30, 2, 'A1', 'someValue3' UNION SELECT 40, 2, 'A3', 'someValue4 & <>!' SELECT NULL AS attributeId, NULL AS attributeName INTO attributes WHERE 1=0 UNION SELECT 'A1', 'attribute1' UNION SELECT 'A2', 'attribute2' UNION SELECT 'A3', 'attribute3'
Note that I changed the value of the last attribute in the above example to include some XML-unfriendly characters.
Now make a basic SQL query to perform the necessary joins:
SELECT e.empId , a.attributeName , ea.attributeVal FROM employee AS e INNER JOIN employeeAttributes AS ea ON ea.empId = e.empId INNER JOIN attributes AS a ON a.attributeId = ea.attributeId
which gives this result:
empId attributeName attributeVal 1 attribute1 someValue1 1 attribute2 someValue2 2 attribute1 someValue3 2 attribute3 someValue4 & <>!
These funny characters in the last attribute will give us problems. Let me modify the query to avoid them.
; WITH cruftyData AS ( SELECT e.empId , a.attributeName , (SELECT ea.attributeVal AS x FOR XML RAW) AS attributeValXml FROM employee AS e INNER JOIN employeeAttributes AS ea ON ea.empId = e.empId INNER JOIN attributes AS a ON a.attributeId = ea.attributeId ) , data AS ( SELECT empId , attributeName , SUBSTRING(attributeValXml, 9, LEN(attributeValXml)-11) AS attributeVal FROM cruftyData ) SELECT * FROM data
with the results:
empId attributeName attributeValXml 1 attribute1 someValue1 1 attribute2 someValue2 2 attribute1 someValue3 2 attribute3 someValue4 & <>!
This ensures that attribute values ββcan now be safely used in an XML document. What about attribute names? Rules for XML attribute names are more restrictive than rules for element content. We assume that attribute names are valid XML identifiers. If this is not the case, then some design will need to be developed to convert the names in the database into valid XML names. This remains as an exercise for the reader :)
The next task is to make sure that the attributes are grouped together for each employee, and we can tell when we are in the first or last value in the group. Here is the updated request:
; WITH cruftyData AS ( SELECT e.empId , a.attributeName , (SELECT ea.attributeVal AS x FOR XML RAW) AS attributeValXml FROM employee AS e INNER JOIN employeeAttributes AS ea ON ea.empId = e.empId INNER JOIN attributes AS a ON a.attributeId = ea.attributeId ) , data AS ( SELECT empId , attributeName , SUBSTRING(attributeValXml, 9, LEN(attributeValXml)-11) AS attributeVal , ROW_NUMBER() OVER (PARTITION BY empId ORDER BY attributeName DESC) AS down , ROW_NUMBER() OVER (PARTITION BY empId ORDER BY attributeName) AS up FROM cruftyData ) SELECT * FROM data ORDER BY 1, 2
The only change is to add the columns up and down in the result set:
empId attributeName attributeVal down up 1 attribute1 someValue1 2 1 1 attribute2 someValue2 1 2 2 attribute1 someValue3 2 1 2 attribute3 someValue4 & <>! 1 2
Now we can identify the first attribute for the employee, because it will be 1. The last attribute can be identified in the same way using the bottom column.
Armed with all of this, we are now ready to run the nasty business of creating an XML result using string concatenation.
; WITH cruftyData AS ( SELECT e.empId , a.attributeName , (SELECT ea.attributeVal AS x FOR XML RAW) AS attributeValXml FROM employee AS e INNER JOIN employeeAttributes AS ea ON ea.empId = e.empId INNER JOIN attributes AS a ON a.attributeId = ea.attributeId ) , data AS ( SELECT empId , attributeName , SUBSTRING(attributeValXml, 9, LEN(attributeValXml)-11) AS attributeVal , ROW_NUMBER() OVER (PARTITION BY empId ORDER BY attributeName DESC) AS down , ROW_NUMBER() OVER (PARTITION BY empId ORDER BY attributeName) AS up FROM cruftyData ) , xmlData AS ( SELECT empId , up , CASE WHEN up <> 1 THEN '' ELSE '<row id="' + CAST (empId AS NVARCHAR) + '">' END AS xml1 , '<' + attributeName + '>' + attributeVal + '</' + attributeName + '>' AS xml2 , CASE WHEN down <> 1 THEN '' ELSE '</row>' END AS xml3 FROM data ) SELECT xml1, xml2, xml3
with the result:
xml1 xml2 xml3 <row id="1"> <attribute1>someValue1</attribute1> <attribute2>someValue2</attribute2> </row> <row id="2"> <attribute1>someValue3</attribute1> <attribute3>someValue4 & <>!</attribute3> </row>
It remains only to combine all the lines together and add the tags of the root lines. Since T-SQL does not yet have a string concatenation aggregation function, we resort to using the variable as an accumulator. Here is the final request in all its hacker glory :
DECLARE @result AS NVARCHAR(MAX) SELECT @result = '<rows>' ; WITH cruftyData AS ( SELECT e.empId , a.attributeName , (SELECT ea.attributeVal AS x FOR XML RAW) AS attributeValXml FROM employee AS e INNER JOIN employeeAttributes AS ea ON ea.empId = e.empId INNER JOIN attributes AS a ON a.attributeId = ea.attributeId ) , data AS ( SELECT empId , attributeName , SUBSTRING(attributeValXml, 9, LEN(attributeValXml)-11) AS attributeVal , ROW_NUMBER() OVER (PARTITION BY empId ORDER BY attributeName DESC) AS down , ROW_NUMBER() OVER (PARTITION BY empId ORDER BY attributeName) AS up FROM cruftyData ) , xmlData AS ( SELECT empId , up , CASE WHEN up <> 1 THEN '' ELSE '<row id="' + CAST (empId AS NVARCHAR) + '">' END AS xml1 , '<' + attributeName + '>' + attributeVal + '</' + attributeName + '>' AS xml2 , CASE WHEN down <> 1 THEN '' ELSE '</row>' END AS xml3 FROM data ) SELECT @result = @result + xml1 + xml2 + xml3 FROM xmlData ORDER BY empId, up SELECT @result = @result + '</rows>' SELECT @result
XML ends in the @result variable. You can verify that it is well-formed XML using:
SELECT CAST(@result AS XML)
The final XML is as follows:
<rows><row id="1"><attribute1>someValue1</attribute1><attribute2>someValue2</attribute2></row><row id="2"><attribute1>someValue3</attribute1><attribute3>someValue4 & <>!</attribute3></row></rows>