Parsing SQL Server XML string in varchar field

I have a varchar column in a table that is used to store xml data. Yes, I know that there is an xml data type that I should use, but I think it was set up before the xml data type was available, so varchar is what I should use now. :)

The saved data looks something like this:

<xml filename="100100_456_484351864768.zip" event_dt="10/5/2009 11:42:52 AM"> <info user="TestUser" /> </xml> 

I need to parse the file name to get the numbers between the two underscores, which in this case will be β€œ456”. The first part of the file name "should not" vary in length, but the average will be. I need a solution that will work if the first part changes the length (you know that it will change, because β€œshould not change” always means that it will change).

For what I have now, I use XQuery to pull out the file name, because I realized that this is probably better than straightforward manipulation. I passed the xml string for this, but I am not an XQuery expert, so of course I have problems. I found a function for XQuery (a substring before), but could not get it to work (I'm not even sure that the function will work with SQL Server). There may be an XQuery function for this, but if I don't know about it.

So, I get the file name from the table with a query similar to the following:

 select CAST(parms as xml).query('data(/xml/@filename)') as p from Table1 

From this, I would suggest that I could CAST return back to the string and then execute some instring or charindex function to find out where the underscores are, so that I can encapsulate all of this in a substring, to select which part I need. Without going too far in this, I am pretty sure that in the end I can do it, but I know that there should be an easier way. Thus, a huge unreadable field will be made in the SQL statement, which, even if I moved it to a function, would still get confused to figure out what was happening.

I am sure this is simpler because it seems to be simple string manipulation. Maybe someone can point me in the right direction. Thanks

+6
sql sql-server tsql xquery
source share
3 answers

You can use XQuery for this - just change your statement to:

 SELECT CAST(parms as xml).value('(/xml/@filename)[1]', 'varchar(260)') as p FROM dbo.Table1 

This gives you VARCHAR (260) long enough to contain any valid file name and path - now you have a string and you can work with it using SUBSTRING, etc.

Mark

+5
source share

An easy way to do this is with SUBSTRING and CHARINDEX. Assuming (wise or not) that the first part of the file name does not change the length, but that you still want to use XQuery to search for the file name, here is a short play that does what you want:

 declare @t table ( parms varchar(max) ); insert into @t values ('<xml filename="100100_456_484351864768.zip" event_dt="10/5/2009 11:42:52 AM"><info user="TestUser" /></xml>'); with T(fName) as ( select cast(cast(parms as xml).query('data(/xml/@filename)') as varchar(100)) as p from @t ) select substring(fName,8,charindex('_',fName,8)-8) as myNum from T; 

There are sneaky solutions that use other string functions, such as REPLACE and PARSENAME or REVERSE, but none of them can be more efficient or readable. One option to consider is to write a CLR routine that allows you to process regular expressions in SQL.

By the way, if your xml is always so simple, there is no specific reason why I can use XQuery at all. Here are two queries that will select the right number. Secondly, it is safer if you do not have control over an extra space in your xml line or over the fact that the first part of the file name changes its length:

  select substring(parms,23,charindex('_',parms,23)-23) as myNum from @t; select substring(parms,charindex('_',parms)+1,charindex('_',parms,charindex('_',parms)+1)-charindex('_',parms)-1) as myNum from @t; 
+4
source share

Unfortunately, SQL Server is not a conformal implementation of XQuery, but rather a rather limited subset of the draft version of the XQuery specification. It has not only fn:substring-before , it also does not have fn:index-of for this, using fn:substring , and not fn:string-to-codepoints . So, as far as I can tell, you are stuck with SQL here.

+1
source share

All Articles