Split capitalized words in sql

Does anyone know how to smash words starting with capital letters from a string?

Example:

DECLARE @var1 varchar(100) = 'OneTwoThreeFour' DECLARE @var2 varchar(100) = 'OneTwoThreeFourFive' DECLARE @var3 varchar(100) = 'One' SELECT @var1 as Col1, <?> as Col2 SELECT @var2 as Col1, <?> as Col2 SELECT @var3 as Col1, <?> as Col2 

Expected Result:

  Col1 Col2 OneTwoThreeFour One Two three Four OneTwoThreeFourFive One Two Three Four Five One One 

If this is not possible (or if it takes too long), then a scalar function is also suitable.

+11
sql sql-server tsql
source share
7 answers

Here is a function I created that looked like "removing non-alphabetic characters." How to remove all non-alphabetic characters from a string in SQL Server?

In this case, case-sensitive sorting is used, which actively searches for the combination without spaces / capital letters, and then uses the STUFF function to insert the space. This is a scalar UDF, so some people will immediately say that it will be slower than other solutions. To this concept, I say, please check it. This function does not use any tabular data and only loops as many times as necessary, so it will probably give you very good performance.

 Create Function dbo.Split_On_Upper_Case(@Temp VarChar(1000)) Returns VarChar(1000) AS Begin Declare @KeepValues as varchar(50) Set @KeepValues = '%[^ ][AZ]%' While PatIndex(@KeepValues collate Latin1_General_Bin, @Temp) > 0 Set @Temp = Stuff(@Temp, PatIndex(@KeepValues collate Latin1_General_Bin, @Temp) + 1, 0, ' ') Return @Temp End 

Name it as follows:

 Select dbo.Split_On_Upper_Case('OneTwoThreeFour') Select dbo.Split_On_Upper_Case('OneTwoThreeFour') Select dbo.Split_On_Upper_Case('One') Select dbo.Split_On_Upper_Case('OneTwoThree') Select dbo.Split_On_Upper_Case('stackOverFlow') Select dbo.Split_On_Upper_Case('StackOverFlow') 
+15
source share

Here is the function I just created.

Function

 CREATE FUNCTION dbo.Split_On_Upper_Case ( @String VARCHAR(4000) ) RETURNS VARCHAR(4000) AS BEGIN DECLARE @Char CHAR(1); DECLARE @i INT = 0; DECLARE @OutString VARCHAR(4000) = ''; WHILE (@i <= LEN(@String)) BEGIN SELECT @Char = SUBSTRING(@String, @i,1) IF (@Char = UPPER(@Char) Collate Latin1_General_CS_AI) SET @OutString = @OutString + ' ' + @Char; ELSE SET @OutString = @OutString + @Char; SET @i += 1; END SET @OutString = LTRIM(@OutString); RETURN @OutString; END 

Test Data

 DECLARE @TABLE TABLE (Strings VARCHAR(1000)) INSERT INTO @TABLE VALUES ('OneTwoThree') , ('FourFiveSix') , ('SevenEightNine') 

Query

 SELECT dbo.Split_On_Upper_Case(Strings) AS Vals FROM @TABLE 

Result set

 โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•— โ•‘ Vals โ•‘ โ• โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ฃ โ•‘ One Two Three โ•‘ โ•‘ Four Five Six โ•‘ โ•‘ Seven Eight Nine โ•‘ โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• 
+3
source share

If a single query is required 26 REPLACE can be used to check each uppercase letter, for example

 SELECT @var1 col1, REPLACE( REPLACE( REPLACE( ... REPLACE(@var1, 'A', ' A') , ... , 'X', ' X') , 'Y', ' Y') , 'Z', ' Z') col2 

Not the most beautiful thing, but it will work.

EDIT
Just add another function to do the same in a different way for the other answers.

 CREATE FUNCTION splitCapital (@param Varchar(MAX)) RETURNS Varchar(MAX) BEGIN Declare @ret Varchar(MAX) = ''; declare @len int = len(@param); WITH Base10(N) AS ( SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 ), Chars(N) As ( Select TOP(@len) nthChar = substring(@param, uN + tN*10 + hN*100 + th.N*1000 + 1, 1) Collate Latin1_General_CS_AI FROM Base10 u CROSS JOIN Base10 t CROSS JOIN Base10 h CROSS JOIN Base10 th WHERE uN + tN*10 + hN*100 + th.N*1000 < @len ORDER BY uN + tN*10 + hN*100 + th.N*1000 ) SELECT @ret += Case nthChar When UPPER(nthChar) Then ' ' Else '' End + nthChar FROM Chars RETURN @ret; END 

This uses the TSQL feature to concatenate a string variable, I had to use the TOP N trick to force the Chars CTE strings in the correct order

+2
source share

Build a table of numbers. There are some great posts on SO to show you how to do this. Fill it with values โ€‹โ€‹up to the maximum length of your input string. Select values โ€‹โ€‹from 1 to the actual length of the current input string. Cross-attach this list of numbers to the input string. Use the result for SUBSTRING() each character. Then you can either compare the resulting list of values โ€‹โ€‹of one miner with a pre-filled table variable, or convert each character to an integer using ASCII() and select only those that are between 65 ("A") and 90 ("Z",) . At this point, you have a list that is the position of each uppercase character in your input line. UNION maximum length of your input string at the end of this list. You will understand why in just a second. Now you can SUBSTRING() enter the input variable starting from the number specified by the string N and take the length (the number specified by the string N + 1) - (the number specified by the string N). This is why you should UNION add an extra number at the end. Finally, concatenate all of these substrings together, separated by spaces, using the algorithm of your choice.

Sorry, I do not have an instance in front of me to try the code. That sounds funny. I think that doing this with nested SELECT will be confusing and not supported; better to arrange it as CTE, IMHO.

+1
source share

I am using the ITVF function (table function). In terms of performance, the built-in function works like a view

  CREATE FUNCTION [dbo].[udf_Split_Capitals_In_Str] (@str VARCHAR(8000)) RETURNS TABLE AS RETURN WITH Tally (n) AS ( SELECT TOP (LEN (@str)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0)) a(n) CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b(n) CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c(n) CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d(n) ) SELECT New_Str = STUFF(( SELECT CASE WHEN SUBSTRING(@str, n,1) = UPPER(SUBSTRING(@str, n,1)) Collate Latin1_General_CS_AI AND n > 1 THEN ' ' + SUBSTRING(@str, n,1) ELSE SUBSTRING(@str, n,1) END FROM Tally FOR XML PATH ('')),1,0,'') /*How To use:*/ SELECT * FROM dbo.udf_Split_Capitals_In_Str ('HelloWorld') /*Cross Apply Example*/ SELECT T.* , Fixed_Name.New_Str FixedName FROM ( SELECT Id= 1 , Name = 'DonaldTrump' UNION ALL SELECT Id= 2 , Name = 'HilaryClinton' ) T CROSS APPLY dbo.udf_Split_Capitals_In_Str (T.Name) Fixed_Name 
0
source share

I know that there are already quite good answers, but if you want to avoid creating a function, you can also use a recursive CTE for this. This, of course, is not a clean way to do this, but it works.

 DECLARE @camelcase nvarchar(4000) = 'ThisIsCamelCased' ; WITH split AS ( SELECT [iteration] = 0 ,[string] = @camelcase UNION ALL SELECT [iteration] = split.[iteration] + 1 ,[string] = STUFF(split.[string], pattern.[index] + 1, 0, ' ') FROM split CROSS APPLY ( SELECT [index] = PATINDEX(N'%[^ ][AZ]%' COLLATE Latin1_General_Bin, split.[string]) ) pattern WHERE pattern.[index] > 0 ) SELECT TOP (1) [spaced] = split.[string] FROM split ORDER BY split.[iteration] DESC ; 

As I said, this is not a good way to write a query, but I use such things when I just write some special queries in which I would not want to add new artifacts to the database. You can also use this to create your function as an embedded table function, which is always nice.

0
source share

Please try this:

  declare @t nvarchar (100) ='IamTheTestString' declare @len int declare @Counter int =0 declare @Final nvarchar (100) ='' set @len =len( @t) while (@Counter <= @len) begin set @Final= @Final + Case when ascii(substring (@t,@Counter,1))>=65 and ascii(substring (@t,@Counter,1))<=90 then ' '+substring (@t,@Counter,1) else substring (@t,@Counter,1) end set @Counter=@Counter+1 end print ltrim(@Final) 
0
source share

All Articles