How to split column values and add to a temp table

How to split column values and add to a temp table - sql-server

I am using SQL Server 2008 R2.
I have a table tblInstitution as follows
InstitutionCode InstitutionDesc
---------------------------------
ABC Abra Cada Brad
DEF Def Fede Eeee
GHJ Gee Hee
I want to split the values in InstitutionDesc and store it based on the institution code
InstitutionCode Token Score
-------------------------------
ABC Abra 0
ABC Cada 0
ABC Brad 0
DEF Def 0
DEF Fede 0
DEF Eeee 0
GHJ Gee 0
GHJ Hee 0
Is there a way I can do this in a set-based operation?
I have seen examples where a single column value can be split into multiple column values of the same row. But I am not able to find an example where the same column can be split into different rows. I am not sure what exactly should be searched for. Is it something to do with CTEs.

Here is a recursive CTE option...
If Object_ID('tempdb..#tblInstitution') Is Not Null Drop Table #tblInstitution;
Create Table #tblInstitution (InstitutionCode Varchar(10), InstitutionDesc Varchar(50));
Insert #tblInstitution (InstitutionCode, InstitutionDesc)
Values ('ABC','Abra Cada Brad'),
('DEF','Def Fede Eeee'),
('GHJ','Gee Hee'),
('KLM','Kappa');
With base As
(
Select InstitutionCode,
LTRIM(RTRIM(InstitutionDesc)) As InstitutionDesc
From #tblInstitution
), recur As
(
Select InstitutionCode,
Left(InstitutionDesc, CharIndex(' ', InstitutionDesc + ' ') - 1) As Token,
Case
When CharIndex(' ', InstitutionDesc) > 0
Then Right(InstitutionDesc, Len(InstitutionDesc) - CharIndex(' ', InstitutionDesc))
Else Null
End As Remaining
From base
Union All
Select InstitutionCode,
Left(Remaining, CharIndex(' ', Remaining + ' ') - 1) As Token,
Case
When CharIndex(' ', Remaining) > 0
Then Right(Remaining, Len(Remaining) - CharIndex(' ', Remaining))
Else Null
End As Remaining
From recur
Where Remaining Is Not Null
)
Select InstitutionCode,
Token,
0 As Score
From recur
Order By InstitutionCode

Related

PATINDEX with SOUNDEX

Want to search the string using PATINDEX and SOUNDEX.
I have the following table with some sample data to search the given string using PATINDEX and SOUNDEX.
create table tbl_pat_soundex
(
col_str varchar(max)
);
insert into tbl_pat_soundex values('Smith A Steve');
insert into tbl_pat_soundex values('Steve A Smyth');
insert into tbl_pat_soundex values('A Smeeth Stive');
insert into tbl_pat_soundex values('Steve Smith A');
insert into tbl_pat_soundex values('Smit Steve A');
String to search:- 'Smith A Steve'
SELECT col_str,PATINDEX('%Smith%',col_str) [Smith],PATINDEX('%A%',col_str) [A],PATINDEX('%Steve%',col_str) [Steve]
FROM tbl_pat_soundex
Getting Output:
col_str Smith A Steve
---------------------------------
Smith A Steve 1 7 9
Steve A Smyth 0 7 1
A Smeeth Stive 0 1 0
Steve Smith A 7 13 1
Smit Steve A 0 12 6
Expected Output:
col_str Smith A Steve
---------------------------------
Smith A Steve 1 7 9
Steve A Smyth 9 7 1
A Smeeth Stive 3 1 10
Steve Smith A 7 13 1
Smit Steve A 1 12 6
Tried:
SELECT col_str,
PATINDEX('%'+soundex('Smith')+'%',soundex(col_str)) [Smith],
PATINDEX('%'+soundex('A')+'%',soundex(col_str)) [A],
PATINDEX('%'+soundex('Steve')+'%',soundex(col_str)) [Steve]
FROM tbl_pat_soundex
But getting unexpected result:
col_str Smith A Steve
---------------------------------
Smith A Steve 1 0 0
Steve A Smyth 0 0 1
A Smeeth Stive 0 1 0
Steve Smith A 0 0 1
Smit Steve A 1 0 0
Note: I have 100 Millions of records in the table to search for.

Here's one option, not sure how it would perform with 100 million records considering all that you need to do. You'll have to test that out.
At a high level how I understand this is you basically need
Search all words in a string based on the words of another string
Returning the character starting position in the original string where that word equals or sounds like the search word.
You can use DIFFERENCE() for the comparison:
DIFFERENCE compares two different SOUNDEX values, and returns an
integer value. This value measures the degree that the SOUNDEX values
match, on a scale of 0 to 4. A value of 0 indicates weak or no
similarity between the SOUNDEX values; 4 indicates strongly similar,
or even identically matching, SOUNDEX values.
You'll need to split the string based on the space ' ' and since you're 2008 you'd have to roll your own function.
I used the XML function from here, https://sqlperformance.com/2012/07/t-sql-queries/split-strings, for my examples, you'll obviously need to adjust if you have your own or want to use something different:
CREATE FUNCTION dbo.SplitStrings_XML
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT Item = y.i.value('(./text())[1]', 'nvarchar(4000)')
FROM
(
SELECT x = CONVERT(XML, '<i>'
+ REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
GO
I switched and use table variables to show the example, I would suggest not doing that with the amount of data you have and create and use physical tables.
Option 1 - Not dynamic:
DECLARE #tbl_pat_soundex TABLE
(
[col_str] VARCHAR(MAX)
);
INSERT INTO #tbl_pat_soundex
VALUES ( 'Smith A Steve' )
,( 'Steve A Smyth' )
,( 'A Smeeth Stive' )
,( 'Steve Smith A' )
,( 'Smit Steve A' )
SELECT DISTINCT [aa].[col_str]
, MAX([aa].[Smith]) OVER ( PARTITION BY [aa].[col_str] ) AS [Smith]
, MAX([aa].[A]) OVER ( PARTITION BY [aa].[col_str] ) AS [A]
, MAX([aa].[Steve]) OVER ( PARTITION BY [aa].[col_str] ) AS [Steve]
FROM (
SELECT [a].[col_str]
, CASE WHEN DIFFERENCE([b].[item], 'Smith') = 4 THEN
CHARINDEX([b].[item], [a].[col_str])
ELSE 0
END AS [Smith]
, CASE WHEN DIFFERENCE([b].[item], 'A') = 4 THEN
CHARINDEX([b].[item], [a].[col_str])
ELSE 0
END AS [A]
, CASE WHEN DIFFERENCE([b].[item], 'Steve') = 4 THEN
CHARINDEX([b].[item], [a].[col_str])
ELSE 0
END AS [Steve]
FROM #tbl_pat_soundex [a]
CROSS APPLY [dbo].[SplitStrings_XML]([a].[col_str], ' ') [b]
) AS [aa];
Using the function we split the string into individual words
Then we use a case statement to check the DIFFERENCE value
If that DIFFERENCE value equals 4 we then return the CHARINDEX value of the original word against string.
If doesn't equal we return 0
Then from there it's a matter of getting the max value of each based on the original string:
, MAX([aa].[Smith]) OVER ( PARTITION BY [aa].[col_str] ) AS [Smith]
, MAX([aa].[A]) OVER ( PARTITION BY [aa].[col_str] ) AS [A]
, MAX([aa].[Steve]) OVER ( PARTITION BY [aa].[col_str] ) AS [Steve]
To get you your final results:
Option 2 - Dynamic with a pivot:
We'll declare the string we want to search, split that out and search for those individuals words in the original string and then pivot the results.
--This example is using global temp tables as it's showing how
--to build a dynamic pivot
IF OBJECT_ID('tempdb..##tbl_pat_soundex') IS NOT NULL
DROP TABLE [##tbl_pat_soundex];
IF OBJECT_ID('tempdb..##tbl_col_str_SearchString') IS NOT NULL
DROP TABLE [##tbl_col_str_SearchString];
CREATE TABLE [##tbl_pat_soundex]
(
[col_str] VARCHAR(MAX)
);
INSERT INTO [##tbl_pat_soundex]
VALUES ( 'Smith A Steve' )
, ( 'Steve A Smyth' )
, ( 'A Smeeth Stive' )
, ( 'Steve Smith A' )
, ( 'Smit Steve A' );
--What are you searching for?
DECLARE #SearchString NVARCHAR(200);
SET #SearchString = N'Smith A Steve';
--We build a table we load with every combination of the words from the string and the words from the SearchString for easier comparison.
CREATE TABLE [##tbl_col_str_SearchString]
(
[col_str] NVARCHAR(MAX)
, [col_str_value] NVARCHAR(MAX)
, [SearchValue] NVARCHAR(200)
);
--Load that table for comparison
--split our original string into individual words
--also split our search string into individual words and give me all combinations.
INSERT INTO [##tbl_col_str_SearchString] (
[col_str]
, [col_str_value]
, [SearchValue]
)
SELECT DISTINCT [a].[col_str]
, [b].[item]
, [c].[item]
FROM [##tbl_pat_soundex] [a]
CROSS APPLY [dbo].[SplitStrings_XML]([a].[col_str], ' ') [b]
CROSS APPLY [dbo].[SplitStrings_XML](#SearchString, ' ') [c]
ORDER BY [a].[col_str];
--Then we can easily compare each word and search word for those that match or sound alike using DIFFERNCE()
SELECT [col_str], [col_str_value], [SearchValue], CASE WHEN DIFFERENCE([col_str_value], [SearchValue]) = 4 THEN CHARINDEX([col_str_value], [col_str]) ELSE 0 END AS [Match] FROM ##tbl_col_str_SearchString
--Then we can pivot on it
--and we will need to make it dynamic since we are not sure what what #SearchString could be.
DECLARE #PivotSQL NVARCHAR(MAX);
DECLARE #pivotColumn NVARCHAR(MAX);
SET #pivotColumn = N'[' + REPLACE(#SearchString, ' ', '],[') + N']';
SET #PivotSQL = N'SELECT * FROM (
SELECT [col_str], [SearchValue], CASE WHEN DIFFERENCE([col_str_value], [SearchValue]) = 4 THEN CHARINDEX([col_str_value], [col_str]) ELSE 0 END AS [Match] FROM ##tbl_col_str_SearchString
) aa
PIVOT (MAX([Match]) FOR [SearchValue] IN (' + #pivotColumn
+ N')) AS MaxMatch
ORDER BY [MaxMatch].[col_str]
';
--Giving us the final results.
EXEC sp_executesql #PivotSQL

splitting a string column correctly by spaces

in my query I have several hundred records with strings from the iSeries Message Queue like this:
006 1 AccountSetBalance 0000000000 EQ 2016-03-01-18.45.42.002000 0038882665 _ 123456 12345612345678 17017362 0 0
I need to show in my results the account number part 12345678 and the balance part which is 17017362
I have tried:
SELECT MQ_Message
, SUBSTRING(MQ_Message,92,30) -- = 12345678 17017362 0
, SUBSTRING(MQ_Message,92,8) -- = 12345678 , SUBSTRING(MQ_Message,100, CHarIndex(' ', SUBSTRING('006 1 AccountSetBalance 0000000000 EQ 2016-03-01-18.45.42.002000 0038882665 _ 123456 12345612345678 17017362 0 0',92,20)) )
, CHarIndex(' ', SUBSTRING('006 1 AccountSetBalance 0000000000 EQ 2016-03-01-18.45.42.002000 0038882665 _ 123456 12345612345678 17017362 0 0',99,20))
, CHARINDEX(' ','17017362 0 0')
from outboundMessages WHERE message_Type = '006'
I can get the account easily enough, as the string is fixed length up to the balance, but then I need to split the string returned by SUBSTRING(MQ_Message,92,30) and get the balance part out of it which is 17017362 and will be different between 0 and maybe 999999 (in pence!)
I am really stuck trying to get the balance, having tried every possible combination of using CHARINDEX.
What is the best way to do this?

DECLARE #string NVARCHAR(MAX) = '006 1 AccountSetBalance 0000000000 EQ 2016-03-01-18.45.42.002000 0038882665 _ 123456 12345612345678 17017362 0 0',
#xml xml
select #xml = cast('<d><q>'+REPLACE(#string,' ','</q><q>')+'</q></d>' as xml)
SELECT n.v.value('q[9]','integer'),
n.v.value('q[11]','integer')
FROM #xml.nodes('/d') AS n(v);
Result:
----------- -----------
123456 17017362
(1 row(s) affected)

Sqlserver PIVOT to turn a "reconstruct" a flat table into columns - why does this not work?

The system we are using allows a data entry form to be created from multiple user defined fields to satisfy information required on a particular group of different "ORDES". The fields are then stored in a database as such from what is entered:
GUID OrderGUID UserDataCode Value
1 100 OrderName Breakfast
2 100 OrderDesc Food you eat before Lunch
3 100 CerealYN Y
4 100 ToastYN Y
5 100 ToastDesc White Bread
6 100 PaperYN Y
7 100 PaperDesc The Newsroom
8 101 OrderName Lunch
9 101 OrderDesc Food you eat before Dinner
10 101 CerealYN N
11 101 ToastYN Y
12 101 ToastDesc Brown Bread
13 101 PaperYN Y
14 101 PaperDesc The MiddayNews
(etc)
(in fact this is an Enterprise Hospital software but I have used simpler examples here)
I would like using SQL to return this table PIVOTed like below
OrderGUID OrderName OrderDESC CerealYN ToastYN ToastDesc ....
101 Breakfast Food you.. Y Y White Bread ....
102 Lunch Food you.. N Y Brown Bread ....
I wrote the following SQL based on examples found on the net:
DECLARE #DynamicPivotQuery AS NVARCHAR(MAX)
DECLARE #ColumnName AS NVARCHAR(MAX)
--Get distinct values of the PIVOT Column
SELECT #ColumnName= ISNULL(#ColumnName + ',','')
+ QUOTENAME([UserDataCode])
FROM (
SELECT
[UserDataCode]
FROM
[XXX].[dbo].[CV3OrderUserData]
WHERE OrderGUID = 3000680
) AS Codes;
--Prepare the PIVOT query using the dynamic
SET #DynamicPivotQuery = N'SELECT OrderGUID, ' + #ColumnName + '
FROM
[XXX].[dbo].[CV3OrderUserData]
PIVOT(Max(Value)
FOR UserDataCode IN (' + #ColumnName + ')) AS PVTTable'
--Execute the Dynamic Pivot Query
--SELECT #DynamicPivotQuery
EXEC sp_executesql #DynamicPivotQuery
However while it does the pivot as requested.. and puts the values in the correct new "dynamic" columns, if returns a row for each OrderGUID + Value,
ie:
OrderGUID OrderName OrderDesc CerealYN ToastYN
100 Breakfast null null null ...
100 null Food you.. null null ...
101 null null Y null ...
etc.etc
What am i doing wrong :( ?

The problem in your query is the pivot source query has GUID column which makes the pivot operator to consider GUID column.
To get the expected output you need to remove GUID column from the pivot source query.
Here is a static version you can convert it to dynamic version as you already did.
select * from
(
SELECT OrderGUID,UserDataCode,Value
FROM
tst) A
PIVOT(Max(Value)
FOR UserDataCode IN ([OrderName],[OrderDesc],
[CerealYN],[ToastYN],
[ToastDesc],[PaperYN],
[PaperDesc])) AS PVTTable
SQLFIDDLE DEMO

count the number of spaces in values in sql server [duplicate]

This question already has answers here:
How do you count the number of occurrences of a certain substring in a SQL varchar?
(23 answers)
Closed 8 years ago.
I need the number of spaces in column values in sql server.
Ex:
column1
------------
aaa bbbb - 1 space
aaa bbb ccc - 2 space
aaa bbb ccc ddd - 3 space
I need the count of spaces like this.
thanks.

SELECT LEN(column1)-LEN(REPLACE(column1, ' ', '')) FROM YourTableName

This will give a different and more accurate result than the other answers, it is also counting spaces in the end of the words, it becomes clear when tested on these examples:
DECLARE #a table(column1 varchar(20))
INSERT #a values('b c ')
INSERT #a values('b c')
INSERT #a values(' b c ')
SELECT
LEN(column1 + ';')-LEN(REPLACE(column1,' ','')) - 1 accurate,
LEN(column1)-LEN(REPLACE(column1,' ', '')) [inaccurate] -- other answers
FROM #a
Result:
accurate inaccurate
2 1
1 1
10 4

Try this one -
DECLARE #t TABLE (txt VARCHAR(50))
INSERT INTO #t (txt)
VALUES
('aaa bbbb')
, ('aaa bbb ccc')
, ('aaa bbb ccc ddd')
SELECT txt, LEN(txt) - LEN(REPLACE(txt, ' ', ''))
FROM #t

this is a code for that
select len('aaa bbb') - len(replace('aaa bbb ccc', ' ', '')) from
**tablename**
output
1
select len('aaa bbb ccc') - len(replace('aaa bbb ccc', ' ', '')) from
**tablename**
ouput
2
Tablename acan be anything table that can be in your database

Splitting (duplicating rows of) an SQL table based on the values in one column

I have a table which is the result of an export and has the columns and values like the following
Name Aliases Ranks
Ben BenA BenB BenC 1 5 3
Jerry JerryA JerryB 7 3
Aliases and Ranks are separated by a character (in this case CHAR(10)) and they have the same number of entries. But each Name could have different number of Aliases (And therefore Ranks).
I would like to write a SQL Query to give me the following table
Name Alias Rank
Ben BenA 1
Ben BenB 5
Ben BenC 3
Jerry JerryA 7
Jerry JerryB 3
How can I do this?

with cte as (
select Name, cast(null as int) as AliasStartPosition, cast(0 as int) as AliasEndPosition, Aliases + ' ' as Aliases, cast(null as int) as RankStartPosition, cast(0 as int) as RankEndPosition, Ranks + ' ' as Ranks
from (
values ('Ben', 'BenA BenB BenC', '1 5 3'),
('Jerry', 'JerryA JerryB', '7 3')
) t (Name, Aliases, Ranks)
union all
select Name, AliasEndPosition + 1, charindex(' ', Aliases, AliasEndPosition + 1), Aliases, RankEndPosition + 1, charindex(' ', Ranks, RankEndPosition + 1), Ranks
from cte
where charindex(' ', Aliases, AliasEndPosition + 1) != 0
)
select Name, substring(Aliases, AliasStartPosition, AliasEndPosition - AliasStartPosition) as Alias, substring(Ranks, RankStartPosition, RankEndPosition - RankStartPosition) as Rank
from cte
where AliasStartPosition is not null

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to split column values and add to a temp table - sql-server

Related

PATINDEX with SOUNDEX

splitting a string column correctly by spaces

Sqlserver PIVOT to turn a "reconstruct" a flat table into columns - why does this not work?

count the number of spaces in values in sql server [duplicate]

Splitting (duplicating rows of) an SQL table based on the values in one column

Categories

Resources