Want to search the string using PATINDEX and SOUNDEX.
I have the following table with some sample data to search the given string using PATINDEX and SOUNDEX.
create table tbl_pat_soundex
(
col_str varchar(max)
);
insert into tbl_pat_soundex values('Smith A Steve');
insert into tbl_pat_soundex values('Steve A Smyth');
insert into tbl_pat_soundex values('A Smeeth Stive');
insert into tbl_pat_soundex values('Steve Smith A');
insert into tbl_pat_soundex values('Smit Steve A');
String to search:- 'Smith A Steve'
SELECT col_str,PATINDEX('%Smith%',col_str) [Smith],PATINDEX('%A%',col_str) [A],PATINDEX('%Steve%',col_str) [Steve]
FROM tbl_pat_soundex
Getting Output:
col_str Smith A Steve
---------------------------------
Smith A Steve 1 7 9
Steve A Smyth 0 7 1
A Smeeth Stive 0 1 0
Steve Smith A 7 13 1
Smit Steve A 0 12 6
Expected Output:
col_str Smith A Steve
---------------------------------
Smith A Steve 1 7 9
Steve A Smyth 9 7 1
A Smeeth Stive 3 1 10
Steve Smith A 7 13 1
Smit Steve A 1 12 6
Tried:
SELECT col_str,
PATINDEX('%'+soundex('Smith')+'%',soundex(col_str)) [Smith],
PATINDEX('%'+soundex('A')+'%',soundex(col_str)) [A],
PATINDEX('%'+soundex('Steve')+'%',soundex(col_str)) [Steve]
FROM tbl_pat_soundex
But getting unexpected result:
col_str Smith A Steve
---------------------------------
Smith A Steve 1 0 0
Steve A Smyth 0 0 1
A Smeeth Stive 0 1 0
Steve Smith A 0 0 1
Smit Steve A 1 0 0
Note: I have 100 Millions of records in the table to search for.
Here's one option, not sure how it would perform with 100 million records considering all that you need to do. You'll have to test that out.
At a high level how I understand this is you basically need
Search all words in a string based on the words of another string
Returning the character starting position in the original string where that word equals or sounds like the search word.
You can use DIFFERENCE() for the comparison:
DIFFERENCE compares two different SOUNDEX values, and returns an
integer value. This value measures the degree that the SOUNDEX values
match, on a scale of 0 to 4. A value of 0 indicates weak or no
similarity between the SOUNDEX values; 4 indicates strongly similar,
or even identically matching, SOUNDEX values.
You'll need to split the string based on the space ' ' and since you're 2008 you'd have to roll your own function.
I used the XML function from here, https://sqlperformance.com/2012/07/t-sql-queries/split-strings, for my examples, you'll obviously need to adjust if you have your own or want to use something different:
CREATE FUNCTION dbo.SplitStrings_XML
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT Item = y.i.value('(./text())[1]', 'nvarchar(4000)')
FROM
(
SELECT x = CONVERT(XML, '<i>'
+ REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
GO
I switched and use table variables to show the example, I would suggest not doing that with the amount of data you have and create and use physical tables.
Option 1 - Not dynamic:
DECLARE #tbl_pat_soundex TABLE
(
[col_str] VARCHAR(MAX)
);
INSERT INTO #tbl_pat_soundex
VALUES ( 'Smith A Steve' )
,( 'Steve A Smyth' )
,( 'A Smeeth Stive' )
,( 'Steve Smith A' )
,( 'Smit Steve A' )
SELECT DISTINCT [aa].[col_str]
, MAX([aa].[Smith]) OVER ( PARTITION BY [aa].[col_str] ) AS [Smith]
, MAX([aa].[A]) OVER ( PARTITION BY [aa].[col_str] ) AS [A]
, MAX([aa].[Steve]) OVER ( PARTITION BY [aa].[col_str] ) AS [Steve]
FROM (
SELECT [a].[col_str]
, CASE WHEN DIFFERENCE([b].[item], 'Smith') = 4 THEN
CHARINDEX([b].[item], [a].[col_str])
ELSE 0
END AS [Smith]
, CASE WHEN DIFFERENCE([b].[item], 'A') = 4 THEN
CHARINDEX([b].[item], [a].[col_str])
ELSE 0
END AS [A]
, CASE WHEN DIFFERENCE([b].[item], 'Steve') = 4 THEN
CHARINDEX([b].[item], [a].[col_str])
ELSE 0
END AS [Steve]
FROM #tbl_pat_soundex [a]
CROSS APPLY [dbo].[SplitStrings_XML]([a].[col_str], ' ') [b]
) AS [aa];
Using the function we split the string into individual words
Then we use a case statement to check the DIFFERENCE value
If that DIFFERENCE value equals 4 we then return the CHARINDEX value of the original word against string.
If doesn't equal we return 0
Then from there it's a matter of getting the max value of each based on the original string:
, MAX([aa].[Smith]) OVER ( PARTITION BY [aa].[col_str] ) AS [Smith]
, MAX([aa].[A]) OVER ( PARTITION BY [aa].[col_str] ) AS [A]
, MAX([aa].[Steve]) OVER ( PARTITION BY [aa].[col_str] ) AS [Steve]
To get you your final results:
Option 2 - Dynamic with a pivot:
We'll declare the string we want to search, split that out and search for those individuals words in the original string and then pivot the results.
--This example is using global temp tables as it's showing how
--to build a dynamic pivot
IF OBJECT_ID('tempdb..##tbl_pat_soundex') IS NOT NULL
DROP TABLE [##tbl_pat_soundex];
IF OBJECT_ID('tempdb..##tbl_col_str_SearchString') IS NOT NULL
DROP TABLE [##tbl_col_str_SearchString];
CREATE TABLE [##tbl_pat_soundex]
(
[col_str] VARCHAR(MAX)
);
INSERT INTO [##tbl_pat_soundex]
VALUES ( 'Smith A Steve' )
, ( 'Steve A Smyth' )
, ( 'A Smeeth Stive' )
, ( 'Steve Smith A' )
, ( 'Smit Steve A' );
--What are you searching for?
DECLARE #SearchString NVARCHAR(200);
SET #SearchString = N'Smith A Steve';
--We build a table we load with every combination of the words from the string and the words from the SearchString for easier comparison.
CREATE TABLE [##tbl_col_str_SearchString]
(
[col_str] NVARCHAR(MAX)
, [col_str_value] NVARCHAR(MAX)
, [SearchValue] NVARCHAR(200)
);
--Load that table for comparison
--split our original string into individual words
--also split our search string into individual words and give me all combinations.
INSERT INTO [##tbl_col_str_SearchString] (
[col_str]
, [col_str_value]
, [SearchValue]
)
SELECT DISTINCT [a].[col_str]
, [b].[item]
, [c].[item]
FROM [##tbl_pat_soundex] [a]
CROSS APPLY [dbo].[SplitStrings_XML]([a].[col_str], ' ') [b]
CROSS APPLY [dbo].[SplitStrings_XML](#SearchString, ' ') [c]
ORDER BY [a].[col_str];
--Then we can easily compare each word and search word for those that match or sound alike using DIFFERNCE()
SELECT [col_str], [col_str_value], [SearchValue], CASE WHEN DIFFERENCE([col_str_value], [SearchValue]) = 4 THEN CHARINDEX([col_str_value], [col_str]) ELSE 0 END AS [Match] FROM ##tbl_col_str_SearchString
--Then we can pivot on it
--and we will need to make it dynamic since we are not sure what what #SearchString could be.
DECLARE #PivotSQL NVARCHAR(MAX);
DECLARE #pivotColumn NVARCHAR(MAX);
SET #pivotColumn = N'[' + REPLACE(#SearchString, ' ', '],[') + N']';
SET #PivotSQL = N'SELECT * FROM (
SELECT [col_str], [SearchValue], CASE WHEN DIFFERENCE([col_str_value], [SearchValue]) = 4 THEN CHARINDEX([col_str_value], [col_str]) ELSE 0 END AS [Match] FROM ##tbl_col_str_SearchString
) aa
PIVOT (MAX([Match]) FOR [SearchValue] IN (' + #pivotColumn
+ N')) AS MaxMatch
ORDER BY [MaxMatch].[col_str]
';
--Giving us the final results.
EXEC sp_executesql #PivotSQL
Related
Want to search the string using PATINDEX and SOUNDEX within the WHERE clause or any optimal way.
I have the following table with some sample data to search the given string using PATINDEX and SOUNDEX.
create table tbl_pat_soundex
(
col_str varchar(max)
);
insert into tbl_pat_soundex values('Smith A Steve');
insert into tbl_pat_soundex values('Steve A Smyth');
insert into tbl_pat_soundex values('A Smeeth Stive');
insert into tbl_pat_soundex values('Steve Smith A');
insert into tbl_pat_soundex values('Smit Steve A');
Note: I have 100 Millions of records in the table to search for.
String to search:- 'Smith A Steve'
SELECT col_str
FROM tbl_pat_soundex
WHERE PATINDEX('%Smith%',col_str) >= 1 AND PATINDEX('%A%',col_str) >= 1 AND PATINDEX('%Steve%',col_str) >= 1
Getting Output:
col_str
--------------
Smith A Steve
Steve Smith A
Expected Output:
col_str
----------------
Smith A Steve
Steve A Smyth
A Smeeth Stive
Steve Smith A
Smit Steve A
Tried:
1:
SELECT col_str
FROM tbl_pat_soundex
WHERE PATINDEX('%Smith%',col_str) >= 1 AND
PATINDEX('%A%',col_str) >= 1 AND
PATINDEX('%Steve%',col_str) >= 1
2:
SELECT col_str
FROM tbl_pat_soundex
WHERE PATINDEX('%'+SOUNDEX('Smith')+'%',SOUNDEX(col_str)) >= 1 AND
PATINDEX('%'+SOUNDEX('A')+'%',SOUNDEX(col_str)) >= 1 AND
PATINDEX('%'+SOUNDEX('Steve')+'%',SOUNDEX(col_str)) >= 1
3:
SELECT col_str
FROM tbl_pat_soundex
WHERE DIFFERENCE('Smith',col_str) = 4 AND
DIFFERENCE('A',col_str) =4 AND
DIFFERENCE('Steve',col_str) = 4
4:
--Following was taking huge time(was kept running more than 20 minutes) to execute.
SELECT DISTINCT col_str
FROM tbl_pat_soundex [a]
CROSS APPLY SplitString([a].[col_str], ' ') [b]
WHERE DIFFERENCE([b].Item,'Smith') >= 1 AND
DIFFERENCE([b].Item,'A') >= 1 AND
DIFFERENCE([b].Item,'Steve') >= 1
With such a lot of rows the only hint I can give you is: Change the design. Each name part should live in a separate column...
The following will work, but I promise it will be slow...
--set up a test db
USE master;
GO
CREATE DATABASE shnugo;
GO
USE shnugo;
GO
--your table, I added an ID-column
create table tbl_pat_soundex
(
ID INT IDENTITY --needed to distinguish rows
,col_str varchar(max)
);
GO
--A function, which will return a blank-separated string as a alphabetically sorted list of distinct soundex values separated by /: "Smith A Steve" comes back as /A000/S310/S530/
CREATE FUNCTION dbo.ComputeSoundex(#str VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE #tmpXML XML=CAST('<x>' + REPLACE((SELECT #str AS [*] FOR XML PATH('')),' ','</x><x>') + '</x>' AS XML);
RETURN (SELECT DISTINCT '/' + SOUNDEX(x.value('text()[1]','varchar(max)')) AS [se]
FROM #tmpXML.nodes('/x[text()]') A(x)
ORDER BY se
FOR XML PATH(''),TYPE).value('.','nvarchar(max)') + '/';
END
GO
--Add a column to store a computed soundex-chain permanently
ALTER TABLE tbl_pat_soundex ADD SortedSoundExPattern VARCHAR(MAX);
GO
--We need a trigger to maintain the computed soundex-chain on any insert or update
CREATE TRIGGER RefreshComputeSoundex ON tbl_pat_soundex
FOR INSERT,UPDATE
AS
BEGIN
UPDATE s SET SortedSoundExPattern=dbo.ComputeSoundex(i.col_str)
FROM tbl_pat_soundex s
INNER JOIN inserted i ON s.ID=i.ID;
END
GO
--test data
insert into tbl_pat_soundex(col_str) values
('Smith A Steve')
,('Steve A Smyth')
,('A Smeeth Stive')
,('Steve Smith A')
,('Smit Steve A')
,('Smit Steve') --no A
,('Smit A') --no Steve
,('Smit Smith Robert Peter A') --add noise
,('Shnugo'); --something else entirely
--check the intermediate result
SELECT *
FROM tbl_pat_soundex
/*
+----+---------------------------+-----------------------+
| ID | col_str | SortedSoundExPattern |
+----+---------------------------+-----------------------+
| 1 | Smith A Steve | /A000/S310/S530/ |
+----+---------------------------+-----------------------+
| 2 | Steve A Smyth | /A000/S310/S530/ |
+----+---------------------------+-----------------------+
| 3 | A Smeeth Stive | /A000/S310/S530/ |
+----+---------------------------+-----------------------+
| 4 | Steve Smith A | /A000/S310/S530/ |
+----+---------------------------+-----------------------+
| 5 | Smit Steve A | /A000/S310/S530/ |
+----+---------------------------+-----------------------+
| 6 | Smit Steve | /S310/S530/ |
+----+---------------------------+-----------------------+
| 7 | Smit A | /A000/S530/ |
+----+---------------------------+-----------------------+
| 8 | Smit Smith Robert Peter A | /A000/P360/R163/S530/ |
+----+---------------------------+-----------------------+
| 9 | Shnugo | /S520/ |
+----+---------------------------+-----------------------+
*/
--Now we can start to search:
DECLARE #StringToSearch VARCHAR(MAX)=' A Steve';
WITH SplittedSearchString AS
(
SELECT soundexCode.value('text()[1]','nvarchar(max)') AS SoundExCode
FROM (SELECT CAST('<x>' + REPLACE(dbo.ComputeSoundex(#StringToSearch),'/','</x><x>') + '</x>' AS XML)) A(x)
CROSS APPLY x.nodes('/x[text()]') B(soundexCode)
)
SELECT a.ID,col_str
FROM tbl_pat_soundex a
INNER JOIN SplittedSearchString s On SortedSoundExPattern LIKE '%/' + s.SoundExCode + '/%'
GROUP BY ID,col_str
HAVING COUNT(ID)=(SELECT COUNT(*) FROM SplittedSearchString)
ORDER BY ID
GO
--clean-up
USE master;
GO
DROP DATABASE shnugo;
Short explanation
This is how it works:
The cte will use the same function to return a soundex-chain of alle the input's fragments
The query will then INNER JOIN this with a LIKE test --this will be sloooooow...
The final check is, if the number of hits is the same as number of fragments.
And a final hint: If you want to search for an exact match, but you want to include different writings you can just directly compare the two strings. You might even place an index on the new column SortedSoundExPattern. Due to the way of creation all kinds of "Steven A Smith", "Steeven a Smit" and even in differing order like "Smith Steven A" will produce exactly the same pattern.
In my view, you should try to use dynamic SQL.
For example, you have a table:
create table tbl_pat_soundex
(
id int,
col_str varchar(max)
)
And you have an the following clustered index or any other index(table with over 100 million rows should have some index):
CREATE NONCLUSTERED INDEX myIndex ON dbo.tbl_pat_soundex(id) INCLUDE (col_str)*/
So try to create the following dynamic SQL query based on your logic and execute it. The wish result should look like this:
DECLARE #statement NVARCHAR(4000)
SET #statement = N'
SELECT col_str
FROM tbl_pat_soundex
WHERE col_str like '%Smith%' AND id > 0
UNION ALL
SELECT col_str
FROM tbl_pat_soundex
WHERE col_str like '%Steve%' AND id > 0
UNION ALL
SELECT col_str
FROM tbl_pat_soundex
WHERE
PATINDEX('%Smith%',col_str) >= 1 AND PATINDEX('%A%',col_str) >= 1 AND
PATINDEX('%Steve%',col_str) >= 1
AND id > 0'
Basically, what we do is creating single search queries which will have index seek and then combine all results.
This query will have index seek as we use predicate id > 0(assuming that all ids are greater than 0 or you can write your own negative number):
SELECT col_str
FROM tbl_pat_soundex
WHERE col_str like '%Smith%' AND id > 0
I have this following table with header information:
InformationID Name DispalyName VAlueType
1 Occupation Occupation Text
3 Surgery Surgery Header
4 SurgeryType SurgeryType Text
5 SurgeryDate SurgeryDate Datetime
8 Rehabilitation Rehabilitation Header
9 RehabilitationPT Rehabilitation Physiotherapist Text
10 RehabilitationDate Rehabilitation Date DateTime
11 DischargeSession Discharge Session Number int
12 Height Height decimal
13 Weight Weight decimal
I have another table has all these rows values stored like this:
InfomationID ClientID TextValue NumberValue DateValue
1 XXXX Programmer null null
4 XXXX Left knee surgery null null
5 XXXX null null 2016-08-01 01:00:00
11 XXXX null 6 null
12 XXXX null 164.592 null
Now I need to get those in a sub-query with where ClientID = XXXX
Part of result would be something following:
CFname| CLname| Occupation| SurgeryType | SurgeryDate| DischargeSession
-------------------------------------------------------------------------------
CFName| CLname| Programmer| Left knee surgery| 2016-08-01 | 6
::::
::::
More records
Regards
you would just need to convert the number and date values to varchar.. you can use coalesce to get the first non null value.
your pivot would looks something like this
SELECT *
FROM (
SELECT c.CFName,
c.CLName,
fh.Name,
COALESCE(fv.TextValue,
CAST(fv.NumberValue AS VARCHAR(50)), -- convert numbers to text
CONVERT(VARCHAR(10), fv.DateValue, 101) AS DynamicValue -- convert date to text
-- increase varchar length for time
-- or 101 for different format
FROM ClientInfo c
JOIN FieldValues fv ON c.ClientID = fv.ClientID
JOIN FieldHeaders fh ON fh.InformationID = fv.InformationID
) t
PIVOT (
MAX(DynamicValue)
FOR fh.Name IN ([Occupation], [SurgeryType], [SurgeryDate], [DischargeSession])
) p
to do this dynamically you would need to build your sql and inject the pivot columns into the sql by selecting the values into a nvarchar using STUFF(FOR XML)
DECLARE #Sql NVARCHAR(MAX),
#Columns NVARCHAR(MAX)
SELECT #Columns = STUFF((
SELECT ',' + QUOTENAME(Name)
FROM FieldHeaders
FOR XML PATH('')
), 1, 1, 0)
SET #Sql = N'
SELECT *
FROM (
SELECT c.CFName,
c.CLName,
fh.Name,
COALESCE(fv.TextValue,
CAST(fv.NumberValue AS VARCHAR(50)), -- convert numbers to text
CONVERT(VARCHAR(10), fv.DateValue, 101) AS DynamicValue -- convert date to text
-- increase varchar length for time
-- or 101 for different format
FROM ClientInfo c
JOIN FieldValues fv ON c.ClientID = fv.ClientID
JOIN FieldHeaders fh ON fh.InformationID = fv.InformationID
) t
PIVOT (
MAX(DynamicValue)
FOR fh.Name IN (' + #Columns + ')
) p
'
EXEC(#Sql)
I have a query with the columns 'Name', 'Amount', and 'ReasonId'. I want to sum the amount and put the reasons on one row to keep every name to a single line. There are about 50 distinct ReasonId's so I do not want to name the column the name of the ReasonId's. Instead, I would like to name the columns 'Reason1', 'Reason2', 'Reason3', and 'Reason4'. One single name can have up to 4 different reasons.
I have this:
Name Amount ReasonId
-------------------------
Bob $5 7
Bob $8 6
John $2 8
John $5 9
John $3 9
John $8 4
I want to produce the following:
Name Amount Reason1 Reason2 Reason3 Reason4
-----------------------------------------------------
Bob $13 7 6 NULL NULL
John $18 8 9 4 NULL
One way to do this is to use the dense_rank window function to number the rows, and then use conditional aggregation to put the reason in the correct columns.
I can't see anything that would give the specific order of the reason columns though, maybe there is some column missing that provides the order?
with cte as (
select
name,
reasonid,
amount,
dense_rank() over (partition by name order by reasonid) rn
from your_table
)
select
name,
sum(amount) amount,
max(case when rn = 1 then reasonid end) reason1,
max(case when rn = 2 then reasonid end) reason2,
max(case when rn = 3 then reasonid end) reason3,
max(case when rn = 4 then reasonid end) reason4
from cte
group by name
If you have some column that gives the order you want then change the order by clause used in the dense_rank function.
Sample SQL Fiddle (using PG as MSSQL seems to be offline).
The output from the query above would be:
| name | amount | reason1 | reason2 | reason3 | reason4 |
|------|--------|---------|---------|---------|---------|
| Bob | 13 | 6 | 7 | (null) | (null) |
| John | 18 | 4 | 8 | 9 | (null) |
You could also use a pivot to achieve this; if you know the columns you can enter them in the script, but if not, you can use dynamic sql (there are reasons why you might want to avoid the dynamic solution).
The advantage of this route is that you can enter the column list in a table and then changes to that table will result in changes to your output with change to the script involved. The disadvantages are all those associated with dynamic SQL.
In the interests of variation, here is a dynamic SQL solution using temp tables to hold your data, since a different possibility has been provided:
-- set up your data
CREATE TABLE #MyTab (Name VARCHAR(4), Amount INT, ReasonId INT)
CREATE TABLE #AllPossibleReasons (Id INT,Label VARCHAR(10))
INSERT #AllPossibleReasons
VALUES
(1,'Reason1')
,(2,'Reason2')
,(3,'Reason3')
,(4,'Reason4')
,(5,'Reason5')
,(6,'Reason6')
,(7,'Reason7')
,(8,'Reason8')
,(9,'Reason9')
INSERT #MyTab
VALUES
('Bob',7,7)
,('Bob',8,6)
,('John',2,8)
,('John',5,9)
,('John',3,9)
,('John',8,4)
-----------------------------------------------------------------------------
-- The actual query
DECLARE #ReasonList VARCHAR(MAX) = ''
DECLARE #SQL VARCHAR(MAX)
SELECT #ReasonList = #ReasonList + ',' + QUOTENAME(Label)
FROM #AllPossibleReasons
SET #ReasonList = SUBSTRING(#ReasonList,2,LEN(#ReasonList))
SET #SQL =
'SELECT Name,Value,' + #ReasonList + ' FROM
(SELECT
M.Name,SUM(Amount) AS This, Label, SUM(Total.Value) AS Value
FROM
#MyTab AS M
INNER JOIN #AllPossibleReasons AS Reason ON M.ReasonId = Reason.Id
INNER JOIN(SELECT T.Name, SUM(Amount)Value
FROM #MyTab T GROUP BY T.Name) AS Total ON M.Name = Total.Name
GROUP BY M.Name, Reason.Label) AS Up
PIVOT (SUM(THis) FOR Label IN (' + #ReasonList + ')) AS Pvt'
EXEC (#SQL)
DROP TABLE #AllPossibleReasons
DROP TABLE #MyTab
Working from the information in ListAGG in SQLSERVER, I came up with this somewhat ugly example:
with tbl1 as (
-- Set up initial data set
select 'Bob' name, 5 amount, 7 ReasonId
union all select 'Bob' , 3, 4
union all select 'Bob', 2, 1
union all select 'Brian', 8, 2
union all select 'Bob', 6, 4
union all select 'Brian', 1, 3
union all select 'Tim', 2, 2)
, TBL2 AS ( -- Add a blank to separate the concatenation
SELECT NAME
, AMOUNT
, CAST(ReasonId as varchar) + ' ' ReasonId from tbl1
)
select ta.name
, Total
, ReasonIds from (
(select distinct name, stuff((select distinct '' + t2.ReasonId from tbl2 t2
where t1.name = t2.name
for xml path(''), type).value('.','NVARCHAR(MAX)'),1,0,' ') ReasonIds from tbl2 t1) ta
inner join ( select name, sum(amount) Total from tbl1 group by name) tb on ta.name = tb.name) ;
This converts TBL1 to the following:
name Total ReasonIds
Bob 16 1 4 7
Brian 9 2 3
Tim 2 2
I have below mentioned table :
drn RecNum Name Value
----------------------------------------------
1 1 ad1_pk 1
2 1 ad1_address1 P.O. Box 5036
3 1 ad1_address2 NULL
4 1 ad1_address3 NULL
5 1 ad1_ctyfk 56
6 1 ad1_postalcode 80155-5036
7 1 ad1_active Y
8 1 ad1_irstat A
9 1 ad1_irdata NULL
10 1 ad1_at1fk 1
1 2 ad1_pk 2
2 2 ad1_address1 1871 S. Broadway
3 2 ad1_address2 NULL
4 2 ad1_address3 NULL
5 2 ad1_ctyfk 1
6 2 ad1_postalcode 80210
7 2 ad1_active Y
8 2 ad1_irstat A
9 2 ad1_irdata NULL
10 2 ad1_at1fk 1
I am creating the pivot using the below mentioned query:
declare #var nvarchar(max)
declare #sql nvarchar(max)
set #var = stuff((select distinct ',' + name from temp
for xml path('')),1,1,'') -- **this is giving distinct column list but the order of columns get changed..**
set #sql = 'select * from temp
pivot(max(value) for name in (' + #var + ')) as pvt'
exec sp_executesql #sql
Is there a way to keep the order of the columns unchanged? I want the order of columns listed in #var to be same as in the table.
Add a GROUP BY and an ORDER BY clause (to replace the DISTINCT) where you build your column list as follows:
set #var = stuff((select ',' + min(name) from temp GROUP BY drn ORDER BY drn
for xml path('')),1,1,'')
And don't forget the the necessary aggregation (I've used MIN()). Thanks #Ionic.
This is because you're using a DISTINCT in your SELECT query. If you look at the execution plan, you can see DISTINCT SORT operation. This sorts your result based on the DISTINCT columns you specify, in this case it's Name:
To retain the order, you can try this:
set #var = stuff((
select ',' + name
from(
select
name,
drn,
rn = row_number() over(partition by name order by drn)
from temp
)t
where rn = 1
order by drn
for xml path('')),
1,1,'')
I am using SQL Server 2008 R2.
I have a table tblInstitution as follows
InstitutionCode InstitutionDesc
---------------------------------
ABC Abra Cada Brad
DEF Def Fede Eeee
GHJ Gee Hee
I want to split the values in InstitutionDesc and store it based on the institution code
InstitutionCode Token Score
-------------------------------
ABC Abra 0
ABC Cada 0
ABC Brad 0
DEF Def 0
DEF Fede 0
DEF Eeee 0
GHJ Gee 0
GHJ Hee 0
Is there a way I can do this in a set-based operation?
I have seen examples where a single column value can be split into multiple column values of the same row. But I am not able to find an example where the same column can be split into different rows. I am not sure what exactly should be searched for. Is it something to do with CTEs.
Here is a recursive CTE option...
If Object_ID('tempdb..#tblInstitution') Is Not Null Drop Table #tblInstitution;
Create Table #tblInstitution (InstitutionCode Varchar(10), InstitutionDesc Varchar(50));
Insert #tblInstitution (InstitutionCode, InstitutionDesc)
Values ('ABC','Abra Cada Brad'),
('DEF','Def Fede Eeee'),
('GHJ','Gee Hee'),
('KLM','Kappa');
With base As
(
Select InstitutionCode,
LTRIM(RTRIM(InstitutionDesc)) As InstitutionDesc
From #tblInstitution
), recur As
(
Select InstitutionCode,
Left(InstitutionDesc, CharIndex(' ', InstitutionDesc + ' ') - 1) As Token,
Case
When CharIndex(' ', InstitutionDesc) > 0
Then Right(InstitutionDesc, Len(InstitutionDesc) - CharIndex(' ', InstitutionDesc))
Else Null
End As Remaining
From base
Union All
Select InstitutionCode,
Left(Remaining, CharIndex(' ', Remaining + ' ') - 1) As Token,
Case
When CharIndex(' ', Remaining) > 0
Then Right(Remaining, Len(Remaining) - CharIndex(' ', Remaining))
Else Null
End As Remaining
From recur
Where Remaining Is Not Null
)
Select InstitutionCode,
Token,
0 As Score
From recur
Order By InstitutionCode