Split XML field into multiple delimited values - SQL - sql-server

I have some XML content in a single field; I want to split each xml field in multiple rows.
The XML is something like that:
<env>
<id>id1<\id>
<DailyProperties>
<date>01/01/2022<\date>
<value>1<\value>
<\DailyProperties>
<DailyProperties>
<date>05/05/2022<\date>
<value>2<\value>
<\DailyProperties>
<\env>
I want to put everything in a table as:
ID DATE VALUE
id1 01/01/2022 1
id1 05/05/2022 2
For now I managed to parse the xml value, and I have found something online to get a string into multiple rows (like this), but my string should have some kind of delimiter. I did this:
SELECT
ID,
XMLDATA.X.query('/env/DailyProperties/date').value('.', 'varchar(100)') as r_date,
XMLDATA.X.query('/env/DailyProperties/value').value('.', 'varchar(100)') as r_value
from tableX
outer apply xmlData.nodes('.') as XMLDATA(X)
WHERE ID = 'id1'
but I get all values without a delimiter, as such:
01/10/202202/10/202203/10/202204/10/202205/10/202206/10/202207/10/202208/10/202209/10/202210/10/2022
Or, as in my example:
ID R_DATE R_VALUE
id01 01/01/202205/05/2022 12
I have found out that XQuery has a last() function that return the last value parsed; in my xml example it will return only 05/05/2022, so it should exists something for address the adding of a delimiter. The number of rows could vary, as it could vary the number of days of which I have a value.

Please try the following solution.
I had to fix your XML to make it well-formed.
SQL
DECLARE #tbl TABLE (id INT IDENTITY PRIMARY KEY, xmldata XML);
INSERT INTO #tbl (xmldata) VALUES
(N'<env>
<id>id1</id>
<DailyProperties>
<date>01/01/2022</date>
<value>1</value>
</DailyProperties>
<DailyProperties>
<date>05/05/2022</date>
<value>2</value>
</DailyProperties>
</env>');
SELECT p.value('(id/text())[1]','VARCHAR(20)') AS id
, c.value('(date/text())[1]','VARCHAR(10)') AS [date]
, c.value('(value/text())[1]','INT') AS [value]
FROM #tbl
CROSS APPLY xmldata.nodes('/env') AS t1(p)
OUTER APPLY t1.p.nodes('DailyProperties') AS t2(c);
Output
id
date
value
id1
01/01/2022
1
id1
05/05/2022
2

Yitzhak beat me to it by 2 min. Nonetheless, here's what I have:
--==== XML Data:
DECLARE #xml XML =
'<env>
<id>id1</id>
<DailyProperties>
<date>01/01/2022</date>
<value>1</value>
</DailyProperties>
<DailyProperties>
<date>05/05/2022</date>
<value>2</value>
</DailyProperties>
</env>';
--==== Solution:
SELECT
ID = ff2.xx.value('(text())[1]','varchar(20)'),
[Date] = ff.xx.value('(date/text())[1]', 'date'),
[Value] = ff.xx.value('(value/text())[1]', 'int')
FROM (VALUES(#xml)) AS f(X)
CROSS APPLY f.X.nodes('env/DailyProperties') AS ff(xx)
CROSS APPLY f.X.nodes('env/id') AS ff2(xx);
Returns:
ID Date Value
-------------------- ---------- -----------
id1 2022-01-01 1
id1 2022-05-05 2

Related

Parse XML - Retrieve the Portion Between the Double Quotes

I have the following XML that is in an XML column in SQL Server. I am able to retrieve the data between the tags and list it in table format using the code at the bottom. I can retrieve the values between all the tags except for the one I have in bold below that is in double quotes. I can get the value X just fine but I need to get the 6 that is in between the double quotes in this part: <Organization501cInd organization501cTypeTxt="6">X</Organization501cInd>
WITH XMLNAMESPACES (DEFAULT 'http://www.irs.gov/efile')
SELECT ID, FilingYear, FilingPeriod, FilingType, [FileName]
, Organization501c3Ind = c.value('(//Organization501c3Ind/text())[1]','varchar(MAX)')
, Organization501cInd = c.value('(//Organization501cInd/text())[1]','varchar(MAX)')
, Organization501cTypeTxt = c.value('(//Organization501cTypeTxt/text())[1]','varchar(MAX)')
FROM Form990
CROSS APPLY XMLData.nodes('//Return') AS t(c)
CROSS APPLY XMLData.nodes('//Return/ReturnHeader/Filer') AS t2(c2)
XML:
<ReturnData documentCnt="2">
<IRS990 documentId="IRS990-01" referenceDocumentId="IRS990ScheduleO-01" referenceDocumentName="IRS990ScheduleO ReasonableCauseExplanation" softwareId="19009670">
<PrincipalOfficerNm>CAREY BAKER</PrincipalOfficerNm>
<USAddress>
<AddressLine1Txt>PO BOX 11275</AddressLine1Txt>
<CityNm>TALLAHASSEE</CityNm>
<StateAbbreviationCd>FL</StateAbbreviationCd>
<ZIPCd>32302</ZIPCd>
</USAddress>
<GrossReceiptsAmt>104241</GrossReceiptsAmt>
<GroupReturnForAffiliatesInd>false</GroupReturnForAffiliatesInd>
<Organization501cInd organization501cTypeTxt="6">X</Organization501cInd>
Thoughts?
Without a minimal reproducible example by the OP, shooting from the hip.
SQL
-- DDL and sample data population, start
DECLARE #Form990 TABLE (ID INT IDENTITY PRIMARY KEY, XMLData XML);
INSERT INTO #Form990(XMLData) VALUES
(N'<Return xmlns="http://www.irs.gov/efile" returnVersion="2019v5.1">
<ReturnData documentCnt="2">
<IRS990 documentId="IRS990-01" referenceDocumentId="IRS990ScheduleO-01" referenceDocumentName="IRS990ScheduleO ReasonableCauseExplanation" softwareId="19009670">
<PrincipalOfficerNm>CAREY BAKER</PrincipalOfficerNm>
<USAddress>
<AddressLine1Txt>PO BOX 11275</AddressLine1Txt>
<CityNm>TALLAHASSEE</CityNm>
<StateAbbreviationCd>FL</StateAbbreviationCd>
<ZIPCd>32302</ZIPCd>
</USAddress>
<GrossReceiptsAmt>104241</GrossReceiptsAmt>
<GroupReturnForAffiliatesInd>false</GroupReturnForAffiliatesInd>
<Organization501c3Ind>X</Organization501c3Ind>
<Organization501cInd organization501cTypeTxt="6">X</Organization501cInd>
</IRS990>
</ReturnData>
</Return>');
-- DDL and sample data population, end
WITH XMLNAMESPACES (DEFAULT 'http://www.irs.gov/efile')
SELECT -- ID, FilingYear, FilingPeriod, FilingType, [FileName]
Organization501c3Ind = c.value('(Organization501c3Ind/text())[1]','varchar(MAX)')
, Organization501cInd = c.value('(Organization501cInd/text())[1]','varchar(MAX)')
, Organization501cTypeTxt = c.value('(Organization501cInd/#organization501cTypeTxt)[1]','varchar(MAX)')
FROM #Form990
CROSS APPLY XMLData.nodes('/Return/ReturnData/IRS990') AS t(c)
Output
+----------------------+---------------------+-------------------------+
| Organization501c3Ind | Organization501cInd | Organization501cTypeTxt |
+----------------------+---------------------+-------------------------+
| X | X | 6 |
+----------------------+---------------------+-------------------------+
NOTE: XML element and attribute names are case-sensitive. i.e.: Organization501cTypeTxt will not match an attribute named organization501cTypeTxt.
When extracting attributes you need to use the # accessor in your XPath query. Try something like the following...
WITH XMLNAMESPACES (DEFAULT 'http://www.irs.gov/efile')
SELECT ID, FilingYear, FilingPeriod, FilingType, [FileName],
Organization501cInd = c2.value('(Organization501cInd/text())[1]','varchar(MAX)'),
organization501cTypeTxt = c2.value('(Organization501cInd/#organization501cTypeTxt)[1]','varchar(MAX)')
FROM Form990
CROSS APPLY XMLData.nodes('/ReturnData') AS t(c)
CROSS APPLY t.c.nodes('IRS990') AS t2(c2);

Compare the two tables and update the value in a Flag column

I have two tables and the values like this
`create table InputLocationTable(SKUID int,InputLocations varchar(100),Flag varchar(100))
create table Location(SKUID int,Locations varchar(100))
insert into InputLocationTable(SKUID,InputLocations) values(11,'Loc1, Loc2, Loc3, Loc4, Loc5, Loc6')
insert into InputLocationTable(SKUID,InputLocations) values(12,'Loc1, Loc2')
insert into InputLocationTable(SKUID,InputLocations) values(13,'Loc4,Loc5')
insert into Location(SKUID,Locations) values(11,'Loc3')
insert into Location(SKUID,Locations) values(11,'Loc4')
insert into Location(SKUID,Locations) values(11,'Loc5')
insert into Location(SKUID,Locations) values(11,'Loc7')
insert into Location(SKUID,Locations) values(12,'Loc10')
insert into Location(SKUID,Locations) values(12,'Loc1')
insert into Location(SKUID,Locations) values(12,'Loc5')
insert into Location(SKUID,Locations) values(13,'Loc4')
insert into Location(SKUID,Locations) values(13,'Loc2')
insert into Location(SKUID,Locations) values(13,'Loc2')`
I need to get the output by matching SKUID's from Each tables and Update the value in Flag column as shown in the screenshot, I have tried something like this code
`SELECT STUFF((select ','+ Data.C1
FROM
(select
n.r.value('.', 'varchar(50)') AS C1
from InputLocation as T
cross apply (select cast('<r>'+replace(replace(Location,'&','&'), ',', '</r><r>')+'</r>' as xml)) as S(XMLCol)
cross apply S.XMLCol.nodes('r') as n(r)) DATA
WHERE data.C1 NOT IN (SELECT Location
FROM Location) for xml path('')),1,1,'') As Output`
But not convinced with output and also i am trying to avoid xml path code, because performance is not first place for this code, I need the output like the below screenshot. Any help would be greatly appreciated.
I think you need to first look at why you think the XML approach is not performing well enough for your needs, as it has actually been shown to perform very well for larger input strings.
If you only need to handle input strings of up to either 4000 or 8000 characters (non max nvarchar and varchar types respectively), you can utilise a tally table contained within an inline table valued function which will also perform very well. The version I use can be found at the end of this post.
Utilising this function we can split out the values in your InputLocations column, though we still need to use for xml to concatenate them back together for your desired format:
-- Define data
declare #InputLocationTable table (SKUID int,InputLocations varchar(100),Flag varchar(100));
declare #Location table (SKUID int,Locations varchar(100));
insert into #InputLocationTable(SKUID,InputLocations) values (11,'Loc1, Loc2, Loc3, Loc4, Loc5, Loc6'),(12,'Loc1, Loc2'),(13,'Loc4,Loc5'),(14,'Loc1');
insert into #Location(SKUID,Locations) values (11,'Loc3'),(11,'Loc4'),(11,'Loc5'),(11,'Loc7'),(12,'Loc10'),(12,'Loc1'),(12,'Loc5'),(13,'Loc4'),(13,'Loc2'),(13,'Loc2'),(14,'Loc1');
--Query
-- Derived table splits out the values held within the InputLocations column
with i as
(
select i.SKUID
,i.InputLocations
,s.item as Loc
from #InputLocationTable as i
cross apply dbo.fn_StringSplit4k(replace(i.InputLocations,' ',''),',',null) as s
)
select il.SKUID
,il.InputLocations
,isnull('Add ' -- The split Locations are then matched to those already in #Location and those not present are concatenated together.
+ stuff((select ', ' + i.Loc
from i
left join #Location as l
on i.SKUID = l.SKUID
and i.Loc = l.Locations
where il.SKUID = i.SKUID
and l.SKUID is null
for xml path('')
)
,1,2,''
)
,'No Flag') as Flag
from #InputLocationTable as il
order by il.SKUID;
Output:
+-------+------------------------------------+----------------------+
| SKUID | InputLocations | Flag |
+-------+------------------------------------+----------------------+
| 11 | Loc1, Loc2, Loc3, Loc4, Loc5, Loc6 | Add Loc1, Loc2, Loc6 |
| 12 | Loc1, Loc2 | Add Loc2 |
| 13 | Loc4,Loc5 | Add Loc5 |
| 14 | Loc1 | No Flag |
+-------+------------------------------------+----------------------+
For nvarchar input (I have different functions for varchar and max type input) this is my version of the string splitting function linked above:
create function [dbo].[fn_StringSplit4k]
(
#str nvarchar(4000) = ' ' -- String to split.
,#delimiter as nvarchar(1) = ',' -- Delimiting value to split on.
,#num as int = null -- Which value in the list to return. NULL returns all.
)
returns table
as
return
-- Start tally table with 10 rows.
with n(n) as (select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1)
-- Select the same number of rows as characters in #str as incremental row numbers.
-- Cross joins increase exponentially to a max possible 10,000 rows to cover largest #str length.
,t(t) as (select top (select len(isnull(#str,'')) a) row_number() over (order by (select null)) from n n1,n n2,n n3,n n4)
-- Return the position of every value that follows the specified delimiter.
,s(s) as (select 1 union all select t+1 from t where substring(isnull(#str,''),t,1) = #delimiter)
-- Return the start and length of every value, to use in the SUBSTRING function.
-- ISNULL/NULLIF combo handles the last value where there is no delimiter at the end of the string.
,l(s,l) as (select s,isnull(nullif(charindex(#delimiter,isnull(#str,''),s),0)-s,4000) from s)
select rn
,item
from(select row_number() over(order by s) as rn
,substring(#str,s,l) as item
from l
) a
where rn = #num
or #num is null;
go

how to add a xml node to xml DataColumn and return Xml Datacolumn as string in SQL

I have a table like below. ID column should be added into XML tag as Value with respective ID Column value and return as one XML tag
Ex:
Specification | ID
<car><Name>BMW</Name></car> | 245
<car><Name>Audi</Name></car> | 290
<car><Name>Skoda</Name></car>| 295
Output Should be as
| Specification |
| <car><Name>BMW</Name><ID>245</ID></car><car><Name>Audi</Name><ID>290</ID></car><car><Name>Audi</Name><ID>295</ID></car> |
You could use modify() and insert xml like this
DECLARE #SampleData AS TABLE
(
Specification xml,
Id int
)
INSERT INTO #SampleData
VALUES
('<car><Name>BMW</Name></car>', 245),
('<car><Name>Audi</Name></car>', 290),
('<car><Name>Skoda</Name></car>', 295)
Update #SampleData SET Specification.modify('insert <Id>{sql:column("sd.Id")}</Id> into (/car)[1]')
FROM #SampleData sd
SELECT sd.* FROM #SampleData sd
Demo link: Rextester
Some reference links:
modify xml
insert xml
this is one way I can think of. You can strip the value of the car name using the .node, and reconstruct the XML. Something like,
CREATE TABLE #Temp
(
Specification XML,
ID INT
)
INSERT #Temp ( Specification, ID )
VALUES
('<car><Name>BMW</Name></car>', 245),
('<car><Name>Audi</Name></car>', 290),
('<car><Name>Skoda</Name></car>', 295)
SELECT * FROM
(
SELECT
Val.value('(text())[1]', 'varchar(32)') AS Name,
T.ID
FROM #Temp AS T
CROSS APPLY Specification.nodes('car/Name') AS Id(Val)
) X
FOR XML PATH('Name'), ROOT ('Car')
DROP TABLE #Temp

Try to find explaination for [duplicate]

Table is:
Id
Name
1
aaa
1
bbb
1
ccc
1
ddd
1
eee
Required output:
Id
abc
1
aaa,bbb,ccc,ddd,eee
Query:
SELECT ID,
abc = STUFF(
(SELECT ',' + name FROM temp1 FOR XML PATH ('')), 1, 1, ''
)
FROM temp1 GROUP BY id
This query is working properly. But I just need the explanation how it works or is there any other or short way to do this.
I am getting very confused to understand this.
Here is how it works:
1. Get XML element string with FOR XML
Adding FOR XML PATH to the end of a query allows you to output the results of the query as XML elements, with the element name contained in the PATH argument. For example, if we were to run the following statement:
SELECT ',' + name
FROM temp1
FOR XML PATH ('')
By passing in a blank string (FOR XML PATH('')), we get the following instead:
,aaa,bbb,ccc,ddd,eee
2. Remove leading comma with STUFF
The STUFF statement literally "stuffs” one string into another, replacing characters within the first string. We, however, are using it simply to remove the first character of the resultant list of values.
SELECT abc = STUFF((
SELECT ',' + NAME
FROM temp1
FOR XML PATH('')
), 1, 1, '')
FROM temp1
The parameters of STUFF are:
The string to be “stuffed” (in our case the full list of name with a
leading comma)
The location to start deleting and inserting characters (1, we’re stuffing into a blank string)
The number of characters to delete (1, being the leading comma)
So we end up with:
aaa,bbb,ccc,ddd,eee
3. Join on id to get full list
Next we just join this on the list of id in the temp table, to get a list of IDs with name:
SELECT ID, abc = STUFF(
(SELECT ',' + name
FROM temp1 t1
WHERE t1.id = t2.id
FOR XML PATH (''))
, 1, 1, '') from temp1 t2
group by id;
And we have our result:
Id
Name
1
aaa,bbb,ccc,ddd,eee
This article covers various ways of concatenating strings in SQL, including an improved version of your code which doesn't XML-encode the concatenated values.
SELECT ID, abc = STUFF
(
(
SELECT ',' + name
FROM temp1 As T2
-- You only want to combine rows for a single ID here:
WHERE T2.ID = T1.ID
ORDER BY name
FOR XML PATH (''), TYPE
).value('.', 'varchar(max)')
, 1, 1, '')
FROM temp1 As T1
GROUP BY id
To understand what's happening, start with the inner query:
SELECT ',' + name
FROM temp1 As T2
WHERE T2.ID = 42 -- Pick a random ID from the table
ORDER BY name
FOR XML PATH (''), TYPE
Because you're specifying FOR XML, you'll get a single row containing an XML fragment representing all of the rows.
Because you haven't specified a column alias for the first column, each row would be wrapped in an XML element with the name specified in brackets after the FOR XML PATH. For example, if you had FOR XML PATH ('X'), you'd get an XML document that looked like:
<X>,aaa</X>
<X>,bbb</X>
...
But, since you haven't specified an element name, you just get a list of values:
,aaa,bbb,...
The .value('.', 'varchar(max)') simply retrieves the value from the resulting XML fragment, without XML-encoding any "special" characters. You now have a string that looks like:
',aaa,bbb,...'
The STUFF function then removes the leading comma, giving you a final result that looks like:
'aaa,bbb,...'
It looks quite confusing at first glance, but it does tend to perform quite well compared to some of the other options.
PATH mode is used in generating XML from a SELECT query
1. SELECT
ID,
Name
FROM temp1
FOR XML PATH;
Ouput:
<row>
<ID>1</ID>
<Name>aaa</Name>
</row>
<row>
<ID>1</ID>
<Name>bbb</Name>
</row>
<row>
<ID>1</ID>
<Name>ccc</Name>
</row>
<row>
<ID>1</ID>
<Name>ddd</Name>
</row>
<row>
<ID>1</ID>
<Name>eee</Name>
</row>
The Output is element-centric XML where each column value in the resulting rowset is wrapped in an row element. Because the SELECT clause does not specify any aliases for the column names, the child element names generated are the same as the corresponding column names in the SELECT clause.
For each row in the rowset a tag is added.
2.
SELECT
ID,
Name
FROM temp1
FOR XML PATH('');
Ouput:
<ID>1</ID>
<Name>aaa</Name>
<ID>1</ID>
<Name>bbb</Name>
<ID>1</ID>
<Name>ccc</Name>
<ID>1</ID>
<Name>ddd</Name>
<ID>1</ID>
<Name>eee</Name>
For Step 2: If you specify a zero-length string, the wrapping element is not produced.
3.
SELECT
Name
FROM temp1
FOR XML PATH('');
Ouput:
<Name>aaa</Name>
<Name>bbb</Name>
<Name>ccc</Name>
<Name>ddd</Name>
<Name>eee</Name>
4. SELECT
',' +Name
FROM temp1
FOR XML PATH('')
Ouput:
,aaa,bbb,ccc,ddd,eee
In Step 4 we are concatenating the values.
5. SELECT ID,
abc = (SELECT
',' +Name
FROM temp1
FOR XML PATH('') )
FROM temp1
Ouput:
1 ,aaa,bbb,ccc,ddd,eee
1 ,aaa,bbb,ccc,ddd,eee
1 ,aaa,bbb,ccc,ddd,eee
1 ,aaa,bbb,ccc,ddd,eee
1 ,aaa,bbb,ccc,ddd,eee
6. SELECT ID,
abc = (SELECT
',' +Name
FROM temp1
FOR XML PATH('') )
FROM temp1 GROUP by iD
Ouput:
ID abc
1 ,aaa,bbb,ccc,ddd,eee
In Step 6 we are grouping the date by ID.
STUFF( source_string, start, length, add_string )
Parameters or Arguments
source_string
The source string to modify.
start
The position in the source_string to delete length characters and then insert add_string.
length
The number of characters to delete from source_string.
add_string
The sequence of characters to insert into the source_string at the start position.
SELECT ID,
abc =
STUFF (
(SELECT
',' +Name
FROM temp1
FOR XML PATH('')), 1, 1, ''
)
FROM temp1 GROUP by iD
Output:
-----------------------------------
| Id | Name |
|---------------------------------|
| 1 | aaa,bbb,ccc,ddd,eee |
-----------------------------------
There is very new functionality in Azure SQL Database and SQL Server (starting with 2017) to handle this exact scenario. I believe this would serve as a native official method for what you are trying to accomplish with the XML/STUFF method. Example:
select id, STRING_AGG(name, ',') as abc
from temp1
group by id
STRING_AGG - https://msdn.microsoft.com/en-us/library/mt790580.aspx
EDIT: When I originally posted this I made mention of SQL Server 2016 as I thought I saw that on a potential feature that was to be included. Either I remembered that incorrectly or something changed, thanks for the suggested edit fixing the version. Also, pretty impressed and wasn't fully aware of the multi-step review process that just pulled me in for a final option.
In for xml path, if we define any value like [ for xml path('ENVLOPE') ] then these tags will be added with each row:
<ENVLOPE>
</ENVLOPE>
SELECT ID,
abc = STUFF(
(SELECT ',' + name FROM temp1 FOR XML PATH ('')), 1, 1, ''
)
FROM temp1 GROUP BY id
Here in the above query STUFF function is used to just remove the first comma (,) from the generated xml string (,aaa,bbb,ccc,ddd,eee) then it will become (aaa,bbb,ccc,ddd,eee).
And FOR XML PATH('') simply converts column data into (,aaa,bbb,ccc,ddd,eee) string but in PATH we are passing '' so it will not create a XML tag.
And at the end we have grouped records using ID column.
I did debugging and finally returned my 'stuffed' query to it it's normal way.
Simply
select * from myTable for xml path('myTable')
gives me contents of the table to write to a log table from a trigger I debug.
Declare #Temp As Table (Id Int,Name Varchar(100))
Insert Into #Temp values(1,'A'),(1,'B'),(1,'C'),(2,'D'),(2,'E'),(3,'F'),(3,'G'),(3,'H'),(4,'I'),(5,'J'),(5,'K')
Select X.ID,
stuff((Select ','+ Z.Name from #Temp Z Where X.Id =Z.Id For XML Path('')),1,1,'')
from #Temp X
Group by X.ID

Query to find the record with most matching columns, where the number of columns and names of columns is unknown?

I have two tables, X and Y, with identical schema but different records. Given a record from X, I need a query to find the closest matching record in Y that contains NULL values for non-matching columns. Identity columns should be excluded from the comparison. For example, if my record looked like this:
------------------------
id | col1 | col2 | col3
------------------------
0 |'abc' |'def' | 'ghi'
And table Y looked like this:
------------------------
id | col1 | col2 | col3
------------------------
6 |'abc' |'def' | 'zzz'
8 | NULL |'def' | NULL
Then the closest match would be record 8, since where the columns don't match, there are NULL values. 6 WOULD have been the closest match, but the 'zzz' disqualified it.
What's unique about this problem is that the schema of the tables is unknown besides the id column and the data types. There could be 4 columns, or there could be 7 columns. We just don't know - it's dynamic. All we know is that there is going to be an 'id' column and that the columns will be strings, either varchar or nvarchar.
What is the best query in this case to pick the closest matching record out of Y, given a record from X? I'm actually writing a function. The input is an integer (the id of a record in X) and the output is an integer (the id of a record in Y, or NULL). I'm an SQL novice, so a brief explanation of what's happening in your solution would help me greatly.
There could be 4 columns, or there could be 7 columns.... I'm actually writing a function.
This is an impossible task. Because functions are deterministic, so you cannot have a function that will work on an arbitrary table structure, using dynamic SQL. A stored procedure, sure, but not a function.
However, the below shows you a way using FOR XML and some decomposing of the XML to unpivot rows into column names and values which can then be compared. The technique used here and the queries can be incorporated into a stored procedure.
MS SQL Server 2008 Schema Setup:
-- this is the data table to match against
create table t1 (
id int,
col1 varchar(10),
col2 varchar(20),
col3 nvarchar(40));
insert t1
select 6, 'abc', 'def', 'zzz' union all
select 8, null , 'def', null;
-- this is the data with the row you want to match
create table t2 (
id int,
col1 varchar(10),
col2 varchar(20),
col3 nvarchar(40));
insert t2
select 0, 'abc', 'def', 'ghi';
GO
Query 1:
;with unpivoted1 as (
select n.n.value('local-name(.)','nvarchar(max)') colname,
n.n.value('.','nvarchar(max)') value
from (select (select * from t2 where id=0 for xml path(''), type)) x(xml)
cross apply x.xml.nodes('//*[local-name()!="id"]') n(n)
), unpivoted2 as (
select x.id,
n.n.value('local-name(.)','nvarchar(max)') colname,
n.n.value('.','nvarchar(max)') value
from (select id,(select * from t1 where id=outr.id for xml path(''), type) from t1 outr) x(id,xml)
cross apply x.xml.nodes('//*[local-name()!="id"]') n(n)
)
select TOP(1) WITH TIES
B.id,
sum(case when A.value=B.value then 1 else 0 end) matches
from unpivoted1 A
join unpivoted2 B on A.colname = B.colname
group by B.id
having max(case when A.value <> B.value then 1 end) is null
ORDER BY matches;
Results:
| ID | MATCHES |
----------------
| 8 | 1 |

Resources