Parse non Xml string into Xml during Sql query - sql-server

Say I have this subset of data. All I need to do is have John | John | 20 as my output. The main issue I am having is that my XmlData is stored in an nvarchar(Max) field and the update to fix this, breaks an unknown amount of other applications (talking a massive scale so I cannot simply modify the table design).
Name nvarchar(23) | XmlData (nvarchar(max) |
John |<Personal><name>John</name><age>20</age></Personal> |
Suzy |<Personal><name>Suzanne</name><age>24</age></Personal> |
etc...
What I have tried so far is similar to the following, but it fails.
SELECT Name,
[myTable].Value('(Personal[name ="name"]/value/text())[1]', 'nvarchar(100)') as 'XmlName',
[myTable].Value('(Personal[name ="age"]/value/text())[1]', 'nvarchar(100)') as 'XmlAge'
FROM [MyTable]
How can I achieve my goal of the following output?
Name | XmlName | XmlAge |
John | John | 20 |
Suzy | Suzanne | 24 |
etc...

First cast the field to the XML type, then use the value() method:
DECLARE #T TABLE (Name nvarchar(23), XmlData nvarchar(max));
INSERT #T VALUES ('John', '<Personal><name>John</name><age>20</age></Personal>');
SELECT Name,
CAST(XmlData AS XML).value('(Personal/name)[1]', 'nvarchar(100)') AS 'XmlName',
CAST(XmlData AS XML).value('(Personal/age)[1]', 'nvarchar(100)') AS 'XmlAge'
FROM #T;

Related

SQL Server 2017 - get column name, datatype and value of table

I thought it was a simple task but it's a couple of hours I'm still struggling :-(
I want to have the list of column names of a table, together with its datatype and the value contained in the columns, but have no idea how to bind the table itself to get the current value:
DECLARE #TTab TABLE
(
fieldName nvarchar(128),
dataType nvarchar(64),
currentValue nvarchar(128)
)
INSERT INTO #TTab (fieldName,dataType)
SELECT
i.COLUMN_NAME,
i.DATA_TYPE
FROM
INFORMATION_SCHEMA.COLUMNS i
WHERE
i.TABLE_NAME = 'Users'
Expected result:
+------------+----------+---------------+
| fieldName | dataType | currentValue |
+------------+----------+---------------+
| userName | nvarchar | John |
| active | bit | true |
| age | int | 43 |
| balance | money | 25.20 |
+------------+----------+---------------+
In general the answer is: No, this is impossible. But there is a hack using text-based containers like XML or JSON (v2016+):
--Let's create a test table with some rows
CREATE TABLE dbo.TestGetMetaData(ID INT IDENTITY,PreName VARCHAR(100),LastName NVARCHAR(MAX),DOB DATE);
INSERT INTO dbo.TestGetMetaData(PreName,LastName,DOB) VALUES
('Tim','Smith','20000101')
,('Tom','Blake','20000202')
,('Kim','Black','20000303')
GO
--Here's the query
SELECT C.colName
,C.colValue
,D.*
FROM
(
SELECT t.* FROM dbo.TestGetMetaData t
WHERE t.Id=2
FOR XML PATH(''),TYPE
) A(rowSet)
CROSS APPLY A.rowSet.nodes('*') B(col)
CROSS APPLY(VALUES(B.col.value('local-name(.)','nvarchar(500)')
,B.col.value('text()[1]', 'nvarchar(max)'))) C(colName,colValue)
LEFT JOIN INFORMATION_SCHEMA.COLUMNS D ON D.TABLE_SCHEMA='dbo'
AND D.TABLE_NAME='TestGetMetaData'
AND D.COLUMN_NAME=C.colName;
GO
--Clean-Up (carefull with real data)
DROP TABLE dbo.TestGetMetaData;
GO
Part of the result
+----------+------------+-----------+--------------------------+-------------+
| colName | colValue | DATA_TYPE | CHARACTER_MAXIMUM_LENGTH | IS_NULLABLE |
+----------+------------+-----------+--------------------------+-------------+
| ID | 2 | int | NULL | NO |
+----------+------------+-----------+--------------------------+-------------+
| PreName | Tom | varchar | 100 | YES |
+----------+------------+-----------+--------------------------+-------------+
| LastName | Blake | nvarchar | -1 | YES |
+----------+------------+-----------+--------------------------+-------------+
| DOB | 2000-02-02 | date | NULL | YES |
+----------+------------+-----------+--------------------------+-------------+
The idea in short:
Using FOR XML PATH(''),TYPE will create a XML representing your SELECT's result set.
The big advantage with this: The XML's element will carry the column's name.
We can use a CROSS APPLY to geht the column's name and value
Now we can JOIN the metadata from INFORMATION_SCHEMA.COLUMNS.
One hint: All values will be of type nvarchar(max) actually.
The value being a string type might lead to unexpected results due to implicit conversions or might lead into troubles with BLOBs.
UPDATE
The following query wouldn't even need to specify the table's name in the JOIN:
SELECT C.colName
,C.colValue
,D.DATA_TYPE,D.CHARACTER_MAXIMUM_LENGTH,IS_NULLABLE
FROM
(
SELECT * FROM dbo.TestGetMetaData
WHERE Id=2
FOR XML AUTO,TYPE
) A(rowSet)
CROSS APPLY A.rowSet.nodes('/*/#*') B(attr)
CROSS APPLY(VALUES(A.rowSet.value('local-name(/*[1])','nvarchar(500)')
,B.attr.value('local-name(.)','nvarchar(500)')
,B.attr.value('.', 'nvarchar(max)'))) C(tblName,colName,colValue)
LEFT JOIN INFORMATION_SCHEMA.COLUMNS D ON CONCAT(D.TABLE_SCHEMA,'.',D.TABLE_NAME)=C.tblName
AND D.COLUMN_NAME=C.colName;
Why?
Using FOR XML AUTO will use attribute centered XML. The elements name will be the tables name, while the values rest within attributes.
UPDATE 2
Fully generic function:
CREATE FUNCTION dbo.GetRowWithMetaData(#input XML)
RETURNS TABLE
AS
RETURN
SELECT C.colName
,C.colValue
,D.*
FROM #input.nodes('/*/#*') B(attr)
CROSS APPLY(VALUES(#input.value('local-name(/*[1])','nvarchar(500)')
,B.attr.value('local-name(.)','nvarchar(500)')
,B.attr.value('.', 'nvarchar(max)'))) C(tblName,colName,colValue)
LEFT JOIN INFORMATION_SCHEMA.COLUMNS D ON CONCAT(D.TABLE_SCHEMA,'.',D.TABLE_NAME)=C.tblName
AND D.COLUMN_NAME=C.colName;
--You call it like this (see the extra paranthesis!)
SELECT * FROM dbo.GetRowWithMetaData((SELECT * FROM dbo.TestGetMetaData WHERE ID=2 FOR XML AUTO));
As you see, the function does not even has to know anything in advance...

SQL Server split SELECT XML column as arbitrary individual columns

In my application, I have few pre-defined fields for an object and user can define custom fields. I am using XML data type to store the custom fields in a name value format.
e.g. I have Employees table that has FN, LN, Email as pre-defined columns and CustomFields as XML column to hold the user defined fields.
And different rows can contain different custom fields.
e.g. Row 1 -> John, Smith, jsmith#example.com,
<root>
<phone>123-123-1234</phone>
<country>USA</country>
</root>
and then Row 2 -> Smith, John, sjohn#example.com,
<root>
<age>50</age>
<sex>Male</sex>
</root>
And there can be any number of such custom fields defined for different employee records. The format will always be the same
<root><field>value</field></root>
How can I return Phone and Country as columns while selecting Row1 and return Age and Sex as columns while selecting Row2?
Take this temp table for all examples
CREATE TABLE #tbl (ID INT IDENTITY, FirstName VARCHAR(100),LastName VARCHAR(100),eMail VARCHAR(100),CustomFields XML);
INSERT INTO #tbl VALUES
('John','Smith','john.smith#test.com'
,'<root>
<phone>123-123-1234</phone>
<country>USA</country>
</root>')
, ('Jane','Miller','jane.miller#test.com'
,'<root>
<age>50</age>
<sex>Male</sex>
</root>');
Option 1
Assuming that there is a fix known set of custom fields.
This allows typesafe reading (age as INT)
all possible columns are returned, unused are NULL
Try this code
SELECT tbl.ID
,tbl.FirstName
,tbl.LastName
,tbl.eMail
,tbl.CustomFields.value('(/root/phone)[1]','nvarchar(max)') AS phone
,tbl.CustomFields.value('(/root/country)[1]','nvarchar(max)') AS country
,tbl.CustomFields.value('(/root/age)[1]','int') AS age
,tbl.CustomFields.value('(/root/sex)[1]','nvarchar(max)') AS sex
FROM #tbl AS tbl
This is the result
+----+-----------+----------+----------------------+--------------+---------+------+------+
| ID | FirstName | LastName | eMail | phone | country | age | sex |
+----+-----------+----------+----------------------+--------------+---------+------+------+
| 1 | John | Smith | john.smith#test.com | 123-123-1234 | USA | NULL | NULL |
+----+-----------+----------+----------------------+--------------+---------+------+------+
| 2 | Jane | Miller | jane.miller#test.com | NULL | NULL | 50 | Male |
+----+-----------+----------+----------------------+--------------+---------+------+------+
*/
Option 2
assuming you do not know the field names in advance you cannot name the output columns directly
But you can use generic names, read the data row-wise and do PIVOT
Try this:
SELECT p.*
FROM
(
SELECT tbl.FirstName
,tbl.LastName
,tbl.eMail
,N'Col_' + CAST(ROW_NUMBER() OVER(PARTITION BY tbl.ID ORDER BY (SELECT NULL)) AS NVARCHAR(max)) AS ColumnName
,A.cf.value('local-name(.)','nvarchar(max)') + ':' + A.cf.value('.','nvarchar(max)') AS cf
FROM #tbl AS tbl
CROSS APPLY tbl.CustomFields.nodes('/root/*') AS A(cf)
) AS x
PIVOT
(
MAX(cf) FOR ColumnName IN(Col_1,Col_2,Col_3,Col_4 /*add as many as you need*/)
) AS p
This is the result
+-----------+----------+----------------------+--------------------+-------------+-------+-------+
| FirstName | LastName | eMail | Col_1 | Col_2 | Col_3 | Col_4 |
+-----------+----------+----------------------+--------------------+-------------+-------+-------+
| Jane | Miller | jane.miller#test.com | age:50 | sex:Male | NULL | NULL |
+-----------+----------+----------------------+--------------------+-------------+-------+-------+
| John | Smith | john.smith#test.com | phone:123-123-1234 | country:USA | NULL | NULL |
+-----------+----------+----------------------+--------------------+-------------+-------+-------+
Option 3
assuming you do not know the columns, but you need the columns correctly named
attention: be aware of the fact, that such an approach will never be allowed in ad-hoc-SQL such as VIEW or inline TVF which might be a great back draw...
This needs dynamic creation of a statement. I will create the statement of Option 1 but replace the fix list with a dynamically created list:
DECLARE #DynamicColumns NVARCHAR(MAX)=
(
SELECT ',tbl.CustomFields.value(''(/root/' + A.cf.value('local-name(.)','nvarchar(max)') + ')[1]'',''nvarchar(max)'') AS ' + A.cf.value('local-name(.)','nvarchar(max)')
FROM #tbl AS tbl
CROSS APPLY tbl.CustomFields.nodes('/root/*') AS A(cf)
FOR XML PATH('')
);
DECLARE #DynamicSQL NVARCHAR(MAX)=
' SELECT tbl.ID
,tbl.FirstName
,tbl.LastName
,tbl.eMail'
+ #DynamicColumns +
' FROM #tbl AS tbl;'
EXEC(#DynamicSQL);
The result would be the same as in Option 1, but with a completely dynamic approach.
Cleanup
DROP TABLE #tbl;

Need help in a reverse pivot - Column names become data and then the values in that column

I am looking to pull data from a table and insert the results into a #temp table where the column name is part of the result set. I know I can get the column names from the schema information table but I need the data in one of the columns. There will be only 1 row from the original table, so I am basically doing a reverse STUFF command or reverse Pivot. The result set would be columnName and Value but multiple rows- as many rows as columns
So basically the result set or table with have just 2 columns, one for the column name and one for the value in that column. That is my goal. I know a pivot does this in reverse but can't seem to find a "Reverse pivot". I am using SQL Server 2008.
Any help would be appreciated. Thanks!
Are you able to give a better description of what you're after? For example, more information on the table structures, etc.
Regardless. Please see below an example of using a CROSS APPLY statement to transform a 'Pivot Table' into a flat table.
Data within the pivot table
+----+-----------+----------+----------------+
| Id | FirstName | LastName | Company |
+----+-----------+----------+----------------+
| 1 | Joe | Bloggs | A Company |
| 2 | Jane | Doe | Lost and Found |
+----+-----------+----------+----------------+
SQL statement to turn pivot table to flat table
IF OBJECT_ID('PivotedTable', 'U') IS NOT NULL
DROP TABLE PivotedTable
GO
CREATE TABLE PivotedTable (
Id INT IDENTITY,
FirstName VARCHAR(255),
LastName VARCHAR(255),
Company VARCHAR(255)
)
INSERT PivotedTable (FirstName, LastName, Company)
VALUES ('Joe', 'Bloggs', 'A Company'), ('Jane', 'Doe', 'Lost and Found')
SELECT
FlatTable.ColumnName,
FlatTable.Value
FROM PivotedTable t
CROSS APPLY (
VALUES
('FirstName', FirstName),
('LastName', LastName),
('Company', Company)
) FlatTable (ColumnName, Value)
Output of the query after turning into a flat table
+------------+----------------+
| ColumnName | Value |
+------------+----------------+
| FirstName | Joe |
| LastName | Bloggs |
| Company | A Company |
| FirstName | Jane |
| LastName | Doe |
| Company | Lost and Found |
+------------+----------------+

Update column based on other column values

I have a table that stores clients that are then grouped under one client who are considered the 'Head' of the group. So a client is considered a 'Head' if they appear in the 'Group' column, even though they may not be in their own 'Group'. The table may appear as follows:
+--------+-------+------+
| Client | Group | Head |
+--------+-------+------+
| ABC | ABC | Yes |
| DEF | ABC | No |
| GHI | GHI | Yes |
| JKL | MNO | Yes |
| MNO | PQR | Yes |
| PQR | MNO | No |
| STU | STU | Yes |
+--------+-------+------+
Here we can see that 'Head' records for client JKL and PQR are incorrect. What I need is a list of just the clients whose 'Head' column is incorrect and what it should be (Yes/1 or No/0). What is the best way to go about doing this?
Declare #a Table (Client Varchar(10),[Group] Varchar(10),Head Varchar(3))
Insert into #a Values ('ABC','ABC','YES')
Insert into #a Values ('DEF','ABC','No')
Insert into #a Values ('GHI','GHI','YES')
Insert into #a Values ('JKL','MNO','YES')
Insert into #a Values ('MNO','PQR','YES')
Insert into #a Values ('PQR','MNO','No')
Insert into #a Values ('STU','STU','YES')
-- uncomment here to get only wrong ones
--Select * from (
Select a.*, Coalesce(h.RealHead,'No') as RealHead
, CASE WHEN Coalesce(h.RealHead,'No')<>a.Head then 'ERROR' else 'OK' end as Info
FROM #a a
LEFT JOIN (Select Distinct [Group], 'YES' as RealHead from #a) h
ON h.[GROUP]=a.Client -- Join real Heads with clients
-- uncomment here to get only wrong ones
--) s where Info='ERROR'
Use the SQL select statement WHERE and double equal sign ==
select * from tableName where Head=="No";
You can update the records to what you like
;
UPDATE tableName
SET Head = 0
WHERE Head="No";
Hope this helps.
declare #t table(Client varchar(50), [Group] varchar(50), Head varchar(50))
insert into #t values( 'ABC','ABC','Yes'),
('DEF','ABC','No'),
('GHI','GHI','Yes'),
('JKL','MNO','Yes'),
('MNO','PQR','Yes'),
('PQR','MNO','No'),
('STU','STU','Yes')
select * from #t t1
where t1.client not in (select distinct [Group] from #t t3 where t3.Head = 'Yes' and t3.Client = t3.[group])
--and Head = 'Yes' --please uncomment this line and check the result, if any issue to desire your result, tell me

SQL Server 2008 Vertical data to Horizontal

I apologize for submitting another question on this topic, but I've read through many of the answers on this and I can't seem to get it to work for me.
I have three tables I need to join and pull info on. One of the tables is only 3 columns and stores the data vertically. I would like to transpose that data to a horizontal format.
The data will look like this if I just join and pull:
SELECT
a.app_id,
b.field_id,
c.field_name,
b.field_value
FROM table1 a
JOIN table2 b ON a.app_id = b.app_id
JOIN table3 c ON b.field_id = c.field_id --(table3 is a lookup table for field names)
Result:
app_id | field_id | field_name | field_value
-----------------------------------------------------
1234 | 101 | First Name | Joe
1234 | 102 | Last Name | Smith
1234 | 105 | DOB | 10/15/72
1234 | 107 | Mailing Addr | PO BOX 1234
1234 | 110 | Zip | 12345
1239 | 101 | First Name | Bob
1239 | 102 | Last Name | Johnson
1239 | 105 | DOB | 12/01/78
1239 | 107 | Mailing Addr | 1234 N Star Ave
1239 | 110 | Zip | 12456
Instead, I would like it to look like this:
app_id | First Name | Last Name | DOB | Mailing Addr | Zip
--------------------------------------------------------------------------
1234 | Joe | Smith | 10/15/72 | PO BOX 1234 | 12345
1239 | Bob | Johnson | 12/01/78 | 1234 N Star Ave | 12456
In the past, I just resorted to looking up all the field_id's I needed in my data and created CASE statements for each one. The app the users are using contains data for multiple products, and each product contains different fields. Considering the number of products supported and the number of fields for each product (many, many more than the basic example I showed, above) it takes a long time to look them up and write out huge chunks of CASE statements.
I was wondering if there's some cheat-code out there to achieve what I need without having to look up the field_ids and writing things out. I know the PIVOT function is likely what I'm looking for, however, I can't seem to get it to work correctly.
Think you guys could help out?
You can use the PIVOT function to convert your rows of data into columns.
Your original query can be used to retrieve all the data, the only change I would make to it would be to exclude the column b.field_id because this will alter the final display of the result.
If you have a known list of field_name values that you want to turn into columns, then you can hard-code your query:
select app_id,
[First Name], [Last Name], [DOB],
[Mailing Addr], [Zip]
from
(
SELECT
a.app_id,
c.field_name,
b.field_value
FROM table1 a
INNER JOIN table2 b
ON a.app_id = b.app_id
INNER JOIN table3 c
ON b.field_id = c.field_id
) d
pivot
(
max(field_value)
for field_name in ([First Name], [Last Name], [DOB],
[Mailing Addr], [Zip])
) piv;
See SQL Fiddle with Demo.
But if you are going to have an unknown number of values for field_name, then you will need to implement dynamic SQL to get the result:
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT ',' + QUOTENAME(Field_name)
from Table3
group by field_name, Field_id
order by Field_id
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT app_id,' + #cols + '
from
(
SELECT
a.app_id,
c.field_name,
b.field_value
FROM table1 a
INNER JOIN table2 b
ON a.app_id = b.app_id
INNER JOIN table3 c
ON b.field_id = c.field_id
) x
pivot
(
max(field_value)
for field_name in (' + #cols + ')
) p '
execute sp_executesql #query;
See SQL Fiddle with Demo. Both of these this will give a result:
| APP_ID | FIRST NAME | LAST NAME | DOB | MAILING ADDR | ZIP |
------------------------------------------------------------------------
| 1234 | Joe | Smith | 10/15/72 | PO Box 1234 | 12345 |
| 1239 | Bob | Johnson | 12/01/78 | 1234 N Star Ave | 12456 |
Try this
SELECT
[app_id]
,MAX([First Name]) AS [First Name]
,MAX([Last Name]) AS [Last Name]
,MAX([DOB]) AS [DOB]
,MAX([Mailing Addr]) AS [Mailing Addr]
,MAX([Zip]) AS [Zip]
FROM Table1
PIVOT
(
MAX([field_value]) FOR [field_name] IN ([First Name],[Last Name],[DOB],[Mailing Addr],[Zip])
) T
GROUP BY [app_id]
SQL FIDDLE DEMO
bluefeet's answer was the right one for me, but I needed distinct on the column list:
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT Distinct ',' + QUOTENAME(Field_name)
from Table3
group by field_name, Field_id
order by ',' + QUOTENAME(Field_name)
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT app_id,' + #cols + '
from
(
SELECT
a.app_id,
c.field_name,
b.field_value
FROM table1 a
INNER JOIN table2 b
ON a.app_id = b.app_id
INNER JOIN table3 c
ON b.field_id = c.field_id
) x
pivot
(
max(field_value)
for field_name in (' + #cols + ')
) p '
execute sp_executesql #query;
This would solve using group by and MAX function, instead of pivot:
SELECT PK_ID, MAX(PHONE) AS PHONE, MAX(MAIL) AS MAIL
FROM (
SELECT
PK_ID,
CASE
WHEN CONTACT_ALIAS.CONTACT_TYPE = 'COMPANY' THEN CONTACT_ALIAS.CONTACT_VALUE
END AS PHONE ,
CASE
WHEN CONTACT_ALIAS.CONTACT_TYPE = 'BUSINESS' THEN CONTACT_ALIAS.CONTACT_VALUE
END AS MAIL
FROM T_CONTACT_EMPLOYERS CONTACT_ALIAS
WHERE CONTACT_ALIAS.CONTACT_TYPE IN ('COMPANY' , 'BUSINESS')
) TEMP
GROUP BY PK_ID
USe of SQL Pivot
SELECT [Id], [FirstName], [LastName], [Email]
FROM
(
SELECT Id, Att_Id, Att_Value FROM VerticalTable
) as source
PIVOT
(
MAX(Att_Value) FOR Att_Id IN ([FirstName], [LastName], [Email])
) as target

Resources