Transact SQL Pivot a Split String into appropriate columns

Transact SQL Pivot a Split String into appropriate columns - sql-server

My company is using a generic logging database among many products. To prevent the need for a lot of cross database queries some info is stored in delimited fields within the generic data columns for the logging.
I'm wanting to write query's on the data, but I'm unsure how to use Pivot/Unpivot to get the data into appropriate columns?
Below is a generic example using static data for what I'm wanting to do, but not sure how to do it. We unfortunately don't have the built in split string function in SqlServer 2016 so dbo.fnSplitString is my written equivalent which works fine.
DECLARE #Columns TABLE (
CustomerNumber VARCHAR(MAX) NOT NULL,
FirstName VARCHAR(MAX) NOT NULL,
LastName VARCHAR(MAX) NOT NULL);
/* This isn't valid SQL ... unsure how to get this to work */
INSERT INTO #Columns PIVOT SELECT * FROM dbo.fnSplitString('STUFF1,STUFF2,STUFF3',',');
SELECT * FROM #Columns;
Edit:
Using the examples here and https://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx I was able to come up with a solution. The split string function also needs to output a position. This was inspired by one of the solutions below just 'OrdinalPosition' needed to be added to the function. The resulting query works.
DECLARE #Columns TABLE (CustomerNumber VARCHAR(MAX) NOT NULL, FirstName VARCHAR(MAX) NOT NULL, LastName VARCHAR(MAX) NOT NULL);
INSERT INTO #Columns select [0], [1], [2] from (SELECT position, splitdata FROM dbo.fnSplitString('STUFF1,STUFF4,STUFF3',',')) split pivot (MAX(splitdata) FOR position in ([0],[1],[2])) piv;
SELECT * FROM #Columns;

Is this what you are looking for:
I just used "item" as column name. Replace it with the name corresponding to your name that is returned from fnSplitString function.
DECLARE #Columns TABLE (CustomerNumber VARCHAR(MAX) NOT NULL, FirstName
VARCHAR(MAX) NOT NULL, LastName VARCHAR(MAX) NOT NULL);
INSERT INTO #Columns
select STUFF1,STUFF2,STUFF3 from (SELECT item FROM
dbo.fnSplitString('STUFF1,STUFF2,STUFF3',',')) d
pivot ( max(item) for item in (STUFF1,STUFF2,STUFF3) ) piv;
SELECT * FROM #Columns;

If your function fnSplitString returns a table then simply
INSERT INTO #Columns
SELECT * FROM dbo.fnSplitString('STUFF1,STUFF2,STUFF3',',')
PIVOT
(
MAX(ParsedValue)
FOR OrdinalPosition in ([0], [1], [2], [3])
) x

Related

Not able to parse JSON in synapse SQL with OPENJSON

This question has a reference to my earlier SO thread. Here it is.
In a nutshell, I am trying to parse a JSON input in Synapse SQL.
DECLARE #json nvarchar(max)
SET #json= '{"value": "{\"value\":[{\"ERDAT\":\"20210511\"},{\"ERDAT\":\"20210511\"},
{\"ERDAT\":\"20210511\"},{\"ERDAT\":\"20210511\"},"type": "String"}';
DECLARE #ReplacedDetails nvarchar(max), #ReplacedStringDetails nvarchar(max)
SET #ReplacedDetails = REPLACE(LTRIM(RTRIM(#json)),'\','');
SET #ReplacedStringDetails = REPLACE(#ReplacedDetails,',"type": "String"','');
SELECT #ReplacedStringDetails
CREATE TABLE #ValueTable_15
(
ColumnName varchar(200),
LastUpdatedValue varchar(200)
);
INSERT INTO #ValueTable_15 (ColumnName,LastUpdatedValue)
SELECT TOP(1) j2.[key],TRY_PARSE(j2.[value] as bigint) AS LastUpdatedValue
FROM OPENJSON(#ReplacedStringDetails, '$.value.value') j1
CROSS APPLY OPENJSON(j1.value) j2
ORDER BY LastUpdatedValue DESC;
Then when I am running the above query, I am getting error:
Microsoft][ODBC Driver 17 for SQL Server][SQL Server]JSON text is not properly formatted. Unexpected character 'v' is found at position 13
When I am simply trying to SELECT #ReplaceStringDetails it is giving the expected results.
What I am missing here?
P.S. I replaced $.value.value with simple $.value, but yielding no result.

As noted in the comments it appears that the characters ]}" got removed before ,"type": "String"}. If you correct that edit you should be able to parse the JSON using the normal OPENJSON() table-valued function, e.g.:
DECLARE #json nvarchar(max)
SET #json= '{"value": "{\"value\":[{\"ERDAT\":\"20210511\"},{\"ERDAT\":\"20210511\"},{\"ERDAT\":\"20210511\"},{\"ERDAT\":\"20210511\"}]}","type": "String"}';
SELECT [embedded].*
FROM OPENJSON(#json) WITH (
[value] nvarchar(max) --nope: AS JSON
) [wrapper]
CROSS APPLY OPENJSON(wrapper.value, '$.value') WITH (
[ERDAT] nvarchar(10)
) [embedded];
Which returns the results...
ERDAT
20210511
20210511
20210511
20210511

SQL Server 2016 - Array into separate columns

I'm tasked with importing data into SQL thats pretty much JSON but not quite . I've used OPENROWSET/OPENJSON to import into a staging table and the data looks like this
What I need to achieve is migrate that to a single table with the following structure
I'm having no success , I even trying updating the data in the staging table to look like this and import but no joy.
My current attempt:
SELECT A.[DATE], A.[VALUE]
FROM OPENJSON(#JSON) AS I
CROSS APPLY (
SELECT *
FROM OPENJSON (#JSON) WITH (
[DATE] NVARCHAR(MAX) '$.DATE',
[VALUE] NVARCHAR(MAX) '$.VALUE'
)
) A OUTPUT
Any recommendations ?

Just use this way:
CREATE TABLE #tmp (
instance NVARCHAR(50),
json NVARCHAR(1000)
)
INSERT #tmp
VALUES
( N'server1.com',
N'[{"date":10000, "value":"6"},{"date":20000, "value":"8"}]'
)
SELECT
t.instance, Date,Value
FROM #tmp t
OUTER APPLY OPENJSON(t.json)
WITH (
Date varchar(200) '$.date' ,
Value VARCHAR(100) '$.value'
)

For your first set of data, you have a doubly-nested JSON array, so you need to use OPENJSON to break open the outer one first:
SELECT
instance
JSON_VALUE(j1.innerArray, '$[0]') AS date,
JSON_VALUE(j1.innerArray, '$[1]') AS value
FROM table t
CROSS APPLY OPENJSON(t.json) WITH (
innerArray nvarchar(max) '$' AS JSON
) j1
For the second version, just change the JSON_VALUE parameters:
JSON_VALUE(j1.innerArray, '$.date') AS date,
JSON_VALUE(j1.innerArray, '$.value') AS value

Original answer:
The reason for the unexpected result is the fact, that the you have nested JSON arrays, but the WITH clause is not correct. You need to use the appropriate WITH clause, like the statement below:
Table:
SELECT *
INTO Data
FROM (VALUES
('server1.com', '[[1613347200, "7"], [1613347205, "8"], [1613347202, "9"]]'),
('server2.com', '[[1613317200, "3"], [1613347215, "2"], [1613347212, "1"]]')
) v (instance, array)
Statement:
SELECT d.instance, j.[date], j.[value]
FROM Data d
OUTER APPLY OPENJSON(d.array) WITH (
[date] numeric(10, 0) '$[0]',
[value] varchar(1) '$[1]'
) j
Result:
instance date value
-----------------------------
server1.com 1613347200 7
server1.com 1613347205 8
server1.com 1613347202 9
server2.com 1613317200 3
server2.com 1613347215 2
server2.com 1613347212 1
Update:
Your second attempt is almost correct. The reason for the NULL values is the fact, that the path part of the columns definitions in the WITH clause is case sensitive:
SELECT d.instance, j.[date], j.[value]
FROM (VALUES
('server1.com', '[{"date":1613347200, "value":"7"}, {"date":1613347200, "value":"8"}]')
) d (instance, array)
OUTER APPLY OPENJSON(d.array) WITH (
[date] numeric(10, 0) '$.date',
[value] varchar(1) '$.value'
) j

Creating a Table Valued Function with dynamic columns

I received help previously on creating a query for listing total sales by month for a list of jobs. However, I cannot put this into a view as a variable needs declared.
I thought I could use a Table Valued Function (which I've never used before) but apparently I need to define the columns of the table which I cannot do if they are to be dynamic (the columns are yyyy-mm).
Can someone suggest how I could approach this problem please?
I am not familiar with creating anything other than views so this is completely new to me. Obviously if using the TVF isn't the best method please advise.
The query:
Declare #SQL varchar(max) = Stuff((Select Distinct ',' + QuoteName(YearMonth) From v_JobSalesByMonth Order by 1 For XML Path('')),1,1,'')
Select #SQL = '
Select [JobID], [TotalJobSales],' + #SQL + '
From (
Select JobID
,TotalJobSales = sum(SalesForMonth) over (Partition By JobID)
,YearMonth
,SalesForMonth
From v_JobSalesByMonth A) A
Pivot (sum(SalesForMonth) For [YearMonth] in (' + #SQL + ') ) p'
Exec(#SQL);
I started to create the function but am confused about how to define the dynamic columns.
CREATE FUNCTION db.tvfnJobSalesByMonthPivot (#JobID INT)
RETURNS #JobSalesByMonth TABLE
(
JobID int PRIMARY KEY NOT NULL,
TotalJobSales decimal(6,2) NULL,
2016-12 (ermmmmm?)
)
Many thanks in advance.
Paul

I am not getting values by passing variable using IN query in SQL

I am passing string values from my code like '12th Standard/Ordinary National Diploma,Higher National Diploma' to SQL query, but I am not getting any values and nothing showing any result.
My SQL query:
declare #qua varchar(250),#final varchar(250),#Qualification varchar(250)
set #Qualification= '12th Standard/Ordinary National Diploma,Higher National Diploma'
set #qua =replace(#Qualification,',',''',''')
set #final= ''''+#qua+''''
select * from mytablename in(#final)
Result: Data is not displaying
Thank you in advance.

Instead do it using a table variable like
declare #tbl table(qual varchar(250));
insert into #tbl
select '12th Standard/Ordinary National Diploma'
union
select 'Higher National Diploma';
select * from mytablename where somecolumn in(select qual from #tbl);

Despite trying to put quote marks in there, you're still only passing a single string to the IN. The string just contains embedded quotes and SQL Server is looking for that single long string.
You also don't seem to be comparing a column for the IN.
Your best bet is to pass in multiple string variables, but if that's not possible then you'll have to write a function that parses a single string into a resultset and use that. For example:
SELECT
Column1, -- Because we never use SELECT *
Column2
FROM
MyTableName
WHERE
qualification IN (SELECT qualification FROM dbo.fn_ParseString(#qualifications))

You can insert all your search criteria in one table and then can easily do a lookup on the main table, example below:
DECLARE #MyTable TABLE (Name VARCHAR(10), Qualification VARCHAR(50))
DECLARE #Search TABLE (Qualifications VARCHAR(50))
INSERT INTO #MyTable VALUES ('User1','12th Standard'), ('User2','Some Education'),
('User3','Ordinary National Diploma'), ('User4','Some Degree'),
('User5','Higher National Diploma')
INSERT INTO #Search VALUES ('12th Standard'),('Ordinary National Diploma'),('Higher National Diploma')
SELECT MT.*
FROM #MyTable MT
INNER JOIN (SELECT Qualifications FROM #Search) S ON S.Qualifications = MT.Qualification

As previous said, you are passing a string with commas, not comma separated values. It needs to be split up into separate values.
You can do this by passing the qualification string into XML which you can use to turn it into separate rows of data.
The IN parameter will then accept the data as separate values.
DECLARE #Qualifications as varchar(150) = '12th Standard/Ordinary National Diploma,Higher National Diploma'
Declare #Xml XML;
SET #Xml = N'<root><r>' + replace(#Qualifications, char(44),'</r><r>') + '</r></root>';
select *
from MyTableName
Where MyTableName.Qualification in
(select r.value('.','varchar(max)') as item
from #Xml.nodes('//root/r') as records(r))

Alternatively you can create a table-valued function that splits according to input like in your case its ',' and then INNER JOIN with the returnColumnname and that particular column that you want to filter
SELECT COLUMNS, . . . .
FROM MyTableName mtn
INNER JOIN dbo.FNASplitToTable(#qualifications, ',') csvTable
ON csvTable.returnColumnName = mtn.somecolumn
Table Valued function might be like:
CREATE FUNCTION dbo.FNASplitToTable (#string varchar(MAX), #splitType CHAR(1))
RETURNS #result TABLE(Value VARCHAR(100))
AS
BEGIN
DECLARE #x XML
SELECT #x = CAST('<A>' + REPLACE(#string, #splitType, '</A><A>') + '</A>' AS XML)
INSERT INTO #result
SELECT LTRIM(t.value('.', 'VARCHAR(100)')) AS inVal
FROM #x.nodes('/A') AS x(t)
RETURN
END
GO

Execute multiple dynamic T-SQL statements and obtain a limited number of unique values while preserving order

I have a SourceTable and a table variable #TQueries containing various T-SQL predicates that target SourceTable.
The expected result is to dynamically generate SELECT statements that return a list of Id's as specified by the predicates in #TQueries. Each dynamically generated SELECT statement also needs to execute in a particular order, and the final set of values needs to be unique and the ordering must be preserved.
Fortunately, there's a limit to how many values need to be retrieved and how many dynamic queries need to be generated. The Id list should contain at most 10 Ids, and we don't expect more than 7 queries.
The following is a sample of this setup, not the actual data/database:
-- Set up some test data, this is quick and dirty just to provide some data to test against
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[SourceTable]') AND type in (N'U'))
BEGIN
-- Create a numbers table, sorta
SELECT TOP 20
IDENTITY(INT,1,1) AS Id,
ABS(CHECKSUM(NewId())) % 100 AS [SomeValue]
INTO [SourceTable]
FROM sysobjects a
END
DECLARE #TQueries TABLE (
[Ordinal] INT,
[WherePredicate] NVARCHAR(MAX),
[OrderByPredicate] NVARCHAR(MAX)
);
-- Simulate SELECTs with different order by that get different data due to varying WHERE clauses and ORDER conditions
INSERT INTO #TQueries VALUES ( 1, N'[Id] IN (6,11,13,7,10,3,15)', '[SomeValue] ASC' ) -- Sort Asc
INSERT INTO #TQueries VALUES ( 2, N'[Id] IN (9,15,14,20,17)', '[SomeValue] DESC' ) -- Sort Desc
INSERT INTO #TQueries VALUES ( 3, N'[Id] IN (20,10,1,16,11,19,9,15,17,6,2,3,13)', 'NEWID()' ) -- Sort Random
My main issue has been avoiding the use of a CURSOR or iterating through the rows one by one. The closest I've come to a set operation that meets this criteria is using a table variable to store the results of each query or a massive CTE.
Suggestions and comments are welcome.

Here's a solution that builds a single statement both to run all the queries and to return the results.
It uses a similar approach as in your answer when iterating over the #TQueries table, i.e. it also uses {...} tokens where column values from #TQuery should go, and it puts the values there with nested REPLACE() calls.
Other than that, it heavily depends on ranking functions, and I'm not sure if doesn't really abuse them. You'd need to test this method before deciding if it's better or worse than the one you've got so far.
DECLARE #QueryTemplate nvarchar(max), #FinalSQL nvarchar(max);
SET #QueryTemplate =
N'SELECT
[Id],
QueryRank = {Ordinal},
RowRank = ROW_NUMBER() OVER (ORDER BY {OrderByPredicate})
FROM [dbo].[SourceTable]
WHERE {WherePredicate}
';
SET #FinalSQL =
N'WITH AllData AS (
' +
SUBSTRING(
(
SELECT
'UNION ALL ' +
REPLACE(REPLACE(REPLACE(#QueryTemplate,
'{Ordinal}' , [Ordinal] ),
'{OrderByPredicate}', [OrderByPredicate]),
'{WherePredicate}' , [WherePredicate] )
FROM #TQueries
ORDER BY [Ordinal]
FOR XML PATH (''), TYPE
).value('.', 'nvarchar(max)'),
11, -- starting just after the first 'UNION ALL '
CAST(0x7FFFFFFF AS int) -- max int; no need to specify the exact length
) +
'),
RankedData AS (
SELECT
[Id],
QueryRank,
RowRank,
ValueRank = ROW_NUMBER() OVER (PARTITION BY [Id] ORDER BY QueryRank)
FROM AllData
)SELECT TOP (#top)
[Id]
FROM RankedData
WHERE ValueRank = 1
ORDER BY
QueryRank,
RowRank
';
PRINT #FinalSQL;
EXECUTE sp_executesql #FinalSQL, N'#top int', 10;
Basically, every subquery gets these auxiliary columns:
QueryRank – a constant value (within the subquery's result set) derived from [Ordinal];
RowRank – a ranking assigned to a row based on the [OrderByPredicate].
The result sets are UNIONed and then every entry of every unique value is again ranked (ValueRank) based on the query ranking.
When pulling the final result set, duplicates are suppressed (by the condition ValueRank = 1), and QueryRank and RowRank are used in the ORDER BY clause to preserve the original row order.
I used EXECUTE sp_executesql #query instead of EXECUTE (#query), because the former allows you to add parameters to the query. In particular, I parametrised the number of results to return (the argument of TOP). But you could certainly concatenate that value into the dynamic script directly, just like other things, if you prefer EXECUTE () over EXECUTE sq_executesql.
If you like, you can try this query at SQL Fiddle. (Note: the SQL Fiddle version replaces the #TQueries table variable with the TQueries table.)

This is what I've managed to piece together cobbled from my original response and improved upon by comments from #AndriyM
DECLARE #sql_prefix NVARCHAR(MAX);
SET #sql_prefix =
N'DECLARE #TResults TABLE (
[Ordinal] INT IDENTITY(1,1),
[ContentItemId] INT
);
DECLARE #max INT, #top INT;
SELECT #max = 10;';
DECLARE #sql_insert_template NVARCHAR(MAX), #sql_body NVARCHAR(MAX);
SET #sql_insert_template =
N'SELECT #top = #max - COUNT(*) FROM #TResults;
INSERT INTO #TResults
SELECT TOP (#top) [Id]
FROM [dbo].[SourceTable]
WHERE
{WherePredicate}
AND NOT EXISTS (
SELECT 1
FROM #TResults AS [tr]
WHERE [tr].[ContentItemId] = [SourceTable].[Id]
)
ORDER BY {OrderByPredicate};';
WITH Query ([Ordinal],[SqlCommand]) AS (
SELECT
[Ordinal],
REPLACE(REPLACE(#sql_insert_template, '{WherePredicate}', [WherePredicate]), '{OrderByPredicate}', [OrderByPredicate])
FROM #TQueries
)
SELECT
#sql_body = #sql_prefix + (
SELECT [SqlCommand]
FROM Query
ORDER BY [Ordinal] ASC
FOR XML PATH(''),TYPE).value('.', 'varchar(max)') + CHAR(13)+CHAR(10)
+N' SELECT * FROM #TResults ORDER BY [Ordinal]';
EXEC(#sql_body);
The basic idea is to use a table variable to hold the results of each query. I create a template for the SQL and replace the values in the template based on what is stored in #TQueries.
Once the entire script is completed I run it with EXEC.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Transact SQL Pivot a Split String into appropriate columns - sql-server

If your function fnSplitString returns a table then simply INSERT INTO #Columns SELECT * FROM dbo.fnSplitString('STUFF1,STUFF2,STUFF3',',') PIVOT ( MAX(ParsedValue) FOR OrdinalPosition in ([0], [1], [2], [3]) ) x

Related

Not able to parse JSON in synapse SQL with OPENJSON

SQL Server 2016 - Array into separate columns

Creating a Table Valued Function with dynamic columns

I am not getting values by passing variable using IN query in SQL

Execute multiple dynamic T-SQL statements and obtain a limited number of unique values while preserving order

Categories

Resources