I'm tasked with importing data into SQL thats pretty much JSON but not quite . I've used OPENROWSET/OPENJSON to import into a staging table and the data looks like this
What I need to achieve is migrate that to a single table with the following structure
I'm having no success , I even trying updating the data in the staging table to look like this and import but no joy.
My current attempt:
SELECT A.[DATE], A.[VALUE]
FROM OPENJSON(#JSON) AS I
CROSS APPLY (
SELECT *
FROM OPENJSON (#JSON) WITH (
[DATE] NVARCHAR(MAX) '$.DATE',
[VALUE] NVARCHAR(MAX) '$.VALUE'
)
) A OUTPUT
Any recommendations ?
Just use this way:
CREATE TABLE #tmp (
instance NVARCHAR(50),
json NVARCHAR(1000)
)
INSERT #tmp
VALUES
( N'server1.com',
N'[{"date":10000, "value":"6"},{"date":20000, "value":"8"}]'
)
SELECT
t.instance, Date,Value
FROM #tmp t
OUTER APPLY OPENJSON(t.json)
WITH (
Date varchar(200) '$.date' ,
Value VARCHAR(100) '$.value'
)
For your first set of data, you have a doubly-nested JSON array, so you need to use OPENJSON to break open the outer one first:
SELECT
instance
JSON_VALUE(j1.innerArray, '$[0]') AS date,
JSON_VALUE(j1.innerArray, '$[1]') AS value
FROM table t
CROSS APPLY OPENJSON(t.json) WITH (
innerArray nvarchar(max) '$' AS JSON
) j1
For the second version, just change the JSON_VALUE parameters:
JSON_VALUE(j1.innerArray, '$.date') AS date,
JSON_VALUE(j1.innerArray, '$.value') AS value
Original answer:
The reason for the unexpected result is the fact, that the you have nested JSON arrays, but the WITH clause is not correct. You need to use the appropriate WITH clause, like the statement below:
Table:
SELECT *
INTO Data
FROM (VALUES
('server1.com', '[[1613347200, "7"], [1613347205, "8"], [1613347202, "9"]]'),
('server2.com', '[[1613317200, "3"], [1613347215, "2"], [1613347212, "1"]]')
) v (instance, array)
Statement:
SELECT d.instance, j.[date], j.[value]
FROM Data d
OUTER APPLY OPENJSON(d.array) WITH (
[date] numeric(10, 0) '$[0]',
[value] varchar(1) '$[1]'
) j
Result:
instance date value
-----------------------------
server1.com 1613347200 7
server1.com 1613347205 8
server1.com 1613347202 9
server2.com 1613317200 3
server2.com 1613347215 2
server2.com 1613347212 1
Update:
Your second attempt is almost correct. The reason for the NULL values is the fact, that the path part of the columns definitions in the WITH clause is case sensitive:
SELECT d.instance, j.[date], j.[value]
FROM (VALUES
('server1.com', '[{"date":1613347200, "value":"7"}, {"date":1613347200, "value":"8"}]')
) d (instance, array)
OUTER APPLY OPENJSON(d.array) WITH (
[date] numeric(10, 0) '$.date',
[value] varchar(1) '$.value'
) j
Related
So, my first post is less a question and more a statement! Sorry.
I needed to convert delimited strings stored in VarChar table columns to multiple/separate columns for the same record. (It's COTS software; so please don't bother telling me how the table is designed wrong.) After searching the internet ad nauseum for how to create a generic single line call to do that - and finding lots of how not to do that - I created my own. (The name is not real creative.)
Returns: A table with sequentially numbered/named columns starting with [Col1]. If an input value is not provided, then an empty string is returned. If less than 32 values are provided, all past the last value are returned as null. If more than 32 values are provided, they are ignored.
Prerequisites: A Number/Tally Table (luckily, our database already contained 'dbo.numbers').
Assumptions: Not more than 32 delimited values. (If you need more, change "WHERE tNumbers.Number BETWEEN 1 AND XXX", and add more prenamed columns ",[Col33]...,[ColXXX]".)
Issues: The very first column always gets populated, even if #InputString is NULL.
--======================================================================
--SMOZISEK 2017/09 CREATED
--======================================================================
CREATE FUNCTION dbo.fStringToPivotTable
(#InputString VARCHAR(8000)
,#Delimiter VARCHAR(30) = ','
)
RETURNS TABLE AS RETURN
WITH cteElements AS (
SELECT ElementNumber = ROW_NUMBER() OVER(PARTITION BY #InputString ORDER BY (SELECT 0))
,ElementValue = NodeList.NodeElement.value('.','VARCHAR(1022)')
FROM (SELECT TRY_CONVERT(XML,CONCAT('<X>',REPLACE(#InputString,#Delimiter,'</X><X>'),'</X>')) AS InputXML) AS InputTable
CROSS APPLY InputTable.InputXML.nodes('/X') AS NodeList(NodeElement)
)
SELECT PivotTable.*
FROM (
SELECT ColumnName = CONCAT('Col',tNumbers.Number)
,ColumnValue = tElements.ElementValue
FROM DBO.NUMBERS AS tNumbers --DEPENDENT ON ANY EXISTING NUMBER/TALLY TABLE!!!
LEFT JOIN cteElements AS tElements
ON tNumbers.Number = tElements.ElementNumber
WHERE tNumbers.Number BETWEEN 1 AND 32
) AS XmlSource
PIVOT (
MAX(ColumnValue)
FOR ColumnName
IN ([Col1] ,[Col2] ,[Col3] ,[Col4] ,[Col5] ,[Col6] ,[Col7] ,[Col8]
,[Col9] ,[Col10],[Col11],[Col12],[Col13],[Col14],[Col15],[Col16]
,[Col17],[Col18],[Col19],[Col20],[Col21],[Col22],[Col23],[Col24]
,[Col25],[Col26],[Col27],[Col28],[Col29],[Col30],[Col31],[Col32]
)
) AS PivotTable
;
GO
Test:
SELECT *
FROM dbo.fStringToPivotTable ('|Height|Weight||Length|Width||Color|Shade||Up|Down||Top|Bottom||Red|Blue|','|') ;
Usage:
SELECT 1 AS ID,'Title^FirstName^MiddleName^LastName^Suffix' AS Name
INTO #TempTable
UNION SELECT 2,'Mr.^Scott^A.^Mozisek^Sr.'
UNION SELECT 3,'Ms.^Jane^Q.^Doe^'
UNION SELECT 5,NULL
UNION SELECT 7,'^Betsy^^Ross^'
;
SELECT SourceTable.*
,ChildTable.Col1 AS ColTitle
,ChildTable.Col2 AS ColFirst
,ChildTable.Col3 AS ColMiddle
,ChildTable.Col4 AS ColLast
,ChildTable.Col5 AS ColSuffix
FROM #TempTable AS SourceTable
OUTER APPLY dbo.fStringToPivotTable(SourceTable.Name,'^') AS ChildTable
;
No, I have not tested any plan (I just needed it to work).
Oh, yeah: SQL Server 2012 (12.0 SP2)
Comments? Corrections? Enhancements?
Here is my TVF. Easy to expand up to the 32 (the pattern is pretty clear).
This is a straight XML without the cost of the PIVOT.
Example - Notice the OUTER APPLY --- Use CROSS APPLY to Exclude NULLs
Select A.ID
,B.*
From #TempTable A
Outer Apply [dbo].[tvf-Str-Parse-Row](A.Name,'^') B
Returns
The UDF if Interested
CREATE FUNCTION [dbo].[tvf-Str-Parse-Row] (#String varchar(max),#Delimiter varchar(10))
Returns Table
As
Return (
Select Pos1 = ltrim(rtrim(xDim.value('/x[1]','varchar(max)')))
,Pos2 = ltrim(rtrim(xDim.value('/x[2]','varchar(max)')))
,Pos3 = ltrim(rtrim(xDim.value('/x[3]','varchar(max)')))
,Pos4 = ltrim(rtrim(xDim.value('/x[4]','varchar(max)')))
,Pos5 = ltrim(rtrim(xDim.value('/x[5]','varchar(max)')))
,Pos6 = ltrim(rtrim(xDim.value('/x[6]','varchar(max)')))
,Pos7 = ltrim(rtrim(xDim.value('/x[7]','varchar(max)')))
,Pos8 = ltrim(rtrim(xDim.value('/x[8]','varchar(max)')))
,Pos9 = ltrim(rtrim(xDim.value('/x[9]','varchar(max)')))
From (Select Cast('<x>' + replace((Select replace(#String,#Delimiter,'§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml) as xDim) as A
Where #String is not null
)
--Thanks Shnugo for making this XML safe
--Select * from [dbo].[tvf-Str-Parse-Row]('Dog,Cat,House,Car',',')
--Select * from [dbo].[tvf-Str-Parse-Row]('John <test> Cappelletti',' ')
My company is using a generic logging database among many products. To prevent the need for a lot of cross database queries some info is stored in delimited fields within the generic data columns for the logging.
I'm wanting to write query's on the data, but I'm unsure how to use Pivot/Unpivot to get the data into appropriate columns?
Below is a generic example using static data for what I'm wanting to do, but not sure how to do it. We unfortunately don't have the built in split string function in SqlServer 2016 so dbo.fnSplitString is my written equivalent which works fine.
DECLARE #Columns TABLE (
CustomerNumber VARCHAR(MAX) NOT NULL,
FirstName VARCHAR(MAX) NOT NULL,
LastName VARCHAR(MAX) NOT NULL);
/* This isn't valid SQL ... unsure how to get this to work */
INSERT INTO #Columns PIVOT SELECT * FROM dbo.fnSplitString('STUFF1,STUFF2,STUFF3',',');
SELECT * FROM #Columns;
Edit:
Using the examples here and https://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx I was able to come up with a solution. The split string function also needs to output a position. This was inspired by one of the solutions below just 'OrdinalPosition' needed to be added to the function. The resulting query works.
DECLARE #Columns TABLE (CustomerNumber VARCHAR(MAX) NOT NULL, FirstName VARCHAR(MAX) NOT NULL, LastName VARCHAR(MAX) NOT NULL);
INSERT INTO #Columns select [0], [1], [2] from (SELECT position, splitdata FROM dbo.fnSplitString('STUFF1,STUFF4,STUFF3',',')) split pivot (MAX(splitdata) FOR position in ([0],[1],[2])) piv;
SELECT * FROM #Columns;
Is this what you are looking for:
I just used "item" as column name. Replace it with the name corresponding to your name that is returned from fnSplitString function.
DECLARE #Columns TABLE (CustomerNumber VARCHAR(MAX) NOT NULL, FirstName
VARCHAR(MAX) NOT NULL, LastName VARCHAR(MAX) NOT NULL);
INSERT INTO #Columns
select STUFF1,STUFF2,STUFF3 from (SELECT item FROM
dbo.fnSplitString('STUFF1,STUFF2,STUFF3',',')) d
pivot ( max(item) for item in (STUFF1,STUFF2,STUFF3) ) piv;
SELECT * FROM #Columns;
If your function fnSplitString returns a table then simply
INSERT INTO #Columns
SELECT * FROM dbo.fnSplitString('STUFF1,STUFF2,STUFF3',',')
PIVOT
(
MAX(ParsedValue)
FOR OrdinalPosition in ([0], [1], [2], [3])
) x
I know we can use LIKE for pattern matching, however, here is what want to do.
I have a table, which has a column, 'Pattern', the values are like:
host1%
%host2
....
I have another table, which has a column, 'Host'. The question is: how can I check whether the values in 'Host' table do not match any patterns in 'Pattern'?
If it is too complex, then a simplified question is: How can I check whether the values in 'Host' do not StartWith any strings in 'Pattern'?
We can use loop, but is there a better way? ideally, it should work for ql server 2008, but latest version will do.
thanks
Use where not exists followed by a subquery which checks each pattern against the current row of the table containing your data. i.e.
where not exists
(
select top 1 1
from #patterns p
where d.datum like p.pattern
)
Full Code for Working Example: SQL Fiddle
declare #patterns table
(
pattern nvarchar(16) not null
)
declare #data table
(
datum nvarchar(16) not null
)
insert #patterns
values ('host1%')
,('%host2')
insert #data
values ('host1234')
, ('234host1')
, ('host2345')
, ('345host2')
select *
from #data d
where not exists
(
select top 1 1
from #patterns p
where d.datum like p.pattern
)
select t1.host
from table_1 t1
left join table_2 t2 on t1.host like t2.pattern
where t2.pattern is null
I have a SourceTable and a table variable #TQueries containing various T-SQL predicates that target SourceTable.
The expected result is to dynamically generate SELECT statements that return a list of Id's as specified by the predicates in #TQueries. Each dynamically generated SELECT statement also needs to execute in a particular order, and the final set of values needs to be unique and the ordering must be preserved.
Fortunately, there's a limit to how many values need to be retrieved and how many dynamic queries need to be generated. The Id list should contain at most 10 Ids, and we don't expect more than 7 queries.
The following is a sample of this setup, not the actual data/database:
-- Set up some test data, this is quick and dirty just to provide some data to test against
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[SourceTable]') AND type in (N'U'))
BEGIN
-- Create a numbers table, sorta
SELECT TOP 20
IDENTITY(INT,1,1) AS Id,
ABS(CHECKSUM(NewId())) % 100 AS [SomeValue]
INTO [SourceTable]
FROM sysobjects a
END
DECLARE #TQueries TABLE (
[Ordinal] INT,
[WherePredicate] NVARCHAR(MAX),
[OrderByPredicate] NVARCHAR(MAX)
);
-- Simulate SELECTs with different order by that get different data due to varying WHERE clauses and ORDER conditions
INSERT INTO #TQueries VALUES ( 1, N'[Id] IN (6,11,13,7,10,3,15)', '[SomeValue] ASC' ) -- Sort Asc
INSERT INTO #TQueries VALUES ( 2, N'[Id] IN (9,15,14,20,17)', '[SomeValue] DESC' ) -- Sort Desc
INSERT INTO #TQueries VALUES ( 3, N'[Id] IN (20,10,1,16,11,19,9,15,17,6,2,3,13)', 'NEWID()' ) -- Sort Random
My main issue has been avoiding the use of a CURSOR or iterating through the rows one by one. The closest I've come to a set operation that meets this criteria is using a table variable to store the results of each query or a massive CTE.
Suggestions and comments are welcome.
Here's a solution that builds a single statement both to run all the queries and to return the results.
It uses a similar approach as in your answer when iterating over the #TQueries table, i.e. it also uses {...} tokens where column values from #TQuery should go, and it puts the values there with nested REPLACE() calls.
Other than that, it heavily depends on ranking functions, and I'm not sure if doesn't really abuse them. You'd need to test this method before deciding if it's better or worse than the one you've got so far.
DECLARE #QueryTemplate nvarchar(max), #FinalSQL nvarchar(max);
SET #QueryTemplate =
N'SELECT
[Id],
QueryRank = {Ordinal},
RowRank = ROW_NUMBER() OVER (ORDER BY {OrderByPredicate})
FROM [dbo].[SourceTable]
WHERE {WherePredicate}
';
SET #FinalSQL =
N'WITH AllData AS (
' +
SUBSTRING(
(
SELECT
'UNION ALL ' +
REPLACE(REPLACE(REPLACE(#QueryTemplate,
'{Ordinal}' , [Ordinal] ),
'{OrderByPredicate}', [OrderByPredicate]),
'{WherePredicate}' , [WherePredicate] )
FROM #TQueries
ORDER BY [Ordinal]
FOR XML PATH (''), TYPE
).value('.', 'nvarchar(max)'),
11, -- starting just after the first 'UNION ALL '
CAST(0x7FFFFFFF AS int) -- max int; no need to specify the exact length
) +
'),
RankedData AS (
SELECT
[Id],
QueryRank,
RowRank,
ValueRank = ROW_NUMBER() OVER (PARTITION BY [Id] ORDER BY QueryRank)
FROM AllData
)SELECT TOP (#top)
[Id]
FROM RankedData
WHERE ValueRank = 1
ORDER BY
QueryRank,
RowRank
';
PRINT #FinalSQL;
EXECUTE sp_executesql #FinalSQL, N'#top int', 10;
Basically, every subquery gets these auxiliary columns:
QueryRank – a constant value (within the subquery's result set) derived from [Ordinal];
RowRank – a ranking assigned to a row based on the [OrderByPredicate].
The result sets are UNIONed and then every entry of every unique value is again ranked (ValueRank) based on the query ranking.
When pulling the final result set, duplicates are suppressed (by the condition ValueRank = 1), and QueryRank and RowRank are used in the ORDER BY clause to preserve the original row order.
I used EXECUTE sp_executesql #query instead of EXECUTE (#query), because the former allows you to add parameters to the query. In particular, I parametrised the number of results to return (the argument of TOP). But you could certainly concatenate that value into the dynamic script directly, just like other things, if you prefer EXECUTE () over EXECUTE sq_executesql.
If you like, you can try this query at SQL Fiddle. (Note: the SQL Fiddle version replaces the #TQueries table variable with the TQueries table.)
This is what I've managed to piece together cobbled from my original response and improved upon by comments from #AndriyM
DECLARE #sql_prefix NVARCHAR(MAX);
SET #sql_prefix =
N'DECLARE #TResults TABLE (
[Ordinal] INT IDENTITY(1,1),
[ContentItemId] INT
);
DECLARE #max INT, #top INT;
SELECT #max = 10;';
DECLARE #sql_insert_template NVARCHAR(MAX), #sql_body NVARCHAR(MAX);
SET #sql_insert_template =
N'SELECT #top = #max - COUNT(*) FROM #TResults;
INSERT INTO #TResults
SELECT TOP (#top) [Id]
FROM [dbo].[SourceTable]
WHERE
{WherePredicate}
AND NOT EXISTS (
SELECT 1
FROM #TResults AS [tr]
WHERE [tr].[ContentItemId] = [SourceTable].[Id]
)
ORDER BY {OrderByPredicate};';
WITH Query ([Ordinal],[SqlCommand]) AS (
SELECT
[Ordinal],
REPLACE(REPLACE(#sql_insert_template, '{WherePredicate}', [WherePredicate]), '{OrderByPredicate}', [OrderByPredicate])
FROM #TQueries
)
SELECT
#sql_body = #sql_prefix + (
SELECT [SqlCommand]
FROM Query
ORDER BY [Ordinal] ASC
FOR XML PATH(''),TYPE).value('.', 'varchar(max)') + CHAR(13)+CHAR(10)
+N' SELECT * FROM #TResults ORDER BY [Ordinal]';
EXEC(#sql_body);
The basic idea is to use a table variable to hold the results of each query. I create a template for the SQL and replace the values in the template based on what is stored in #TQueries.
Once the entire script is completed I run it with EXEC.
I have table with data 1/1 to 1/20 in one column. I want the value 1 to 20 i.e value after '/'(front slash) is updated into other column in same table in SQL Server.
Example:
Column has value 1/1,1/2,1/3...1/20
new Column value 1,2,3,..20
That is, I want to update this new column.
Try this:
UPDATE YourTable
SET Col2 = RIGHT(Col1,LEN(Col1)-CHARINDEX('/',Col1))
Please find the below query also split the string with delimeter.
Select Substring(#String1,0,CharIndex(#delimeter,#String1))
From: http://www.sql-server-helper.com/error-messages/msg-536.aspx
To use function LEFT if not all data is in the form '1/12' you need this in the second line above:
Set Col2 = LEFT(Col1, ISNULL(NULLIF(CHARINDEX('/', Col1) - 1, -1), LEN(Col1)))
SELECT SUBSTRING(ParentBGBU,0,CHARINDEX('-',ParentBGBU,0)) FROM dbo.tblHCMMaster;
I know this question is specific to sql server, but I'm using postgresql and came across this question, so for anybody else in a similar situation, there is the split_part(string text, delimiter text, field int) function.
Maybe something like this:
First some test data:
DECLARE #tbl TABLE(Column1 VARCHAR(100))
INSERT INTO #tbl
SELECT '1/1' UNION ALL
SELECT '1/20' UNION ALL
SELECT '1/2'
Then like this:
SELECT
SUBSTRING(tbl.Column1,CHARINDEX('/',tbl.Column1)+1,LEN(tbl.Column1))
FROM
#tbl AS tbl
SELECT emp.LoginID, emp.JobTitle, emp.BirthDate, emp.ModifiedDate ,
CASE WHEN emp.JobTitle NOT LIKE '%Document Control%' THEN emp.JobTitle
ELSE SUBSTRING(emp.JobTitle,CHARINDEX('Document Control',emp.JobTitle),LEN('Document Control'))
END
,emp.gender,emp.MaritalStatus
FROM HumanResources.Employee [emp]
WHERE JobTitle LIKE '[C-F]%'
Use CHARINDEX. Perhaps make user function. If you use this split often.
I would create this function:
CREATE FUNCTION [dbo].[Split]
(
#String VARCHAR(max),
#Delimiter varCHAR(1)
)
RETURNS TABLE
AS
RETURN
(
WITH Split(stpos,endpos)
AS(
SELECT 0 AS stpos, CHARINDEX(#Delimiter,#String) AS endpos
UNION ALL
SELECT endpos+1, CHARINDEX(#Delimiter,#String,endpos+1)
FROM Split
WHERE endpos > 0
)
SELECT 'INT_COLUMN' = ROW_NUMBER() OVER (ORDER BY (SELECT 1)),
'STRING_COLUMN' = SUBSTRING(#String,stpos,COALESCE(NULLIF(endpos,0),LEN(#String)+1)-stpos)
FROM Split
)
GO