SQLServer JSON column data to a temporary table - sql-server

I am using SQL Server 2016. The column in question contains JSON. It always stores data in below format;
{"question1":"123","question2":"123","reference-id":"Z6SIPLGKE56"}
So, multiple rows will have same structure with different values.
Is there a way i can retrieve it back as a table? or put it into a temporary table? So final output will be like;
question1 | question2 | reference-id|....
123 | 123 | Z6SIPLGKE56
456 | 456 | Z6SWFLGKE56
The end result I am looking at is export the results to a CSV. I can do this outside of the SQL Server, but was wondering whether it's possible with built-in features of SQL Server(With current searches I did, seems like the available functions such as openjson etc.. doesn't allow you to do this in one pass).
UPDATE 1 - Since more details are requested by commentros
This is a survey application. So, users can design their own surveys. The structure is stored as json. As a start let's assume each survey has same set of questions. (ex:- Survey 1 has 5 questions where as Survey 2 has 10 questions)
Now, let's say two users fill the survey 1. Sample data if visualized in json is as follows:
from user 1:
{"forms-survey-client-reference-id":"RYRT4ZU1ZO","question1":"ans1","question2":"ans2"....}
from user 2
{"forms-survey-client-reference-id":"RYRT4ZU1FE","question1":"asdf","question2":"dfhdsf"....}
So the CSV output for this survey has to be: (ignore the column order)
question1 | question2 | reference-id|....
asdf | dfhdsf | RYRT4ZU1FE
ans1 | ans2 | RYRT4ZU1ZO
Now consider survey 2 has the following structure after submitting data from:
User 1
{"forms-survey-client-reference-id":"RYRT4ZU1ZO","question1":"ans1","question2":"opt1,opt2,opt3"....}
User 2
{"forms-survey-client-reference-id":"RYRT4ABCZO","question1":"ans1","question2":"opt1,opt2"....}
Notice for question 2, users has selected multiple answers (checkboxes) and they are stored as a general string with comma separated(User 1 has selected 3 items and User 2 has selected 2 items)
The CSV output for above should be:
question1 | question2 | reference-id|....
ans1 | opt1,opt2 | RYRT4ZU1ZO
ans1 | opt1,opt2,opt3 | RYRT4ABCZO

Assuming that this is your JSON structure you can use the following
DECLARE #json NVARCHAR(4000) = '{"question1":"123","question2":"123","reference-id":"Z6SIPLGKE56"}'
SELECT *
FROM
(
SELECT [key] JsonKey , value JsonValue
FROM OPENJSON (#json)
) X
PIVOT
(
MAX(JsonValue) FOR JsonKey IN ([question1], [question2], [reference-id])
) P
If the structure is not going to be similar you'll need to create dynamic pivot
you can also do this:
DECLARE #json NVARCHAR(4000) = '{"question1":"123","question2":"123","reference-id":"Z6SIPLGKE56"}'
SELECT *
FROM OPENJSON (#json)
WITH ([question1] INT '$."question1"',
[question2] INT '$."question2"',
[reference-id] varchar(100) '$."reference-id"')

One method is with OPENJSON and CROSS APPLY:
DECLARE #JsonTable TABLE(json nvarchar(MAX));
INSERT INTO #JsonTable VALUES
(N'{"question1":"123","question2":"123","reference-id":"Z6SIPLGKE56"}')
, (N'{"question1":"456","question2":"456","reference-id":"Z6SIPLGKE57"}');
SELECT
question1
, question1
, reference_id
FROM #JsonTable
CROSS APPLY OPENJSON(json)
WITH (
question1 int '$.question1'
, question2 int '$.question2'
, reference_id varchar(20) '$."reference-id"'
);

Related

Split Single Column into multiple and Load it to a Table or a View

I'm using SQL Server 2008. I have a source table with a few columns (A, B) containing string data to split into a multiple columns. I do have function that does the split already written.
The data from the Source table (the source table format cannot be modified) is used in a View being created. But I need to have my View have already split data for Column A and B from the Source table. So, my view will have extra columns that are not in the Source table.
Then the View populated with the Source table is used to Merge with the Other Table.
There two questions here:
Can I split column A and B from the Source table when creating a View, but do not change the Source Table?
How to use my existing User Defined Function in the View "Select" statement to accomplish this task?
Idea in short:
String to split is also shown in the example in the commented out section. Pretty much have Destination table, vStandardizedData View, SP that uses the View data to Merge to tblStandardizedData table. So, in my Source column I have column A and B that I need to split before loading to tblStandardizedData table.
There are five objects that I'm working on:
Source File
Destination Table
vStandardizedData View
tblStandardizedData table
Stored procedure that does merge
(Update and Insert) form the vStandardizedData View.
Note: all the 5 objects a listed in the order they are supposed to be created and loaded.
Separately from this there is an existing UDFunction that can split the string which I was told to use
Example of the string in column A (column B has the same format data) to be split:
6667 Mission Street, 4567 7rd Street, 65 Sully Pond Park
Desired result:
User-defined function returns a table variable:
CREATE FUNCTION [Schema].[udfStringDelimeterfromTable]
(
#sInputList VARCHAR(MAX) -- List of delimited items
, #Delimiter CHAR(1) = ',' -- delimiter that separates items
)
RETURNS #List TABLE (Item VARCHAR(MAX)) WITH SCHEMABINDING
/*
* Returns a table of strings that have been split by a delimiter.
* Similar to the Visual Basic (or VBA) SPLIT function. The
* strings are trimmed before being returned. Null items are not
* returned so if there are multiple separators between items,
* only the non-null items are returned.
* Space is not a valid delimiter.
*
* Example:
SELECT * FROM [Schema].[udfStringDelimeterfromTable]('abcd,123, 456, efh,,hi', ',')
*
* Test:
DECLARE #Count INT, #Delim CHAR(10), #Input VARCHAR(128)
SELECT #Count = Count(*)
FROM [Schema].[udfStringDelimeterfromTable]('abcd,123, 456', ',')
PRINT 'TEST 1 3 lines:' + CASE WHEN #Count=3
THEN 'Worked' ELSE 'ERROR' END
SELECT #DELIM=CHAR(10)
, #INPUT = 'Line 1' + #delim + 'line 2' + #Delim
SELECT #Count = Count(*)
FROM [Schema].[udfStringDelimeterfromTable](#Input, #Delim)
PRINT 'TEST 2 LF :' + CASE WHEN #Count=2
THEN 'Worked' ELSE 'ERROR' END
What I'd ask you, is to read this: How to create a Minimal, Complete, and Verifiable example.
In general: If you use your UDF, you'll get table-wise data. It was best, if your UDF would return the item together with a running number. Otherwise you'll first need to use ROW_NUMBER() OVER(...) to create a part number in order to create your target column names via string concatenation. Then use PIVOT to get the columns side-by-side.
An easier approach could be a string split via XML like in this answer
A quick proof of concept to show the principles:
DECLARE #tbl TABLE(ID INT,YourValues VARCHAR(100));
INSERT INTO #tbl VALUES
(1,'6667 Mission Street, 4567 7rd Street, 65 Sully Pond Park')
,(2,'Other addr1, one more addr, and another one, and even one more');
WITH Casted AS
(
SELECT *
,CAST('<x>' + REPLACE(YourValues,',','</x><x>') + '</x>' AS XML) AS AsXml
FROM #tbl
)
SELECT *
,LTRIM(RTRIM(AsXml.value('/x[1]','nvarchar(max)'))) AS Address1
,LTRIM(RTRIM(AsXml.value('/x[2]','nvarchar(max)'))) AS Address2
,LTRIM(RTRIM(AsXml.value('/x[3]','nvarchar(max)'))) AS Address3
,LTRIM(RTRIM(AsXml.value('/x[4]','nvarchar(max)'))) AS Address4
,LTRIM(RTRIM(AsXml.value('/x[5]','nvarchar(max)'))) AS Address5
FROM Casted
If your values might include forbidden characters (especially <,> and &) you can find an approach to deal with this in the linked answer.
The result
+----+---------------------+-----------------+--------------------+-------------------+----------+
| ID | Address1 | Address2 | Address3 | Address4 | Address5 |
+----+---------------------+-----------------+--------------------+-------------------+----------+
| 1 | 6667 Mission Street | 4567 7rd Street | 65 Sully Pond Park | NULL | NULL |
+----+---------------------+-----------------+--------------------+-------------------+----------+
| 2 | Other addr1 | one more addr | and another one | and even one more | NULL |
+----+---------------------+-----------------+--------------------+-------------------+----------+

Best way to concat 1 to n values into single field from two tables

T-SQL
Imagine two tables looking like this:
Table: students
==============================
| TeacherID | SName |
| 1 | Thompson |
| 1 | Nickles |
| 2 | Cree |
==============================
Table: teacher
====================================================
| TeacherID | TName | + many other fields |
| 1 | Pipers | |
| 2 | Slinger | |
====================================================
The field names are completely arbitrary.
I want to create a query with the following output:
================================================================
| TeacherName | many other fields | Students |
| Pipers | | Thompson,Nickles |
================================================================
Currently I have something like this:
SELECT *
FROM teacher
LEFT JOIN (
SELECT DISTINCT
EL2.teacherID,
STUFF(( SELECT ',' + SName
FROM students
WHERE EL2.teacherID = students.teacherID
FOR XML PATH('')
),1,1,'') AS "Students"
FROM students, teacher EL2) t1
ON t1.teacherID = teacher.teacherID
WHERE t1.Students LIKE '%Thompson%'
This works and gives me what I need. The WHERE clause is to illustrate that I
also absolutely need to be able to filter if a teacher has that student, but then put all students that teacher has into the concated field.
My question now is if there is a better way to do this.
I already looked at this:
Concatenate many rows into a single text string?
But it didn't help me much because one I couldn't get it to work with two seperate tables and two I couldn't filter the way I needed.
The SQL Management Studio execution plan indicates that the SELECT DISTINCT is
very expensive and others have said that the reliance on XML PATH is not optimal because it's behaviour can change.
Be carefull with a DISTINCT on names, as you might have two students with the same name! And btw: GROUP BY is in most cases a better performing approach to get a distinct list...
You might try something like this:
SELECT t.*
,STUFF(( SELECT ',' + s.SName
FROM students AS s
WHERE t.teacherID = s.teacherID
FOR XML PATH('')
),1,1,'') AS Students
FROM teacher AS t
WHERE EXISTS(SELECT 1 FROM students AS x WHERE x.teacherID=t.teacherID /*AND [PUT YOUR FILTER HERE]*/)
If I understand this correctly you want to find only teachers where one given student is connected to the teacher. And in this case you want to find all students bound to all teachers connected to the given student, correct?
At the end you find a /*AND [PUT YOUR FILTER HERE]*/ At this place you should put something like AND x.StudentId=123. This will filter the teachers to the rows connected with this student only. For these teachers all students are concatenated...
Use XML Path,..How for XML path works:
select
TeacherID,
Tname,
stuff((select ','+s.sname from students s where s.teacherid=t.teacherid
for xml path('')),1,1,'')as students
from
teachers t

How to build search engine for website using sql server

I need some help with creating a simple search engine for website. Basic idea is that user will enter a string in search bar, which will compare in database key_word and get the results.
Let's say I have the following table in the SQL Server database:
|----|----------|----------------------|
| ID | URL | key_word |
|----|----------|----------------------|
| 1 | url1.com | cat short red NYC |
| 2 | url2.com | tall blue LA |
| 3 | url3.com | skinny NYC green |
| 4 | url4.com | cat black get |
|----|----------|----------------------|
Now in search bar, lets say user want to search the below string "get red cat from NYC". I want to search this in database 'key_word'.
String key = "get red cat from NYC"
What I have tried:
So far I have the following below query to search from database. This is good for if user want to search for only one word. but the string 'key' will not work here and it will return 0 result. I need some idea so I can make this better query.
SELECT *
FROM [SearchTable]
WHERE [key_Word] LIKE % key %;
What I want:
I want to change this sql server query so that it return ID=1,3,4.
So in other words. I want to take this string:
String key = "get red cat from NYC"
and first search in database the word "get". it doesn't show up so go to next word. Next word is "red", this shows up in ID=1. next word is "cat", this shows up in ID=1,4. Next word is "from", this doesn't show up in any rows. Next word is "NYC", this shows up in ID=1,3.
put all id's together and you get ID's=1,1,4,1,3.
than I want to sort it so that ID=1 shows up at top and ID=3,4 can be at button since they are tied.
I was hoping to do this by only one SQL query, because if I keep connecting to database than the speed will go down too. So I was think of some SQL Server functions?
You need a string splitter for this. See this article for some functions:
DECLARE #key VARCHAR(MAX) = 'get red cat from NYC'
SELECT t.ID
FROM tbl t
CROSS APPLY dbo.SplitStrings_XML(t.key_word, ' ') tx
INNER JOIN (
SELECT Item
FROM dbo.SplitStrings_XML(#key, ' ')
)k
ON k.Item = tx.Item
GROUP BY T.ID
ORDER BY COUNT(*) DESC
SQL Fiddle
Here is the SplitStrings_XML function:
CREATE FUNCTION dbo.SplitStrings_XML
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT Item = y.i.value('(./text())[1]', 'nvarchar(4000)')
FROM
(
SELECT x = CONVERT(XML, '<i>'
+ REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
The above function will not work if your string has illegal XML characters like >, <, and &. You can use other splitter but the idea stays the same.

Join tables by column names, convert string to column name

I have a table which store 1 row per 1 survey.
Each survey got about 70 questions, each column present 1 question
SurveyID Q1, Q2 Q3 .....
1 Yes Good Bad ......
I want to pivot this so it reads
SurveyID Question Answer
1 Q1 Yes
1 Q2 Good
1 Q3 Bad
... ... .....
I use {cross apply} to acheive this
SELECT t.[SurveyID]
, x.question
, x.Answer
FROM tbl t
CROSS APPLY
(
select 1 as QuestionNumber, 'Q1' as Question , t.Q1 As Answer union all
select 2 as QuestionNumber, 'Q2' as Question , t.Q2 As Answer union all
select 3 as QuestionNumber, 'Q3' as Question , t.Q3 As Answer) x
This works but I dont want to do this 70 times so I have this select statement
select ORDINAL_POSITION
, COLUMN_NAME from INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = mytable
This gives me the list of column and position of column in the table.
So I hope I can somehow join 2nd statement with the 1st statement where by column name. However I am comparing content within a column and a column header here. Is it doable? Is there other way of achieving this?
Hope you can guide me please?
Thank you
Instead of Cross Apply you should use UNPIVOT for this query....
SQL Fiddle
MS SQL Server 2008 Schema Setup:
CREATE TABLE Test_Table(SurveyID INT, Q1 VARCHAR(10)
, Q2 VARCHAR(10), Q3 VARCHAR(10), Q4 VARCHAR(10))
INSERT INTO Test_Table VALUES
(1 , 'Yes', 'Good' , 'Bad', 'Bad')
,(2 , 'Bad', 'Bad' , 'Yes' , 'Good')
Query 1:
SELECT SurveyID
,Questions
,Answers
FROM Test_Table t
UNPIVOT ( Answers FOR Questions IN (Q1,Q2,Q3,Q4))up
Results:
| SurveyID | Questions | Answers |
|----------|-----------|---------|
| 1 | Q1 | Yes |
| 1 | Q2 | Good |
| 1 | Q3 | Bad |
| 1 | Q4 | Bad |
| 2 | Q1 | Bad |
| 2 | Q2 | Bad |
| 2 | Q3 | Yes |
| 2 | Q4 | Good |
If you need to perform this kind of operation to lots of similar tables that have differing numbers of columns, an UNPIVOT approach alone can be tiresome because you have to manually change the list of columns (Q1,Q2,Q3,etc) each time.
The CROSS APPLY based query in the question also suffers from similar drawbacks.
The solution to this, as you've guessed, involves using meta-information maintained by the server to tell you the list of columns you need to operate on. However, rather than requiring some kind of join as you suspect, what is needed is Dynamic SQL, that is, a SQL query that creates another SQL query on-the-fly.
This is done essentially by concatenating string (varchar) information in the SELECT part of the query, including values from columns which are available in your FROM (and join) clauses.
With Dynamic SQL (DSQL) approaches, you often use system metatables as your starting point. INFORMATION_SCHEMA exists in some SQL Server versions, but you're better off using the Object Catalog Views for this.
A prototype DSQL solution to generate the code for your CROSS APPLY approach would look something like this:
-- Create a variable to hold the created SQL code
-- First, add the static code at the start:
declare #SQL varchar(max) =
' SELECT t.[SurveyID]
, x.question
, x.Answer
FROM tbl t
CROSS APPLY
(
'
-- This syntax will add to the variable for every row in the query results; it's a little like looping over all the rows.
select #SQL +=
'select ' + cast(C.column_id as varchar)
+ ' as QuestionNumber, ''' + C.name
+ ''' as Question , t.' + C.name
+ ' As Answer union all
'
from sys.columns C
inner join sys.tables T on C.object_id=T.object_id
where T.name = 'MySurveyTable'
-- Remove final "union all", add closing bracket and alias
set #SQL = left(#SQL,len(#SQL)-10) + ') x'
print #SQL
-- To also execute (run) the dynamically-generated SQL
-- and get your desired row-based output all at the same time,
-- use the EXECUTE keyword (EXEC for short)
exec #SQL
A similar approach could be used to dynamically write SQL for the UNPIVOT approach.

SQL Server Query Problem?

I was using MySQL's concat_ws function to get data from MySQL database and it was working perfectly. I discover this query through this SO Question. Now I have same case but this time database is SQL Server.
Asking again for SQL Server:
I have a table like this:
id | roll_no | name
---------------------
1 | 111 | Naveed
2 | 222 | Adil
3 | 333 | Ali
If I have data like this:
$fields = array( "id" , "roll_no" ) and $values = array( "1,111", "2,222" );
It means I have to write a sql query to get records from table where (id != 1 and roll_no != 111) and (id != 2 and roll_no != 222). It means 3rd record will be fetched.
I have data like this:
$fields = array( "id" ) and $values = array( "2", "3" );
It means I have to write a sql query to get records from table where (id != 2) and (id != 3). It means 1st record will be fetched.
How to write a general single query to get data from table using above two data arrays?
Thanks
Your question is unclear for me, but from the link you provided to the other SO question, and assuming you just want to adapt to SQL Server, I guess you could just replace
concat_ws (',', id, roll_no)
by
id + ',' + roll_no
It's a simple sql. Do you know 'where (not) in' clause?
Like for the first one if you have 1-to-1 relation between id and roll no, you can simple write
select *
from table
where id not in ('1','2') and roll_no not in ('111','222')
For the second it's easier.

Resources