Trying to parse json data without having access to OPEN_JSON - sql-server

We have a production database that has compatibility level of 100, this cannot change and open_json requires a level of 130. So I'm trying to figure out a little work around
Here is an example json
SET #json = '{
"sessionid": "XXXXX",
"userid": "XXXX",
"scorm": [{
"name": "variable_name",
"value": "variable_value"
},
{
"name": "variable_name",
"value": "variable_value"
},
... Continues on like this
]
}';
So I am able to get values out of this JSON by doing the following
SELECT JSON(#json,'$.sessionid') as SESSIONID
SELECT JSON_VALUE(#json, '$.scorm[0].value') as FirstValue
But I have no idea how many will be inside the scorm: [{}] array, but we need all of them to save into a table. MAX there ever will be is probably 25 different values.
Soo I figured I could do something like:
//PSUEDO CODE
while(value != null)
DO
//Get values
select value = JSON_VALUE(#json, '$.scorm[0].value') as VALUE
select name = JSON_VALUE(#json, '$.scorm[0].name') as NAME
//Erase values from json
SET #json = JSON_VALUE(#json, '$.scorm[0], NULL)
SET #json = REPLACE(#json,'null,','');
//Insert into table
Insert (name,value) into TABLE
This is essentially getting the topmost value from the json array, and then setting it to 'NULL' and manipulating the string to get rid of it. Meaning now the json array has a new topmost value.
Then I would continue on like this and insert data into my table for each name/value.
This all seems like a terrible way to be doing it, and I can't imagine this is good, nor fast.
Am I missing something obvious? Some other-way I could be doing this (without access to OPEN_JSON()) ?

Related

Full Text Search in OrientDB JSON Data

I have following data in OrientDB 3.0.27 where some of the values are in JSON Array and some are string
{
"#type": "d",
"#rid": "#57:0",
"#version": 2,
"#class": "abc_class",
"user_name": [
"7/1 LIBOR Product"
],
"user_Accountability": [],
"user_Rollout_32_date": [],
"user_Brands": [
"AppNet"
],
"user_lastModificationTime": [
"2019-11-27 06:40:35"
],
"user_columnPercentage": [
"0.00"
],
"user_systemId": [
"06114a87-a099-0c30c60b49c4"
],
"user_lastModificationUser": [
"system"
],
"system_type": "Product",
"user_createDate": [
"2017-10-27 09:58:42"
],
"system_modelId": "bian_model",
"user_parent": [
"a12a41bd-af6f-0ca028af480d"
],
"user_Strategic_32_value": [],
"system_oeId": "06114a87-a099-0c30c60b49c4",
"user_description": [],
"#fieldTypes": "user_name=e,user_Accountability=e,user_Rollout_32_date=e,user_Brands=e,user_lastModificationTime=e,user_columnPercentage=e,user_systemId=e,user_lastModificationUser=e,user_createDate=e,user_parent=e,user_Strategic_32_value=e,user_description=e"
}
I have tried following queries:
select * from `abc_class ` where any() = ["AppNet"] limit 2;
select * from `abc_class ` where any() like '%a099%' limit 2;
Both of the above queries work since they are respecting the datatype of the field.
I want to run a contains query which will search in ANY field with ANY data type (like String, number, JSON Array, etc) more of like a - full text search.
select * from `abc_class ` where any() like '%AppNet%' limit 2;
The above query doesn't work since the real value is inside JSON Array. Tried almost all the things from filtering section documentation
How can I achieve full-text search like functionality with the existing data?
EDIT # 1
After doing more research now I'm able to atleast convert the array value into string and then run like operator on it, like below;
select * from `abc_class` where user_name.asString() like '%LIBOR%'
However, using any().asString() doesn't result any result
select * from `abc_class` where any().asString() like '%LIBOR%'
If the above query can be enhanced somehow to query any column as string, then the problem can be resolved.
If all the column values needs to be searched then we can create a JSON object of the full row data and convert it into String.
Then query the string with like keyword, as follows:
select * from `abc_class` where #this.toJSON().asString() like '%LIBOR%'
If we will be converting to #this.asString() directly then we'll be getting the count of array elements instead of the real data inside the array elements like below:
abc_class#57:4{system_modelId:model,system_oeId:14f4b593-a57d-4d37ad070a10,system_type:Product,user_lastModificationUser:[1],user_name:[1],user_description:[0],user_Accountability:[0],user_lastModificationTime:[1],user_Rollout_32_date:[0],user_Strategic_32_value:[0],user_createDate:[1],user_Brands:[0],user_parent:[1],user_systemId:[1],user_columnCompletenessPercentage:[1]} v2
Therefore, we need to first convert into JSON and then into String to query the full record using #this.toJSON().asString()
References:
https://orientdb.com/docs/last/sql/SQL-Methods.html
https://orientdb.com/docs/last/sql/SQL-Where.html
https://orientdb.com/docs/last/sql/SQL-Syntax.html

Variant column separate json values in file by comma?

This is a basic question, I am trying to break one variant row into multiple columns and running into an error.
Create or replace table App_versions(data variant);
CREATE or Replace FILE FORMAT x_json
TYPE = "JSON"
COMPRESSION = "GZIP"
FILE_EXTENSION= 'json.gz'
COPY INTO App_versions
FROM #~/staged
file_format = 'x_json'
on_error = 'skip_file';
list #~;
SELECT * FROM App_versions limit 10;
Select data:available,value::boolean as avail, data:color.value::string as col, data:name.value::string as title, data:version.value::float as version from App_versions;
Data Stored in Column
[
{
"available": false,
"color": "Indigo",
"name": "Bigtax",
"version": "2.2.9"
},
{
"available": false,
"color": "Khaki",
"name": "Solarbreeze",
"version": "7.00"
}
]
And I am running into the columns all to be Null values. What am I doing wrong?
I based it off of:https://support.snowflake.net/s/article/json-data-parsing-in-snowflake
If you want each { ... } object to land in it's own row, then use the STRIP_OUTER_ARRAY = TRUE file format option. Or you can FLATTEN() data on the fly after loading. To access multiple objects in single row without flattening, you have to include an index to specify which object you want -- for example... select data[0].available::boolean as avail ....

Getting values from json array using an array of object and keys in Python

I'm a Python newbie and I'm trying to write a script to extract json keys by passing the keys dinamically, reading them from a csv.
First of all this is my first post and I'm sorry if my questions are banals and if the code is incomplete but it's just a pseudo code to understand the problem (I hope not to complicate it...)
The following partial code retrieves the values from three key (group, user and id or username) but I'd like to load the objects and key from a csv to make them dinamicals.
Input json
{
"fullname": "The Full Name",
"group": {
"user": {
"id": 1,
"username": "John Doe"
},
"location": {
"x": "1234567",
"y": "9876543"
}
},
"color": {
"code": "ffffff",
"type" : "plastic"
}
}
Python code...
...
url = urlopen(jsonFile)
data = json.loads(url.read())
id = (data["group"]["user"]["id"])
username = (data["group"]["user"]["username"])
...
File.csv loaded into an array. Each line contains one or more keys.
fullname;
group,user,id;
group,user,username;
group,location,x;
group,location,y;
color,code;
The questions are: can I use a variable containing the object or key to be extract?
And how can I specify how many keys there are in the keys array to put them into the data([ ][ ]...) using only one line?
Something like this pseudo code:
...
url = urlopen(jsonFile)
data = json.loads(url.read())
...
keys = line.split(',')
...
# using keys[] to identify the objects and keys
value = (data[keys[0]][keys[1]][keys[2]])
...
But the line value = (data[keys[0]][keys[1]][keys[2]]) should have the exact number of the keys per line read from the csv.
Or I must to make some "if" lines like these?:
...
if len(keys) == 3:
value = (data[keys[0]][keys[1]][keys[2]])
if len(keys) == 2:
value = (data[keys[0]][keys[1]])
...
Many thanks!
I'm not sure I completely understand your question, but I would suggest you to try and play with pandas. It might be as easy as this:
import pandas as pd
df = pd.read_json(<yourJsonFile>, orient='columns')
name = df.fullname[0]
group_user = df.group.user
group_location = df.group.location
color_type = df.color.type
color_code = df.color.code
(Where group_user and group_location will be python dictionaries).

How to transform a JSON array nested inside an object inside another array in Postgres?

I'm using Postgres 9.6 and have a JSON field called credits with the following structure; A list of credits, each with a position and multiple people that can be in that position.
[
{
"position": "Set Designers",
people: [
"Joe Blow",
"Tom Thumb"
]
}
]
I need to transform the nested people array, which are currently just strings representing their names, into objects that have a name and image_url field, like this
[
{
"position": "Set Designers",
people: [
{ "name": "Joe Blow", "image_url": "" },
{ "name": "Tom Thumb", "image_url": "" }
]
}
]
So far I've only been able to find decent examples of doing this on either the parent JSON array or on an array field nested inside a single JSON object.
So far this is all I've been able to manage and even it is mangling the result.
UPDATE campaigns
SET credits = (
SELECT jsonb_build_array(el)
FROM jsonb_array_elements(credits::jsonb) AS el
)::jsonb
;
Create an auxiliary function to simplify the rather complex operation:
create or replace function transform_my_array(arr jsonb)
returns jsonb language sql as $$
select case when coalesce(arr, '[]') = '[]' then '[]'
else jsonb_agg(jsonb_build_object('name', value, 'image_url', '')) end
from jsonb_array_elements(arr)
$$;
With the function the update is not so horrible:
update campaigns
set credits = (
select jsonb_agg(jsonb_set(el, '{people}', transform_my_array(el->'people')))
from jsonb_array_elements(credits::jsonb) as el
)::jsonb
;
Working example in rextester.

JSON_MODIFY appends variable containing JSON with escape characters instead of JSON string

I have had some success using JSON_MODIFY to append fieldErrors: [] and its content to the root of another object { data: []} so that the resultant object looks something like this: {data: [], fieldErrors: []}.
Problem is, when I append a #variable, it includes a bunch of escape characters.
I expect this: {"name":"PosTitle","status":"the field is messed up, yo"}
BUT, I get this: ["{\"name\":\"PosTitle\",\"status\":\"the field is messed up, yo\"}"]}
DECLARE
#fieldErrors nvarchar(max) ='{}'
,#jsonResponse nvarchar(max) = '
{
"data": [
{
"PosTitle": "",
"PosCode": "86753",
}
]
}
'
--define the fields that are bad
set #fieldErrors = JSON_MODIFY(JSON_MODIFY(#fieldErrors, '$.name', 'PosTitle'), '$.status', 'the field is messed up, yo')
print #fieldErrors
--RESULT, this looks great:
--{"name":"PosTitle","status":"the field is messed up, yo"}
-- append fieldErrors to the response
set #jsonResponse = JSON_MODIFY(#jsonResponse, 'append $.fieldErrors', #fieldErrors)
print #jsonResponse
--RESPONSE, this includes escape characters
/*
{
"data": [
{
"PosTitle": "",
"PosCode": "86753",
}
]
,"fieldErrors":["{\"name\":\"PosTitle\",\"status\":\"the field is messed up, yo\"}"]}
*/
Why are escape characters being added when fieldErrors is appended to the response?
Please remove the comma at the end of "PosCode": "86753", as it's not a valid JSON this way.
To answer your question, you are trying to add a string stored in #fieldErrors which results in the escape characters being added.
Instead,
set #jsonResponse = JSON_MODIFY(#jsonResponse, '$.fieldErrors', JSON_QUERY(#fieldErrors)) should yield the results you're looking for.
Note that when you use append you are telling JSON_MODIFY that it is adding a value to an array (which may or may not be what you need, but isn't what you wrote you're expecting).

Resources