Snowflake 'GET': (VARCHAR(16777216), VARCHAR(12)) - snowflake-cloud-data-platform

I have loaded a JSON file on to a Snowflake stage.
Now my goal is to COPY the contents of the file into a relational table.
Table is defined as have either varchar or Boolean columns.
{
"requestRefid": "W2W8P",
"requestid": "kki8786f1b-03eb",
"requestTypes": [
"Do not sell it"
],
"subjectTypes": [
"Current customer"
],
"firstName": "Dan",
"lastName": "Murrary",
"email": "k99008a85ki#gmail.com",
"phone": "410000869",
"emailValidation": true,
"phoneValidation": true,
"message": "Confirm",
}
Here is the COPY statement that I am using:
copy into TEST."PUBLIC".REQUESTS(REQUESTREFID, REQUESTID, FIRSTNAME, LASTNAME, EMAIL, PHONE, EMAILVALIDATION, PHONEVALIDATION, IDVALIDATION, MESSAGE, CHANNEL)
from (select $1:requestRefid, $1:requestid, $1:firstName, $1:lastName, $1:email, $1:phone, $1:emailValidation, $1:phoneValidation, $1:idValidation, $1:message, $1:channel
from #sf_tut_stage/sample.json t);
Here is the error that I get:
SQL Error [1044] [42P13]:
SQL compilation error: error line 2 at position 18
Invalid argument types for function 'GET': (VARCHAR(16777216),
VARCHAR(12))
SQL compilation error: error line 2 at position 18
Invalid argument types for function 'GET': (VARCHAR(16777216),
VARCHAR(12))
SQL compilation error: error line 2 at position 18
Invalid argument types for function 'GET': (VARCHAR(16777216),
VARCHAR(12))
I am able to query the contents of the JSON file in the stage using following query:
select $1
from #sf_tut_stage/sample.json;
What am I doing wrong?
I also tried add following to the copy statement:
file_format = (format_name = SF_TUT_CSV_FORMAT));
but no luck.
What is the right way to write this statement so it can load items within $1 to individual relational table columns.

Your code is trying to use the : syntax to extract a value from the source, but it's still just a varchar, which doesn't allow that syntax. Try this using PARSE_JSON so that SnowFlake knows that it's JSON and can apply that syntax appropriately.
copy into TEST."PUBLIC".REQUESTS(REQUESTREFID, REQUESTID, FIRSTNAME, LASTNAME, EMAIL, PHONE, EMAILVALIDATION, PHONEVALIDATION, IDVALIDATION, MESSAGE, CHANNEL)
from (
select
PARSE_JSON($1):requestRefid,
PARSE_JSON($1):requestid,
PARSE_JSON($1):firstName,
PARSE_JSON($1):lastName,
PARSE_JSON($1):email,
PARSE_JSON($1):phone,
PARSE_JSON($1):emailValidation,
PARSE_JSON($1):phoneValidation,
PARSE_JSON($1):idValidation,
PARSE_JSON($1):message,
PARSE_JSON($1):channel
from #sf_tut_stage/sample.json t
);

Here is the way to load JSON data into Relational table.
https://docs.snowflake.net/manuals/user-guide/script-data-load-transform-json.html
You need to use parse_json() method.

Related

How to use JsonPath expression with wildcards in MS SQL 2019's Json_Value?

In my SQL Table, I have a column storing JSON with a structure similar to the following:
{
"type": "Common",
"items": [
{
"name": "landline",
"number": "0123-4567-8888"
},
{
"name": "home",
"number": "0123-4567-8910"
},
{
"name": "mobile",
"number": "0123-4567-9910"
}
]
}
This is the table structure I am using:
CREATE TABLE StoreDp(
[JsonData] [nvarchar](max),
[Type] AS (json_value([JsonData],'lax $.type')) PERSISTED,
[Items] AS (json_value([JsonData],N'lax $.items[*].name')) PERSISTED
)
Now, when I am trying to insert the sample JSON (serialized) in the table column [JsonData], I am getting an error
JSON path is not properly formatted. Unexpected character '*' is found at position 3.
I was expecting data to be inserted with value in [Items] as "[landline, home, mobile]"
I have validated the jsonpath expression, and it works fine except for in SQL Server.
Update: Corrected the SQL server version.
SQL Server cannot do shred and rebuild JSON using wildcard paths and JSON_VALUE.
You would have to use a combination of OPENJSON and STRING_AGG, and also STRING_ESCAPE if you want the result to be valid JSON.
SELECT
(
SELECT '[' + STRING_AGG('"' + STRING_ESCAPE(j.name, 'json') + '"', ',') + ']'
FROM OPENJSON(sd.JsonData, '$.items')
WITH (
name varchar(20)
) j
)
FROM StoreDp sd;
db<>fiddle
You could only do this in a computed column by using a scalar UDF. However those have major performance implications and should generally be avoided. I suggest you just make a view instead.

Snowflake - extract JSON array string object values into pipe separated values

I have a nested JSON array which is a string object which has been stored into variant type stage table and I want to extract particular string object value and populate with pipe separated values if more than one object found. Can someone help me to achieve the desired output format please.
Sample JSON data
{"issues": [
{
"expand": "",
"fields": {
"customfield_10010": [
"com.atlassian.xxx.yyy.yyyy.Sprint#xyz456[completeDate=2020-07-20T20:19:06.163Z,endDate=2020-07-17T21:48:00.000Z,goal=,id=1234,name=SPR-SPR 8,rapidViewId=239,sequence=1234,startDate=2020-06-27T21:48:00.000Z,state=CLOSED]",
"com.atlassian.xxx.yyy.yyyy.Sprint#abc123[completeDate=<null>,endDate=2020-08-07T20:33:00.000Z,goal=,id=1239,name=SPR-SPR 9,rapidViewId=239,sequence=1239,startDate=2020-07-20T20:33:26.364Z,state=ACTIVE]"
],
"customfield_10011": "obcd",
"customfield_10024": null,
"customfield_10034": null,
"customfield_10035": null,
"customfield_10037": null,
},
"id": "123456",
"key": "SUE-1234",
"self": "xyz"
}]}
I don't have any idea on how to separate the string objects inside an array with snowflake.
By using the below query I can get whole string converted into pipe separated values.
select
a.value:id::number as ISSUE_ID,
a.value:key::varchar as ISSUE_KEY,
array_to_string(a.value:fields.customfield_10010, '|') as CF_10010_Data
from
ABC.VARIANT_TABLE,
lateral flatten( input => payload_json:issues) as a;
But I need to extract particular string object value. Say for example id value such as 1234 & 1239 to be populated as pipe separated as shown below.
ISSUE_ID ISSUE_KEY SPRINT_ID
123456 SUE-1234 1234|1239
Any idea on this to get desired result is much appreciated. Thanks..
It looks like the data within [...] for your sprints are just details about that sprint. I think it would be easiest for you to actually populate a separate sprints table with data on each sprint, and then you can join that table to the Sprint ID values parsed from the API response you showed with issues data.
with
jira_responses as (
select
$1 as id,
$2 as body
from (values
(1, '{"issues":[{"expand":"","fields":{"customfield_10010":["com.atlassian.xxx.yyy.yyyy.Sprint#xyz456[completeDate=2020-07-20T20:19:06.163Z,endDate=2020-07-17T21:48:00.000Z,goal=,id=1234,name=SPR-SPR 8,rapidViewId=239,sequence=1234,startDate=2020-06-27T21:48:00.000Z,state=CLOSED]","com.atlassian.xxx.yyy.yyyy.Sprint#abc123[completeDate=<null>,endDate=2020-08-07T20:33:00.000Z,goal=,id=1239,name=SPR-SPR 9,rapidViewId=239,sequence=1239,startDate=2020-07-20T20:33:26.364Z,state=ACTIVE]"],"customfield_10011":"obcd","customfield_10024":null,"customfield_10034":null,"customfield_10035":null,"customfield_10037":null},"id":"123456","key":"SUE-1234","self":"xyz"}]}')
)
)
select
issues.value:id::integer as issue_id,
issues.value:key::string as issue_key,
get(split(sprints.value::string, '['), 0)::string as sprint_id
from jira_responses,
lateral flatten(input => parse_json(body):issues) issues,
lateral flatten(input => parse_json(issues.value):fields:customfield_10010) sprints
Based on your sample data, the results would look like the following.
See Snowflake reference docs below.
"Querying Semi-structured Data"
PARSE_JSON
FLATTEN
SPLIT
GET

SQL SERVER: Export Query RESULT as JSON Object

I am using Azure sql server and trying to export results of a query in the following format.
Required Query Result:
{ "results": [{...},{...}], "response": 0 }
From this example : https://msdn.microsoft.com/en-us/library/dn921894.aspx
I am using this sql but I am not sure how to add another response property as a sibling to the root property :"results".
Current Query:
SELECT name, surname
FROM emp
FOR JSON AUTO, ROOT('results')
Output of Query:
{ "results": [
{ "name": "John", "surname": "Doe" },
{ "name": "Jane", "surname": "Doe" } ] }
Use FOR JSON PATH instead of FOR JSON AUTO. See the Format Query Results as JSON with FOR JSON (SQL Server) page for several examples, including dot-separated column names and queries from SELECTS
There is no built-in option for this format, so maybe the easiest way would be to manually format response, something like:
declare #resp nvarchar(20) = '20'
SELECT '{"response":"' +
(SELECT * FROM emp FOR JSON PATH) +
'", "response": ' + #resp + ' }'
FOR JSON will do harder part (formatting table) and you just need to wrap it.

Convert column into nested field in destination table on load job in Big Query

I am currently running a job to transfer data from one table to another via a query. But I can't seem to find a way convert a column into a nested field containing the column as a child field. For example, I have a column customer_id: 3 and I would like to convert it to {"customer": {"id":3}}. Below is a snippet of my job data.
query='select * FROM ['+ BQ_DATASET_ID+'.'+Table_name+'] WHERE user="'+user+'"'
job_data={"projectId": PROJECT_ID,
'jobReference': {
'projectId': PROJECT_ID,
'job_id': str(uuid.uuid4())
},
'configuration': {
'query': {
'query': query,
'priority': 'INTERACTIVE',
'allowLargeResults': True,
"destinationTable":{
"projectId": PROJECT_ID,
"datasetId": user,
"tableId": destinationTable,
},
"writeDisposition": "WRITE_APPEND"
},
}
}
Unfortunately, if the "customer" RECORD does not exist in the input schema, it is not currently possible to generate that nested RECORD field with child fields through a query. We have features in the works that will allow schema manipulation like this via SQL, but I don't think it's possible to do accomplish this today.
I think your best option today would be an export, transformation to desired format, and re-import of the data to the desired destination table.
a simple solution is to run
select customer_id as customer.id ....

DELETE FROM ... reporting syntax error at or near "."

I'm trying to delete just one data from my DB, but, when I write the command I keep getting that there's some syntax error, could you tell me where is the error?
This are the commands I've tried:
DELETE FROM database_userprofile WHERE user.username = 'some';
ERROR: syntax error at or near "."
LINE 1: DELETE FROM database_userprofile WHERE user.username = 'some'...
DELETE FROM database_userprofile USING database_user WHERE user.username="some";
ERROR: syntax error at or near "."
LINE 1: ... database_userprofile USING database_user WHERE user.username=...
Hope you can help me
Your query doesn't make any sense.
DELETE FROM database_userprofile WHERE user.username = 'some';
^^^^
Where'd user come from? It isn't referenced in the query. Is it a column of database_userprofile? If so, you can't write user.username (unless it's a composite type, in which case you would have to write (user).username to tell the parser that; but I doubt it's a composite type).
The immediate cause is that user is a reserved word. You can't use that name without quoting it:
DELETE FROM database_userprofile WHERE "user".username = 'some';
... however, this query still makes no sense, it'll just give a different error:
regress=> DELETE FROM database_userprofile WHERE "user".username = 'some';
ERROR: missing FROM-clause entry for table "user"
LINE 1: DELETE FROM database_userprofile WHERE "user".username = 'so...
My wild guess is that you're trying to do a delete over a join. I'm assuming that you have tables like:
CREATE TABLE "user" (
id serial primary key,
username text not null,
-- blah blah
);
CREATE TABLE database_userprofile (
user_id integer references "user"(id),
-- blah blah
);
and you're trying to do delete with a condition across the other table.
If so, you can't just write user.username. You must use:
DELETE FROM database_userprofile
USING "user"
WHERE database_userprofile.user_id = "user".id
AND "user".username = 'fred';
You'll notice that I've double-quoted "user". That's because it's a keyword and shouldn't really be used for table names or other user defined identifiers. Double-quoting it forces it to be intepreted as an identifier not a keyword.
Due to documentation, the syntax for delete in PostgreSQL 9.1 is:
[ WITH [ RECURSIVE ] with_query [, ...] ]
DELETE FROM [ ONLY ] table [ * ] [ [ AS ] alias ]
[ USING using_list ]
[ WHERE condition | WHERE CURRENT OF cursor_name ]
[ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]
So you need to specify the "table_name" after DELETE command, not the "database_name".
You can delete data only if you are logged into the database.
You got
ERROR: syntax error at or near "."
because in the WHERE section you can specify the target table or the tables in the usinglist.
You may also get this error when copy-pasting a query from Eclipse to pgadmin. Somehow, a strange symbol may be inserted. To avoid this error, paste it in a simple text editor first (like notepad), then cut it from there and paste it in pgadmin.

Resources