Iteration over json array elements in PostgreSQL - arrays

I've got two JSON data row from the column in postgresql database and that looks like this.
{
"details":[{"to":"0:00:00","from":"00:00:12"}]
}
{
"details":[
{"to":"13:01:11","from":"13:00:12"},
{"to":"00:00:12","from":"13:02:11"}
]
}
I want to iterate over details and get only the "from" key values using a query in postgresql.
I want it like
from
00:00:12
13:00:12
13:02:11

Use jsonb_array_elements
select j->>'from' as "from" from t
cross join jsonb_array_elements(s->'details') as j;
Demo

Related

Snowflake select query to return data in json format

I am making a select call on a table and it always returns 1 row. I would like to get the data in json format.
{
"column_name1": "value1",
"column_name2": "value2",
}
Does snowflake query allows anything like this ?
object_construct is the way to go for this.
For example,
select object_construct(*) from t1;

Indexing and querying jsonb nested arrays using Postgres 9.4

I'm currently using postgres (version 9.4.4) to save entire json documents of projects in one table like this (simplified):
CREATE TABLE projects (
id numeric(19,0) NOT NULL,
project bson NOT NULL )
The project json is something like this (OVERLY-simplified):
{
"projectData": {
"guid":"project_guid",
"name":"project_name"
},
"types": [
{
"class":"window",
"provider":"glassland"
"elements":[
{
"name":"example_name",
"location":"2nd floor",
"guid":"1a",
},
{
"name":"example_name",
"location":"3rd floor",
"guid":"2a",
}
]
},
{
"class":"door",
"provider":"woodland"
"elements":[
{
"name":"example_name",
"location":"1st floor",
"guid":"3a",
},
{
"name":"example_name",
"location":"2nd floor",
"guid":"4a",
}
]
}
]
}
I've been reading documentation on operators ->, ->>, #>, #> and so on. I did some tests and successful selects. But I can't manage to index properly, specially nested arrays (types and elements).
Those are some example selects I would like to learn how to optimize (there are plenty like this):
select distinct types->'class' as class, types->'provider' as type
from projects, jsonb_array_elements(project#>'{types}') types;
select types->>'class' as class,
types->>'provider' as provider,
elems->>'name' as name,
elems->>'location' as location,
elems->>'guid' as guid,
from projects, jsonb_array_elements(project#>'{types}') types, jsonb_array_elements(types#>'{elements}') elems
where types->>'class' like 'some_text%' and elems->'guid' <> '""';
Also I have this index:
CREATE INDEX idx_gin ON projects USING GIN (project jsonb_ops);
Both of those selects work, but they don't use te #> operator or any operator that can use the GIN index. I can't create a index btree ( create index idx_btree on tcq.json_test using btree ((obra->>'types')); ) because the size of the value exceeds the limit (for the real json). Also I can't ( or I don't know how to ) create an index for, let's say, guids of elements ( create index idx_btree2 on tcq.json_test using btree((obra->>'types'->>'elements'->>'guid')); ). This produces a syntax error.
I been trying to translate queries to something using #> but things like this:
select count(*)
from projects, jsonb_array_elements(project#>'{types}') types
where types->>'class' = 'window';
select count(*)
from projects
where obra #> '{"types":[{"class":"window"}]}';
produce a different output.
Is there a way to properly index the nested arrays of that json? or to properly select taking advantage of the GIN index?

How to filter contents in linked list using select statement?

A class has a property which holds link of rid's.ie testcaseLink:[#20:0,#20:1].I have a problem in selecting only certain rids from the testcaseLink using select statement.
Looking for something like select testcaseLink.get(#20:0) from tableName.Is there is any method to filter the contents of collections in orientdb
select testCaseLink[#rid=#20:6 OR #rid=#20:8] from tableName where ColumnName='Value' Solved the problem

Convert column into nested field in destination table on load job in Big Query

I am currently running a job to transfer data from one table to another via a query. But I can't seem to find a way convert a column into a nested field containing the column as a child field. For example, I have a column customer_id: 3 and I would like to convert it to {"customer": {"id":3}}. Below is a snippet of my job data.
query='select * FROM ['+ BQ_DATASET_ID+'.'+Table_name+'] WHERE user="'+user+'"'
job_data={"projectId": PROJECT_ID,
'jobReference': {
'projectId': PROJECT_ID,
'job_id': str(uuid.uuid4())
},
'configuration': {
'query': {
'query': query,
'priority': 'INTERACTIVE',
'allowLargeResults': True,
"destinationTable":{
"projectId": PROJECT_ID,
"datasetId": user,
"tableId": destinationTable,
},
"writeDisposition": "WRITE_APPEND"
},
}
}
Unfortunately, if the "customer" RECORD does not exist in the input schema, it is not currently possible to generate that nested RECORD field with child fields through a query. We have features in the works that will allow schema manipulation like this via SQL, but I don't think it's possible to do accomplish this today.
I think your best option today would be an export, transformation to desired format, and re-import of the data to the desired destination table.
a simple solution is to run
select customer_id as customer.id ....

Parse json arrays using HIVE

I have many json arrays stored in a table (jt) that looks like this:
[{"ts":1403781896,"id":14,"log":"show"},{"ts":1403781896,"id":14,"log":"start"}]
[{"ts":1403781911,"id":14,"log":"press"},{"ts":1403781911,"id":14,"log":"press"}]
Each array is a record.
I would like to parse this table in order to get a new table (logs) with 3 fields: ts, id, log.
I tried to use the get_json_object method, but it seems that method is not compatible with json arrays because I only get null values.
This is the code I have tested:
CREATE TABLE logs AS
SELECT get_json_object(jt.value, '$.ts') AS ts,
get_json_object(jt.value, '$.id') AS id,
get_json_object(jt.value, '$.log') AS log
FROM jt;
I tried to use other functions but they seem really complicated.
Thank you! :)
Update!
I solved my issue by performing a regexp:
CREATE TABLE jt_reg AS
select regexp_replace(regexp_replace(value,'\\}\\,\\{','\\}\\\n\\{'),'\\[|\\]','') as valuereg from jt;
CREATE TABLE logs AS
SELECT get_json_object(jt_reg.valuereg, '$.ts') AS ts,
get_json_object(jt_reg.valuereg, '$.id') AS id,
get_json_object(jt_reg.valuereg, '$.log') AS log
FROM ams_json_reg;
I just ran into this problem, with the JSON array stored as a string in the hive table.
The solution is a bit hacky and ugly, but it works and doesn't require serdes or external UDFs
SELECT
get_json_object(single_json_table.single_json, '$.ts') AS ts,
get_json_object(single_json_table.single_json, '$.id') AS id,
get_json_object(single_json_table.single_json, '$.log') AS log
FROM ( SELECT explode (
split(regexp_replace(substr(json_array_col, 2, length(json_array_col)-2),
'"}","', '"}",,,,"'), ',,,,')
) FROM src_table) single_json_table;
I broke the lines up so that it would be a little easier to read.
I'm using substr() to strip the first and last characters, removing [ and ] . I'm then using regex_replace to match the separator between records in the json array and adding or changing the separator to be something unique that can then be used easily with split() to turn the string into a hive array of json objects which can then be used with explode() as described in the previous solution.
Note, the separator regex used here ( "}"," ) wouldn't work with the original data set...the regex would have to be ( "},\{" ) and the replacement would then need to be "},,,,{" eg..
split(regexp_replace(substr(json_array_col, 2, length(json_array_col)-2),
'"},\\{"', '"},,,,{"'), ',,,,')
Use explode() function
hive (default)> CREATE TABLE logs AS
> SELECT get_json_object(single_json_table.single_json, '$.ts') AS ts,
> get_json_object(single_json_table.single_json, '$.id') AS id,
> get_json_object(single_json_table.single_json, '$.log') AS log
> FROM
> (SELECT explode(json_array_col) as single_json FROM jt) single_json_table ;
Automatically selecting local only mode for query
Total MapReduce jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
hive (default)> select * from logs;
OK
ts id log
1403781896 14 show
1403781896 14 start
1403781911 14 press
1403781911 14 press
Time taken: 0.118 seconds, Fetched: 4 row(s)
hive (default)>
where json_array_col is column in jt which holds your array of jsons.
hive (default)> select json_array_col from jt;
json_array_col
["{"ts":1403781896,"id":14,"log":"show"}","{"ts":1403781896,"id":14,"log":"start"}"]
["{"ts":1403781911,"id":14,"log":"press"}","{"ts":1403781911,"id":14,"log":"press"}"]
because get_json_object doesn't support json array string, so you can concat to a json object, like this:
SELECT
get_json_object(concat(concat('{"root":', jt.value), '}'), '$.root')
FROM jt;

Resources