Compare multiple date fields in JSON and use them in where clause - arrays

So i have a text field in my Postgres 10.8 (json_array_elements not possible) DB. It has a json structure like this.
{
"code_cd": "02",
"tax_cd": null,
"earliest_exit_date": [
{
"date": "2023-03-31",
"_destroy": ""
},
{
"date": "2021-11-01",
"_destroy": ""
},
{
"date": "2021-12-21",
"_destroy": ""
}
],
"enter_date": null,
"leave_date": null
}
earliest exit_date can also be empty like this:
{
"code_cd": "02",
"tax_cd": null,
"earliest_exit_date":[],
"enter_date": null,
"leave_date": null
}
Now i want to get the earliest_exit_date back where the date is after current_date and is the closest one to current_date. From the example with earliest_exit_date the output have to be: 2021-12-21
Anyone knows how to do this?

If your table has unique value or has id you can use below query:
Sample table and data structure: dbfiddle
select distinct
id,
min("date") filter (where "date" > current_date) over (partition by id)
from
test t
cross join jsonb_to_recordset(t.data::jsonb -> 'earliest_exit_date') as e("date" date)
order by id

Related

SQL Server table data to JSON Path result

I am looking for a solution to convert the table results to a JSON path.
I have a table with two columns as below. Column 1 Will always have normal values, but column 2 will have values up to 15 separated by ';' (semicolon).
ID Column1 Column2
--------------------------------------
1 T1 Re;BoRe;Va
I want to convert the above column data in to below JSON Format
{
"services":
[
{ "service": "T1"}
],
"additional_services":
[
{ "service": "Re" },
{ "service": "BoRe" },
{ "service": "Va" }
]
}
I have tried creating something like the below, but cannot get to the exact format that I am looking for
SELECT
REPLACE((SELECT d.Column1 AS services, d.column2 AS additional_services
FROM Table1 w (nolock)
INNER JOIN Table2 d (nolock) ON w.Id = d.Id
WHERE ID = 1
FOR JSON PATH), '\/', '/')
Please let me know if this is something we can achieve using T-SQL
As I mention in the comments, I strongly recommend you fix your design and normalise your design. Don't store delimited data in your database; Re;BoRe;Va should be 3 rows, not 1 delimited one. That doesn't mean you can't achieve what you want with your denormalised data, just that your design is flawed, and thus it needs being brought up.
One way to achieve what you're after is with some nested FOR JSON calls:
SELECT (SELECT V.Column1 AS service
FOR JSON PATH) AS services,
(SELECT SS.[value] AS service
FROM STRING_SPLIT(V.Column2,';') SS
FOR JSON PATH) AS additional_services
FROM (VALUES(1,'T1','Re;BoRe;Va'))V(ID,Column1,Column2)
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER;
This results in the following JSON:
{
"services": [
{
"service": "T1"
}
],
"additional_services": [
{
"service": "Re"
},
{
"service": "BoRe"
},
{
"service": "Va"
}
]
}

Query JSON Key:Value Pairs in AWS Athena

I have received a data set from a client that is loaded in AWS S3. The data contains unnamed JSON key:value pairs. This isn't my area of expertise, so I was looking for a little help.
The structure of JSON data that I've typically worked with in the past looks similar to this:
{ "name":"John", "age":30, "car":null }
The data that I have received from my client is formatted as such:
{
"answer_id": "cc006",
"answer": {
"101086": 1,
"101087": 2,
"101089": 2,
"101090": 7,
"101091": 5,
"101092": 3,
"101125": 2
}
}
This is survey data, where the key on the left is a numeric customer identifier, and the value on the right is their response to a survey question, i.e. customer "101125" answered the survey with a value of "2". I need to be able to query the JSON data using Athena such that my result set looks similar to:
Cross joining the unnested children against the parent node isn't an issue. What I can't figure out is how to select all of the keys from the array "answer" without specifying that actual key name. Similarly, I want to be able to select all of the values as well.
Is it possible to create a virtual table in Athena that would allow for these results, or do I need to convert the JSON to a format this looks more similar to the following:
{
"answer_id": "cc006",
"answer": [
{ "key": "101086", "value": 1 },
{ "key": "101087", "value": 2 },
{ "key": "101089", "value": 2 },
{ "key": "101090", "value": 7 },
{ "key": "101091", "value": 5 },
{ "key": "101092", "value": 3 },
{ "key": "101125", "value": 2 }
]
}
EDIT 6/4/2020
I was able to use the code that Theon provided below along with the following table structure:
CREATE EXTERNAL TABLE answer_example (
answer_id string,
answer string
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://mybucket/'
That allowed me to use the following query to generate the results that I needed.
WITH Data AS(
SELECT
answer_id,
CAST(json_extract(answer, '$') AS MAP(VARCHAR, VARCHAR)) as answer
FROM
answer_example
)
SELECT
answer_id,
key,
element_at(answer, key) AS value
FROM
Data
CROSS JOIN UNNEST (map_keys(answer)) AS answer (key)
EDIT 6/5/2020
Taking additional advice from Theon's response below, the following DDL and Query simplify this quite a bit.
DDL:
CREATE EXTERNAL TABLE answer_example (
answer_id string,
answer map<string,string>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://mybucket/'
Query:
SELECT
answer_id,
key,
element_at(answer, key) AS value
FROM
answer_example
CROSS JOIN UNNEST (map_keys(answer)) AS answer (key)
Cross joining with the keys of the answer property and then picking the corresponding value. Something like this:
WITH data AS (
SELECT
'cc006' AS answer_id,
MAP(
ARRAY['101086', '101087', '101089', '101090', '101091', '101092', '101125'],
ARRAY[1, 2, 2, 7, 5, 3, 2]
) AS answers
)
SELECT
answer_id,
key,
element_at(answers, key) AS value
FROM data
CROSS JOIN UNNEST (map_keys(answers)) AS answer (key)
You could probably do something with transform_keys to create rows of the key value pairs, but the SQL above does the trick.

snowflake pivot attribute values into columns in array of objects

EDIT: I gave bad example data. Updated some details and switched out dummy data for sanitized, actual data.
Source system: Freshdesk via Stitch
Table Structure:
create or replace TABLE TICKETS (
CC_EMAILS VARIANT,
COMPANY VARIANT,
COMPANY_ID NUMBER(38,0),
CREATED_AT TIMESTAMP_TZ(9),
CUSTOM_FIELDS VARIANT,
DUE_BY TIMESTAMP_TZ(9),
FR_DUE_BY TIMESTAMP_TZ(9),
FR_ESCALATED BOOLEAN,
FWD_EMAILS VARIANT,
ID NUMBER(38,0) NOT NULL,
IS_ESCALATED BOOLEAN,
PRIORITY FLOAT,
REPLY_CC_EMAILS VARIANT,
REQUESTER VARIANT,
REQUESTER_ID NUMBER(38,0),
RESPONDER_ID NUMBER(38,0),
SOURCE FLOAT,
SPAM BOOLEAN,
STATS VARIANT,
STATUS FLOAT,
SUBJECT VARCHAR(16777216),
TAGS VARIANT,
TICKET_CC_EMAILS VARIANT,
TYPE VARCHAR(16777216),
UPDATED_AT TIMESTAMP_TZ(9),
_SDC_BATCHED_AT TIMESTAMP_TZ(9),
_SDC_EXTRACTED_AT TIMESTAMP_TZ(9),
_SDC_RECEIVED_AT TIMESTAMP_TZ(9),
_SDC_SEQUENCE NUMBER(38,0),
_SDC_TABLE_VERSION NUMBER(38,0),
EMAIL_CONFIG_ID NUMBER(38,0),
TO_EMAILS VARIANT,
PRODUCT_ID NUMBER(38,0),
GROUP_ID NUMBER(38,0),
ASSOCIATION_TYPE NUMBER(38,0),
ASSOCIATED_TICKETS_COUNT NUMBER(38,0),
DELETED BOOLEAN,
primary key (ID)
);
Note the variant field, "custom_fields". It undergoes an unfortunate transformation between the api and snowflake. The resulting field contains an array of 3 or more objects, each one a custom field. I do not have the ability to change the data format. Examples:
# values could be null
[
{
"name": "cf_request",
"value": "none"
},
{
"name": "cf_related_with",
"value": "none"
},
{
"name": "cf_question",
"value": "none"
}
]
# or values could have a combination of null and non-null values
[
{
"name": "cf_request",
"value": "none"
},
{
"name": "cf_related_with",
"value": "none"
},
{
"name": "cf_question",
"value": "concern"
}
]
# or they could all have non-null values
[
{
"name": "cf_request",
"value": "issue with timer"
},
{
"name": "cf_related_with",
"value": "timer stopped"
},
{
"name": "cf_question",
"value": "technical problem"
}
]
I would essentially like to pivot these into fields in a select query where the name attribute's value becomes a column header. Making the output similar to the following:
+----+------------------+-----------------+-------------------+-----------------------------+
| id | cf_request | cf_related_with | cf_question | all_other_fields |
+----+------------------+-----------------+-------------------+-----------------------------+
| 5 | issue with timer | timer stopped | technical problem | more data about this ticket |
| 6 | hq | laptop issues | some value | more data |
| 7 | a thing | about a thing | about something | more data |
+----+------------------+-----------------+-------------------+-----------------------------+
Is there a function that searches the values of array objects and returns objects with qualifying values? Something like:
select
id,
get_object_where(name = 'category', value) as category,
get_object_where(name = 'subcategory', value) as category,
get_object_where(name = 'subsubcategory', value) as category
from my_data_table
Unfortunately, PIVOT requires an aggregate function, I tried using min and max, but only get a return of null values. Something similar to this approach would be great if there is another syntax to do it that doesn't require aggregation.
with arr as (
select
id,
cs.value:name col_name,
cs.value:value col_value
from my_data_table,
lateral flatten(input => custom_fields) cs
)
select
*
from arr
pivot(col_name for col_value in ('category', 'subcategory', 'subsubcategory')
as p (id, category, subcategory, subsubcategory);
It is possible to use the following approach, but it is flawed in that any time a new custom field is added I have to add cases to account for new positions within the array.
select
id,
case
when custom_fields[0]:name = 'cf_request' then custom_fields[0]:value
when custom_fields[1]:name = 'cf_request' then custom_fields[1]:value
when custom_fields[2]:name = 'cf_request' then custom_fields[2]:value
when custom_fields[2]:name = 'cf_request' then custom_fields[3]:value
else null
end cf_request,
case
when custom_fields[0]:name = 'cf_related_with' then custom_fields[0]:value
when custom_fields[1]:name = 'cf_related_with' then custom_fields[1]:value
when custom_fields[2]:name = 'cf_related_with' then custom_fields[2]:value
when custom_fields[2]:name = 'cf_related_with' then custom_fields[3]:value
else null
end cf_related_with,
case
when custom_fields[0]:name = 'cf_question' then custom_fields[0]:value
when custom_fields[1]:name = 'cf_question' then custom_fields[1]:value
when custom_fields[2]:name = 'cf_question' then custom_fields[2]:value
when custom_fields[2]:name = 'cf_question' then custom_fields[3]:value
else null
end cf_question,
created_at
from my_db.my_schema.tickets;
I think you almost had it. You just need to add a max() or min() around your col_name. As you stated, it needs an aggregate function, and something like max() or min() will work here, since it is aggregating on the name/value pairs that you have. If you have 2 subcategory values, for example, it'll pick the min/max value. From your example, that doesn't appear to be an issue, so it'll always choose the value you want. I was able to replicate your scenario with this query:
WITH x AS (
SELECT parse_json('[{"name": "category","value": "Bikes"},{"name": "subcategory","value": "Mountain Bikes"},{"name": "subsubcategory","value": "hardtail bikes"}]')::VARIANT as field_var
),
arr as (
select
seq,
cs.value:name::varchar col_name,
cs.value:value::varchar col_value
from x,
lateral flatten(input => x.field_var) cs
)
select
*
from arr
pivot(max(col_value) for col_name in ('category','subcategory','subsubcategory')) as p (seq, category, subcategory, subsubcategory);

update value in list Postgres jsonb

I am trying to update json
[{"id": "1", "name": "myconf", "icons": "small", "theme": "light", "textsize": "large"},
{"id": 2, "name": "myconf2", "theme": "dark"}, {"name": "firstconf", "theme": "dark", "textsize": "large"},
{"id": 3, "name": "firstconxsf", "theme": "dassrk", "textsize": "lassrge"}]
and this is the table containing that json column :
CREATE TABLE USER_CONFIGURATIONS ( ID BIGSERIAL PRIMARY KEY, DATA JSONB );
adding new field is easy I am using:
UPDATE USER_CONFIGURATIONS
SET DATA = DATA || '{"name":"firstconxsf", "theme":"dassrk", "textsize":"lassrge"}'
WHERE id = 9;
But how to update single with where id = 1 or 2
Click: step-by-step demo:db<>fiddle
UPDATE users -- 4
SET data = s.updated
FROM (
SELECT
jsonb_agg( -- 3
CASE -- 2
WHEN ((elem ->> 'id')::int IN (1,2)) THEN
elem || '{"name":"abc", "icon":"HUGE"}'
ELSE elem
END
) AS updated
FROM
users,
jsonb_array_elements(data) elem -- 1
) s;
Expand array elements into one row each
If element has relevant id, update with || operator; if not, keep the original one
Reaggregate the array after updating the JSON data
Execute the UPDATE statement.

postgresql json array query

I tried to query my json array using the example here: How do I query using fields inside the new PostgreSQL JSON datatype?
They use the example:
SELECT *
FROM json_array_elements(
'[{"name": "Toby", "occupation": "Software Engineer"},
{"name": "Zaphod", "occupation": "Galactic President"} ]'
) AS elem
WHERE elem->>'name' = 'Toby';
But my Json array looks more like this (if using the example):
{
"people": [{
"name": "Toby",
"occupation": "Software Engineer"
},
{
"name": "Zaphod",
"occupation": "Galactic President"
}
]
}
But I get an error: ERROR: cannot call json_array_elements on a non-array
Is my Json "array" not really an array? I have to use this Json string because it's contained in a database, so I would have to tell them to fix it if it's not an array.
Or, is there another way to query it?
I read documentation but nothing worked, kept getting errors.
The json array has a key people so use my_json->'people' in the function:
with my_table(my_json) as (
values(
'{
"people": [
{
"name": "Toby",
"occupation": "Software Engineer"
},
{
"name": "Zaphod",
"occupation": "Galactic President"
}
]
}'::json)
)
select t.*
from my_table t
cross join json_array_elements(my_json->'people') elem
where elem->>'name' = 'Toby';
The function json_array_elements() unnests the json array and generates all its elements as rows:
select elem->>'name' as name, elem->>'occupation' as occupation
from my_table
cross join json_array_elements(my_json->'people') elem
name | occupation
--------+--------------------
Toby | Software Engineer
Zaphod | Galactic President
(2 rows)
If you are interested in Toby's occupation:
select elem->>'occupation' as occupation
from my_table
cross join json_array_elements(my_json->'people') elem
where elem->>'name' = 'Toby'
occupation
-------------------
Software Engineer
(1 row)

Resources