How to extract array from JSON String in BigQuery - arrays

Hi, I'm working on a table looks like below:
----------------------------------------------------------------------------------------------------------------------------------------------
| user_id | j_games_information
----------------------------------------------------------------------------------------------------------------------------------------------
| hsbdgcy76s |{"data": [{"game_id": "acb", "rewards":[{"no":3,"items":"oils"}]},{"game_id": "bsm", "rewards":[{"no":4,"items":"bombs"}]}]}
----------------------------------------------------------------------------------------------------------------------------------------------
| kslcn6vg76 |{"data": [{"game_id": "ohf", "rewards":[{"no":6,"items":"oils"}]},{"game_id": "dfg", "rewards":[{"no":7,"items":"bombs"}]}]}
----------------------------------------------------------------------------------------------------------------------------------------------
My expected output will be:
-----------------------------------
| user_id | game_ids |
-----------------------------------
| hsbdgcy76s | acb |
-----------------------------------
| hsbdgcy76s | bsm |
-----------------------------------
| kslcn6vg76 | ohf |
-----------------------------------
| kslcn6vg76 | dfg |
-----------------------------------
I tried the following code but this query returned no results. Can anyone help me with this? Thank you!
select user_id, JSON_EXRACT_SCALAR(json_array,"$.game_id") AS game_ids
from table, unnest(json_extract_array(j_games_information,"$.data")) AS json_array
But

Try to replace json_file with j_games_information
with mytable as (
select 'hsbdgcy76s' as user_id, '{"data": [{"game_id": "acb", "rewards":[{"no":3,"items":"oils"}]},{"game_id": "bsm", "rewards":[{"no":4,"items":"bombs"}]}]}' as j_games_information union all
select 'kslcn6vg76' as user_id, '{"data": [{"game_id": "ohf", "rewards":[{"no":6,"items":"oils"}]},{"game_id": "dfg", "rewards":[{"no":7,"items":"bombs"}]}]}' as j_games_information
)
select user_id, JSON_EXTRACT_SCALAR(json_array,"$.game_id") AS game_ids
from mytable, unnest(json_extract_array(j_games_information,"$.data")) AS json_array

Related

How to identify valid records based on column values in snowflake

I have a table as below
I want output like below
This means I have few predefined pairs, example
if one employee is coming from both HR_INTERNAL and HR_EXTERNAL, take only that record which is from HR_INTERNAL
if one employee is coming from both SALES_INTERNAL and SALES_EXTERNAL, take only that record which is from SALES_INTERNAL
etc.
Is there a way to achieve this?
I used ROW_NUMBER to rank
ROW_NUMBER() OVER(PARTITION BY "EMPID" ORDER BY SOURCESYSTEM ASC) AS RANK_GID
I just put them on a table like this:
create or replace table predefined_pairs ( pairs ARRAY );
insert into predefined_pairs select [ 'HR_INTERNAL', 'HR_EXTERNAL' ] ;
insert into predefined_pairs select [ 'SALES_INTERNAL', 'SALES_EXTERNAL' ] ;
Then I use the following query to produce the output you wanted:
select s.sourcesystem, s.empid,
CASE WHEN COUNT(1) OVER(PARTITION BY EMPID) = 1 THEN 'ValidRecord'
WHEN p.pairs[0] IS NULL THEN 'ValidRecord'
WHEN p.pairs[0] = s.sourcesystem THEN 'ValidRecord'
ELSE 'InvalidRecord'
END RecordValidity
from source s
left join predefined_pairs p on array_contains( s.sourcesystem::VARIANT, p.pairs ) ;
+-------------------+--------+----------------+
| SOURCESYSTEM | EMPID | RECORDVALIDITY |
+-------------------+--------+----------------+
| HR_INTERNAL | EMP001 | ValidRecord |
| HR_EXTERNAL | EMP001 | InvalidRecord |
| SALES_INTERNAL | EMP002 | ValidRecord |
| SALES_EXTERNAL | EMP002 | InvalidRecord |
| HR_EXTERNAL | EMP004 | ValidRecord |
| SALES_INTERNAL | EMP005 | ValidRecord |
| PURCHASE_INTERNAL | EMP003 | ValidRecord |
+-------------------+--------+----------------+

Multiple NOT LIKE in sql server

I have a table like
+--------+-------+
| id | name |
+--------+-------+
| 302345 | Name1 |
| 522345 | Name2 |
| 1X2345 | Name3 |
| 2X2345 | Name4 |
| 1X8765 | Name5 |
| 2X2123 | Name6 |
| 502345 | Name7 |
| M62345 | Name8 |
+--------+-------+
I want to take id that doesn't have prefix 30,1X,2X. Like this I have more than 20 prefix to be excluded. I was using NOT LIKE 20 times and looking to shorten it. From some of stackoverflow question, I have found we can create a table and store all these values in a column and use that. But in my case I don't have permission to create table. Hence tried the below code but it is giving strange results.
SELECT *
FROM mytable
INNER JOIN (
SELECT '30%' Prefix
UNION ALL
SELECT '50%'
UNION ALL
SELECT '1X%'
) list ON id NOT LIKE prefix
FIDDLE HERE . Please suggest some alternative.
Expected output
+--------+-------+
| id | name |
+--------+-------+
| 522345 | Name2 |
+--------+-------+
| 502345 | Name7 |
+--------+-------+
| M62345 | Name8 |
+--------+-------+
You could use a NOT EXISTS with a VALUES construct for all your prefixes.
Something like this:
SELECT *
FROM mytable mt
WHERE NOT EXISTS (SELECT 1
FROM (VALUES('30%'),('50%'),('1X%'),('2X%')/*,...*/)V(expr)
WHERE mt.id LIKE V.expr);

Extract into multiple columns from JSON with PostgreSQL

I have a column item_id that contains data in JSON (like?) structure.
+----------+---------------------------------------------------------------------------------------------------------------------------------------+
| id | item_id |
+----------+---------------------------------------------------------------------------------------------------------------------------------------+
| 56711 | {itemID":["0530#2#1974","0538\/2#2#1974","0538\/3#2#1974","0538\/18#2#1974","0539#2#1974"]}" |
| 56712 | {itemID":["0138528#2#4221","0138529#2#4221","0138530#2#4221","0138539#2#4221","0118623\/2#2#4220"]}" |
| 56721 | {itemID":["2704\/1#1#1356"]}" |
| 56722 | {itemID":["0825\/2#2#3349","0840#2#3349","0844\/10#2#3349","0844\/11#2#3349","0844\/13#2#3349","0844\/14#2#3349","0844\/15#2#3349"]}" |
| 57638 | {itemID":["0161\/1#2#3364","0162\/1#2#3364","0163\/2#2#3364"]}" |
| 57638 | {itemID":["109#1#3364","110\/1#1#3364"]}" |
+----------+---------------------------------------------------------------------------------------------------------------------------------------+
I need the last four digits before every comma (if there is) and the last 4 digits distincted and separated into individual colums.
The distinct should happen across id as well, so only one result row with id: 57638 is permitted.
Here is a fiddle with a code draft that is not giving the right answer.
The desired result should look like this:
+----------+-----------+-----------+
| id | item_id_1 | item_id_2 |
+----------+-----------+-----------+
| 56711 | 1974 | |
| 56712 | 4220 | 4221 |
| 56721 | 1356 | |
| 56722 | 3349 | |
| 57638 | 3364 | 3365 |
+----------+-----------+-----------+
There can be quite a lot 'item_id_%' column in the results.
with the_table (id, item_id) as (
values
(56711, '{"itemID":["0530#2#1974","0538\/2#2#1974","0538\/3#2#1974","0538\/18#2#1974","0539#2#1974"]}'),
(56712, '{"itemID":["0138528#2#4221","0138529#2#4221","0138530#2#4221","0138539#2#4221","0118623\/2#2#4220"]}'),
(56721, '{"itemID":["2704\/1#1#1356"]}'),
(56722, '{"itemID":["0825\/2#2#3349","0840#2#3349","0844\/10#2#3349","0844\/11#2#3349","0844\/13#2#3349","0844\/14#2#3349","0844\/15#2#3349"]}'),
(57638, '{"itemID":["0161\/1#2#3364","0162\/1#2#3364","0163\/2#2#3364"]}'),
(57638, '{"itemID":["109#1#3365","110\/1#1#3365"]}')
)
select id
,(array_agg(itemid)) [1] itemid_1
,(array_agg(itemid)) [2] itemid_2
from (
select distinct id
,split_part(replace(json_array_elements(item_id::json -> 'itemID')::text, '"', ''), '#', 3)::int itemid
from the_table
order by 1
,2
) t
group by id
DEMO
You can unnest the json array, get the last 4 characters of each element as a number, then do conditional aggregation:
select
id,
max(val) filter(where rn = 1) item_id_1,
max(val) filter(where rn = 2) item_id_2
from (
select
id,
right(val, 4)::int val,
dense_rank() over(partition by id order by right(val, 4)::int) rn
from mytable t
cross join lateral jsonb_array_elements_text(t.item_id -> 'itemID') as x(val)
) t
group by id
You can add more conditional max()s to the outer query to handle more possible values.
Demo on DB Fiddle:
id | item_id_1 | item_id_1
----: | --------: | --------:
56711 | 1974 | null
56712 | 4220 | 4221
56721 | 1356 | null
56722 | 3349 | null
57638 | 3364 | 3365

Database Joins with single record with multiple coulmns

Industry Table
+----+--------------+
| ID | IndustryName |
+----+--------------+
| 1 | Auto |
| 2 | Pets |
+----+--------------+
Images Table
+----+------------+--------------+
| ID | IndustryId | ImageURL |
+----+------------+--------------+
| 1 | 1 | URL1 |
| 2 | 1 | URL2 |
| 3 | 1 | URL3 |
+----+------------+--------------+
I wanna get result of Select Query as follows
+----+------------+------------------+
| ID | IndustryId |ImageURLContains |
+----+------------+------------------+
| 1 | Auto |(URL1,URL2,URL3) |
+----+------------+------------------+
And we can have n number of URLS against one IndustryId.
Try It With Stuff Function
CREATE TABLE Images_Table(ID INT ,IndustryId INT , IMageURL NVARCHAR(50))
INSERT INTO Images_Table
SELECT 1,1,'URL1'UNION ALL
SELECT 2,1,'URL2'UNION ALL
SELECT 3,1,'URL3'
SELECT
ID IndstryId,IndustryName,
STUFF(
(
SELECT ' ,'+ISNULL(IMageURL,'')
FROM Images_Table t2
WHERE t2.IndustryId=t1.ID
ORDER BY t2.Id
FOR XML PATH('')
),1,2,'') ImageURL
FROM (SELECT ID,ISNULL(IndustryName,'') IndustryName FROM Industry_Table) t1
Following Output I'm Getting it
IndstryId IndustryName ImageURL
----------- -------------------------------------------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 Auto URL1 ,URL2 ,URL3

T-SQL Merging data

I've imported data from an XML file by using SSIS to SQL Server.
The result what I got in the database is similar to this:
+-------+---------+---------+-------+
| ID | Name | Brand | Price |
+-------+---------+---------+-------+
| 2 | NULL | NULL | 100 |
| NULL | SLX | NULL | NULL |
| NULL | NULL | Blah | NULL |
| NULL | NULL | NULL | 100 |
+-------+---------+---------+-------+
My desired result would be:
+-------+---------+---------+-------+
| ID | Name | Brand | Price |
+-------+---------+---------+-------+
| 2 | SLX | Blah | 100 |
+-------+---------+---------+-------+
Is there a pretty solution to solve this in T-SQL?
I've already tried it with a SELECT MAX(ID) and then a GROUP BY ID, but I'm still stuck with the NULL values. Also I've tried it with MERGE, but also a failure.
Could someone give me a direction where to search further?
You can select MAX on all columns....
SELECT MAX(ID), MAX(NAME), MAX(BRAND), MAX(PRICE)
FROM [TABLE]
Click here for a fiddley fidd fiddle...

Resources