Aggregate query result - database

I have two tables, the second table contains a foreign key which references the first table primary key.
First table "Houses" (id,title,city,country), Second table "Images" (id,name,house_id)
I am implementing the following query:
SELECT * FROM houses INNER JOIN images ON houses.id = images.house_id;
The result is an array of repeated data except for a field name:
[
{
id:1,
title: "house1",
city:"c1",
country:"country2",
name:"image1",
house_id: 2
},
{
id:2,
title: "house1",
city:"c1",
country:"country2",
name:"image2",
house_id: 2
},
{
id:3,
title: "house1",
city:"c1",
country:"country2",
name:"image3"
house_id: 2,
},
]
How could I adjust the query to get the result like the following:
[
{
id:2,
title: "house1",
city:"c1",
country:"country2",
imagesNames:["image1","image2","image3"]
house_id: 2,
}
]
Is it doable using knex? I am using a PostgreSQL database.

GROUP BY all columns shared by all peers, and aggregate names. Like:
SELECT h.id, h.title, h.city, h.country
, array_agg(name) AS images_names
, i.house_id -- redundant?
FROM houses h
JOIN images i ON h.id = i.house_id;
GROUP BY h.id, h.title, h.city, h.country, i.house_id;

Related

SQL Searching on columns generated in output

I have two tables and writing a query by taking data from both the tables as below:
select distinct on (e.pol) e.pol, ei,bene, ei.status,
(jsonb_array_elements(ei.name_json)->> 'custName1') as name1,
(jsonb_array_elements(ei.name_json)->> 'custName1') as name2
from table1 e, table2 ei
cross join (jsonb_array_elements(ei.name_json)
where e.pol=ei.pol and value->>'custName1' like '%Tes%'
order by e.pol, ei.bene
In this query I am trying to search from a json array "name_json" as follows:
[
{
"custName1": "Tesla",
"custName2": ""
},
{
"custName1": "Gerber",
"custName2": "N"
}
]
I am displaying only that column which is having the distinct policy and its respective custName1 and wantto search for only field that is diaplayed in my output. How to do it? How to search only on the columns displayed in the output?
Help is appreciated.

SQL Server table data to JSON Path result

I am looking for a solution to convert the table results to a JSON path.
I have a table with two columns as below. Column 1 Will always have normal values, but column 2 will have values up to 15 separated by ';' (semicolon).
ID Column1 Column2
--------------------------------------
1 T1 Re;BoRe;Va
I want to convert the above column data in to below JSON Format
{
"services":
[
{ "service": "T1"}
],
"additional_services":
[
{ "service": "Re" },
{ "service": "BoRe" },
{ "service": "Va" }
]
}
I have tried creating something like the below, but cannot get to the exact format that I am looking for
SELECT
REPLACE((SELECT d.Column1 AS services, d.column2 AS additional_services
FROM Table1 w (nolock)
INNER JOIN Table2 d (nolock) ON w.Id = d.Id
WHERE ID = 1
FOR JSON PATH), '\/', '/')
Please let me know if this is something we can achieve using T-SQL
As I mention in the comments, I strongly recommend you fix your design and normalise your design. Don't store delimited data in your database; Re;BoRe;Va should be 3 rows, not 1 delimited one. That doesn't mean you can't achieve what you want with your denormalised data, just that your design is flawed, and thus it needs being brought up.
One way to achieve what you're after is with some nested FOR JSON calls:
SELECT (SELECT V.Column1 AS service
FOR JSON PATH) AS services,
(SELECT SS.[value] AS service
FROM STRING_SPLIT(V.Column2,';') SS
FOR JSON PATH) AS additional_services
FROM (VALUES(1,'T1','Re;BoRe;Va'))V(ID,Column1,Column2)
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER;
This results in the following JSON:
{
"services": [
{
"service": "T1"
}
],
"additional_services": [
{
"service": "Re"
},
{
"service": "BoRe"
},
{
"service": "Va"
}
]
}

How to parse jsonb field into columns without keys in PostgreSQL

I have table login( id int, meta_skills jsonb), but the jsonb is not stored in key-value pairs.
The data in the jsonb field look like
{
"Cat1": [
{
"Skill_1": 2,
"Skill_2": 2,
"Skill_3": 2,
"Skill_4": 2,
"Skill_5": 2
}
],
"Cat2": [
{
"Skill_1": 3,
"Skill_2": 2,
"Skill_3": 3
}
],
"Cat3": [
{
"Skill_1": 2,
"Skill_2": 2,
"Skill_3": 2,
"Skill_4": 2
}
]
}
The skills values are random values.
and I want to prepare the data in following format
You have to unnest the levels:
SELECT login.id,
u1.category,
u3.skill,
u3.level
FROM login
CROSS JOIN LATERAL jsonb_each(login.meta_skills) AS u1(category,v)
CROSS JOIN LATERAL jsonb_array_elements(u1.v) AS u2(v)
CROSS JOIN LATERAL jsonb_each(u2.v) AS u3(skill, level);

Group by and filter based on sum of a column in google apps script

I am trying to group by vendor and id and take the sum of total weight from the below table:
Vendor Id Weight
AAA 1 1234
AAA 1 121
AAA 2 5182
BBB 1 311
BBB 1 9132
BBB 2 108
In the below query, variable 'row' is the input table
I have the below query that groups by Vendor and Id and sums Weight
var res_2 = alasql('SELECT [0] as Vendor,[1] as Id, sum([2]) as Total_Weight FROM ? GROUP BY [0], [1]',[row]);
Result is as follows:
[ { Vendor: 'AAA', Id: '1', Total_Weight: 1355 },
{ Vendor: 'AAA', Id: '2', Total_Weight: 5182 },
{ Vendor: 'BBB', Id: '1', Total_Weight: 9443 },
{ Vendor: 'BBB', Id: '2', Total_Weight: 108 }, ]
My next part is I need to loop over this array and for every unique vendor, I need to take the maximum 'Total_Weight' and get the corresponding 'Id' and push the variables 'Vendor'
and 'Id' to another array.
Hence, the results has to be
[{Vendor: 'AAA', Id: '2'},{Vendor: 'BBB', Id: '1'}]
Can anyone guide me on whether this could be accomplished through a logic or do I need to modify the query as such. Any suggestions would be appreciated.
I see you put a tag google-sheets here. I think your problem can be solved within this tool:
First stage you get using query formula:
=query(B2:D,"select B, C, sum(D) where B is not null group by C, B order by sum(D) desc")
As you have your table sorted by max sum value, you can use vlookup function and take first row for the vendor you need and build a table:
=ArrayFormula(ifna(vlookup(unique(F5:F),F4:H8,{1,2},false)))
Or you can do both stages together (and use query as a table inside vlookup)
=ArrayFormula(ifna(vlookup(unique(B3:B),query(B3:D,"select B, C, sum(D) where B is not null group by C, B order by sum(D) desc"),{1,2},false)))
The result is an array with 2 columns - vendor and ID corresponding to max sum.

Query JSON Key:Value Pairs in AWS Athena

I have received a data set from a client that is loaded in AWS S3. The data contains unnamed JSON key:value pairs. This isn't my area of expertise, so I was looking for a little help.
The structure of JSON data that I've typically worked with in the past looks similar to this:
{ "name":"John", "age":30, "car":null }
The data that I have received from my client is formatted as such:
{
"answer_id": "cc006",
"answer": {
"101086": 1,
"101087": 2,
"101089": 2,
"101090": 7,
"101091": 5,
"101092": 3,
"101125": 2
}
}
This is survey data, where the key on the left is a numeric customer identifier, and the value on the right is their response to a survey question, i.e. customer "101125" answered the survey with a value of "2". I need to be able to query the JSON data using Athena such that my result set looks similar to:
Cross joining the unnested children against the parent node isn't an issue. What I can't figure out is how to select all of the keys from the array "answer" without specifying that actual key name. Similarly, I want to be able to select all of the values as well.
Is it possible to create a virtual table in Athena that would allow for these results, or do I need to convert the JSON to a format this looks more similar to the following:
{
"answer_id": "cc006",
"answer": [
{ "key": "101086", "value": 1 },
{ "key": "101087", "value": 2 },
{ "key": "101089", "value": 2 },
{ "key": "101090", "value": 7 },
{ "key": "101091", "value": 5 },
{ "key": "101092", "value": 3 },
{ "key": "101125", "value": 2 }
]
}
EDIT 6/4/2020
I was able to use the code that Theon provided below along with the following table structure:
CREATE EXTERNAL TABLE answer_example (
answer_id string,
answer string
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://mybucket/'
That allowed me to use the following query to generate the results that I needed.
WITH Data AS(
SELECT
answer_id,
CAST(json_extract(answer, '$') AS MAP(VARCHAR, VARCHAR)) as answer
FROM
answer_example
)
SELECT
answer_id,
key,
element_at(answer, key) AS value
FROM
Data
CROSS JOIN UNNEST (map_keys(answer)) AS answer (key)
EDIT 6/5/2020
Taking additional advice from Theon's response below, the following DDL and Query simplify this quite a bit.
DDL:
CREATE EXTERNAL TABLE answer_example (
answer_id string,
answer map<string,string>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://mybucket/'
Query:
SELECT
answer_id,
key,
element_at(answer, key) AS value
FROM
answer_example
CROSS JOIN UNNEST (map_keys(answer)) AS answer (key)
Cross joining with the keys of the answer property and then picking the corresponding value. Something like this:
WITH data AS (
SELECT
'cc006' AS answer_id,
MAP(
ARRAY['101086', '101087', '101089', '101090', '101091', '101092', '101125'],
ARRAY[1, 2, 2, 7, 5, 3, 2]
) AS answers
)
SELECT
answer_id,
key,
element_at(answers, key) AS value
FROM data
CROSS JOIN UNNEST (map_keys(answers)) AS answer (key)
You could probably do something with transform_keys to create rows of the key value pairs, but the SQL above does the trick.

Resources