How to get results of multiple aggregations in a single druid query?

How to get results of multiple aggregations in a single druid query? - database

Say I have the following table named t_student_details:
Name Age Marks Sport City ........ (multiple columns)
====== ===== ======= ======= ======
Jason 11 45 tennis New York
Mark 12 42 football New York
Jessica 11 43 tennis Orlando
Brad 13 46 tennis Orlando
.
.
.
(multiple rows)
I want to get certain information about the students in a single query. This is what I would do in Postgres:
WITH sports_filter AS(
SELECT * FROM t_student_details WHERE sport='tennis'
)
SELECT JSON_BUILD_OBJECT('max_age', (SELECT MAX(age) FROM sports_filter),
'min_age', (SELECT MIN(age) FROM sports_filter),
'city_wise_marks_mean', (SELECT JSON_AGG(mean_items)
FROM (SELECT city, AVG(marks) FROM sports_filter
GROUP BY city) AS mean_items)
The result of the above SQL query in Postgres would be
{"max_age": 46,
"min_age": 43,
"city_wise_marks_mean": [{"New York": 45, "Orlando": 44.5}]}
As it is evident, I got multiple aggregations/information about students belonging to the sport 'tennis' in a single query. This way of querying also fetches only the necessary data and not everything.
How do I achieve this using Druid? I don't necessarily need the output response to be in the exact same format, but how do I fetch all these stats in the same query without having to fetch all the details of the students? Is it possible to get all this in a single query using Apache Druid?

As I understand your Postgres query it should be as simple as using GROUPING SETS
WITH sports_filter AS(
SELECT * FROM t_student_details WHERE sport='tennis'
)
SELECT City, MIN(Marks), MAX(Marks), AVG(Marks)
FROM sports_filters
GROUP BY CUBE (City)
The GROUP BY CUBE is short-hand for GROUP BY GROUPING SETS ( (City), () )
This should result in one row where City is NULL which is your overall min/max/avg and one row for each city.

Yes, it should be possible. Druid supports two query languages: Druid SQL and native queries. SELECT queries with the following structure are supported:
[ EXPLAIN PLAN FOR ]
[ WITH tableName [ ( column1, column2, ... ) ] AS ( query ) ]
SELECT [ ALL | DISTINCT ] { * | exprs }
FROM { <table> | (<subquery>) | <o1> [ INNER | LEFT ] JOIN <o2> ON condition }
[ WHERE expr ]
[ GROUP BY [ exprs | GROUPING SETS ( (exprs), ... ) | ROLLUP (exprs) | CUBE (exprs) ] ]
[ HAVING expr ]
[ ORDER BY expr [ ASC | DESC ], expr [ ASC | DESC ], ... ]
[ LIMIT limit ]
[ OFFSET offset ]
[ UNION ALL <another query> ]
along with these Aggregation functions.

Related

PostgreSQL aggregate over json arrays

I have seen a lot of references to using json_array_elements on extracting the elements of a JSON array. However, this appears to only work on exactly 1 array. If I use this in a generic query, I get the error
ERROR: cannot call json_array_elements on a scalar
Given something like this:
orders
{ "order_id":"2", "items": [{"name": "apple","price": 1.10}]}
{ "order_id": "3","items": [{"name": "apple","price": 1.10},{"name": "banana","price": 0.99}]}
I would like to extract
item
count
apple
2
banana
1
Or
item
total_value_sold
apple
2.20
banana
0.99
Is it possible to aggregate over json arrays like this using json_array_elements?

Use the function for orders->'items' to flatten the data:
select elem->>'name' as name, (elem->>'price')::numeric as price
from my_table
cross join jsonb_array_elements(orders->'items') as elem;
It is easy to get the aggregates you want from the flattened data:
select name, count(*), sum(price) as total_value_sold
from (
select elem->>'name' as name, (elem->>'price')::numeric as price
from my_table
cross join jsonb_array_elements(orders->'items') as elem
) s
group by name;
Db<>fiddle.

Power Bi : Filter a SQL Server table which contains a string

I have 2 tables in SQL Server :
Table Country which countains the name of the country :
Table Society which the name of the society and the names of the countries where the society worked :
In Power Bi, i have to create a filter country (US, Germany, France, UK,...) that will filter the table society :
For example, if i put "US" in the filter Country, in my matrix i will have Society A and Society B.
If i put "France" in the filter Country, i will have Society B and Society C.
(My first idea was to add some binary fields "IsInThisCountry" in SQL Server then use these fields as a filter )
Something like this :
CASE WHEN country LIKE '%US%' THEN 1 ELSE 0 END 'IsUS'
But the issue is if i have 50 country, i will have to create 50 filter

If you have SQL Server with compatibility 130 or higher (with string_split) you can try something like this in your data model to split the delimited countries in your societies table:
;with countries as (
select 'germany' as country
union all
select 'sweden'
),
socities as (
select 'A' as society, 'germany-sweden' as countries
union all
select 'B', 'sweden'
),
societyByCountry as (
select c.society, value as Country from socities c
cross apply string_split(c.countries, '-') s
)
select c.country, s.society from countries c
inner join societyByCountry s on s.Country = c.country

PostgreSQL: Efficiently aggregate array columns as part of a group by

We wish to perform a GROUP BY operation on a table. The original table contains an ARRAY column. Within a group, the content of these arrays should be transformed into a single array with unique elements. No ordering of these elements is required. contain
Newest PostgreSQL versions are available.
Example original table:
id | fruit | flavors
---: | :----- | :---------------------
| apple | {sweet,sour,delicious}
| apple | {sweet,tasty}
| banana | {sweet,delicious}
Exampled desired result:
count_total | aggregated_flavors
----------: | :---------------------------
1 | {delicious,sweet}
2 | {sour,tasty,delicious,sweet}
SQL toy code to create the original table:
CREATE TABLE example(id int, fruit text, flavors text ARRAY);
INSERT INTO example (fruit, flavors)
VALUES ('apple', ARRAY [ 'sweet','sour', 'delicious']),
('apple', ARRAY [ 'sweet','tasty' ]),
('banana', ARRAY [ 'sweet', 'delicious']);
We have come up with a solution requiring transforming the array to s
SELECT COUNT(*) AS count_total,
array
(SELECT DISTINCT unnest(string_to_array(replace(replace(string_agg(flavors::text, ','), '{', ''), '}', ''), ','))) AS aggregated_flavors
FROM example
GROUP BY fruit
However we think this is not optimal, and may be problematic as we assume that the string does neither contain "{", "}", nor ",". It feels like there must be functions to combine arrays in the way we need, but we weren't able to find them.
Thanks a lot everyone!

demo:db<>fiddle
Assuming each record contains a unique id value:
SELECT
fruit,
array_agg(DISTINCT flavor), -- 2
COUNT(DISTINCT id) -- 3
FROM
example,
unnest(flavors) AS flavor -- 1
GROUP BY fruit
unnest() array elements
Group by fruit value: array_agg() for distinct flavors
Group by fruit value: COUNT() for distinct ids with each fruit group.
if the id column is really empty, you could generate the id values for example with the row_number() window function:
demo:db<>fiddle
SELECT
*
FROM (
SELECT
*, row_number() OVER () as id
FROM example
) s,
unnest(flavors) AS flavor

postgreSQL : JOIN and GROUP BY to get the right query result

I am a postgreSQL newbie and I am stuck on the following queries.
The desirable output would be
id | name | address | description | employees
1 | 'company1' | 'asdf' | 'asdf' | [{id: 1, name: 'Mark'}, {id: 2, name: 'Mark'}, {id: 3, name: 'Steve'}, {id: 4, name: 'Mark'}]
2 ...
3 ...
5 | 'company5' | 'asdf | 'adsf' | []
My current query(which not working is)
SELECT companies.* ,employees.*,json_agg(companies_employees.*) as "item"
FROM
companies_employees
JOIN companies ON companies_employees.COMPANY_id = companies.ID
JOIN employees ON companies_employees.EMPLOYEE_id = employees.ID
GROUP BY companies.ID, companies.NAME, companies.ADDRESS,companies.DESCRIPTION,employees.ID, employees.NAME
There are 3 tables:
companies : ID, NAME, ADDRESS, DESCRIPTION
employees : ID, NAME, SALARY, ROLE
companies_employees : EMPLOYEE_ID, COMPANY_ID
(CONSTRAINT companies_employees_employee_fkey FOREIGN KEY(employee_id) REFERENCES employees(id),
CONSTRAINT companies_employees_company_fkey FOREIGN KEY(company_id) REFERENCES companies(id) )
The sample table is [http://sqlfiddle.com/#!15/27982/29][here]
Maybe "GROUP BY" is not a right one to use.
Would you please guide me to the right direction?
Many thanks in advance

http://sqlfiddle.com/#!15/8849a/1
Your issue is that you are displaying employee data and using that in your group by condition. Those fields have unique values. You want to group only on the company information:
SELECT
companies.*,json_agg(employees.*) as "employees"
FROM
companies_employees
JOIN companies ON companies_employees.COMPANY_id = companies.ID
JOIN employees ON companies_employees.EMPLOYEE_id = employees.ID
GROUP BY
companies.ID,
companies.NAME,
companies.ADDRESS,
companies.DESCRIPTION

How to use group by in SQL Server query?

I have problem with group by in SQL Server
I have this simple SQL statement:
select *
from Factors
group by moshtari_ID
and I get this error :
Msg 8120, Level 16, State 1, Line 1
Column 'Factors.ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
This is my result without group by :
and this is error with group by command :
Where is my problem ?

In general, once you start GROUPing, every column listed in your SELECT must be either a column in your GROUP or some aggregate thereof. Let's say you have a table like this:
| ID | Name | City |
| 1 | Foo bar | San Jose |
| 2 | Bar foo | San Jose |
| 3 | Baz Foo | Santa Clara |
If you wanted to get a list of all the cities in your database, and tried:
SELECT * FROM table GROUP BY City
...that would fail, because you're asking for columns (ID and Name) that aren't in the GROUP BY clause. You could instead:
SELECT City, count(City) as Cnt FROM table GROUP BY City
...and that would get you:
| City | Cnt |
| San Jose | 2 |
| Santa Clara | 1 |
...but would NOT get you ID or Name. You can do more complicated things with e.g. subselects or self-joins, but basically what you're trying to do isn't possible as-stated. Break down your problem further (what do you want the data to look like?), and go from there.
Good luck!

When you group then you can select only the columns you group by. Other columns need to be aggrgated. This can be done with functions like min(), avg(), count(), ...
Why is this? Because with group by you make multiple records unique. But what about the column not being unique? The DB needs a rule for those on how to display then - aggregation.

You need to apply aggregate function such as max(), avg() , count() in group by.
For example this query will sum totalAmount for all moshtari_ID
select moshtari_ID,sum (totalAmount) from Factors group by moshtari_ID;
output will be
moshtari_ID SUM
2 120000
1 200000

Try it,
select *
from Factorys
Group by ID, date, time, factorNo, trackingNo, totalAmount, createAt, updateAt, bark_ID, moshtari_ID

If you are applying group clause then you can only use group columns and aggregate function in select
syntax:
SELECT expression1, expression2, ... expression_n,
aggregate_function (aggregate_expression)
FROM tables
[WHERE conditions]
GROUP BY expression1, expression2, ... expression_n
[ORDER BY expression [ ASC | DESC ]];

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to get results of multiple aggregations in a single druid query? - database

Related

PostgreSQL aggregate over json arrays

Power Bi : Filter a SQL Server table which contains a string

PostgreSQL: Efficiently aggregate array columns as part of a group by

postgreSQL : JOIN and GROUP BY to get the right query result

How to use group by in SQL Server query?

Categories

Resources