postgreSQL : JOIN and GROUP BY to get the right query result

postgreSQL : JOIN and GROUP BY to get the right query result - arrays

I am a postgreSQL newbie and I am stuck on the following queries.
The desirable output would be
id | name | address | description | employees
1 | 'company1' | 'asdf' | 'asdf' | [{id: 1, name: 'Mark'}, {id: 2, name: 'Mark'}, {id: 3, name: 'Steve'}, {id: 4, name: 'Mark'}]
2 ...
3 ...
5 | 'company5' | 'asdf | 'adsf' | []
My current query(which not working is)
SELECT companies.* ,employees.*,json_agg(companies_employees.*) as "item"
FROM
companies_employees
JOIN companies ON companies_employees.COMPANY_id = companies.ID
JOIN employees ON companies_employees.EMPLOYEE_id = employees.ID
GROUP BY companies.ID, companies.NAME, companies.ADDRESS,companies.DESCRIPTION,employees.ID, employees.NAME
There are 3 tables:
companies : ID, NAME, ADDRESS, DESCRIPTION
employees : ID, NAME, SALARY, ROLE
companies_employees : EMPLOYEE_ID, COMPANY_ID
(CONSTRAINT companies_employees_employee_fkey FOREIGN KEY(employee_id) REFERENCES employees(id),
CONSTRAINT companies_employees_company_fkey FOREIGN KEY(company_id) REFERENCES companies(id) )
The sample table is [http://sqlfiddle.com/#!15/27982/29][here]
Maybe "GROUP BY" is not a right one to use.
Would you please guide me to the right direction?
Many thanks in advance

http://sqlfiddle.com/#!15/8849a/1
Your issue is that you are displaying employee data and using that in your group by condition. Those fields have unique values. You want to group only on the company information:
SELECT
companies.*,json_agg(employees.*) as "employees"
FROM
companies_employees
JOIN companies ON companies_employees.COMPANY_id = companies.ID
JOIN employees ON companies_employees.EMPLOYEE_id = employees.ID
GROUP BY
companies.ID,
companies.NAME,
companies.ADDRESS,
companies.DESCRIPTION

Related

get array of data for one record in psql?

I have two table which are person and account .
person has id and name.
+----------+--------+--------------------------------------------+
| Column | Type | Modifiers |
|----------+--------+--------------------------------------------|
| id | bigint | not null generated by default as identity |
| name | text | |
and account has id, name, ids
+----------+-----------+--------------------------------------------+
| Column | Type | Modifiers |
|----------+-----------+--------------------------------------------|
| id | bigint | not null generated by default as identity |
| name | text | |
| ids | integer[] | |
+----------+-----------+--------------------------------------------+
Indexes:
"account_pkey" PRIMARY KEY, btree (id)
Check constraints:
"account_ids_check" CHECK (array_length(ids, 0) < 4)
I'm storing id of person in ids.
I have two questions:
my ids filed in account table can be array of foreign key and point to person's id? if yes how can do It?
I want to get the id and name of account and name , id of person of the id that is in ids. like this .
id:
name:
ids: [{id: ,name: },{id: ,name: },{id: ,name: }]
I'm using this query but it gives me error
SELECT aa.id,
aa.name,
array(SELECT json_build_object ( 'id', p.id,'name', p.name from person p JOIN account a ON p.id = ANY(a.ids) ) ) as pins
from account aa where aa.id = 1

demo
I refactored your code. Hope it can solve your problem.
basically my code idea is one person cannot have more than 4 accounts.
one account belong to multi person seems not that intuitive to me.
some columns to json => https://dba.stackexchange.com/questions/69655/select-columns-inside-json-agg and https://dba.stackexchange.com/questions/90858/postgres-multiple-columns-to-json
begin;
create table person
(person_id bigserial primary key,
name text not null,
account_ct integer default 0, constraint account_ct_max4 check( account_ct BETWEEN 0 AND 4 ));
create table account(
account_id serial primary key,
account_type text,
account_limits numeric,
person_id bigint,
constraint person_ref_fkey foreign key (person_id)
references person(person_id) match simple
on update cascade on delete cascade
);
commit;
idea is a person can have multi account, but no more than 4. We use trigger to enforce that.
--tojson one person's account info.
select p.person_id,
p.name,
json_agg(
(select x from (select a.account_id, a.account_type) as x) ) as item
from account a join person p on a.person_id = p.person_id
where p.person_id = 1
group by 1,2;
--tojson one person's person info. account info not convert to json.
select account_id,
account_type,
json_build_object('id',p.person_id,'name',p.name)
from account a, person p
where p.person_id = 1;

Postgres - join on array values

Say I have a table with schema as follows
id | name | tags |
1 | xyz | [4, 5] |
Where tags is an array of references to ids in another table called tags.
Is it possible to join these tags onto the row? i.e. replacing the id numbers with the values for thise rows in the tags table such as:
id | name | tags |
1 | xyz | [[tag_name, description], [tag_name, description]] |
If not, I wonder if this an issue with the design of the schema?

Example tags table:
create table tags(id int primary key, name text, description text);
insert into tags values
(4, 'tag_name_4', 'tag_description_4'),
(5, 'tag_name_5', 'tag_description_5');
You should unnest the column tags, use its elements to join the table tags and aggregate columns of the last table. You can aggregate arrays to array:
select t.id, t.name, array_agg(array[g.name, g.description])
from my_table as t
cross join unnest(tags) as tag
join tags g on g.id = tag
group by t.id;
id | name | array_agg
----+------+-----------------------------------------------------------------
1 | xyz | {{tag_name_4,tag_description_4},{tag_name_5,tag_description_5}}
(1 row)
or strings to array:
select t.id, t.name, array_agg(concat_ws(', ', g.name, g.description))
...
or maybe strings inside a string:
select t.id, t.name, string_agg(concat_ws(', ', g.name, g.description), '; ')
...
or the last but not least, as jsonb:
select t.id, t.name, jsonb_object_agg(g.name, g.description)
from my_table as t
cross join unnest(tags) as tag
join tags g on g.id = tag
group by t.id;
id | name | jsonb_object_agg
----+------+------------------------------------------------------------------------
1 | xyz | {"tag_name_4": "tag_description_4", "tag_name_5": "tag_description_5"}
(1 row)
Live demo: db<>fiddle.

not sure if this is still helpful for anyone, but unnesting the tags is quite a bit slower than letting postgres do the work directly from the array. you can rewrite the query and this is generally more performant because the g.id = ANY(tags) is a simple pkey index scan without the expansion step:
SELECT t.id, t.name, ARRAY_AGG(ARRAY[g.name, g.description])
FROM my_table AS t
LEFT JOIN tags AS g
ON g.id = ANY(tags)
GROUP BY t.id;

Select N rows avoiding duplicates on a non-key, non-index field

Using T-SQL, how can I select n rows of a non-key, non-index column and avoid duplicate results?
Example table:
ID_ | state | customer | memo
------------------------------------------
1 | abc | 123 | memo text xyz
2 | abc | 123 | memo text abc
3 | abc | 456 | memo text def
4 | abc | 456 | memo text rew
5 | abc | 789 | memo text yte
6 | def | 123 | memo text hrd
7 | def | 432 | memo text dfg
I want to select, say, 2 memos for state 'abc' but the returned memos should not be for the same customer.
memo
----
memo text xyz
memo text def
PS: The only select condition available is state (eg: where state = 'abc')
I have managed to do this in a very inefficient way
SELECT top 2 MAX(memo)
FROM table
WHERE state = 'abc'
GROUP BY customer
This works fine for small sample size, but the production table has over 1 billion rows.

You can try using the following query, in your actual database size. Not sure of the performance in database table with billion rows. So you can do the test yourself.
SELECT memo
FROM (SELECT memo,
ROW_NUMBER() OVER (PARTITION BY customer ORDER BY (SELECT 0)) AS RN
FROM table1 WHERE state = 'abc') T
WHERE RN = 1
You can check the SQL FIDDLE
EDIT: Adding a non-clustered index on state and customer including memo will tremendously improve the performance.
CREATE NONCLUSTERED INDEX [custom_index] ON table
(
[state] ASC,
[customer] ASC
)
INCLUDE ( [memo]) WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [DATA]

A way to get that n distinct value for state/customer is to get an ID for every group
SELECT MIN(ID_) ID
FROM Table1
GROUP BY State, customer
(MIN can be substituted by MAX, it's just a way to get one of the values)
then JOIN that to the table adding the other condition
WITH getID AS (
SELECT MIN(ID_) ID
FROM Table1
GROUP BY State, customer
)
SELECT TOP 2
t.ID_, t.State, t.Customer, t.memo
FROM table1 t
INNER JOIN getID g ON t.ID_ = g.ID
WHERE t.state = 'abc'
SQLFiddle demo
if your version of SQLServer doesn't support WITH the CTE can become a subquery
SELECT TOP 2
t.ID_, t.State, t.Customer, t.memo
FROM table1 t
INNER JOIN (SELECT MIN(ID_) ID
FROM Table1
GROUP BY State, customer
) g ON t.ID_ = g.ID
WHERE t.state = 'abc'
Another way is to use CROSS APPLY to get the distinct ID
SELECT TOP 2
t.ID_, t.State, t.Customer, t.memo
FROM table1 t
CROSS APPLY (SELECT TOP 1
ID_
FROM table1 t1
WHERE t1.State = t.State AND t1.Customer = t.Customer) c
WHERE t.state = 'abc'
AND c.ID_ = t.ID_;
SQLFiddle demo

the IN clause does not works in my embedded SQL statement

I have following Table
Table User
UserID Name
1 Om
2 John
3 Kisan
4 Lisa
5 Karel
Table Game
Games Players
Golf 1,3,5
Football 4
I wrote query:
Select UserId,
Name from User
Where UserID IN
(Select Players from Game where Games='Golf')
Result:
~~~~~~~
0 Rows
Above query does not return me any result while it works well when i directly specify values for In clause in statement.
Select UserId, Name
from User
Where UserID IN (1,3,5)
Result:
~~~~~~~
UserID Name
1 Om
3 Kisan
5 Karel
3 rows
However when I change the condition in very 1st query with Football:
Select UserId, Name
from User
Where UserID IN
(Select Players
from Game
where Games='Football').
This returns me following result:
UserID Name
4 Lisa
1 row
How I can work around so that my very 1st query returns me the right result?
I think I'm in wrong direction. Help me out!

This is what you get for storing comma separated values in a field. Now you have to split it, using, say this function and do something like
Select User.UserId, User.Name from User
inner join splitstring((Select Players from Game where Games='Golf')) a
on User.UserID = a.Name
But consider changing your table "Game" design to
Games Players
Golf 1
Golf 3
Golf 5
Football 4
Then you can do simple
Select User.UserId, User.Name
from User inner join Game
on User.UserID = Game.Players
Where Game.Games = 'Golf'
without any additional functions.

Your first query translates to this:
Select UserId, Name
from User
Where UserID IN (`1,3,5`)
Notice that it is a string representation of the IDs, not a comma separated list like in your second query.
There are many Split functions out there written for this very scenario.
You can utilize one of them as such:
DECLARE #PlayersCsv NVARCHAR(MAX)
Select #PlayersCsv = Players from Game where Games='Golf'
Select UserId,
Name from User
Where UserID IN
(Select Value FROM dbo.Split(#PlayersCsv, ','))

DECLARE #xml AS xml
SET #xml = (SELECT cast('<X>'+(''+replace(players,',' ,'</X><X>')+'</X>') AS xml)
FROM Game WHERE Games='Golf')
SELECT UserId, Name
FROM User
WHERE UserID IN
(SELECT N.value('.', 'varchar(10)') as value FROM #xml.nodes('X') as T(N))
SQL Fiddle Results:
| USERID | NAME |
|--------|-------|
| 1 | Om |
| 3 | Kisan |
| 5 | Karel |

Search Multiple Tables with SQL Server

I have three tables:
____________________ ____________________ ____________________
posts tags posts_x_tags
____________________ ____________________ ____________________
| id | title | body | id | tag_name | post_id | tag_id
posts and tags have a many to many relationship.
Is is possible to do a search for posts.body OR posts.title OR tags.tag_name.
This gets me close but returns so many duplicates:
SELECT * FROM posts p
INNER JOIN posts_x_tags x ON p.id = x.post_id
INNER JOIN tags t ON t.id = x.tag_id
AND t.tag LIKE '%a%'
OR p.title LIKE '%a%'
OR p.body LIKE '%a%';
Any help would be much appreciated.

You can use SELECT DISTINCT ... in order to remove duplicates from result set.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

postgreSQL : JOIN and GROUP BY to get the right query result - arrays

Related

get array of data for one record in psql?

Postgres - join on array values

Select N rows avoiding duplicates on a non-key, non-index field

the IN clause does not works in my embedded SQL statement

Search Multiple Tables with SQL Server

Categories

Resources