Concatenate array elements on a joined table PostgreSQL

Concatenate array elements on a joined table PostgreSQL - arrays

Is it possible to do a 1 on 1 element array concatenation if I have a query like this:
EDIT: Arrays not always have the same number of elements.
could be that array1 has sometimes 4 elements ans array2 8 elements.
drop table if exists a;
drop table if exists b;
create temporary table a as (select 1 as id,array['a','b','c'] as array1);
create temporary table b as (select 1 as id,array['x','y','z'] as array2);
select
a.id,
a.array1,
b.array2,
array_concat--This has to be a 1 to 1 ordered concatenation (see
--example below)
from a
left join b on b.id=a.id
What I would like to obtain here is a paired concatenation of both arrays 1 and 2, like this:
id array11 array2 array_concat
1 ['a','b','c'] ['d','e','f'] ['a-d','b-e','c-f']
2 ['x','y','z'] ['i','j','k'] ['x-i','y-j','z-k']
3 ...
I tried using unnest but i can't make it work:
select
a.id,
a.array1,
b.array2,
array_concat
from table a
left join b on b.id=a.id
left join (select a.array1,b.array2, array_agg(a1||b2)
FROM unnest(a.array1, b.array2)
ab (a1, b2)
) ag on ag.array1=a.array1 and ag.array2=b.array2
;
EDIT:
This works for only one table:
SELECT array_agg(el1||el2)
FROM unnest(ARRAY['a','b','c'], ARRAY['d','e','f']) el (el1, el2);
++Thanks to https://stackoverflow.com/users/1463595/%D0%9D%D0%9B%D0%9E
EDIT:
I came to a very close solution but it mixes up some of the intermediate values once the concatenation between arrays is done, never the less I still need a perfect solution...
The approach I am now using is:
1) Creating one table based on the 2 separate ones
2) aggregating using Lateral:
create temporary table new_table as
SELECT
id,
a.a,
b.b
FROM a a
LEFT JOIN b b on a.id=b.id;
SELECT id,
ab_unified
FROM pair_sources_mediums_campaigns,
LATERAL (SELECT ARRAY_AGG(a||'[-]'||b order by grp1) as ab_unified
FROM (SELECT DISTINCT case when a null
then 'not tracked'
else a
end as a
,case when b is null
then 'none'
else b
end as b
,rn - ROW_NUMBER() OVER(PARTITION BY a,b ORDER BY rn) AS grp1
FROM unnest(a,b) with ordinality as el (a,b,rn)
) AS sub
) AS lat1
order by 1;

Something like this.
with a_elements (id, element, idx) as (
select a.id,
u.element,
u.idx
from a
cross join lateral unnest(a.array1) with ordinality as u(element, idx)
), b_elements (id, element, idx) as (
select b.id,
u.element,
u.idx
from b
cross join lateral unnest(b.array2) with ordinality as u(element, idx)
)
select id,
array_agg(concat_ws('-', a.element, b.element) order by idx) as elements
from a_elements a
full outer join b_elements b using (id, idx)
group by coalesce(a.id, b.id);
The join operator using (..) will automatically take the non-null value from the joined tables. This removes the need to use e.g. coalesce(a.id, b.id).n
It's not pretty and definitely not efficient for large tables, but seems to do all you want.
For arrays that do not have the same amount of elements, the result will only have the element from one of the arrays.
For this dataset:
insert into a
(id, array1)
values
(1, array['a','b','c','d']),
(2, array['d','e','f']);
insert into b
(id, array2)
values
(1, array['x','y','z']),
(2, array['m','n','o','p']);
It returns this result:
id | elements
---+----------------
1 | {a-x,b-y,c-z,d}
2 | {d-m,e-n,f-o,p}

I think you were thinking too far, try this (SQLFiddle):
select
a.id,
a.array1,
b.array2,
array[a.array1[1] || '-' || b.array2[1],
a.array1[2] || '-' || b.array2[2],
a.array1[3] || '-' || b.array2[3]] array_concat
from
a inner join
b on b.id = a.id
;

Related

TSQL Compare tables based on multiple rows same column?

i will try and be as clear as possible on this one, as i have no idea what to do next and would love a kick in the right direction.
Im trying to compare the values within 2 tables. The tables look like this:
Table1:
Table2
INSERT INTO #table1 ([elementName], [elementValue])
VALUES
('t1','Project'),
('p1','test1'),
('n1','value1'),
('t2','Project'),
('p2','test2'),
('n2','value2'),
('t3','Project'),
('p3','test3'),
('n3','value3'),
('t4',''),
('p4',''),
('n4',''),
('t5',''),
('p5',''),
('n5','')
INSERT INTO #table2 ([elementName], [elementValue])
VALUES
('t1','Project'),
('p1',''),
('n1',''),
('t2','Project'),
('p2','test3'),
('n2','value123'),
('t3','Project'),
('p3',''),
('n3',''),
('t4','Package'),
('p4',''),
('n4',''),
('t5','Project'),
('p5','Testtest'),
('n5','valuevalue')
I used this code to fill the testtables. Normally this is an automated process, and the tables are filled from an XML string.
Furthermore, the numbers in the element name are considered "groups" meaning T1 P1 and N1 are together.
I would like to compare T1 and P1 etc from Table1 to any combination of T and P from table2
If they match, i would like to overwrite the value of Table 1 N1 with the value of the matched N on table 2. (in the example, table1 N3 would be replaced with table2 N2
Besides that i also want to keep every group in table 1 that is not in table 2
but also add every group that is in table 2 but not in table 1 on one of the blank spots.
Last but not least, if the T value is filled, but P value is empty, it does not have to overwrite/change anything in table1.
The expected result would be this:
Table1:
i made the changes bold.
I dont really have an idea on where to start on this. Ive tried functions as except and intersect, but did not get even close to what i would like to see.

with t1 as (
select * from (values
('t1','Project'),
('p1','test1'),
('n1','value1'),
('t2','Project'),
('p2','test2'),
('n2','value2'),
('t3','Project'),
('p3','test3'),
('n3','value3'),
('t4',''),
('p4',''),
('n4',''),
('t5',''),
('p5',''),
('n5','')
) v([elementName], [elementValue])
),
t2 as (
select * from (values
('t1','Project'),
('p1',''),
('n1',''),
('t2','Project'),
('p2','test3'),
('n2','value123'),
('t3','Project'),
('p3',''),
('n3',''),
('t4','Package'),
('p4',''),
('n4',''),
('t5','Project'),
('p5','Testtest'),
('n5','valuevalue')
) v([elementName], [elementValue])
),
pivoted_t1 as (
select *
from
(select left([elementName], 1) letter, right([elementName], len([elementName]) - 1) number, [elementValue] as value from t1) t1
pivot(min(value) for letter in ([t], [p], [n])) pvt1
),
pivoted_t2 as (
select *
from
(select left([elementName], 1) letter, right([elementName], len([elementName]) - 1) number, [elementValue] as value from t2) t2
pivot(min(value) for letter in ([t], [p], [n])) pvt2
),
amended_values as (
select
pvt1.number,
coalesce(pvt2.t, pvt1.t) as t,
coalesce(pvt2.p, pvt1.p) as p,
coalesce(pvt2.n, pvt1.n) as n,
count(case when pvt1.t = '' and pvt1.p = '' then 1 end) over(order by pvt1.number rows between unbounded preceding and current row) as empty_row_number
from
pivoted_t1 pvt1
left join pivoted_t2 pvt2 on pvt1.t = pvt2.t and pvt1.p = pvt2.p and pvt1.t <> '' and pvt1.p <> ''
),
added_new_values as (
select
a.number,
coalesce(n.t, a.t) as t,
coalesce(n.p, a.p) as p,
coalesce(n.n, a.n) as n
from
amended_values a
left join (
select number, t, p, n, row_number() over (order by number) as row_number
from pivoted_t2 t2
where
t2.t <> ''
and t2.p <> ''
and not exists (select * from pivoted_t1 t1 where t1.t = t2.t and t1.p = t2.p)
) n on n.row_number = a.empty_row_number
)
select
concat([elementName], number) as [elementName],
[elementValue]
from
added_new_values
unpivot ([elementValue] for [elementName] in ([t], [p], [n])) upvt
;

Recursive query SQL Server not working as expected

thanks in advance for you help. I'm still quite new to MS SQL db but I was wondering why my recursive query for MSSQL below does not return the value i'm expecting. I've done my research and at the bottom is the code I came up with. Lets say I have the following table...
CategoryID ParentID SomeName
1 0 hmm
2 0 err
3 0 woo
4 3 ppp
5 4 ttt
I'm expecting the query below to return 3 4 5. I basically wanted to get the list of category id's heirarchy below it self inclusive based on the category id I pass in the recursive query. Thanks for you assistance.
GO
WITH RecursiveQuery (CategoryID)
AS
(
-- Anchor member definition
SELECT a.CategoryID
FROM [SomeDB].[dbo].[SomeTable] AS a
WHERE a.ParentID = CategoryID
UNION ALL
-- Recursive member definition
SELECT b.CategoryID
FROM [SomeDB].[dbo].[SomeTable] AS b
INNER JOIN RecursiveQuery AS d
ON d.CategoryID = b.ParentID
)
-- Statement that executes the CTE
SELECT o.CategoryID
FROM [SomeDB].[dbo].[SomeTable] AS o
INNER JOIN RecursiveQuery AS d
ON d.CategoryID = 3
GO

If you want tree from specific root:
DECLARE #rootCatID int = 3
;WITH LessonsTree (CatID)
AS
(
SELECT a.CategoryID
FROM [EducationDatabase].[dbo].[LessonCategory] AS a
WHERE a.CategoryID = #rootCatID ---<<<
UNION ALL
SELECT b.CategoryID
FROM LessonsTree as t
INNER JOIN [EducationDatabase].[dbo].[LessonCategory] AS b
ON b.ParentID = t.CatID
)
SELECT o.*
FROM LessonsTree t
INNER JOIN [EducationDatabase].[dbo].[LessonCategory] AS o
ON o.CategoryID = t.CatID

As stated in the comments, the anchor isn't restricted. Easiest solution is to add the criterium in the anchor
with RecursiveQuery (theID)
AS
(
SELECT a.ParentID --root id=parentid to include it and to prevent an extra trip to LessonCategory afterwards
FROM [LessonCategory] AS a
WHERE a.ParentID = 3 --restriction here
UNION ALL
SELECT b.CategoryID
FROM [LessonCategory] AS b
INNER JOIN RecursiveQuery AS d
ON d.theID = b.ParentID
)
SELECT* from RecursiveQuery
Another option is to have the recursive query be general (no restricted anchor) and have it keep the rootid as well. Then the query on the cte can restrict on the rootid (the first option is probably better, this second one is mainly suitable if you are created some sort of root-view)
with RecursiveQuery
AS
(
SELECT a.ParentID theID, a.ParentID RootID
FROM [LessonCategory] AS a
UNION ALL
SELECT b.CategoryID, d.RootID
FROM [LessonCategory] AS b
INNER JOIN RecursiveQuery AS d
ON d.theID = b.ParentID
)
SELECT theID from RecursiveQuery where RootID = 3

Conditional JOIN Statement SQL Server

Is it possible to do the following:
IF [a] = 1234 THEN JOIN ON TableA
ELSE JOIN ON TableB
If so, what is the correct syntax?

I think what you are asking for will work by joining the Initial table to both Option_A and Option_B using LEFT JOIN, which will produce something like this:
Initial LEFT JOIN Option_A LEFT JOIN NULL
OR
Initial LEFT JOIN NULL LEFT JOIN Option_B
Example code:
SELECT i.*, COALESCE(a.id, b.id) as Option_Id, COALESCE(a.name, b.name) as Option_Name
FROM Initial_Table i
LEFT JOIN Option_A_Table a ON a.initial_id = i.id AND i.special_value = 1234
LEFT JOIN Option_B_Table b ON b.initial_id = i.id AND i.special_value <> 1234
Once you have done this, you 'ignore' the set of NULLS. The additional trick here is in the SELECT line, where you need to decide what to do with the NULL fields. If the Option_A and Option_B tables are similar, then you can use the COALESCE function to return the first NON NULL value (as per the example).
The other option is that you will simply have to list the Option_A fields and the Option_B fields, and let whatever is using the ResultSet to handle determining which fields to use.

This is just to add the point that query can be constructed dynamically based on conditions.
An example is given below.
DECLARE #a INT = 1235
DECLARE #sql VARCHAR(MAX) = 'SELECT * FROM [sourceTable] S JOIN ' + IIF(#a = 1234,'[TableA] A ON A.col = S.col','[TableB] B ON B.col = S.col')
EXEC(#sql)
--Query will be
/*
SELECT * FROM [sourceTable] S JOIN [TableB] B ON B.col = S.col
*/

You can solve this with union
select a, b
from tablea
join tableb on tablea.a = tableb.a
where b = 1234
union
select a, b
from tablea
join tablec on tablec.a = tableb.a
where b <> 1234

I disagree with the solution suggesting 2 left joins. I think a table-valued function is more appropriate so you don't have all the coalescing and additional joins for each condition you would have.
CREATE FUNCTION f_GetData (
#Logic VARCHAR(50)
) RETURNS #Results TABLE (
Content VARCHAR(100)
) AS
BEGIN
IF #Logic = '1234'
INSERT #Results
SELECT Content
FROM Table_1
ELSE
INSERT #Results
SELECT Content
FROM Table_2
RETURN
END
GO
SELECT *
FROM InputTable
CROSS APPLY f_GetData(InputTable.Logic) T

I think it will be better to think about your query in a different way and treat them more like sets.
I do believe if you make two separate queries then join them using UNION, It will be much better in performance and more readable.

TSQL optimizing code for NOT IN

I inherit an old SQL script that I want to optimize but after several tests, I must admit that all my tests only creates huge SQL with repetitive blocks. I would like to know if someone can propose a better code for the following pattern (see code below). I don't want to use temporary table (WITH). For simplicity, I only put 3 levels (table TMP_C, TMP_D and TMP_E) but the original SQL have 8 levels.
WITH
TMP_A AS (
SELECT
ID,
Field_X
FROM A
TMP_B AS(
SELECT DISTINCT
ID,
Field_Y,
CASE
WHEN Field_Z IN ('TEST_1','TEST_2') THEN 'CATEG_1'
WHEN Field_Z IN ('TEST_3','TEST_4') THEN 'CATEG_2'
WHEN Field_Z IN ('TEST_5','TEST_6') THEN 'CATEG_3'
ELSE 'CATEG_4'
END AS CATEG
FROM B
INNER JOIN TMP_A
ON TMP_A.ID=TMP_B.ID),
TMP_C AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_1'),
TMP_D AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_2' AND ID NOT IN (SELECT ID FROM TMP_C)),
TMP_E AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_3'
AND ID NOT IN (SELECT ID FROM TMP_C)
AND ID NOT IN (SELECT ID FROM TMP_D))
SELECT * FROM TMP_C
UNION
SELECT * FROM TMP_D
UNION
SELECT * FROM TMP_E
Many thanks in advance for your help.

First off, select DISTINCT will prevent duplicates from the result set, so you are overworking the condition. By adding the "WITH" definitions and trying to nest their use makes it more confusing to follow. The data is ultimately all coming from the "B" table where also has key match in "A". Lets start with just that... And since you are not using anything from the (B)Field_Y or (A)Field_X in your result set, don't add them to the mix of confusion.
SELECT DISTINCT
B.ID,
CASE WHEN B.Field_Z IN ('TEST_1','TEST_2') THEN 'CATEG_1'
WHEN B.Field_Z IN ('TEST_3','TEST_4') THEN 'CATEG_2'
WHEN B.Field_Z IN ('TEST_5','TEST_6') THEN 'CATEG_3'
ELSE 'CATEG_4'
END AS CATEG
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_1', 'TEST_2', 'TEST_3', 'TEST_4', 'TEST_5', 'TEST_6' )
The where clause will only include those category qualifying values you want and still have the results per each category.
Now, if you actually needed other values from your "Field_Y" or "Field_X", then that would generate a different query. However, your Tmp_C, Tmp_D and Tmp_E are only asking for the ID and CATEG columns anyhow.

This may perform better
SELECT DISTINCT B.ID, 'CATEG_1'
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_1', 'TEST_2')
UNION
SELECT DISTINCT B.ID, 'CATEG_2'
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_3', 'TEST_4')
...

Join the table valued function in the query

I have one table vwuser. I want join this table with the table valued function fnuserrank(userID). So I need to cross apply with table valued function:
SELECT *
FROM vwuser AS a
CROSS APPLY fnuserrank(a.userid)
For each userID it generates multiple records. I only want the last record for each empid that does not have a Rank of Term(inated). How can I do this?
Data:
HistoryID empid Rank MonitorDate
1 A1 E1 2012-8-9
2 A1 E2 2012-9-12
3 A1 Term 2012-10-13
4 A2 E3 2011-10-09
5 A2 TERM 2012-11-9
From this 2nd record and 4th record must be selected.

In SQL Server 2005+ you can use this Common Table Expression (CTE) to determine the latest record by MonitorDate that doesn't have a Rank of 'Term':
WITH EmployeeData AS
(
SELECT *
, ROW_NUMBER() OVER (PARTITION BY empId, ORDER BY MonitorDate DESC) AS RowNumber
FROM vwuser AS a
CROSS APPLY fnuserrank(a.userid)
WHERE Rank != 'Term'
)
SELECT *
FROM EmployeeData AS ed
WHERE ed.RowNumber = 1;
Note: The statement before this CTE will need to end in a semi-colon. Because of this, I have seen many people write them like ;WITH EmployeeData AS...

You'll have to play with this. Having trouble mocking your schema on sqlfiddle.
Select bar.*
from
(
SELECT *
FROM vwuser AS a
CROSS APPLY fnuserrank(a.userid)
where rank != 'TERM'
) foo
left join
(
SELECT *
FROM vwuser AS b
CROSS APPLY fnuserrank(b.userid)
where rank != 'TERM'
) bar
on foo.empId = bar.empId
and foo.MonitorDate > bar.MonitorDate
where bar.empid is null
I always need to test out left outers on dates being higher. The way it works is you do a left outer. Every row EXCEPT one per user has row(s) with a higher monitor date. That one row is the one you want. I usually use an example from my code, but i'm on the wrong laptop. to get it working you can select foo., bar. and look at the results and spot the row you want and make the condition correct.
You could also do this, which is easier to remember
SELECT *
FROM vwuser AS a
CROSS APPLY fnuserrank(a.userid)
) foo
join
(
select empid, max(monitordate) maxdate
FROM vwuser AS b
CROSS APPLY fnuserrank(b.userid)
where rank != 'TERM'
) bar
on foo.empid = bar.empid
and foo.monitordate = bar.maxdate
I usually prefer to use set based logic over aggregate functions, but whatever works. You can tweak it also by caching the results of your TVF join into a table variable.
EDIT:
http://www.sqlfiddle.com/#!3/613e4/17 - I mocked up your TVF here. Apparently sqlfiddle didn't like "go".
select foo.*, bar.*
from
(
SELECT f.*
FROM vwuser AS a
join fnuserrank f
on a.empid = f.empid
where rank != 'TERM'
) foo
left join
(
SELECT f1.empid [barempid], f1.monitordate [barmonitordate]
FROM vwuser AS b
join fnuserrank f1
on b.empid = f1.empid
where rank != 'TERM'
) bar
on foo.empId = bar.barempid
and foo.MonitorDate > bar.barmonitordate
where bar.barempid is null

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Concatenate array elements on a joined table PostgreSQL - arrays

I think you were thinking too far, try this (SQLFiddle): select a.id, a.array1, b.array2, array[a.array1[1] || '-' || b.array2[1], a.array1[2] || '-' || b.array2[2], a.array1[3] || '-' || b.array2[3]] array_concat from a inner join b on b.id = a.id ;

Related

TSQL Compare tables based on multiple rows same column?

Recursive query SQL Server not working as expected

Conditional JOIN Statement SQL Server

TSQL optimizing code for NOT IN

Join the table valued function in the query

Categories

Resources