pg removing common part in array

pg removing common part in array - arrays

Suppose a postgresql query that output rows of 2 columns each being array of int[][2]
track0 track1
{{1,2},{5847,5848},{5845,5846}......} {{1,2},{5847,5848},{10716,10715}........}
{{13,14},{1,2},{5847,5848},{284,285}........} {{13,14},{1,2},{5847,5848},{1284,1285}................}
How can we remove the leading arrays common to both columns except the last one?
In the first row {1,2} should be removed from the two columns.
In the second row {13,14},{1,2} should be removed from the two columns.
Can it be done with sql or is it necessary to use plpgsql?
I could manage the plpgsql but would like sql solution.

given your sample with no gaps in sequence of equal pairs,
t=# with d(t0,t1) as (values
('{{1,2},{5847,5848},{5845,5846}}'::int[][2],'{{1,2},{5847,5848},{10716,10715}}'::int[][2])
, ('{{13,14},{1,2},{5847,5848},{284,285}}'::int[][2],'{{13,14},{1,2},{5847,5848},{1284,1285}}'::int[][2])
)
, u as (
select *
from d
join unnest(t0) with ordinality t(e0,o0) on true
left outer join unnest(t1) with ordinality t1(e1,o1) on o0=o1
)
, p as (
select *
, case when (
not lead(e0=e1) over w
or not lead(e0=e1,2) over w
or e0!=e1
) AND (o1%2) = 1
then ARRAY[e0,lead(e0) over w] end r0
, case when (
not lead(e0=e1) over w
or not lead(e0=e1,2) over w
or e0!=e1
) AND (o1%2) = 1
then ARRAY[e1,lead(e1) over w] end r1
from u
window w as (partition by t0,t1 order by o0)
)
select t0,t1,array_agg(r0),array_agg(r1)
from p
where r0 is not null or r1 is not null
group by t0,t1
;
t0 | t1 | array_agg | array_agg
---------------------------------------+-----------------------------------------+---------------------------+-----------------------------
{{13,14},{1,2},{5847,5848},{284,285}} | {{13,14},{1,2},{5847,5848},{1284,1285}} | {{5847,5848},{284,285}} | {{5847,5848},{1284,1285}}
{{1,2},{5847,5848},{5845,5846}} | {{1,2},{5847,5848},{10716,10715}} | {{5847,5848},{5845,5846}} | {{5847,5848},{10716,10715}}
(2 rows)
you can skip part of complication if you add some trick for multidimentional arrays, look here and here

Related

Why is my total not coming up correctly, and how can I fix it?

I'm working on remaking an Access report in SSRS with data from a SQL Server.
In the report I have a matrix, and one of the values is SumOfPieces.
SumOfPieces is in my Query as sum(t1.pieces) as SumOfPieces.
Inside the table I get the correct row values by just using [SumOfPieces], but my total is not adding anything together. For example this is what I am getting:
Product | Facility | Shift/Line | Pieces
BFS | BRWP | A 1 | 65,000
BFS | MHWP | A 2 | 70,000
BFS | MHWP | B 2 | 80,000
________________________________________
Total | | | 70,000
For some reason it's giving me the middle value
The expression for the total is simply =Sum(fields!SumOfPieces.Value)
I tried different variations of something like this expression =Sum(avg(fields!SumPieces.Value,"Product1")
In Access this is accomplished with queries nested 4-5 deep.
For this field specifically it looks like
Original query t1 with t1.Pieces
Next query on t1 with t1.Pieces summed as t1.SumOfPieces
Next query joins t1 with others
The Access report just uses that SumOfPieces as the row value, and then a sum(SumOfPieces) for the total.
Sample of my Dataset Query:
SELECT
StaveHistorySummary.fk_Inspectors
,StaveHistorySummary.fk_InspectionSites
,StaveHistorySummary.fk_ProductionLines
,StaveHistorySummary.fk_ProductTypes
,StaveHistorySummary.DateMade
,StaveHistorySummary.[TimeStamp]
,StaveHistorySummary.StaveHistoryguid
,InspectionSites.SiteAbbr
,Inspectors.Name
,ProductTypes.Product
,ProductionLines.LineName
,CAST(sum(Millproduction.Pieces) as int) AS SumPieces
,CASE
WHEN SapEdgingInches IS NOT NULL THEN SapEdgingInches
WHEN HeartEdgingInches IS NOT NULL THEN HeartEdgingInches
WHEN BothEdgingInches IS NOT NULL THEN BothEdgingInches
WHEN SawnIncorrInches IS NOT NULL THEN SawnIncorrInches
WHEN EqualizedIncorrInches IS NOT NULL THEN EqualizedIncorrInches
WHEN SawnOKInches IS NOT NULL THEN SawnOKInches
END AS WIDTH
FROM
StaveHistorySummary
INNER JOIN ProductionLines
ON StaveHistorySummary.fk_ProductionLines = ProductionLines.ProductionLines_NDX
INNER JOIN InspectionSites
ON StaveHistorySummary.fk_InspectionSites = InspectionSites.InspectionSites_NDX
INNER JOIN ProductTypes
ON StaveHistorySummary.fk_ProductTypes = ProductTypes.ProductTypes_NDX
INNER JOIN Inspectors
ON StaveHistorySummary.fk_Inspectors = Inspectors.Inspectors_NDX
INNER JOIN MillProduction
ON inspectionsites.inspectionsites_ndx = MillProduction.fk_inspectionsites
AND productionlines.productionlines_ndx = MillProduction.fk_productionlines
AND producttypes.producttypes_ndx = millproduction.fk_producttypes
WHERE (CAST(CAST(stavehistorysummary.DateMade as date) as datetime) BETWEEN '6/16/2019' AND '6/22/2019')
AND (CAST(CAST(MillProduction.DateMade as date) as datetime) BETWEEN '6/16/2019' AND '6/22/2019')
GROUP BY
StaveHistorySummary.fk_Inspectors
,StaveHistorySummary.fk_InspectionSites
,StaveHistorySummary.fk_ProductionLines
,StaveHistorySummary.fk_ProductTypes
,StaveHistorySummary.DateMade
,StaveHistorySummary.[TimeStamp]
,StaveHistorySummary.StaveHistoryguid
,InspectionSites.SiteAbbr
,Inspectors.Name
,ProductTypes.Product
,ProductionLines.LineName
,CAST(sum(Millproduction.Pieces) as int) AS SumPieces
,CASE
WHEN SapEdgingInches IS NOT NULL THEN SapEdgingInches
WHEN HeartEdgingInches IS NOT NULL THEN HeartEdgingInches
WHEN BothEdgingInches IS NOT NULL THEN BothEdgingInches
WHEN SawnIncorrInches IS NOT NULL THEN SawnIncorrInches
WHEN EqualizedIncorrInches IS NOT NULL THEN EqualizedIncorrInches
WHEN SawnOKInches IS NOT NULL THEN SawnOKInches
END AS WIDTH

SQL Server - Each GROUP BY expression must contain at least one column that is not an outer reference

I need to identify all records that have MostRecent=-1, OilWell=-1, plus are duplicate records with the same Api, and join these to get the associated CompanyName.
With the query:
SELECT
BLMAPDCONTACT.CompanyName, APD.Api, APD.ID, APD.MostRecent,
APD.Project_Nu, APD.Unit_Lease, APD.Well_Nu, APD.OilWell
FROM
APD
INNER JOIN
BLMAPDCONTACT ON APD.BLM_APD_Cont = BLMAPDCONTACT.OBJECTID
WHERE
(APD.Api IN (SELECT APD.Api
FROM APD AS Tmp
WHERE APD.MostRecent = -1 AND APD.OilWell = -1
GROUP BY APD.Api
HAVING Count(APD.Api) > 1))
ORDER BY
APD.Api DESC;
I get this error:
Each GROUP BY expression must contain at least one column that is not an outer reference.
This error appeared after I added the JOIN clause; without it, it worked.
Example desired output will match on the following records from the APD table:
APD.Api | APD.MostRecent | APD.OilWell
--------------------------------------
123 | -1 | -1
123 | -1 | -1
And not:
APD.Api | APD.MostRecent | APD.OilWell
--------------------------------------
321 | 0 | -1
321 | -1 | -1

did you try this:
SELECT BLMAPDCONTACT.CompanyName, APD.Api, APD.ID, APD.MostRecent, APD.Project_Nu, APD.Unit_Lease, APD.Well_Nu, APD.OilWell
FROM APD INNER JOIN BLMAPDCONTACT ON APD.BLM_APD_Cont = BLMAPDCONTACT.OBJECTID
WHERE (APD.Api IN
(SELECT tmp.Api
FROM APD As Tmp
WHERE tmp.MostRecent=-1 AND tmp.OilWell=-1
GROUP BY tmp.Api HAVING Count(tmp.Api)>1))
ORDER BY APD.Api DESC;

The aliases and table names are confusing me a bit. If you run the following, do you still get the same error?
SELECT b.CompanyName
, a.Api
, a.ID
, a.MostRecent
, a.Project_Nu
, a.Unit_Lease
, a.Well_Nu
, a.OilWell
FROM APD a INNER JOIN BLMAPDCONTACT b
ON a.BLM_APD_Cont = b.OBJECTID
WHERE a.Api IN (
SELECT tmp.Api
FROM APD As Tmp
WHERE tmp.MostRecent = -1 AND tmp.OilWell = -1
GROUP BY tmp.Api
HAVING Count(tmp.Api) > 1
)
ORDER BY a.Api DESC;
Also, just double check that I've translated tables to aliases correctly.

Concatenate array elements on a joined table PostgreSQL

Is it possible to do a 1 on 1 element array concatenation if I have a query like this:
EDIT: Arrays not always have the same number of elements.
could be that array1 has sometimes 4 elements ans array2 8 elements.
drop table if exists a;
drop table if exists b;
create temporary table a as (select 1 as id,array['a','b','c'] as array1);
create temporary table b as (select 1 as id,array['x','y','z'] as array2);
select
a.id,
a.array1,
b.array2,
array_concat--This has to be a 1 to 1 ordered concatenation (see
--example below)
from a
left join b on b.id=a.id
What I would like to obtain here is a paired concatenation of both arrays 1 and 2, like this:
id array11 array2 array_concat
1 ['a','b','c'] ['d','e','f'] ['a-d','b-e','c-f']
2 ['x','y','z'] ['i','j','k'] ['x-i','y-j','z-k']
3 ...
I tried using unnest but i can't make it work:
select
a.id,
a.array1,
b.array2,
array_concat
from table a
left join b on b.id=a.id
left join (select a.array1,b.array2, array_agg(a1||b2)
FROM unnest(a.array1, b.array2)
ab (a1, b2)
) ag on ag.array1=a.array1 and ag.array2=b.array2
;
EDIT:
This works for only one table:
SELECT array_agg(el1||el2)
FROM unnest(ARRAY['a','b','c'], ARRAY['d','e','f']) el (el1, el2);
++Thanks to https://stackoverflow.com/users/1463595/%D0%9D%D0%9B%D0%9E
EDIT:
I came to a very close solution but it mixes up some of the intermediate values once the concatenation between arrays is done, never the less I still need a perfect solution...
The approach I am now using is:
1) Creating one table based on the 2 separate ones
2) aggregating using Lateral:
create temporary table new_table as
SELECT
id,
a.a,
b.b
FROM a a
LEFT JOIN b b on a.id=b.id;
SELECT id,
ab_unified
FROM pair_sources_mediums_campaigns,
LATERAL (SELECT ARRAY_AGG(a||'[-]'||b order by grp1) as ab_unified
FROM (SELECT DISTINCT case when a null
then 'not tracked'
else a
end as a
,case when b is null
then 'none'
else b
end as b
,rn - ROW_NUMBER() OVER(PARTITION BY a,b ORDER BY rn) AS grp1
FROM unnest(a,b) with ordinality as el (a,b,rn)
) AS sub
) AS lat1
order by 1;

Something like this.
with a_elements (id, element, idx) as (
select a.id,
u.element,
u.idx
from a
cross join lateral unnest(a.array1) with ordinality as u(element, idx)
), b_elements (id, element, idx) as (
select b.id,
u.element,
u.idx
from b
cross join lateral unnest(b.array2) with ordinality as u(element, idx)
)
select id,
array_agg(concat_ws('-', a.element, b.element) order by idx) as elements
from a_elements a
full outer join b_elements b using (id, idx)
group by coalesce(a.id, b.id);
The join operator using (..) will automatically take the non-null value from the joined tables. This removes the need to use e.g. coalesce(a.id, b.id).n
It's not pretty and definitely not efficient for large tables, but seems to do all you want.
For arrays that do not have the same amount of elements, the result will only have the element from one of the arrays.
For this dataset:
insert into a
(id, array1)
values
(1, array['a','b','c','d']),
(2, array['d','e','f']);
insert into b
(id, array2)
values
(1, array['x','y','z']),
(2, array['m','n','o','p']);
It returns this result:
id | elements
---+----------------
1 | {a-x,b-y,c-z,d}
2 | {d-m,e-n,f-o,p}

I think you were thinking too far, try this (SQLFiddle):
select
a.id,
a.array1,
b.array2,
array[a.array1[1] || '-' || b.array2[1],
a.array1[2] || '-' || b.array2[2],
a.array1[3] || '-' || b.array2[3]] array_concat
from
a inner join
b on b.id = a.id
;

EF6 - Generating unneeded nested queries

I have the following tables:
MAIN_TBL:
Col1 | Col2 | Col3
------------------
A | B | C
D | E | F
And:
REF_TBL:
Ref1 | Ref2 | Ref3
------------------
A | G1 | Foo
D | G1 | Bar
Q | G2 | Xyz
I wish to write the following SQL query:
SELECT M.Col1
FROM MAIN_TBL M
LEFT JOIN REF_TBL R
ON R.Ref1 = M.Col1
AND R.Ref2 = 'G1'
WHERE M.Col3 = 'C'
I wrote the following LINQ query:
from main in dbContext.MAIN_TBL
join refr in dbContext.REF_TBL
on "G1" equals refr.Ref2
into refrLookup
from refr in refrLookup.DefaultIfEmpty()
where main.Col1 == refr.Col1
select main.Col1
And the generated SQL was:
SELECT
[MAIN_TBL].[Col1]
FROM (SELECT
[MAIN_TBL].[Col1] AS [Col1],
[MAIN_TBL].[Col2] AS [Col2],
[MAIN_TBL].[Col3] AS [Col3]
FROM [MAIN_TBL]) AS [Extent1]
INNER JOIN (SELECT
[REF_TBL].[Ref1] AS [Ref1],
[REF_TBL].[Ref2] AS [Ref2],
[REF_TBL].[Ref3] AS [Ref3]
FROM [REF_TBL]) AS [Extent2] ON [Extent1].[Col1] = [Extent2].[Ref1]
WHERE ('G1' = [Extent2].[DESCRIPTION]) AND ([Extent2].[Ref1] IS NOT NULL) AND CAST( [Extent1].[Col3] AS VARCHAR) = 'C') ...
Looks like it is nesting a query within another query, while I just want it to pull from the table. What am I doing wrong?

I may be wrong, but it looks like you don't do the same in linq query and sql query, especially on your left joining clause.
I would go for this, if you want something similar to your sql query.
from main in dbContext.MAIN_TBL.Where(x => x.Col3 == "C")
join refr in dbContext.REF_TBL
on new{n = "G1", c = main.Col1} equals new{n = refr.Ref2, c = refr.Col1}
into refrLookup
from r2 in refrLookup.DefaultIfEmpty()
select main.Col1
By the way, it doesn't make much sense to left join on a table which is not present in the select clause : you will just get multiple identical Col1 if there's more than one related item in the left joined table...

Taking Count of Subqueries in Full Outer Join

I am working on SQL Server 2012. My SQL follows the following structure.
SELECT A.attributeA
,A.attributeB
,Count(A.*) AS CountA -- I know this is wrong.
,Count(B.*) AS CountB
FROM
(
SELECT ... FROM Foo1
) A
FULL OUTER JOIN
(
SELECT ... FROM Foo2
) B
ON A.attribute1 = B.attribute1
GROUP BY
A.attributeA
,A.attributeB
I want to take the count of all rows from subqueries A and B. How do I do that? Thank you in advance.

Assuming the goal is to just count the non-null records from each side of the join, you can specify the column name (as mentioned in a comment) that you expect to be non-null, often the same column as in your join. For example, since you joined on attribute1:
SELECT A.attributeA
,A.attributeB
,Count(A.attribute1) AS CountA
,Count(B.attribute1) AS CountB
FROM
...
Note that this tells you nothing about when the 2 overlap, if that is part of your goal. For that type of counting, you can use a SUM combined with CASE:
SELECT A.attributeA
,A.attributeB
,Count(A.attribute1) AS CountA
,Count(B.attribute1) AS CountB
,SUM(CASE WHEN A.attribute1 IS NOT NULL AND B.attribute1 IS NOT NULL
THEN 1 ELSE 0
END) as CountAAndBOverlap
FROM
...

If your purpose is just to get the count of the two sub-queries, you can do something like this one. I've also generated common ID for the two sub-queries so that I can JOIN them.Lastly, I used INNER JOIN instead of FULL JOIN.
SELECT
CountSubqueryA AS CountA,
CountSubqueryB AS CountB
FROM
(SELECT 1 AS ID,COUNT(*) AS CountSubqueryA FROM Foo1 ) AS A
INNER JOIN
(SELECT 1 AS ID,COUNT(*) AS CountSubqueryB FROM Foo2 ) AS B
ON A.ID=B.ID

I saw your GROUP BY and thought maybe you wanted this. It will create up to 3 groups that look something like this:
==========================
| A | NOT B | 24 |
--------------------------
| NOT A | B | 31 |
--------------------------
| A | B | 69 |
==========================
SELECT
CASE WHEN A.attribute1 IS NOT NULL THEN 'A' ELSE 'NOT A' END,
CASE WHEN B.attribute1 IS NOT NULL THEN 'B' ELSE 'NOT B' END
COUNT(*)
FROM
(
SELECT ... FROM Foo1
) A
FULL OUTER JOIN
(
SELECT ... FROM Foo2
) B
ON A.attribute1 = B.attribute1
GROUP BY
CASE WHEN A.attribute1 IS NOT NULL THEN 'A' ELSE 'NOT A' END,
CASE WHEN B.attribute1 IS NOT NULL THEN 'B' ELSE 'NOT B' END

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

pg removing common part in array - arrays

Related

Why is my total not coming up correctly, and how can I fix it?

SQL Server - Each GROUP BY expression must contain at least one column that is not an outer reference

Concatenate array elements on a joined table PostgreSQL

EF6 - Generating unneeded nested queries

Taking Count of Subqueries in Full Outer Join

Categories

Resources