PostgreSql query to sum all values in json array - arrays

I am writing this query for an use case on this table
______________________________________________________
|companyId | detailsJson |
|----------| ----------------------------------------|
|12 |{"dataKeyOne":1.10, "dataKeyTwo":1.20} |
|123 |{"dataKeyFour":2.12, "dataKeySeven":1.18}|
|134 | {} |
|342 | {} |
______________________________________________________
My output coming is after writing this query is :
select companyId, sum(value::float) as sum
from tableA, jsonb_each_text(detailsJson)
group by companyId;
|companyId | sum|
|----------|-------|
|12 | 2.30 |
|123 | 3.30 |
But I want if detailsJson is empty then I want these companyId with 0 sum as shown in table below :
|companyId | sum|
|----------|-------|
|12 | 2.30 |
|123 | 3.30 |
|134 | 0.0 |
|342 | 0.0 |
How can I achieve this using PostgreSQL?

You need to move the jsonb_each_text to the FROM clause so that you can use it in an outer join:
select t.companyid,
sum(d.value::float) as sum
from the_table t
left join jsonb_each_text(t.detailsjson) as d(key, value) on true
group by t.companyid
order by t.companyid
;

Related

Results of join listed in rows vs additional columns?

I have 2 tables, with the same exact fields and fields names. i am trying to inner join them but im having some difficulty determining how i can get my results in my desired format.
I know i can do select a.customer, a.id, a.date, a.line, a.product, b.customer, b.id, b.date, b.line, b.product but instead of having my A data and B data on the same row, id like for them to be on seperate rows.
I have 2 tables, with the same exact fields and fields names, i am trying to inner join them so that unique line becomes a row.
Table A:
|customer| id | Date | line | Product|
|--------|-----|---------|------|--------|
| 445678 | 123 | 1/1/22 | 10 | 88975 |
| 853652 | 456 | 1/10/22 | 5 | 55876 |
| 845689 | 789 | 1/25/22 | 1 | 45587 |
TABLE B:
|customer| id | Date | line | Product|
|--------|-----|---------|------|--------|
| 445678 | 489 | 1/1/22 | 1 | 87574 |
| 853652 | 853 | 1/10/22 | 12 | 45678 |
| 587435 | 157 | 2/12/22 | 3 | 25896 |
DESIRED RESULTS:
|customer| id | Date | line | Product|
|--------|-----|---------|------|--------|
| 445678 | 123 | 1/1/22 | 10 | 88975 |
| 445678 | 489 | 1/1/22 | 1 | 87574 |
| 853652 | 456 | 1/10/22 | 5 | 55876 |
| 853652 | 853 | 1/10/22 | 12 | 45678 |
my query:
select a.customer, a.id, a.date, a.line, a.product
from data1 a
inner join data2 b
on a.date = b.date
and a.customer = b.customer

SQL Server: Returning rows with multiple and distinct values

I've been working on this issue for the last day and a half and just can't seem to find another question on here that works for my code.
I have a table here:
Table_D
Policynumber| EntryDate | BI_Limit | P remium
------------------------------------------------------
ABCD100001 | 5/1/16 | 15/30 | 919
ABCD100001 | 5/13/16 | 15/30 | 1008
ABCD100002 | 5/24/16 | 100/300 | 1380
ABCD100003 | 5/30/16 | 25/50 | 1452
ABCD100003 | 6/2/16 | 25/50 | 1372
ABCD100003 | 6/4/16 | 30/60 | 951
ABCD100004 | 6/11/16 | 100/300 | 1038
ABCD100005 | 6/22/16 | 100/300 | 1333
ABCD100005 | 7/2/16 | 50/100 | 1208
ABCD100006 | 7/10/16 | 250/500 | 1345
ABCD100007 | 7/18/16 | 15/30 | 996
in which I'm trying to extract rows in which a policynumber has multiple listings and a different BI_Limit. So the output should be:
Output
Policynumber | EntryDate | BI_Limit | Premium
---------------------------------------------------
ABCD100003 | 5/30/16 | 25/50 | 1452
ABCD100003 | 6/2/16 | 25/50 | 1372
ABCD100003 | 6/4/16 | 30/60 | 951
ABCD100005 | 6/22/16 | 100/300 | 1333
ABCD100005 | 7/2/16 | 50/100 | 1208
I'm storing Policynumber as VARCHAR(Max), EntryDate as DATE, BI_Limit as VARCHAR(Max), and Premium as INTEGER.
The code I've want to say should work would be something along the lines of:
SELECT * FROM Table_D
WHERE BI_Limit IN (
SELECT BI_Limit
FROM Table_D
GROUP BY BI_Limit
HAVING COUNT(DISTINCT BI_Limit)>1);
But this returns nothing for me. Can anyone help to show me what I'm doing wrong? Thank you.
You could also try exists
select a.*
from Table_D a
where
exists (
select 1
from Table_D b
where a.Policynumber = b.Policynumber
and a.BI_Limit <> b.BI_Limit
)
SELECT d.*
FROM ( -- find the policy number with multiple listing and diff BI_Limit
SELECT PolicyNumber
FROM TableD
GROUP BY PolicyNumber
HAVING count(*) > 1
AND MIN (BI_Limit) <> MAX (BI_Limit)
) m -- join back the Table_D to for other information
INNER JOIN Table_D d
ON m.PolicyNumber = d.PolicyNumber

Optimisation of a MSSQL Query - group by multiple columns

Hey guys i could need some advice, i've the following 2 tables
Table Model:
+----------------+---------------+-------------+------------------------+-------------+--------+-----------------+------------------+------------------+------------------------------+------------------------------+----------------------+----------------------+------------------+
| DLTCountryCode | SupplierID | ModelNumber | ModelDescription | Brand | Fedas | MeasurementUnit | MinModelNetPrice | MaxModelNetPrice | MinModelSuggestedRetailPrice | MaxModelSuggestedRetailPrice | MinModelInsteadPrice | MaxModelInsteadPrice | PictureAvailable |
+----------------+---------------+-------------+------------------------+-------------+--------+-----------------+------------------+------------------+------------------------------+------------------------------+----------------------+----------------------+------------------+
| AT | 9120048150008 | 2012266 | xxx | Brand | 115946 | STK | 6.05 | 6.05 | 10.95 | 10.95 | 0 | 0 | 1 |
+----------------+---------------+-------------+------------------------+-------------+--------+-----------------+------------------+------------------+------------------------------+------------------------------+----------------------+----------------------+------------------+
Table ModelColorSizeInventory:
+----------------+---------------+-------------+-----------+------+---------------+----------+-------------------------+
| DLTCountryCode | SupplierID | ModelNumber | ColorCode | Size | ItemNumber | Quantity | InventoryDateTime |
+----------------+---------------+-------------+-----------+------+---------------+----------+-------------------------+
| AT | 9120048150008 | 2012266 | 801 | L | 9008601584968 | 0 | 2017-09-29 11:16:02.347 |
| AT | 9120048150008 | 2012266 | 801 | M | 9008601584951 | 0 | 2017-09-29 11:16:02.347 |
| AT | 9120048150008 | 2012266 | 801 | S | 9008601584944 | 2 | 2017-09-29 11:16:02.347 |
| AT | 9120048150008 | 2012266 | 801 | XL | 9008601584975 | 4 | 2017-09-29 11:16:02.347 |
| AT | 9120048150008 | 2012266 | 801 | XXL | 9008601584982 | 6 | 2017-09-29 11:16:02.347 |
+----------------+---------------+-------------+-----------+------+---------------+----------+-------------------------+
And the following Query:
SELECT dccdm.*, SUM(dccdmcsi.[Quantity]) AS QuantityModel
FROM "Model" AS "dccdm"
LEFT JOIN ModelColorSizeInventory AS dccdmcsi ON dccdm.[ModelNumber] = dccdmcsi.[ModelNumber]
WHERE (
dccdm.ModelNumber IN('2012266')
)
AND dccdmcsi.[Quantity] >0
AND dccdm.[DLTCountryCode]='AT'
GROUP BY dccdm.[DLTCountryCode],dccdm.[SupplierID],dccdm.[ModelNumber],dccdm.[ModelDescription],dccdm.[Brand],dccdm.[Fedas],dccdm.[MeasurementUnit],dccdm.[MinModelNetPrice],dccdm.[MaxModelNetPrice],dccdm.[MinModelSuggestedRetailPrice],dccdm.[MaxModelSuggestedRetailPrice],dccdm.[MinModelInsteadPrice],dccdm.[MaxModelInsteadPrice],dccdm.[PictureAvailable]
This Query works as expected, i'm joining ModelColorSizeInventory to find out the sum of all variants with quantities
One thing that bothers me is the group by part, because if i skip the group by statement i'm getting the following error:
Msg 8120, Column 'DLTCountryCode' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Since i'm not that familiar with MSSQL i ask the following question:
How can i write down this query without using this complex GROUP BY clause
The reason behind this question is that writing down multiple columns in a group by statement feels wrong, in queries like these ... ;)
You could use "subview" for grouped amounts, or simply use subquery in join, like:
Select dccdm.*, Isnull(dccdmcsi.SumQuantity,0)
FROM dbo.[Model] dccdm LEFT JOIN
(Select ModelNumber, SUM([Quantity]) as SumQuantity from dbo.ModelColorSizeInventory GROUP BY ModelNumber) dccdmcsi
ON dccdm.ModelNumber=dccdmcsi.ModelNumber

Exclude Secondary ID Records from Original SELECT

I'm relatively new to SQL and am running into a lot of issues trying to figure this one out. I've tried using a LEFT JOIN, and dabbled in using functions to get this to work but to no avail.
For every UserID, if there is a NULL value, I need to remove all records of the Product ID for that UserID from my SELECT.
I am using SQL Server 2014.
Example Table
+--------------+-------------+---------------+
| UserID | ProductID | DateTermed |
+--------------+-------------+---------------+
| 578 | 2 | 1/7/2017 |
| 578 | 2 | 1/7/2017 |
| 578 | 1 | 1/15/2017 |
| 578 | 1 | NULL |
| 649 | 1 | 1/9/2017 |
| 649 | 2 | 1/11/2017 |
+--------------+-------------+---------------+
Desired Output
+--------------+-------------+---------------+
| UserID | ProductID | DateTermed |
+--------------+-------------+---------------+
| 578 | 2 | 1/7/2017 |
| 578 | 2 | 1/7/2017 |
| 649 | 1 | 1/9/2017 |
| 649 | 2 | 1/11/2017 |
+--------------+-------------+---------------+
Try the following:
SELECT a.userid, a.productid, a.datetermed
FROM yourtable a
LEFT OUTER JOIN (SELECT userid, productid, datetermed FROM yourtable WHERE
datetermed is null) b
on a.userid = b.userid and a.productid = b.productid
WHERE b.userid is not null
This will left outer join all records with a null date to their corresponding UserID and ProductID records. If you only take records that don't have an associated UserID and ProductID in the joined table, you should only be left with records that don't have a null date.
You can use this WHERE condition:
SELECT
UserID,ProducID,DateTermed
FROM
[YourTableName]
WHERE
(CONVERT(VARCHAR,UserId)+
CONVERT(VARCHAR,ProductID) NOT IN (
select CONVERT(VARCHAR,UserId)+ CONVERT(VARCHAR,ProductID)
from
[YourTableName]
where DateTermed is null)
)
When you concatenate the UserId and the ProductId get a unique value for each pair, then you can use them as a "key" to exclude the "pairs" that have the null value in the DateTermed field.
Hope this help.

Where to use Outer Apply

MASTER TABLE
x------x--------------------x
| Id | Name |
x------x--------------------x
| 1 | A |
| 2 | B |
| 3 | C |
x------x--------------------x
DETAILS TABLE
x------x--------------------x-------x
| Id | PERIOD | QTY |
x------x--------------------x-------x
| 1 | 2014-01-13 | 10 |
| 1 | 2014-01-11 | 15 |
| 1 | 2014-01-12 | 20 |
| 2 | 2014-01-06 | 30 |
| 2 | 2014-01-08 | 40 |
x------x--------------------x-------x
I am getting the same results when LEFT JOIN and OUTER APPLY is used.
LEFT JOIN
SELECT T1.ID,T1.NAME,T2.PERIOD,T2.QTY
FROM MASTER T1
LEFT JOIN DETAILS T2 ON T1.ID=T2.ID
OUTER APPLY
SELECT T1.ID,T1.NAME,TAB.PERIOD,TAB.QTY
FROM MASTER T1
OUTER APPLY
(
SELECT ID,PERIOD,QTY
FROM DETAILS T2
WHERE T1.ID=T2.ID
)TAB
Where should I use LEFT JOIN AND where should I use OUTER APPLY
A LEFT JOIN should be replaced with OUTER APPLY in the following situations.
1. If we want to join two tables based on TOP n results
Consider if we need to select Id and Name from Master and last two dates for each Id from Details table.
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
LEFT JOIN
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
ORDER BY CAST(PERIOD AS DATE)DESC
)D
ON M.ID=D.ID
which forms the following result
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | NULL | NULL |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
This will bring wrong results ie, it will bring only latest two dates data from Details table irrespective of Id even though we join with Id. So the proper solution is using OUTER APPLY.
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
OUTER APPLY
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
WHERE M.ID=D.ID
ORDER BY CAST(PERIOD AS DATE)DESC
)D
Here is the working : In LEFT JOIN , TOP 2 dates will be joined to the MASTER only after executing the query inside derived table D. In OUTER APPLY, it uses joining WHERE M.ID=D.ID inside the OUTER APPLY, so that each ID in Master will be joined with TOP 2 dates which will bring the following result.
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-08 | 40 |
| 2 | B | 2014-01-06 | 30 |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
2. When we need LEFT JOIN functionality using functions.
OUTER APPLY can be used as a replacement with LEFT JOIN when we need to get result from Master table and a function.
SELECT M.ID,M.NAME,C.PERIOD,C.QTY
FROM MASTER M
OUTER APPLY dbo.FnGetQty(M.ID) C
And the function goes here.
CREATE FUNCTION FnGetQty
(
#Id INT
)
RETURNS TABLE
AS
RETURN
(
SELECT ID,PERIOD,QTY
FROM DETAILS
WHERE ID=#Id
)
which generated the following result
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-11 | 15 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-06 | 30 |
| 2 | B | 2014-01-08 | 40 |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
3. Retain NULL values when unpivoting
Consider you have the below table
x------x-------------x--------------x
| Id | FROMDATE | TODATE |
x------x-------------x--------------x
| 1 | 2014-01-11 | 2014-01-13 |
| 1 | 2014-02-23 | 2014-02-27 |
| 2 | 2014-05-06 | 2014-05-30 |
| 3 | NULL | NULL |
x------x-------------x--------------x
When you use UNPIVOT to bring FROMDATE AND TODATE to one column, it will eliminate NULL values by default.
SELECT ID,DATES
FROM MYTABLE
UNPIVOT (DATES FOR COLS IN (FROMDATE,TODATE)) P
which generates the below result. Note that we have missed the record of Id number 3
x------x-------------x
| Id | DATES |
x------x-------------x
| 1 | 2014-01-11 |
| 1 | 2014-01-13 |
| 1 | 2014-02-23 |
| 1 | 2014-02-27 |
| 2 | 2014-05-06 |
| 2 | 2014-05-30 |
x------x-------------x
In such cases an APPLY can be used(either CROSS APPLY or OUTER APPLY, which is interchangeable).
SELECT DISTINCT ID,DATES
FROM MYTABLE
OUTER APPLY(VALUES (FROMDATE),(TODATE))
COLUMNNAMES(DATES)
which forms the following result and retains Id where its value is 3
x------x-------------x
| Id | DATES |
x------x-------------x
| 1 | 2014-01-11 |
| 1 | 2014-01-13 |
| 1 | 2014-02-23 |
| 1 | 2014-02-27 |
| 2 | 2014-05-06 |
| 2 | 2014-05-30 |
| 3 | NULL |
x------x-------------x
In your example queries the results are indeed the same.
But OUTER APPLY can do more: For each outer row you can produce an arbitrary inner result set. For example you can join the TOP 1 ORDER BY ... row. A LEFT JOIN can't do that.
The computation of the inner result set can reference outer columns (like your example did).
OUTER APPLY is strictly more powerful than LEFT JOIN. This is easy to see because each LEFT JOIN can be rewritten to an OUTER APPLY just like you did. It's syntax is more verbose, though.

Resources