How can I reconcile dates between two tables - snowflake-cloud-data-platform

I'm looking to solve an issue of potential dates missing from a Snowflake table. What I tried to do is create a table with calendar dates between 2017-2022 (minus weekends) based on a specific ID that I know has all expected dates. I have another table that has IDs where dates are missing and I would like to cross-reference with the first table to see the NULLs.
For example,
Column A
Column B
ID
2017-01-01
ID
2017-01-02
ID
2017-01-03
Column A
Column B
ID
NULL
ID
2017-01-02
ID
NULL
I'm trying to join these two tables to see where the NULLs exist in the second table (rows 1 and 3), however the results I'm getting back are the dates that do exist rather than the NULLs. I tried different joins but it doesn't seem to help.
My sample query:
select distinct id, c.date, a.date
from (select distinct date
from first table
where date between '2017-01-01' and '2022-12-31'
and id = 'ID') as a
left join "second table" c
on c.date = a.date
where c.id = 'id'
and c.date between '2017-01-01' and '2022-12-31'
order by c.date

Related

T-SQL: GROUP BY, but while keeping a non-grouped column (or re-joining it)?

I'm on SQL Server 2008, and having trouble querying an audit table the way I want to.
The table shows every time a new ID comes in, as well as every time an IDs Type changes
Record # ID Type Date
1 ae08k M 2017-01-02:12:03
2 liei0 A 2017-01-02:12:04
3 ae08k C 2017-01-02:13:05
4 we808 A 2017-01-03:20:05
I'd kinda like to produce a snapshot of the status for each ID, at a certain date. My thought was something like this:
SELECT
ID
,max(date) AS Max
FROM
Table
WHERE
Date < 'whatever-my-cutoff-date-is-here'
GROUP BY
ID
But that loses the Type column. If I add in the type column to my GROUP BY, then I'd get get duplicate rows per ID naturally, for all the types it had before the date.
So I was thinking of running a second version of the table (via a common table expression), and left joining that in to get the Type.
On my query above, all I have to join to are the ID & Date. Somehow if the dates are too close together, I end up with duplicate results (like say above, ae08k would show up once for each Type). That or I'm just super confused.
Basically all I ever do in SQL are left joins, group bys, and common table expressions (to then left join). What am I missing that I'd need in this situation...?
Use row_number()
select *
from ( select *
, row_number() over (partition by id order by date desc) as rn
from table
WHERE Date < 'whatever-my-cutoff-date-is-here'
) tt
where tt.rn = 1
I'd kinda like know how many IDs are of each type, at a certain date.
Well, for that you use COUNT and GROUP BY on Type:
SELECT Type, COUNT(ID)
FROM Table
WHERE Date < 'whatever-your-cutoff-date-is-here'
GROUP BY Type
Basing on your comment under Zohar Peled answer you probably looking for something like this:
; with cte as (select distinct ID from Table where Date < '$param')
select [data].*, [data2].[count]
from cte
cross apply
( select top 1 *
from Table
where Table.ID = cte.ID
and Table.Date < '$param'
order by Table.Date desc
) as [data]
cross apply
( select count(1) as [count]
from Table
where Table.ID = cte.ID
and Table.Date < '$param'
) as [data2]

Multiple Date Ranges Different Rows - Exclusion Logic

I need to find Customer transaction that didn't occur while a customer was considered a "premium customer". In this example, the sql retrieves row #2. How do I avoid retrieving any records since this customer was a premium on purchase date as referenced in row 1.
SELECT
FROM TblPremCus c INNER JOIN tblTrans t ON c.CustomerID = t.CustomerID
WHERE t.PurchaseDate NOT BETWEEN c.StartDate AND c.EndDate
Data:
**tblPremCus**
ROWID CustomerID StartDate EndDate
1 ABC123 1/1/2016 6/16/2016
2 ABC123 9/3/2016 12/21/9999
**tblTrans**
TransID CustomerID PurchaseDate
T1 ABC123 6/1/16
Expected Result: NONE
My understanding is that you want to get all transactions where the transaction date does not fall in between the dates that customer is 'premium'.
In a case like this, though, it is easier to find transactions that DO fall into one of those ranges, and is a simple change from what you started with:
SELECT * FROM TblPremCus c
INNER JOIN tblTrans t ON c.CustomerID = t.CustomerID
WHERE t.PurchaseDate BETWEEN c.StartDate AND c.EndDate
We've seen that using NOT BETWEEN is not good enough to weed out unwanted transactions, but now that we know which transactions we DON'T want, we can find the ones we DO: simply, any ones not returned by the above query.
SELECT * from TblTrans
WHERE TransID NOT IN (SELECT TransID FROM TblPremCus c
INNER JOIN tblTrans t ON c.CustomerID = t.CustomerID
WHERE t.PurchaseDate BETWEEN c.StartDate AND c.EndDate)

Merge columns and add values based on duplicate column

I have rows in my SQL Server that I would like to merge based on duplicate StartDate column. By merging, I would also like to
ID CustomerID Amount PurchaseDate TimeStamp
1 113 20 2015-10-01 0x0000000000029817
2 113 30 2015-10-01 0x0000000000029818
Based on the example above, I would like to have a single column where the values for the Amount column are summed up.
ID CustomerID Amount PurchaseDate TimeStamp
2 113 50 2015-10-01 0x0000000000029818
I'm not certain how I should go about this whether I should:
Create a new row with the new values or;
Update the latest added row and add the Amount to that row
But first I'd like to know how to get rows with duplicate StartDate column values
UPDATE: I have here a delete script for old values
DELETE FROM Table WHERE ID NOT IN (SELECT MAX(ID) FROM Table GROUP BY CustomerID, PurchaseDate)
I suggest updating the last inserted;
UPDATE T
SET Amount = X.Amount
FROM Table T INNER JOIN (
SELECT MAX(ID), SUM(Amount)
FROM Table
GROUP BY CustomerID, PurchaseDate) X ON T.ID = X.ID)
In this case I'd suggest also to remove the old values

Create a view with sum total across two tables grouped by date (SQL Server 2008)

I currently have a view like this:
CREATE VIEW dbo.audit
WITH schemabinding
AS
SELECT
CONVERT(date, DateAdded) AS dt,
COUNT_BIG(*) AS cnt
FROM
dbo.Table1
GROUP BY
CONVERT(date, DateAdded)
Which returns:
dt cnt
-----------------
3/13/2015 5000
3/12/2015 1324
I'm trying to get a sum total count from both tables grouped by date into a single view. Is this possible?
i.e.
Table 1 Table 2
dt cnt | dt cnt
3/13/2015 5000 | 3/13/2015 1000
3/12/2015 1324 | 3/12/2015 1
To:
View 1
dt cnt
3/13/2015 6000
3/12/2015 1325
It would be nice to keep this in a single view. As it's just a running total of how many new items got added. Any ideas?
Assuming that there are two views and depending on relationship between these two views (based on values from dt columns: View1.dt and View2.dt) you could use a INNER/LEFT/RIGHT or FULL JOIN thus:
SELECT ISNULL(v1.dt, v2.dt) AS dt, ISNULL(v1.cnt, 0) + ISNULL(v2.cnt, 0) AS cnt
FROM dbo.View1 v1 /*INNER/LEFT/RIGHT*/ FULL JOIN dbo.View2 v2 ON v1.dt = v2.dt
I've used FULL JOIN because I assumed that there are values in View1.dt column that doesn't exist in View2.dt column and also there are values in View2.dt column that doesn't exist in View1.dt. More, some dt values could exist in both columns(views).
Note: I assume that second view has the same definition but it uses Table2 as data source: FROM dbo.Table2.
Assuming your data is such that there can be days missing from the tables it's easier to handle the dates by creating a table of dates (one row per day) so that you can join the tables using it, like this:
CREATE VIEW dbo.audit WITH schemabinding AS
select
Dates.Date as dt,
count_big(Table1.date) as ct_1,
count_big(Table2.date) as ct_2
from
Dates
left outer join Table1 on convert(date, Table1.Date) = Dates.Date
left outer join Table2 on convert(date, Table2.Date) = Dates.Date
group by
Dates.Date
SQL Fiddle: http://sqlfiddle.com/#!6/bf116/3
If the tables are huge there might be some problems with performance because SQL Server isn't going to use index for the dates because there is a conversion to date -- and this is in case you have a where clause on the view. If you need something like that an inline table value function might work better because then you can have variables for the date ranges.
If I understand your question correctly, try something like this:
CREATE VIEW dbo.audit WITH schemabinding AS
SELECT CONVERT(Table1.date,DateAdded) AS Table1_dt,
COUNT_BIG(Table1.*) AS Table1_cnt,
CONVERT(Table1.date,DateAdded) AS Table2_dt,
COUNT_BIG(Table2.*) AS Table2_cnt
FROM dbo.Table1 INNER JOIN dbo.Table2 ON(Table1.date = Table2.Date)
GROUP BY CONVERT(Table1,DateAdded)
This solution assumes the same column names in both tables and also the same dates to be selected.

sql server, need to get total item qty for each item for each customer within date range

I have 2 relational tables orders and order_items
orders has
id
customer_name
delivery_date
order_items has
order_id
item
unit
qty
I need to see how much (ie. Sum(qty)) of each item/unit combination each customer ordered with in a specified date range.
the only way I see this can be done is to use C# or vb.net and first create a datatable with distinct item/unit combinations for the date range.
The I would loop through those item/units and get a total for a customer for them in that date range and add them to another datatable.
Is there a way to do this in sql alone?
Yes, there is:
SELECT customer_name, item, unit, SUM(Qty) as totals
FROM orders o INNER JOIN order_items oi
ON o.id = oi.order_id
WHERE o.delivery_date BETWEEN #datefrom AND #dateto
GROUP BY customer_name, item, unit
ORDER BY o.delivery_date

Resources