compare timestamp between two tables - snowflake-cloud-data-platform

I've a table "Summary" contains "ID", "NAME" and "Timestamp", I want to insert into this table all the rows from my DB table when the timestamp of the DB row is greater than MAX(TIMESTAMP) of the "Summary" table.
For example,
Summary table
ID
NAME
TIMESTAMP
1
A
2018-06-28 15:12:46
2
B
2018-06-28 16:12:46
3
C
2018-06-28 18:12:46
DB table
ID
NAME
TIMESTAMP
1
D
2018-06-28 15:12:46
2
E
2018-06-28 19:12:46
3
F
2018-06-28 22:12:46
SO, the MAX(timestamp) of "Summary" table is "2018-06-28 18:12:46", I need to check if there's any row in DB has a timestamp greater than that, if so, select the row from the DB and insert it into "Summary" table
desired table
ID
NAME
TIMESTAMP
1
A
2018-06-28 15:12:46
2
B
2018-06-28 16:12:46
3
C
2018-06-28 18:12:46
2
E
2018-06-28 19:12:46
3
F
2018-06-28 22:12:46
I have tried this by comparing the two tables but it doesn't work:
SELECT
*
FROM DB , SUMMARY
WHERE DB.timestamp >= MAX(SUMMARY.timestamp );

Starting with your SQL:
SELECT *
FROM DB , SUMMARY WHERE DB.timestamp >= MAX(SUMMARY.timestamp );
Turn the attempt to get a max into a subquery:
SELECT *
FROM DB WHERE DB.timestamp >= (select max(timestamp) from SUMMARY );
The query you're showing is attempting a lateral join, BTW. The optimizer will resolve the max(timestamp) in a separate step to a scalar value, and then run the main query using it for the >= comparison.

Related

Error on Group by method

I wrote a query to combine records in multiple tables. Tables named by Purchase Order, Purchase Order Item
[ Note: The column names are not original names, it just for a model data]
In purchase order table have the order details like this,
id date vendorid totalitems totalqty grossamnt netamnt taxamt
----------------------------------------------------------------------------
1 03/10/17 00001 2 6 12000 13000 1000
Purchase Order Item table have the order details like this,
poid id productcode qty rate tax(%) taxamnt total
--------------------------------------------------------
1 1 12001 3 6000 2.5 500 6500
2 1 12000 3 6000 2.5 500 6500
My Query is,
select po.POID,po.SupplierId,po.TotalItems from
PurchaseOrder po, PurchaseOrderItem poi where po.POID=poi.POID group by
po.POID, po.SupplierId,po.TotalItems
Query returns,
id vendorid totalitems
--------------------------
1 00001 2
1 00001 2
Expected Output is,
id vendorid totalitems
------------------------
1 00001 2
You are using an outdated join method, have a read here:
ANSI vs. non-ANSI SQL JOIN syntax
You are also joining to another table, but never use it:
select po.POID,po.SupplierId,po.TotalItems
from PurchaseOrder po, PurchaseOrderItem poi
where po.POID=poi.POID
group by po.POID, po.SupplierId,po.TotalItems
Can just be:
select po.POID,po.SupplierId,po.TotalItems
from PurchaseOrder po
group by po.POID, po.SupplierId,po.TotalItem
OR
select DISTINCT
po.POID,
po.SupplierId,
po.TotalItems
from PurchaseOrder po

DISTINCT and GROUP BY with SQL Server

I have the following table (sql server) and i'm looking for a query to select the last two rows with all fields:
order by created_at
group by / distinct type_id
id type_id some_value created_at
1 B mk2 2016-10-01 00:00:00.000
2 A mbs 2016-10-01 10:02:39.077
3 B sa 2016-10-02 10:03:08.123
4 A xc 2016-10-02 10:03:28.777
5 B q1 2016-10-03 10:04:20.920
6 A tr 2016-10-03 10:04:48.533
7 A 1a 2016-09-30 10:36:26.287
In MySQL its an easy task - but with SQL Server all fields have to be contained in either an aggregate function or the GROUP BY clause. But that results in field combinations that does not exist.
Is there a way to handle this?
Thanks in advance!
Solution
Based on the comment from Andrew Deighton i did this:
SELECT *
FROM (
SELECT
id,
type_id,
some_value,
created_at,
ROW_NUMBER()
OVER (PARTITION BY type_id
ORDER BY created_at DESC) AS row
FROM test_sql
) AS ts
WHERE row = 1
ORDER BY row
Conclusion: No need for GROUP BY and DISTINCT.

Filter two table rows by using group by with join in SQL Server

I have two tables as Sales and SalesDocument.
Sales Table
RequestId(PrimaryKey) ReqType DateTime
--------- ------- --------
1 Buy 22/10/2015
2 Buy 03/11/2015
3 Sell 10/11/2015
4 Return 11/12/2015
6 Sell 11/12/2015
7 Buy 20/12/2015
Sales Document Table
RequestId(Ref.Key(FK)) ReqDocument ReqDocURL
--------- ----------- ---------
2 Doc1 Http://..Doc1.PDF
3 Doc2 Http://..Doc2.PDF
3 Doc3 Http://..Doc3.PDF
4 Doc1 Http://..Doc1.PDF
4 Doc2 Http://..Doc2.PDF
4 Doc3 Http://..Doc3.PDF
6 Doc2 Http://..Doc2.PDF
6 Doc3 Http://..Doc3.PDF
Now I need to select the records in both tables by using RequestId(as Ref.Key) and the condition are,
1)In 1st Table,Need to get distinct ReqType(FirstColumn:RequestType) and It's count(SecondColumn:RequestTypeCount) based between two date ranges.
2)In 2nd Table, Need to calculate total no.of requested documents(ThirdColumn:SumOfReqDocument) by using RequestType(RequestType is not in 2nd table, hence we need to map with 1st table(sales) by RequestId and get the sum of documents.
The output should be,
RequestType RequestTypeCount SumOfReqDocument
----------- ---------------- ----------------
Buy 3 1
Sell 2 4
Return 1 3
I tried some SQL query but it does not result the actual result. Please help me on this SQL query\Suggest me some other query.
My Query is,
SELECT ReqType as RequestType,count(ReqType) as RequestTypeCount,count(salesDoc.DocumentURL) as SumOfReqDocument FROM Sales sales inner join SalesDocument salesDoc on
sales.RequestId=salesDoc.RequestId where sales.EndDate >= '2015-10-22 10:34:09.000' AND sales.EndDate <= '2015-12-31 00:00:00.000'
group by sales.ReqType
You may try change the INNER JOIN to LEFT JOIN, and COUNT DISTINCT ReqestID for RequestTypeCount
SELECT ReqType as RequestType
,count(DISTINCT sales.RequestId) as RequestTypeCount
,count(salesDoc.ReqDocURL) as SumOfReqDocument
FROM Sales sales
LEFT JOIN SalesDocument salesDoc
ON sales.RequestId=salesDoc.RequestId
WHERE sales.EndDate >= '2015-10-21 10:34:09.000' AND sales.EndDate <= '2015-12-31 00:00:00.000'
group by sales.ReqType

How can I always return Null for a column without updating the column's value in the database?

ID Name tuition num of courses
1 Brandon 4430 6
2 Lisa 2300 3
3 Victoria null 0
4 Jack 3330 4
The type of the tuition column is money, but I need to return return null in my select statement without updating the values in the table.
I tried nullif(tuition is not null), but it didn't work.
How can I return results like those in the table below, without updating the table or modifying the data in database?
ID Name tuition num of courses
1 Brandon null 6
2 Lisa null 3
3 Victoria null 0
4 Jack null 4
If you are returning null for every row, just code the column as:
NULL AS Tuition
Example query:
SELECT Id, Name, NULL as Tuition, NumCourses FROM TheTable
I have created the table and inserted records as you have shown above
It is a self join query.
-- To make sure that the underlying table is not updated run both the queries together.
select TT.Id, TT.Name,
nullif(TT.Tuition, BT.Tuition) as Tuition, TT.NOCs
from tblTuition TT
join tblTuition BT
on TT.Id = Bt.Id
select * from tblTuition
Whenever you need to get value as null then you can use like this,
SELECT NULL AS ABC FROM MYTABLE
So above statement add one ABC column in your select list AS All NULL Values, same thing can be use as getting a Default value e.g. if you want to get 1 then simply use SELECT 1 AS ABC FROM MYTABLE

T-SQL select rows by oldest date and unique category

I'm using Microsoft SQL. I have a table that contains information stored by two different categories and a date. For example:
ID Cat1 Cat2 Date/Time Data
1 1 A 11:00 456
2 1 B 11:01 789
3 1 A 11:01 123
4 2 A 11:05 987
5 2 B 11:06 654
6 1 A 11:06 321
I want to extract one line for each unique combination of Cat1 and Cat2 and I need the line with the oldest date. In the above I want ID = 1, 2, 4, and 5.
Thanks
Have a look at row_number() on MSDN.
SELECT *
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY date_time, id) rn
FROM mytable
) q
WHERE rn = 1
(run the code on SQL Fiddle)
Quassnoi's answer is fine, but I'm a bit uncomfortable with how it handles dups. It seems to return based on insertion order, but I'm not sure if even that can be guaranteed? (see these two fiddles for an example where the result changes based on insertion order: dup at the end, dup at the beginning)
Plus, I kinda like staying with old-school SQL when I can, so I would do it this way (see this fiddle for how it handles dups):
select *
from my_table t1
left join my_table t2
on t1.cat1 = t2.cat1
and t1.cat2 = t2.cat2
and t1.datetime > t2.datetime
where t2.datetime is null

Resources