I have the follwoing structure:
Emp PayDate Amount
1 11/23/2010 500
1 11/25/2010 -900
1 11/28/2010 1000
1 11/29/2010 2000
2 11/25/2010 2000
3 11/28/2010 -3000
2 11/28/2010 4000
3 11/29/2010 -5000
I need to get the following result if emp 1 is selected (top 3 dates and their corresponding vals - if they exist - 4th row is always ignored)
PayDate1 Amount1 Paydate2 Amount2 Paydate3 Amount3
11/23/2010 500 11/25/2010 -900 11/28/2010 1000
I need to get the following result if emp 2 is selected
Paydate1 Amount1 Paydate2 Amount2 Paydate3 Amount3
11/25/2010 2000 11/28/2010 4000 NULL NULL
I need to get the following result if emp 3 is selected
Paydate1 Amount1 Paydate2 Amount2 Paydate3 Amount3
11/28/2010 -3000 11/29/2010 -5000
To get the respective data in rows I can run the following query:
select top 3 Paydate, Amount from Table where Emp = #Emp
But how do I get result in a pivoted fashion?
There's an excellent article on Pivots with SQL Server 2005+ here.
CREATE TABLE dbo.Table1
(
Emp int,
PayDate datetime,
Amount int
)
GO
INSERT INTO dbo.Table1 VALUES (1, '11/23/2010',500)
INSERT INTO dbo.Table1 VALUES (1, '11/25/2010',-900)
INSERT INTO dbo.Table1 VALUES (1, '11/28/2010',1000)
INSERT INTO dbo.Table1 VALUES (1, '11/29/2010',2000)
INSERT INTO dbo.Table1 VALUES (2, '11/25/2010',2000)
INSERT INTO dbo.Table1 VALUES (3, '11/28/2010',-3000)
INSERT INTO dbo.Table1 VALUES (2, '11/28/2010',4000)
INSERT INTO dbo.Table1 VALUES (3, '11/29/2010',-5000)
;WITH cte AS
(SELECT Emp, PayDate, Amount, PayDateRowNumber
FROM
(SELECT Emp,
PayDate,
Amount,
ROW_NUMBER() OVER (PARTITION BY Emp ORDER BY PayDate) AS PayDateRowNumber
FROM Table1) AS RankedTable1
WHERE PayDateRowNumber < 4)
SELECT c1.Emp AS Emp, c1.PayDate AS PayDate1
,c1.Amount AS Amount1, c2.PayDate AS PayDate2
,c2.Amount AS Amount2, c3.PayDate AS PayDate3, c3.Amount AS Amount3
FROM cte c1
LEFT JOIN cte c2 ON c2.Emp = c1.Emp AND c2.PayDateRowNumber = 2
LEFT JOIN cte c3 ON c3.Emp = c2.Emp AND c3.PayDateRowNumber = 3
WHERE c1.PayDateRowNumber = 1
Output is:
Some caveats are that it won't aggregate amounts for the same employer/date (though can easily be changed). Also may want to change to review use of ROW_NUMBER() versus RANK() and DENSE_RANK() depending on your definition of "TOP 3"
Related
so I want something as below in my query
select * from table a
where a.id in(select id, max(date) from table a group by id)
I am getting error here , as IN is equivalent to = .
how to do it?
example :
id
date
1
2022-31-01
1
2022-21-03
2
2022-01-01
2
2022-02-01
I need to get only one record based on date(max). The table has more columns than just id and date
so I need to something like this in snowflake
select * from table a
where id in(select id,max(date) from table a group by id)
```-----------------------
All solutions are working , if i select from table .
but i have case statement in view where duplicate records are coming
example :
create or replace view v_test
as
select * from
(
select id,lastdatetime,*,
case when start_date < timestamp and timestamp < end
and move_date = '9999-12-31' then 'Y'
else 'N' end as IND
from table a
) a
so if any one select view where IND= 'Y', more than 1 records are coming
what i want is to select latest records for ID where IND='Y' and max(lastdatetime)
how to incorporate this logic in view?
I think you are trying to get the latest record for each id?
select *
from table a
qualify row_number() over (partition by id order by date desc) = 1
So if we look at your sub-select:
using this "data" for the examples:
with data (id, _date) as (
select column1, to_date(column2, 'yyyy-dd-mm') from values
(1, '2022-31-01'),
(1, '2022-21-03'),
(2, '2022-01-01'),
(2, '2022-02-01')
)
select id, max(_date)
from data
group by 1;
it gives:
ID
MAX(_DATE)
1
2022-03-21
2
2022-01-02
which makes it seem you want the "the last date, per id"
which can classically (ansi sql) be written:
select d.*
from data as d
join (
select
id,
max(_date) as max_date
from data
group by 1
) as c
on d.id = c.id and d._date = c.max_date
;
ID
_DATE
1
2022-03-21
2
2022-01-02
which gives you "all the rows values". BUT if you have many rows with the same last date, you will get those, in the output.
Another methods is to use a ROW_NUMBER to pick one and only one row, which is the style of answer Mike has given:
with data (id, _date, extra) as (
select column1, to_date(column2, 'yyyy-dd-mm'), column3 from values
(1, '2022-31-01', 'extra_a'),
(1, '2022-21-03', 'extra_b_double_a'),
(1, '2022-21-03', 'extra_b_double_b'),
(2, '2022-01-01', 'extra_c'),
(2, '2022-02-01', 'extra_d')
)
select *
from data
qualify row_number() over (partition by id order by _date desc) =1 ;
gives:
ID
_DATE
EXTRA
1
2022-03-21
extra_b_double_a
2
2022-01-02
extra_d
now if you want the "all rows of the last day" you method works, albeit the QUALIFY/ROW_NUMBER is faster. You can use RANK
with data (id, _date, extra) as (
select column1, to_date(column2, 'yyyy-dd-mm'), column3 from values
(1, '2022-31-01', 'extra_a'),
(1, '2022-21-03', 'extra_b_double_a'),
(1, '2022-21-03', 'extra_b_double_b'),
(2, '2022-01-01', 'extra_c'),
(2, '2022-02-01', 'extra_d')
)
select *
from data
qualify dense_rank() over (partition by id order by _date desc) =1 ;
ID
_DATE
EXTRA
1
2022-03-21
extra_b_double_a
1
2022-03-21
extra_b_double_b
2
2022-01-02
extra_d
Now the last thing that it almost seems you are asking for, is "how do find the ID with the most recent data (here 1) and get all rows for that"
with data (id, _date, extra) as (
select column1, to_date(column2, 'yyyy-dd-mm'), column3 from values
(1, '2022-31-01', 'extra_a'),
(1, '2022-21-03', 'extra_b_double_a'),
(1, '2022-21-03', 'extra_b_double_b'),
(2, '2022-01-01', 'extra_c'),
(2, '2022-02-01', 'extra_d')
)
select *
from data
qualify id = last_value(id) over (order by _date);
Here is an example of how to use the in operator with a subquery:
select * from table1 t1 where t1.id in (select t2.id from table2 t2);
Usage of IN is possible to match on both columns:
select *
from tab AS a
where (a.id, a.date) in (select id, max(date) from tab group by id);
For sample data:
CREATE TABLE tab (id, date)
AS
SELECT column1, to_date(column2, 'yyyy-dd-mm')
FROM VALUES
(1, '2022-31-01'),
(1, '2022-21-03'),
(2, '2022-01-01'),
(2, '2022-02-01');
Output:
I have a table like this :
Table1:
[Id] [TDate] [Score]
1 1.1.00 50
1 1.1.00 60
2 1.1.01 50
2 1.1.01 70
2 1.3.01 40
3 1.1.00 80
3 1.1.00 30
3 1.2.00 40
My desired output should be like this:
[ID] [TDate] [Score]
1 1.1.00 60
2 1.1.01 70
2 1.3.01 40
3 1.1.00 80
3 1.2.00 40
So fare, I have written this:
SELECT DISTINCT TOP 2 Id, TDate, Score
FROM
( SELECT Id, TDate, Score, ROW_NUMBER() over(partition by TDate order by Score) Od
FROM Table1
) A
WHERE A.Od = 1
ORDER BY Score
But it gives me :
[ID] [TDate] [Score]
2 1.1.01 70
3 1.1.00 80
of course I can do this:
"select top 2 ...where ID = 1"
and then:
union
`"Select top 2 ... where ID = 2"`
etc..
but I have a 100,000 of this..
Any way to generalize it to any Id?
Thank you.
WITH TOPTWO AS (
SELECT Id, TDate, Score, ROW_NUMBER()
over (
PARTITION BY TDate
order by SCORE
) AS RowNo
FROM [table_name]
)
SELECT * FROM TOPTWO WHERE RowNo <= 2
Your output doesn't make sense. Let me assume you want two rows per id. Then the query would look like:
SELECT TOP 2 Id, TDate, Score
FROM (SELECT Id, TDate, Score,
ROW_NUMBER() over (partition by id order by Score DESC) as seqnum
FROM Table1
) t
WHERE seqnum <= 2
ORDER BY Score;
Notes:
This assumes that you want two rows per id. Hence, id is in the PARTITION BY.
The WHERE now selects two rows per group in the PARTITION BY.
There is no need for SELECT DISTINCT in the outer query -- at least for this question.
Try this : Make partition by ID and TDate and sort by score in descending order
ROW_NUMBER() over(partition by ID,TDate order by Score DESC) Od
Complete script
WITH CTE AS(
SELECT *,
ROW_NUMBER() over(partition by ID,TDate order by Score DESC) RN
FROM TableName
)
SELECT *
FROM CTE
WHERE RN = 1
Unless I am missing something this can be done with a simple group by
First I prepare a temp table for testing :
declare #table table (ID int, TDate varchar(10), Score int)
insert into #Table values(1, '1.1.00', 50)
insert into #Table values(1, '1.1.00', 60)
insert into #Table values(2, '1.1.01', 50)
insert into #Table values(2, '1.1.01', 70)
insert into #Table values(2, '1.3.01', 40)
insert into #Table values(3, '1.1.00', 80)
insert into #Table values(3, '1.1.00', 30)
insert into #Table values(3, '1.2.00', 40)
Now lets do a select on this table
select ID, TDate, max(Score) as Score
from #table
group by ID, TDate
order by ID, TDate
The result is this :
ID TDate Score
1 1.1.00 60
2 1.1.01 70
2 1.3.01 40
3 1.1.00 80
3 1.2.00 40
So all you need to do is change #table to your table name and you are done
I have a table containing orders. I would like to select those orders that are a certain number of days apart for a specific client. For example, in the table below I would like to select all of the orders for CustomerID = 10 that are at least 30 days apart from the previous instance. With the starting point to be the first occurrence (07/05/2014 in this data).
OrderID | CustomerID | OrderDate
==========================================
1 10 07/05/2014
2 10 07/15/2014
3 11 07/20/2014
4 11 08/20/2014
5 11 09/21/2014
6 10 09/23/2014
7 10 10/15/2014
8 10 10/30/2014
I would want to select OrderIDs (1,6,8) since they are 30 days apart from each other and all from CustomerID = 10. OrderIDs 2 and 7 would not be included as they are within 30 days of the previous order for that customer.
What confuses me is how to set the "checkpoint" to the last valid date. Here is a little "pseudo" SQL.
SELECT OrderID
FROM Orders
WHERE CusomerID = 10
AND OrderDate > LastValidOrderDate + 30
i came here and i saw #SveinFidjestøl already posted answer but i can't control my self after by long tried :
with the help of LAG and LEAD we can comparison between same column
and as per your Q you are looking 1,6,8. might be this is helpful
SQL SERVER 2012 and after
declare #temp table
(orderid int,
customerid int,
orderDate date
);
insert into #temp values (1, 10, '07/05/2014')
insert into #temp values (2, 10, '07/15/2014')
insert into #temp values (3, 11, '07/20/2014')
insert into #temp values (4, 11, '08/20/2014')
insert into #temp values (5, 11, '09/21/2014')
insert into #temp values (6, 10, '09/23/2014')
insert into #temp values (7, 10, '10/15/2014')
insert into #temp values (8, 10, '10/30/2014');
with cte as
(SELECT orderid,customerid,orderDate,
LAG(orderDate) OVER (ORDER BY orderid ) PreviousValue,
LEAD(orderDate) OVER (ORDER BY orderid) NextValue,
rownum = ROW_NUMBER() OVER (ORDER BY orderid)
FROM #temp
WHERE customerid = 10)
select orderid,customerid,orderDate from cte
where DATEDIFF ( day , PreviousValue , orderDate) > 30
or PreviousValue is null or NextValue is null
SQL SERVER 2005 and after
WITH CTE AS (
SELECT
rownum = ROW_NUMBER() OVER (ORDER BY p.orderid),
p.orderid,
p.customerid,
p.orderDate
FROM #temp p
where p.customerid = 10)
SELECT CTE.orderid,CTE.customerid,CTE.orderDate,
prev.orderDate PreviousValue,
nex.orderDate NextValue
FROM CTE
LEFT JOIN CTE prev ON prev.rownum = CTE.rownum - 1
LEFT JOIN CTE nex ON nex.rownum = CTE.rownum + 1
where CTE.customerid = 10
and
DATEDIFF ( day , prev.orderDate , CTE.orderDate) > 30
or prev.orderDate is null or nex.orderDate is null
GO
You can use the LAG() function, available in SQL Server 2012, together with a Common Table Expression. You calculate the days between the customer's current order and the customer's previous order and then query the Common Table Expression using the filter >= 30
with cte as
(select OrderId
,CustomerId
,datediff(d
,lag(orderdate) over (partition by CustomerId order by OrderDate)
,OrderDate) DaysSinceLastOrder
from Orders)
select OrderId, CustomerId, DaysSinceLastOrder
from cte
where DaysSinceLastOrder >= 30 or DaysSinceLastOrder is null
Results:
OrderId CustomerId DaysSinceLastOrder
1 10 NULL
6 10 70
3 11 NULL
4 11 31
5 11 32
(Note that 1970-01-01 is chosen arbitrarily, you may choose any date)
Update
A slighty more reliable way of doing it will involve a temporary table. But the original table tbl can be left unchanged. See here:
CREATE TABLE #tmp (id int); -- set-up temp table
INSERT INTO #tmp VALUES (1); -- plant "seed": first oid
WHILE (##ROWCOUNT>0)
INSERT INTO #tmp (id)
SELECT TOP 1 OrderId FROM tbl
WHERE OrderId>0 AND CustomerId=10
AND OrderDate>(SELECT max(OrderDate)+30 FROM tbl INNER JOIN #tmp ON id=OrderId)
ORDER BY OrderDate;
-- now list all found entries of tbl:
SELECT * FROM tbl WHERE EXISTS (SELECT 1 FROM #tmp WHERE id=OrderId)
#tinka shows how to use CTEs to do the trick, and the new windowed functions (for 2012 and later) are probably the best answer. There is also the option, assuming you do not have a very large data set, to use a recursive CTE.
Example:
declare #customerid int = 10;
declare #temp table
(orderid int,
customerid int,
orderDate date
);
insert into #temp values (1, 10, '07/05/2014')
insert into #temp values (2, 10, '07/15/2014')
insert into #temp values (3, 11, '07/20/2014')
insert into #temp values (4, 11, '08/20/2014')
insert into #temp values (5, 11, '09/21/2014')
insert into #temp values (6, 10, '09/23/2014')
insert into #temp values (7, 10, '10/15/2014')
insert into #temp values (8, 10, '10/30/2014');
with datefilter AS
(
SELECT row_number() OVER(PARTITION BY CustomerId ORDER BY OrderDate) as RowId,
OrderId,
CustomerId,
OrderDate,
DATEADD(day, 30, OrderDate) as FilterDate
from #temp
WHERE CustomerId = #customerid
)
, firstdate as
(
SELECT RowId, OrderId, CustomerId, OrderDate, FilterDate
FROM datefilter
WHERE rowId = 1
union all
SELECT datefilter.RowId, datefilter.OrderId, datefilter.CustomerId,
datefilter.OrderDate, datefilter.FilterDate
FROM datefilter
join firstdate
on datefilter.CustomerId = firstdate.CustomerId
and datefilter.OrderDate > firstdate.FilterDate
WHERE NOT EXISTS
(
SELECT 1 FROM datefilter betweens
WHERE betweens.CustomerId = firstdate.CustomerId
AND betweens.orderdate > firstdate.FilterDate
AND datefilter.orderdate > betweens.orderdate
)
)
SELECT * FROM firstdate
From this it is possible for the following :
TABLE 1
Id | final | Date
------------------
1 236 02-11-14
2 10 07-01-12
3 58 09-02-10
TABLE 2
Id | final | Date
------------------
1 330 02-11-14
2 5 07-01-12
3 100 09-02-10
ADD both Table 1 and Table 2 Sum'd values(column final), and then work out the AVG number from this and create this as another column average, THEN if table2 for example SUM'd original amount (before the AVG) is higher than Table 1 SUM'd amount create another column and print in that column 'Tbl2 has the higher amount' and vise verser if table 1 had the higher amount.
End result Column wise table would look like this :
|tb1_final_amount|tb2_final_amount|Avg_Amount|Top_Score_tbl
|tb1_final_amount|tb2_final_amount|Avg_Amount|Top_Score_tbl
304 435 369.5 tb2 has highest score
This is one way (of many) to do this. You can sum up the two tables and use them as derived tables in a query like so:
select
tb1_final_amount,
tb2_final_amount,
(tb1_final_amount+tb2_final_amount)/2.0 as Avg_Amount,
case
when tb1_final_amount < tb2_final_amount then 'tb2 has highest score'
else 'tb1 has highest score'
end as Top_Score_tbl
from
(select SUM(final) as tb1_final_amount from TABLE1) t1,
(select SUM(final) as tb2_final_amount from TABLE2) t2
This does the trick!:
--SET UP Table1
CREATE TABLE Table1 (ID INT, final INT, [Date] DATETIME)
INSERT Table1 VALUES (1, 236, '20141102')
INSERT Table1 VALUES (2, 10, '20120107')
INSERT Table1 VALUES (3, 58, '20100209')
--SET UP Table2
CREATE TABLE Table2 (ID INT, final INT, [Date] DATETIME)
INSERT Table2 VALUES (1, 330, '20141102')
INSERT Table2 VALUES (2, 5, '20120107')
INSERT Table2 VALUES (3, 100, '20100209')
-- Query
SELECT
SUM(CASE WHEN t.TableName = 'Table1' THEN T.final
ELSE 0
END) AS tb1_final_amount,
SUM(CASE WHEN t.TableName = 'Table2' THEN T.final
ELSE 0
END) AS tb2_final_amount,
AVG(T.final) AS Avg_Amount,
ISNULL((
SELECT
'Table1'
FROM
Table1 T1
WHERE
SUM(CASE WHEN t.TableName = 'Table1' THEN T.final
ELSE 0
END) > SUM(CASE WHEN t.TableName = 'Table2' THEN T.final
ELSE 0
END)
), 'Table2')
FROM
(
SELECT
'Table1' AS TableName,
final
FROM
Table1
UNION ALL
SELECT
'Table2',
final
FROM
Table2
) AS T
I have two tables.
Sales
------
ID Charge VAT
1 100 10
2 200 20
SaleProducts
------------
ID Product Auto SalesID
1 aa True 1
2 bb False 1
I want to get this
SaleOnProduct
-------------
ID Product Charge VAT Total TotalAmount(All of total is plus)
1 aa 100 10 110 220
2 aa 100 10 110 220
How can I do this. Please help me.
declare #Sales table (ID int, Charge int, VAT int)
declare #SaleProducts table (ID int, Product char(2), Auto bit, SalesID int)
insert into #Sales values
(1, 100, 10),
(2, 200, 20)
insert into #SaleProducts values
(1, 'aa', 1, 1),
(2, 'bb', 0, 1)
select
SP.ID,
SP.Product,
S.Charge,
S.VAT,
S.Charge+S.VAT as Total,
sum(S.Charge+S.VAT) over() as TotalAmount
from #Sales as S
inner join #SaleProducts as SP
on S.ID = SP.SalesID
In order to get data from both tables in the same resultset you need to do a join.
Since you want to have a summary row for all sales ((charge+VAT)*numberofSaleProductsRows) of each particular product, you need to use the SUM aggregate funciton, and a GROUP BY clause. All the columns which you need in your resultset and which do not have a specified aggregation needs to be included in the GROUP BY list.
Disclaimer: Untested code
SELECT ID, Product, Charge ,VAT, Charge + VAT as Total,
Sum(Charge + VAT) as TotalAmount
FROM Sales INNER JOIN SaleProducts
ON Sales.ID = SaleProducts.SalesID
GROUP BY ID, Product, Charge, VAT, Charge + VAT
This query do the job: (I supposed SQL Server 2005 or above, otherwise you´ve to change the cte by a temp table)
WITH ReportCte as(
SELECT b.Id IdSale,
a.Id IdProduct,
a.Product,
b.Charge,
b.VAT,
b.Charge+b.VAT Total
FROM [dbo].[SaleProducts] a left join
[dbo].[Sales] b on a.[SalesID] = b.ID)
SELECT a.IdProduct,
a.IdProduct,
a.Charge,
a.VAT,
a.Total,
b.TotalAmount
FROM ReportCte a left join
(select IdSale,SUM(Total) TotalAmount
from ReportCte group by IdSale) b on a.IdSale=b.IdSale
select *, (charge+VAT) as total, SUM(charge+VAT) as totalAmount
from sales
join saleproducts on sales.ID=saleproducts.salesID
group by sales.ID