How to use last_value with group by with count in SQL Server? - sql-server

I have table like:
name | timeStamp | previousValue | newValue
--------+---------------+-------------------+------------
Mark | 13.12.2020 | 123 | 155
Mark | 12.12.2020 | 123 | 12
Tom | 14.12.2020 | 123 | 534
Mark | 12.12.2020 | 123 | 31
Tom | 11.12.2020 | 123 | 84
Mark | 19.12.2020 | 123 | 33
Mark | 17.12.2020 | 123 | 96
John | 22.12.2020 | 123 | 69
John | 19.12.2020 | 123 | 33
I'd like to mix last_value, count (*) and group to get this result:
name | count | lastValue
--------+-----------+-------------
Mark | 5 | 33
Tom | 2 | 534
John | 2 | 69
This part:
select name, count(*)
from table
group by name
returns table:
name | count
--------+---------
Mark | 5
Tom | 2
John | 2
but I have to add the last value for each name.
How to do it?
Best regards!

LAST_VALUE is a windowed function, so you'll need to get that value first, and then aggregate:
WITH CTE AS(
SELECT [name],
[timeStamp], --This is a poor choice for a column's name. timestamp is a (deprecated) synonym of rowversion, and a rowversion is not a date and time value
previousValue,
newValue,
LAST_VALUE(newValue) OVER (PARTITION BY [name] ORDER BY [timeStamp] ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS lastValue
FROM dbo.YourTable)
SELECT [Name],
COUNT(*) AS [count],
lastValue
FROM CTE
GROUP BY [Name],
lastValue;

I got a solution that works, but here's another one:
SELECT
[name], COUNT([name]), [lastValue]
FROM (
SELECT
[name], FIRST_VALUE([newValue]) OVER (PARTITION BY [name] ORDER BY TimeStamp DESC ROWS UNBOUNDED PRECEDING) AS [lastValue]
FROM [table]
) xyz GROUP BY [name], [lastValue]
Keep well!

Related

Rank by top customers within each separate month -

I am having trouble ranking top customers by month. I created a new Rank column - but how do I break it up by month? Any help plz. Code and tables below:
The logic for ranking is selecting the top two customers per month from the tables. Also wrapped into the code (attempted at least) is renaming the date field and setting it to reflect end of month date only.
SELECT * FROM table1;
UPDATE table1
SET DATE=EOMONTH(DATE) AS MO_END;
ALTER TABLE table1
ADD COLUMN RANK INT AFTER SALES;
UPDATE table1
SET RANK=
RANK() OVER(PARTITION BY cust ORDER BY sales DESC);
LIMIT 2
Starting wtih
------+----------+-------+--+
| CUST | DATE | SALES | |
+------+----------+-------+--+
| 36 | 3-5-2018 | 50 | |
| 37 | 3-15-18 | 100 | |
| 38 | 3-25-18 | 65 | |
| 37 | 4-5-18 | 95 | |
| 39 | 4-21-18 | 500 | |
| 40 | 4-45-18 | 199 | |
+------+----------+-------+--+
desired end result
+------+---------+-------+------+--+
| CUST | MO_END | SALES | RANK | |
+------+---------+-------+------+--+
| 37 | 3-31-18 | 100 | 1 | |
| 38 | 3-25-18 | 65 | 2 | |
| 39 | 4-30-18 | 500 | 1 | |
| 40 | 4-45-18 | 199 | 2 | |
+------+---------+-------+------+--+
As a simple selection:
select *
from (
select
table1.*
, DENSE_RANK() OVER(PARTITION BY cust, EOMONTH(DATE) ORDER BY sales DESC) as ranking
from table1
)
where ranking < 3
;
If storing is important: I would not use [rank] as a column name as I avoid any words that are used in SQL, maybe [sales_rank] or similar.
with cte as (
select
cust
, DENSE_RANK() OVER(PARTITION BY cust, EOMONTH(DATE) ORDER BY sales DESC) as ranking
from table1
)
update cte
set sales_rank = ranking
where ranking < 3
;
There is really no reason to store the end of month, just use that function within the partition of the over() clause.
LIMIT 2 is not something that can be used in SQL Server by the way, and it sure can't be used "per grouping". When you use a "window function" such as rank() or dense_rank() you can use the output of those in the where clause of the next "layer". i.e. use those functions in a subquery (or cte) and then use a where clause to filter rows by the calculated values.
Also note I used dense_rank() to guarantee that no rank numbers are skipped, so that the subsequent where clause will be effective.

MS SQL SERVER pivot table aggregation function

I have a question about the application of the aggregation function that used in pivot function.
The table OCCUPATIONS looks like this:
+-----------+------------+
| Name | Occupation |
+-----------+------------+
| Ashley | Professor |
| Samantha | Actor |
| Julia | Doctor |
| Britney | Professor |
| Maria | Professor |
| Meera | Professor |
| Priya | Doctor |
| Priyanka | Professor |
| Jennifer | Actor |
| Ketty | Actor |
| Belvet | Professor |
| Naomi | Professor |
| Jane | Singer |
| Jenny | Singer |
| Kristeen | Singer |
| Christeen | Singer |
| Eve | Actor |
| Aamina | Doctor |
+-----------+------------+
The first column is name and second is occupation.
Now I want to make a pivot table that each column is one kind of occupation and name is sorted alphabetically and print NULL when no more names for an occupation.
The output should looks like this:
+--------+-----------+-----------+----------+
| Doctor | Professor | Singer | Actor |
+--------+-----------+-----------+----------+
| Aamina | Ashley | Christeen | Eve |
| Julia | Belvet | Jane | Jennifer |
| Priya | Britney | Jenny | Ketty |
| NULL | Maria | Kristeen | Samantha |
| NULL | Meera | NULL | NULL |
| NULL | Naomi | NULL | NULL |
| NULL | Priyanka | NULL | NULL |
+--------+-----------+-----------+----------+
Here the first column is Doctor, second is Professor, third is Singer and fourth is Actor. The code to generate result is
select [Doctor],[Professor],[Singer],[Actor] from (select o.Name,
o.Occupation, row_number() over(partition by o.Occupation order by
o.Name) id from OCCUPATIONS o) as src
pivot
(max(src.Name)
for src.Occupation in ([Doctor],[Professor],[Singer],[Actor])
) as m
But when I replace the table generated from here:
(select o.Name, o.Occupation, row_number() over(partition by o.Occupation order by o.Name) id from OCCUPATIONS o) as src' to 'OCCUPATIONS'
the result is like this:
Priya Priyanka Kristeen Samantha
I understand why this happens, because we take a MAX() in each group. However, in the previous result, I also use a MAX() function to generate NULL when there's no more names coming, it doesn't return a max value as my expected, instead it return every name.
My question is why this happens?
Thank you!
Here could be the source of issue:
row_number() over(partition by o.Occupation order by
o.Name) id from OCCUPATIONS o
The Row_Number here you are using is PARTITION BY o.Occupation, so in your PIVOT, it will pivot the records by the occupation group, which means the id is repeating. If you get rid of the PARTITION BY and just keep the Order by part, it should work.
Try this approach:
find the occupations with more people associated
generate table with a sequence of numbers from 1 to the number of people calculated in the previous point
join the table generated in point 2. four times with the original table each time filtering on a different Occupation
This is the query:
declare #tmp table([Name] varchar(50),[Occupation] varchar(50))
insert into #tmp values
('Ashley','Professor') ,('Samantha','Actor') ,('Julia','Doctor') ,('Britney','Professor') ,('Maria','Professor') ,('Meera','Professor') ,('Priya','Doctor') ,('Priyanka','Professor') ,('Jennifer','Actor') ,('Ketty','Actor') ,('Belvet','Professor') ,('Naomi','Professor') ,('Jane','Singer') ,('Jenny','Singer') ,('Kristeen','Singer') ,('Christeen','Singer') ,('Eve','Actor') ,('Aamina','Doctor')
--this variable contains the occuation that has more Names (rows) in the table
--it will be the number of total rows in output table
declare #Occupation_with_max_rows varchar(50)
--populate #Occupation_with_max_rows variable
select top 1 #Occupation_with_max_rows=Occupation
from #tmp
group by Occupation
order by count(*) desc
--generate final results joining 4 times the original table with the sequence table
select D.Name as Doctor,P.Name as Professor,S.Name as Singer,A.Name as Actor
from
(select ROW_NUMBER() OVER (ORDER BY [Name]) as ord from #tmp where Occupation = #Occupation_with_max_rows) O
left join
(select ROW_NUMBER() OVER (ORDER BY [Name]) as ord, [Name] from #tmp where Occupation='Doctor') D on O.ord = D.ord
left join
(select ROW_NUMBER() OVER (ORDER BY [Name]) as ord, [Name] from #tmp where Occupation='Professor') P on O.ord = P.ord
left join
(select ROW_NUMBER() OVER (ORDER BY [Name]) as ord, [Name] from #tmp where Occupation='Singer') S on O.ord = S.ord
left join
(select ROW_NUMBER() OVER (ORDER BY [Name]) as ord, [Name] from #tmp where Occupation='Actor') A on O.ord = A.ord
Results:
Please find below code which works as expected :
select [Doctor],[Professor],[Singer],[Actor]
from
(
select row_number() over (partition by occupation order by name)[A],name,occupation
from occupations
)src
pivot
(
max(Name)
for occupation in ([Doctor],[Professor],[Singer],[Actor])
)piv;

How can I transform sql table from rows to column

I've table like this in mssql:
LogID | BatchID | Type
240 | abc-def | Error
241 | axc-d4f | Success
and so on.
I want to convert this table to like this :
If I can do this for all rows that will be great otherwise I am happy to filter a table with logid (select * from myTable m where m.LogId = 240)
RowId | LogID | ColName | ColValue
1 | 240 | LogID | 240
2 | 240 | BatchID | abc-def
3 | 240 | Type | Error
I read about PIVOT, but couldn't figure out how can I use it in this scenario.
I am happy with any other kind of solution if it's possible.
Thanks,
Hakoo.
One way is to use Apply..DEMO HERE
select
row_number() over
(partition by logid order by logid) as rownum,
logid,col1,col2 from #t t
cross apply
(
values
('logid',cast(logid as varchar(30))),
('batchid',batchid),
('typee',typee)
)b(col1,col2)

Delete duplicate rows from temp table in SQL

I have a table with the below columns
+-------+------------+------------+
| AssID | QuestionID | AnswerText |
+-------+------------+------------+
| 12 | 34 | Null |
| 12 | 34 | Sample |
| 13 | 35 | null |
| 13 | 35 | test1 |
+-------+------------+------------+
I need to remove answertext null row with same AssId and QuestionID
Final Output needs to be in this format
+-------+------------+------------+
| AssId | QuestionID | AnswerText |
+-------+------------+------------+
| 12 | 34 | Sample |
| 13 | 35 | test1 |
+-------+------------+------------+
Please help me with the delete query
Thanks in advance
Sree
You can use exist to see if the NULL answerText row also has a Non-Null answerText Row
DELETE t
FROM MyTABLE t
WHERE t.AnswerText IS NULL
AND EXISTS
(
SELECT *
FROM MyTable m
WHERE m.AssID = t.AssID
AND m.QuestionID = t.QuestionID
AND m.AnswerText IS NOT NULL
)
You can use cte and row_number to delete
;with cte as (
select *, RowN = Row_number() over (partition by assid, questionid order by answertext) from yourtable
)--or order by your id because you have not provided logic for which one to select in answertext
delete from cte where RowN > 1

The highest value from list-distinct

Can anyone help me with query, I have table
vendorid, agreementid, sales
12001 1004 700
5291 1004 20576
7596 1004 1908
45 103 345
41 103 9087
what is the goal ?
when agreemtneid >1 then show me data when sales is the highest
vendorid agreementid sales
5291 1004 20576
41 103 9087
Any ideas ?
Thx
Well you could try using a CTE and ROW_NUMBER something like
;WITH Vals AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY AgreementID ORDER BY Sales DESC) RowID
FROM MyTable
WHERE AgreementID > 1
)
SELECT *
FROM Vals
WHERE RowID = 1
This will avoid you returning multiple records with the same sale.
If that was OK you could try something like
SELECT *
FROM MyTable mt INNER JOIN
(
SELECT AgreementID, MAX(Sales) MaxSales
FROM MyTable
WHERE AgreementID > 1
) MaxVals ON mt.AgreementID = MaxVals.AgreementID AND mt.Sales = MaxVals.MaxSales
SELECT TOP 1 WITH TIES *
FROM MyTable
ORDER BY DENSE_RANK() OVER(PARTITION BY agreementid ORDER BY SIGN (SIGN (agreementid - 2) + 1) * sales DESC)
Explanation
We break table MyTable into partitions by agreementid.
For each partition we construct a ranking or its rows.
If agreementid is greater than 1 ranking will be equal to ORDER BY sales DESC.
Otherwise ranking for every single row in partition will be the same: ORDER BY 0 DESC.
See how it looks like:
SELECT *
, SIGN (SIGN (agreementid - 2) + 1) * sales AS x
, DENSE_RANK() OVER(PARTITION BY agreementid ORDER BY SIGN (SIGN (agreementid - 2) + 1) * sales DESC) AS rnk
FROM MyTable
+----------+-------------+-------+-------+-----+
| vendorid | agreementid | sales | x | rnk |
+----------|-------------|-------+-------+-----+
| 0 | 0 | 3 | 0 | 1 |
| -1 | 0 | 7 | 0 | 1 |
| 0 | 1 | 3 | 0 | 1 |
| -1 | 1 | 7 | 0 | 1 |
| 41 | 103 | 9087 | 9087 | 1 |
| 45 | 103 | 345 | 345 | 2 |
| 5291 | 1004 | 20576 | 20576 | 1 |
| 7596 | 1004 | 1908 | 1908 | 2 |
| 12001 | 1004 | 700 | 700 | 3 |
+----------+-------------+-------+-------+-----+
Then using TOP 1 WITH TIES construction we leave only rows where rnk equals 1.
you can try like this.
SELECT TOP 1 sales FROM MyTable WHERE agreemtneid > 1 ORDER BY sales DESC
I really do not know the business logic behind agreement_id > 1. It looks to me you want the max sales (with ties) by agreement id regardless of vendor_id.
First, lets create a simple sample database.
-- Sample table
create table #sales
(
vendor_id int,
agreement_id int,
sales_amt money
);
-- Sample data
insert into #sales values
(12001, 1004, 700),
(5291, 1004, 20576),
(7596, 1004, 1908),
(45, 103, 345),
(41, 103, 9087);
Second, let's solve this problem using a common table expression to get a result set that has each row paired with the max sales by agreement id.
The select statement just applies the business logic to filter the data to get your answer.
-- CTE = max sales for each agreement id
;
with cte_sales as
(
select
vendor_id,
agreement_id,
sales_amt,
max(sales_amt) OVER(PARTITION BY agreement_id) AS max_sales
from
#sales
)
-- Filter by your business logic
select * from cte_sales where sales_amt = max_sales and agreement_id > 1;
The screen shot below shows the exact result you wanted.

Resources