TSQL OVER CLAUSE That has no partition by has Order By clause - sql-server

I am having problems reading code like
SELECT
employeeID as ID,
RANK() OVER (ORDER BY AVG (Salary) DESC) AS Value
FROM Salaries
which supposedly gets the average salary of every employees
My understanding is the code should be
SELECT
employeeID as ID,
RANK() OVER (Partition By employeeID ORDER BY AVG (Salary) DESC) AS Value
FROM Salaries
but the above code works just fine?

First one is not working for me (returning Msg 8120
Column 'Salaries.employeeID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause), until I add group by employeeID:
SELECT
employeeID as ID,
RANK() OVER (ORDER BY AVG (Salary) DESC) AS Value
FROM Salaries
GROUP BY employeeID
Perhaps, for better understanding, it can be rewritten equivalently as:
;with cte as (
SELECT employeeID, AVG (Salary) as AvgSalary
FROM Salaries
GROUP BY employeeID
)
select employeeID as ID
, RANK() OVER (ORDER BY AvgSalary DESC) as Value
--, AvgSalary
from cte
In this case, average salary by employee is calculated in the CTE, and then query is extended with ranking column Value. Adding partition by employeeID to over clause:
;with cte as (
SELECT employeeID, AVG (Salary) as AvgSalary
FROM Salaries
GROUP BY employeeID
)
select employeeID as ID
, RANK() OVER (partition by employeeID ORDER BY AvgSalary DESC) as Value
--, AvgSalary
from cte
will lead to Value = 1 for every row in the result set (which is not what seem attempted to be achieved), because of rank() will reset numbering to 1 for each distinct employeeID, and employeeID is distinct in every row, since data was aggregated by this column prior to ranking.

Related

Distinct columns should have one date per month in SQL Server

My table has columns Name, EmpName, Date. For distinct Name and EmpName values, Date should be only one value per month
For example:
Name EmpName Date
-----------------------
abc emp1 3/19/2018
abc emp1 3/22/2018 (This record should be rejected)
xyz emp2 3/15/2018 valid record
I wrote something like this
SELECT
name, empname,
ROW_NUMBER() OVER (PARTITION BY YEAR(date), MONTH(date) ORDER BY date DESC)
I got stuck writing a CASE statement
You can use row_number() :
select top (1) with ties t.*
from table t
order by row_number() over (partition by name, empname, year(date), month(date) order by date);
However, based on sample data simple aggregation would also work :
select name, empname, min(date)
from table t
group by name, empname, year(date), month(date);

SQL LAG Days since last order

Hi I am trying to create a windowed query in SQL that shows me the days since last order for each customer.
It now shows me the days in between each order.
What do I need to change in my query to have it only show the days since the last and the previous order per customer? Now it shows it for every order the customer made.
Query:
SELECT klantnr,besteldatum,
DATEDIFF(DAY,LAG(besteldatum) OVER(PARTITION BY klantnr ORDER BY besteldatum),besteldatum) AS DaysSinceLastOrder
FROM bestelling
GROUP BY klantnr,besteldatum;
You can use row_number() to order the rows by besteldatum for each klantnr, and return the latest two using a derived table (subquery) or common table expression.
derived table version:
select klantnr, besteldatum, DaysSinceLastOrder
from (
select klantnr, besteldatum
, DaysSinceLastOrder = datediff(day,lag(besteldatum) over (partition by klantnr order by besteldatum),besteldatum)
, rn = row_number() over (partition by klantnr order by besteldatum desc)
from bestelling
group by klantnr, besteldatum
) t
where rn = 1
common table expression version:
;with cte as (
select klantnr, besteldatum
, DaysSinceLastOrder = datediff(day,lag(besteldatum) over (partition by klantnr order by besteldatum),besteldatum)
, rn = row_number() over (partition by klantnr order by besteldatum desc)
from bestelling
group by klantnr, besteldatum
)
select klantnr, besteldatum, DaysSinceLastOrder
from cte
where rn = 1
If you want one row per customer, rn = 1 is the proper filter. If you want n number of latest rows, use rn < n+1.

SQL Server: selecting distinct values per one column

I was wandering if it's possible to filter select results removing values that partially overlap
For example below, i have thousands of records, but i need the 'week date' value to be unqiue, and in case of duplicates the one with the highest value should remain.
emplo project_id Value week_Date week_ActualStart week_ActualEnd
A0001 project001 100 2015-12-28 2015-12-28 2016-01-03
A0001 project001 60 2015-12-28 2016-01-01 2016-01-03
So only the first row should remain.
I could really use someone's advice
Try something like the following:
;WITH WeekDateCte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY emplno, week_Date ORDER BY Value DESC) RowNo
FROM employee
)
SELECT *
FROM WeekDateCte
WHERE RowNo = 1
For more information about ROW_NUMBER function, check here.
NOTE: ROW_NUMBER() returns BIGINT.
You can use ROW_NUMBER for this:
SELECT emplno, project_id, Value, week_Date,
week_ActualStart, week_ActualEnd
FROM (
SELECT emplno, project_id, Value, week_Date,
week_ActualStart, week_ActualEnd,
ROW_NUMBER() OVER (PARTITION BY emplno, week_Date
ORDER BY Value DESC) AS rn
FROM mytable) AS t
WHERE t.rn = 1
The query picks the row having the greatest Value per emplno, week_Date slice.

SQL Server Select where stage/sequence has been missed or is out of sequence

I have a table that has families_id, date, metric_id
A record gets inserted for each families_id there will be a date & metric_id 1-10.
So there should be 10 records for each families_id, the records get inserted with a date an each should follow on from each other. So metric_id 10 date should be greater than metric_id 6 date.
On mass how can I select where they have
Missed a metric_id
The date for the metric_id 6 is before the date for metric_id 2
use row_number to assign an ordinal to the metric_id and date for each family, then they should match - also metric_id, 1,2,3,4... should match with its calculated row_number(), also 1,2,3,4....
SELECT IQ.* FROM (SELECT families_id, [date], metric_id,
ROW_NUMBER() OVER (PARTITION BY families_id ORDER BY [date]) rn_date,
ROW_NUMBER() OVER (PARTITION BY families_id ORDER BY metricid) rn_metric FROM YourTable) IQ
WHERE IQ.rn_date != IQ.rn_metric;
--should detect wrongly ordered metric_ids
SELECT IQ.* FROM (SELECT families_id, [date], metric_id,
ROW_NUMBER() OVER (PARTITION BY families_id ORDER BY [date]) rn_date,
ROW_NUMBER() OVER (PARTITION BY families_id ORDER BY metricid) rn_metric FROM YourTable) IQ
WHERE IQ.metric_id != IQ.rn_metric;
Another possibility - detect a metricID where the date is earlier for a higher id
SELECT y1.families_id, y1.metric_id FROM yourtable y1
WHERE
EXISTS(SELECT 0 FROM yourtable y2 WHERE y1.families_id = y2.families_id
AND
y2.date < y1.date
AND
y2.metricid > y1.metricid)

Find 2nd highest salary in every department

Hi I have a schema Employee(Employeeid,Name,departmentid,salary). I want to find out 2nd highest salary in each department.
select DepartmentID,name,salary from
(select Departmentid,name,salary, rank() over
(partition by departmentid order by salary desc)as
Rank from employee)t where t.Rank=2;
This does the job but if there is only 1 employee in a department then it does not printout that salary. Can anyone please help me with that?
Try this
Use Count() over() analytic function to count the records in each department. When the count is 1 then take Rank 1
SELECT departmentid,
NAME,
salary
FROM (SELECT departmentid,
NAME,
salary,
Dense_rank()OVER (partition BY departmentid
ORDER BY salary DESC) AS Rank,
Count(1)OVER(partition BY departmentid) AS cnt
FROM employee)t
WHERE t.rank = 2
OR ( t.rank = 1
AND cnt = 1 )
Note : I have used DENSE_RANK over RANK because, when there is a TIE in first salary you will not get RANK = 2

Resources