How to select highest common value across groups - sql-server

`Suppose I have a set of data with 2 fields - Type and Date. I am interested in finding (if exists) the the max common date across the various types. Is this easier to do in SQL or LINQ?
Given the data below the result should be 2018-02-01 as this is the max common date for all types. It there is no such date then no data is returned.
Type, Date
---------
1,2018-03-01
1,2018-02-01
1,2018-01-01
2,2018-02-01
2,2018-05-01
2,2018-01-01
3,2018-01-01
3,2018-03-01
3,2018-02-01

You could use:
SELECT TOP 1 [Date], COUNT(*) OVER(PARTITION BY Date) AS cnt
FROM tab
ORDER BY cnt DESC, [Date] DESC
DBFiddle Demo

This'll work if you have an unlimited or indeterminable number of Types:
CREATE TABLE #Sample ([Type] int, [DAte] date);
INSERT INTO #Sample
VALUES
(1,'20180301'),
(1,'20180201'),
(1,'20180101'),
(2,'20180201'),
(2,'20180501'),
(2,'20180101'),
(3,'20180101'),
(3,'20180301'),
(3,'20180201');
GO
WITH EntryCount AS(
SELECT [Type], [Date],
COUNT(*) OVER (PARTITION By [Date]) AS Entries
FROM #Sample)
SELECT MAX(Date)
FROM EntryCount EC
WHERE Ec.Entries = (SELECT COUNT(DISTINCT sq.[Type]) FROM #Sample sq);
GO
DROP TABLE #Sample;
Not sure how quick it'll be either though.

Example
Select Top 1 [Date]
from YourTable
Group By [Date]
Order By count([Type]) desc,[Date] desc
Returns
2018-02-01

This is not going to be very efficient not matter how you slice it because you have to compare across three groups. Assuming you have 3 types you could use a self join. Something like this.
select MAX(YourDate)
from YourTable yt
join YourTable yt2 on yt2.YourType = 2 and yt.YourDate = yt2.YourDate
join YourTable yt3 on yt3.YourType = 3 and yt.YourDate = yt3.YourDate
where yt.YourType = 1

Related

Loops on SQL Server

I have the following query where I input a date and it give me the result. However, I need to run this for 60 different dates. Instead of running this 1 by 1, is there anyway to automate this so it runs each time on a different date?
IF OBJECT_ID('tempdb..#1') IS NOT NULL DROP TABLE #1
declare #d1 datetime = '2020-02-06'
select distinct [User] into #1
from [X].[dbo].[Table1]
where [status] = 'Success'
and [Date] = #d1;
select count(distinct [User])
from #1
inner join [Y].[dbo].[Table2]
on #1.[User] = [Y].[dbo].[Table2].User
where [Date2] between #d1 and #d1+1
and [Checkname] in ('Check1','Check2')
Loops are slow and generally a bad practice in the context of T-SQL. You can use something like this to get the count of users for a batch of dates:
DROP TABLE IF EXISTS #DataSource;
CREATE TABLE #DataSource
(
[Date] DATETIME
,[UsersCount] INT
);
INSERT INTO #DataSource ([Date])
VALUES ('2020-02-06')
,('2020-02-07')
,('2020-02-08');
IF OBJECT_ID('tempdb..#1') IS NOT NULL DROP TABLE #1
select distinct DS1.[Date]
,DS1.[User]
into #1
from [X].[dbo].[Table1] DS1
INNER JOIN #DataSource DS2
ON DS1.[Date] = DS2.[Date]
where DS1.[status] = 'Success';
select #1.[date]
,count(distinct [User])
from #1
inner join [Y].[dbo].[Table2]
on #1.[User] = [Y].[dbo].[Table2].User
where [Date2] between #1.[date] and #1.[date] + 1
and [Checkname] in ('Check1','Check2')
GROUP BY #1.[date]
First, I want to say that gotqn's answer is a good answer - however, I think there are a few more things in the original code that can be improved - so here is how I would probably do it:
Assuming the dates are consecutive, use a common table expression to calculate the dates using dateadd and row_number.
Then, use another common table expression to get the list of dates and users from table1,
and then select the date and count of distinct users for each date from that common table expression joined to table2:
DECLARE #StartDate Date = '2020-02-06';
WITH Dates AS
(
SELECT TOP (60) DATEADD(DAY, ROW_NUMBER() OVER(ORDER BY ##SPID) -1, #StartDate) As Date
FROM sys.objects
), CTE AS
(
SELECT t1.[User], t1.[Date]
FROM [X].[dbo].[Table1] AS t1
JOIN Dates
ON t1.[Date] = Dates.[Date]
WHERE [status] = 'Success'
)
SELECT cte.[Date], COUNT(DISTINCT [User])
FROM CTE
JOIN [Y].[dbo].[Table2] As t1
ON CTE.[User] = t1.[User]
AND t1.[Date2] >= CTE.[Date]
AND t1.[Date2] < DATEADD(Day, 1, CTE.[Date])
AND [Checkname] IN ('Check1','Check2')
GROUP BY cte.[Date]
If the dates are not consecutive, you can use a table variable to hold the dates instead of calculating them using a common table expression.

Sql Filter table by two dates in order

I have been trying to filter one table by two dates with an order of importance (date2 > date1) as follows:
SELECT
t1.customer, t1.weights, t1.max(t1.date1) as date1, t1.date2
FROM
(SELECT *
FROM table
WHERE CAST(date2 AS smalldatetime) = '10/29/2017') t2
INNER JOIN
table t1 ON t1.customer = t2.customer
AND t1.date2 = t2.date2
GROUP BY
t1.customer, t1.date2
ORDER BY
t1.customer;
It filters the table correctly by date2 first, the max(t1.date1) doesn't what I want it to do though. I get duplicate customers, that share the same (and correct) date2, but show different date1's. These duplicate records have the following in common: The weight row is different. What would I need to do to output just the the customer records connected to the most current date1 without taking other columns into consideration?
I am still a noob, help would be greatly appreciated!
Solution for t-sql (all based on the accepted answer):
SELECT * FROM (
SELECT row_number() over(partition by t1.customer order by t1.date1 desc) as rownum, t1.customer, t1.weights, t1.date1 , t1.date2
FROM
(SELECT *
FROM table
WHERE CAST(date2 AS smalldatetime) = '10/29/2017') t2
INNER JOIN
table t1 ON t1.customer = t2.customer
AND t1.date2 = t2.date2
)t3
where rownum = 1;
If I understood correctly, then instead of a group by logic, I would just use a qualify row statement :)
Try the code below and tell me if it's what you needed - what I'm telling it to do is to bring back only one row per customer ID....but where we select the row based on the dates (by sorting them in ascending order) - however, I'm unclear of what you mean by importance of the 2 dates so I may be completely off base here...can you please give an example of input and desired output?
SELECT t1.customer, t1.weights, t1.date1, t1.date2
FROM
(
Select *
FROM table
WHERE Cast(date2 as smalldatetime)='10/29/2017'
) t2
Inner Join table t1
ON t1.customer = t2.customer
AND t1.date2 = t2.date2
Qualify row_number() over(partition by t1.customer order by date2 , date1)=1
Order By t1.customer;

Display max value rows only

I have the following table (must shorter version than the real one),
and I want to all the rows with max _ values for each _ displayed.
How should I do this?
Table Now
Table I want to have
thanks a lot in advance!!
Using the dense_rank function and a derived table would be appropriate for this (please note I used underscores instead of spaces in the column names):
select group_type
,desk_number
,comments
from
(select *
,dense_rank() over(partition by group_type order by desk_number desc) dr
from mytable) t1
where t1.dr = 1
I made a rextester sample that you can try here
Let me know if you have any questions.
How can I SELECT rows with MAX(Column value), DISTINCT by another column in SQL?
This answers your question quite well but I will convert it for your convenience <3
SELECT *
FROM table
INNER JOIN
(SELECT comments, MAX([desk number]) AS MaxDesk
FROM table
GROUP BY comments) groupedtable
ON table.[desk number]= groupedtable.[desk number]
AND table.comments= groupedtable.MaxDesk
try this :
WITH CTE
AS
(
SELECT
SeqNo = ROW_NUMBER() OVER(ORDER BY CAST(DeskNumber AS INT) DESC PARTITION BY GroupType),
GroupType,
DeskNumber,
[Comment]
FROM YourTable
)
SELECT
*
FROM CTE WHERE CTE.SeqNo = 1

SQL - Combine rows

I have rows in a table that looks like this:
[date],[name],[duty],[holiday],[hdaypart],[sick],[sdaypart]
2015-04-27, person1, 1,0,NULL,0,NULL
2015-04-27, person1, 0 1,'fd',0,NULL
I would like to combine these rows to:
[date],[name],[duty],[holiday],[hdaypart],[sick],[sdaypart]
2015-04-27, person1, 1,1,'fd',0,NULL
The duty, holiday and sick columns as BIT columns.
Is there way to do this?
The one solution I can come up with is using subqueries, but it consumes a lot of time. A faster solution would be nice.
This is what I have now:
SELECT DISTINCT [name],[date],[region],[cluster]
,CASE WHEN (SELECT SUM(CONVERT(INT,callduty)) FROM planning AS t2
WHERE t1.[Date] = #datum AND t2.[Name] = t1.[name] AND t2.[Date] = t1.[date] ) > 0
THEN 1 ELSE 0 END AS [CallDuty]
,CASE WHEN (SELECT SUM(CONVERT(INT,holiday)) FROM planning AS t2
WHERE t1.[Date] = #datum AND t2.[Name] = t1.[name] AND t2.[Date] = t1.[date] ) > 0
THEN 1 ELSE 0 END AS [Holiday]
FROM planning AS t1
where t1.[Date] = #datum AND t1.[Name] like #naam
group by t1.[date],t1.[name], t1.Region, t1.cluster
order by t1.[name]
You seem to want to group by date and name and select either the maximum or the not null values within each group. MAX aggregate function is suitable for both of these selections:
SELECT [date],[name], MAX([duty]), MAX([holiday]),
MAX([hdaypart]), MAX([sick]), MAX([sdaypart])
FROM mytable
GROUP BY [date],[name]
By looking at your example, I assume that you want to get the maximum values for a specific user.
You could do this using a group by and max
select max([date]),[name],max([duty]),max([holiday]),max([hdaypart]),max([sick]),max([sdaypart])
from yourtable
group by name
This is not really pretty but should perform better than using subqueries.
EDIT:
If you have columns with bit sql types, use
max(cast([bitColumn] as int))
Adding the date column in the group by, as suggested by Giorgos Betsos, the result is
select [date],
[name],
max([duty]),
max([holiday]),
max(cast([hdaypart] as int)),
max(cast([sick] as int)),
max(cast([sdaypart] as int))
from yourtable
group by [date],[name]
declare #t table ([date] date,[name] varchar(10),[duty] varchar(10),[holiday] int,[hdaypart] varchar(10),[sick] int,[sdaypart]
int
)
insert into #t([date],[name],[duty],[holiday],[hdaypart],[sick],[sdaypart])values ('2015-04-27','person1',1,0,NULL,0,NULL),
('2015-04-27','person1',1,0,'fd',0,NULL)
select MAX([date]),MAX([name]),MAX([duty]),MAX([holiday]),MAX([hdaypart]), [sick],[sdaypart] from #t
group by sick,[sdaypart]
OR
select [date],[name],[duty],[holiday],MAX([hdaypart])AS H,[sick],[sdaypart] from #t
group by [date],[name],[duty],[holiday],[sick],[sdaypart]
UNION
select [date],[name],[duty],[holiday],MAX([hdaypart])AS H,[sick],[sdaypart] from #t
group by [date],[name],[duty],[holiday],[sick],[sdaypart]
CREATE TABLE #Combine
(
[date] date,
[name] VARCHAR(10),
[duty] CHAR(1),
[holiday] CHAR(1),
[hdaypart] CHAR(5),
[sick] CHAR(1),
[sdaypart] VARCHAR(10)
)
INSERT INTO #Combine VALUES('2015-04-27', 'person1', '1','0',NULL,'0',NULL),
('2015-04-27', 'person1', '0','1','fd','0',NULL)
SELECT MAX(Date) [date],MAX(name) [name],MAX(Duty) [duty],MAX(holiday) holiday,
MAX(hdaypart) hdaypart,max(sick) sick,max(sdaypart)sdaypart FROM #Combine

calculate difference between two times in two rows in sql

I am using MSSQL 2008 Standard
I have multiple rows in a select command which are filled with events. For every event I have got a timestamp, now I want to calculate the time between the events:
(number) | event | timestamp | duration
---------+----------------+---------------------+----------
1 | logon | 2012-05-23 10:00:00 |
2 | incomming call | 2012-05-23 10:01:00 |
3 | call ended | 2012-05-23 10:02:00 |
4 | logoff | 2012-05-23 10:04:00 |
(the number column does not exist but it's easier for explanation)
Now the duration cell for the first row should be 1, for the second one also 1 and for the third one 2.
Does anybody know how to achieve this without loops and so on.
Thank you
You need a self join. Since you need to generate an id then something like:
select t1.*, datediff(minute, t2.timestamp, t1.timestamp) from
(select *, row_number() over (order by ...) as rowid from MyTable) t1
inner join
(select *, row_number() over (order by ...) as rowid from MyTable) t2
on t1.rowid = t2.rowid - 1
I found the CTE answer provided less than desirable due to its not reporting the first line. I found the other answers with join's too complex. I distilled the problem into this snippet
Here is the code which uses a CTE, creates a sequence within the CTE's select which identifies a row number by an ordered timestamp. The resulting selection picks on the resulting ordered rows and determines minutes.
WITH AgentActions AS
(
select ROW_NUMBER() OVER (ORDER BY [TimeStamp]) -- Create an index number ordered by time.
AS [Sequence],
* from AgentInteractions
)
SELECT *,
ISNULL(DATEDIFF(Minute,
(SELECT other.TimeStamp
FROM AgentActions Other
WHERE other.Sequence = AgentActions.Sequence - 1 ),
AgentActions.TimeStamp),
0)
AS MinutesFromLastPoint
FROM AgentActions;
Here is the setup table
CREATE TABLE AgentInteractions
(
[Event] VARCHAR(12) NOT NULL,
[Timestamp] [DateTime] NOT NULL
);
INSERT INTO dbo.AgentInteractions( Event, TimeStamp )
VALUES ( 'Alpha', '1-Jan-2018 3:04:22 PM' ),
( 'Omega', '3-Jan-2018 10:04:22 PM' ),
( 'Beta', '2-Jan-2018 2:04:22 AM' );
Results
SQL Fiddle Example
This is my current version/solution:
declare #temp table
(
id int,
timestamp datetime,
type nvarchar(255),
skillname nvarchar(255),
event nvarchar(255),
userstatus nvarchar(255)
)
insert into #temp (id, timestamp, type, skillname, event, userstatus)
(
select ROW_NUMBER() over (order by timestamp) as id, * from
(
select TimeStamp, 'Event' as type, SkillName, Event, UserStatus from AgentEvents
where TimeStamp >= '2012-05-22T00:00:00'
and UserName like '%engel%'
union
select TimeStamp, 'Anruf' as type, SkillName, '' as event, '' as status from calls
where TimeStamp >= '2012-05-22T00:00:00'
and UserName like '%engel%'
) as a
)
select t1.*, DATEDIFF(second, t1.timestamp, t2.timestamp) as duration
from #temp t1
left outer join #temp t2 on t1.id = t2.id - 1
Edit: changed inner join to left outer join, otherwise the last row would be lost.
As I understand it, you need to update the duration column.
You can use something like this :
update mytable a set duration = DateDiff( a.timestamp, select top b.timestamp from mytable b order by b.timestamp asc)
I cannot test it, but just to give you an idea (it may have some syntax errors).
Using the 'top' with the 'order by' clause should do the trick.
(Edited)
I think you better create a trigger
CREATE TRIGGER update_duration ON sometable
INSTEAD OF INSERT
AS
DECLARE #lastDT datetime
BEGIN
SET #lastDT =
(SELECT TOP 1 _timestamp
FROM sometable
ORDER BY _timestamp DESC)
UPDATE sometable
SET duration = DATEDIFF(MINUTE, #lastDT, GETDATE())
END
WITH rows AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY Col1) AS rn
FROM dbo.Table_2
)
SELECT mc.col1, DATEDIFF(HOUR, mc.Col1, mp.Col1) as TimeDiffInHours
FROM rows mc
JOIN rows mp
ON mc.rn = mp.rn-1

Resources