Longest gap between test matches in Sachin's career - sql-server

I am new to SQL server and was practising using Sachin's batting stats (Cricket) I found here. (Sachin Batting Statistics). I wanted to find the longest gap between two test matches in Sachin's career. So basically have to filter it based on Test matches and find the max difference in the Start_DateAscending column? Hope that made some sense. Sample table added if link doesn't make sense
EDIT: I created a sample table with different dates. the column is named DateValues. Now, I want to find the code for maximum difference between any two successive rows in the DateValue column. For example, in this case the answer is 2 years and 17 days between December 09, 1989 and December 26, 1991
IF OBJECT_ID('TempDB..#mytable','U') IS NOT NULL
DROP TABLE #mytable
CREATE TABLE #mytable
(
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
DateValue DATETIME
)
SET DATEFORMAT DMY
SET IDENTITY_INSERT #mytable ON
INSERT INTO #mytable
(ID, DateValue)
SELECT '11', 'Nov 15 1989 12:00AM' UNION ALL
SELECT '59', 'Nov 23 1989 12:00AM' UNION ALL
SELECT '37', 'Dec 09 1989 12:00AM' UNION ALL
SELECT '44', 'Dec 26 1991 12:00AM' UNION ALL
SELECT '55', 'May 31 1993 12:00AM' UNION ALL
SELECT '60', 'May 15 1995 12:00AM' UNION ALL
SELECT '57', 'Jan 12 1996 12:00AM' UNION ALL
SELECT '43', 'Jan 19 1996 12:00AM' UNION ALL
SELECT '49', 'Jan 31 1996 12:00AM' UNION ALL
SELECT '18', 'Oct 17 1997 12:00AM'
Here's a solution I found on this website, the answer I obtained was 1900-01-01!
SELECT MAX(#mytable.DateValue-h.DateValue) as maxDiff
FROM #mytable
LEFT JOIN #mytable h
ON h.ID=[dbo].#mytable.ID AND #mytable.DateValue>=h.DateValue
WHERE h.DateValue IS NOT NULL

If you're using SQL Server 2012 or above, this SQL will return the biggest gap in days between two tests:
select max(datediff(day, a.TestDate, a.NextTest)) as BiggestGap
from (
select DateValue as TestDate, lead(DateValue) over (order by DateValue) as NextTest
from #mytable m
) a
The first thing this query does (inside the parenthesis) is gets a table that lists all test matches and the date of the next test match. That's what the innermost query provides: It selects all test dates and, using the lead function, the date of the match straight after that test.
The data from that parenthesised select (including ID) looks like this:
ID TestDate NextTest
----------- ----------------------- -----------------------
11 1989-11-15 00:00:00.000 1989-11-23 00:00:00.000
59 1989-11-23 00:00:00.000 1989-12-09 00:00:00.000
37 1989-12-09 00:00:00.000 1991-12-26 00:00:00.000
44 1991-12-26 00:00:00.000 1993-05-31 00:00:00.000
55 1993-05-31 00:00:00.000 1995-05-15 00:00:00.000
60 1995-05-15 00:00:00.000 1996-01-12 00:00:00.000
57 1996-01-12 00:00:00.000 1996-01-19 00:00:00.000
43 1996-01-19 00:00:00.000 1996-01-31 00:00:00.000
49 1996-01-31 00:00:00.000 1997-10-17 00:00:00.000
18 1997-10-17 00:00:00.000 NULL
After that (outside the parenthesis), it's simply a case of finding the row with the biggest difference between dates. In SQL Server, it's best to use the datediff function to get the difference between two dates instead of using mathematical operators such as - in the example you saw, so we use that to get the difference in days between each row. max is used to get the largest of those, thus returning the biggest gap between two matches.
Using the example SQL data provided, the biggest gap is 747 days (approximately 2 years 17 days).

Related

T-SQL Grouping Dynamic Date Ranges

Using MS SQL Server 2019
I have a set of recurring donation records. Each have a First Gift Date and a Last Gift Date associated with them. I need to add a GroupedID to these rows so that I can get the full date range for the earliest FirstGiftDate and the oldest LastGiftDate as long as there is not a break of more than 45 days in between the recurring donations.
For example Bob is a long time supporter. His card has expired multiple times and he has always started new gifts within 45 days. All of his gifts need to be given a single grouped ID. On the opposite side June has been donating and her card expires. She doesn't give again for 6 months, but then continues to give after her card expires. The first gift of Junes should get its own "GroupedID" and the second and third should be grouped together.The grouping count should restart with each donor.
My initial attempt was to join the donation table back to itself aliased as D2. This did work to give me an indicator of which ones were within the 45 day mark but I can't wrap my head around how to then link them. My only thought was to use LEAD and LAG to try analyze each scenario and figure out the different combinations of LEAD and LAG values needed to make it catch each different scenario, but that doesn't seem as reliable as scaleable as I'd like it to be.
I appreciate any help anyone can give.
My code:
SELECT #Donation.*, D2.*
FROM #Donation
LEFT JOIN #Donation D2 ON #Donation.RecurringGiftID <> D2.RecurringGiftID
AND #Donation.Donor = D2.Donor
AND ABS(DATEDIFF(DAY, #Donation.FirstGiftDate, D2.LastGiftDate)) < 45
Table structure and sample data:
CREATE TABLE #Donation
(
RecurringGiftID int,
Donor nvarchar(25),
FirstGiftDate date,
LastGiftDate date
)
INSERT INTO #Donation
VALUES (1, 'Bob', '2017-02-15', '2018-07-01'),
(15, 'Bob', '2018-08-05', '2019-04-01'),
(32, 'Bob', '2019-04-15', '2022-06-15'),
(54, 'June', '2015-05-01', '2016-05-01'),
(96, 'June', '2016-12-15', '2018-02-01'),
(120, 'June', '2018-03-04', '2020-07-01')
Desired output:
RecurringGiftId
Donor
FirstGiftDate
LastGiftDate
GroupedID
1
Bob
2017-02-15
2018-07-01
1
15
Bob
2018-08-05
2019-04-01
1
32
Bob
2019-04-15
2022-06-15
1
54
June
2015-05-01
2016-05-01
1
96
June
2016-12-15
2018-02-01
2
120
June
2018-03-04
2020-07-01
2
use LAG() to detect when current row is more than 45 days from previous and perform a cumulative sum to form the required Group ID
select *,
GroupedID = sum(g) over (partition by Donor order by FirstGiftDate)
from
(
select *,
g = case when datediff(day,
lag(LastGiftDate, 1, '19000101') over (partition by Donor
order by FirstGiftDate),
FirstGiftDate)
> 45
then 1
else 0
end
from #Donation
) d

DateTime in sql only comparing the time

ID DateTime Code
---------- -------------- ----------
58 2015-01-01 20:00:00 1111
58 2015-01-11 10:00:00 8523
58 2015-01-11 03:00:00 4555
58 2015-01-19 00:01:00 8888
9 2015-01-01 20:00:00 4444
how do i count the number of codes for a specific ID ignoring which date it is but it must be between 20:00:00 and 06:00:00
select count(code) as count from table 1 where ID='58' and DateTime between '20:00:00' and '06:00:00'
the expected output would be
count
3
SELECT count(code) as count
FROM table1
WHERE
ID='58' and
(CAST(DateTime as time) >= '20:00'
or CAST(DateTime as time) <= '06:00')
EDIT: John, I understand the issue. Here is a full solution to handle those cases:
In order to use variables:
DECLARE #HourBegin time = '07:00'
DECLARE #HourEnd time = '17:30'
SELECT count(code) as count
FROM table1
WHERE
ID='58' and
(CAST(DateTime as time) between #HourBegin and #HourEnd or
((CAST(DateTime as time) <= #HourEnd or
CAST(DateTime as time) >= #HourBegin) and
#HourBegin > #HourEnd)
)
Almost the same as previous answer, but with hours it looks nicer for me and might be you need DISTINCT code
SELECT count(DISTINCT code) as count
FROM table1
WHERE
ID='58' and
(DATEPART(HOUR,DateTime) >= 20
or DATEPART(HOUR,DateTime) < 6)
UPDATED: changed from <= 6 to < 6
Update
This answer applies to MySQL.
When I started writing the answer, the question was tagged mysql and sql-server. The OP edited it in the meantime.
This query should do what you want on MySQL.
SELECT count(code) AS `count`
FROM `table 1`
WHERE ID='58'
AND TIME(`DateTime`) NOT BETWEEN '06:00:01' AND '19:59:59'
The MySQL function TIME() extracts only the time component from a DATETIME value.
On version 5.7, MySQL added support for fractional seconds (up to 6 digits) on DATETIME columns. The query above will include the entries having time greater than 06:00:00 but smaller than 06:00:01 (events that happened during the first second after 6 AM sharp).
For MySQL 5.7 and newer, the correct query is:
SELECT count(code) AS `count`
FROM `table 1`
WHERE ID='58'
AND (TIME(`DateTime`) <= '06:00:00' OR '20:00:00' <= TIME(`DateTime`))
I don't know about SQL Server.

SQL Server date function

I need to get the week number of the giving date. For Ex jan 1 = week no is 1, Jan 8 - week 2 like this.. any one help me out pls.
You should try something like this:
DECLARE #Dt datetime
SELECT #Dt='02-21-2008'
SELECT DATEPART( wk, #Dt)
This should return the weeknumbers you want.
SQL Server starts counting from the 1st of january. If you want to return the ISO weeknumbers, you need to do a bit more scripting. A nice howto is listed in this site: http://www.rmjcs.com/SQLServer/TSQLFunctions/ISOWeekNumber/tabid/207/Default.aspx
MSDN: DATEPART (Transact-SQL)
In response to Robin's comment:
But i need in such a way, that from
jan 1 to 7, it should return 1, from
jan 8 to 17 it should return 2 like
this.. hope u got my impression
In that case you could also write something like this.
select (datepart(dy, '2011-01-01') / 7) + 1
--returns 1
select (datepart(dy, '2011-01-02') / 7) + 1
--returns 1
select (datepart(dy, '2011-12-31') / 7) + 1
--returns 53
I don't know how SQL Server 2008 responds with the iso_week and wk parameter as I only got a SQL 2005 instance available at the moment.
Does this do what you want?
declare #T table (dt datetime)
insert into #T values
('2010-12-31'),
('2011-01-01'),
('2011-01-02'),
('2011-01-03'),
('2011-01-04'),
('2011-01-05'),
('2011-01-06'),
('2011-01-07'),
('2011-01-08')
select
dt,
(datediff(d, dateadd(year, datediff(year, 0, dt), 0), dt) / 7)+1
from #T
Result
dt
----------------------- -----------
2010-12-31 00:00:00.000 53
2011-01-01 00:00:00.000 1
2011-01-02 00:00:00.000 1
2011-01-03 00:00:00.000 1
2011-01-04 00:00:00.000 1
2011-01-05 00:00:00.000 1
2011-01-06 00:00:00.000 1
2011-01-07 00:00:00.000 1
2011-01-08 00:00:00.000 2

sql server datepart return

I have a sql query that is grouping rows by calendar week
select count(*),datepart(wk,mydate)
from MyTable
where mydate between '12/26/2010' and '1/8/2011'
group by datepart(wk,mydate)
The date range is two weeks but three rows come back because Jan 1 is a saturday and is the only day in the range that DATEPART returns a 1 the other dates return 53 or 2.
I want jan 1 to be grouped with the dates that return a 53, but I want it to be a generic answer not something like CASE WHEN datepart(wk,mydate) = 53 then 1 else datepart(wk,mydate) end because that will work for that specific date range not for other years.
I'm just wondering what a good solution would be
thanks in advance.
We use to choose as week of a date, the week of his last sunday (first day of the week in SQL). So, for each date, you can ask for the week of his last sunday with the following query:
select count(*),datepart(wk,mydate-DATEPART(dw,mydate)+1)
from MyTable
where mydate between '12/26/2010' and '1/8/2011'
group by datepart(wk,mydate-DATEPART(dw,mydate)+1)
Perhaps you can use iso_week instead of wk.
select count(*),datepart(iso_week,mydate)
from MyTable
where mydate between '12/26/2010' and '1/8/2011'
group by datepart(iso_week,mydate)
Sample:
declare #T table (Val datetime)
insert into #T values
('2010-12-30'),
('2010-12-31'),
('2011-01-01'),
('2011-01-02'),
('2011-01-03'),
('2011-01-04'),
('2011-01-05')
select
Val,
datepart(iso_week, Val) as ISO_WEEK
from #T
Result:
Val ISO_WEEK
----------------------- -----------
2010-12-30 00:00:00.000 52
2010-12-31 00:00:00.000 52
2011-01-01 00:00:00.000 52
2011-01-02 00:00:00.000 52
2011-01-03 00:00:00.000 1
2011-01-04 00:00:00.000 1
2011-01-05 00:00:00.000 1
Try DateDiff() instead with your start date as the date to compare.

SQL Server: GROUP BY Aggregation semantics with the PIVOT operator

I am on SQL Server 2008 and I have a table containing WA metrics of the following form :
CREATE TABLE #VistitorStat
(
datelow datetime,
datehigh datetime,
name varchar(255),
cnt int
)
Two days worth of data in the table looks like so:
2009-07-25 00:00:00.000 2009-07-26 00:00:00.000 New Visitor 221
2009-07-25 00:00:00.000 2009-07-26 00:00:00.000 Unique Visitors 225
2009-07-25 00:00:00.000 2009-07-26 00:00:00.000 Return Visitors 0
2009-07-25 00:00:00.000 2009-07-26 00:00:00.000 Repeat Visitors 22
2009-07-26 00:00:00.000 2009-07-27 00:00:00.000 New Visitor 263
2009-07-26 00:00:00.000 2009-07-27 00:00:00.000 Unique Visitors 269
2009-07-26 00:00:00.000 2009-07-27 00:00:00.000 Return Visitors 4
2009-07-26 00:00:00.000 2009-07-27 00:00:00.000 Repeat Visitors 38
I want to group by the days and pivot the metrics into row form. The examples for using the PIVOT operator that I can find only show aggregation based on the SUM and MAX aggregate function. Presumably I need to convey GROUP BY semantics to the PIVOT operator -- note: I can't find any clear examples/ documentation on how to achieve this. Could someone please post the correct syntax of this -- with the use of the PIVOT operator -- of this query.
If this is not possible with pivot -- can you come up with an elegant way of writing the query ? If not I'll just have to generate the data in transposed form.
Post answer edit:
I have come to the conclusion that the pivot operator is unelegant (so far so that I consider it a syntax hack) -- I have solved the problem by generating the data in a transposed fashion. I welcome comments.
I m not sure of the result you want but this gives a line per day:
CREATE TABLE #VistitorStat
(
datelow datetime,
datehigh datetime,
name varchar(255),
cnt int
)
insert into #VistitorStat
select '2009-07-25 00:00:00.000','2009-07-26 00:00:00.000', 'New Visitor', 221
union select '2009-07-25 00:00:00.000',' 2009-07-26 00:00:00.000', 'Unique Visitors', 225
union select '2009-07-25 00:00:00.000',' 2009-07-26 00:00:00.000', 'Return Visitors', 0
union select '2009-07-25 00:00:00.000',' 2009-07-26 00:00:00.000', 'Repeat Visitors', 22
union select '2009-07-26 00:00:00.000',' 2009-07-27 00:00:00.000', 'New Visitor' , 263
union select '2009-07-26 00:00:00.000',' 2009-07-27 00:00:00.000', 'Unique Visitors', 269
union select '2009-07-26 00:00:00.000',' 2009-07-27 00:00:00.000', 'Return Visitors', 4
union select '2009-07-26 00:00:00.000',' 2009-07-27 00:00:00.000', 'Repeat Visitors', 38
select * from #VistitorStat
pivot (
sum(cnt)
for name in ([New Visitor],[Unique Visitors],[Return Visitors], [Repeat Visitors])
) p

Resources