SQl Server - Where clause uses maximum date in data - sql-server

I'm struggling with something i thought would be easy.
I have a table that is updated via an append on most days and has a report date field that shows the date the rows were updated.
I want to join to this table but only pull back the records from the date the table was last updated
Most of the time I could get away just looking for yesterdays date as the table is updated most days
Where [reportdate] > DATEADD(DAY, -1, GETDATE())
But as its not always updated daily, I wanted to rule this issue out. Is there anyway of returning the max date?
I was trying to figure out max (date), but I can't figure out the grouping. I need to return all the fields. The below just seems to return the whole table
SELECT max ([ReportDate]) as reportdate
,[GUID]
,[Make]
,[Model]
,[MPxN]
,[PaymentMode]
,[Consent]
,[Category]
,[Fuel]
,[pkCommCompID]
FROM table
group by guid
,[Make]
,[Model]
,[MPxN]
,[PaymentMode]
,[Consent]
,[Category]
,[Fuel]
,[pkCommCompID]
I could get round it with a temp table that just has the max report date and then using this as the left part of a join
SELECT max ([ReportDate]) as reportdate
FROM [DOMCustomers].[dbo].[DCC_Device_Comms_Compiled]
But The SQL is triggered in Excel so temp tables are problematic (i think).

Is there anyway of returning the max date?
Like this:
SELECT *
FROM SomeTable
where ReportDate = (select max(ReportDate) from SomeTable)

Here is a conceptual example.
It will produce a latest row for each car make.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, make VARCHAR(20), ReportDate DATETIME);
INSERT INTO #tbl (make, ReportDate) VALUES
('Ford', '2020-12-31'),
('Ford', '2020-10-17'),
('Tesla', '2020-10-25'),
('Tesla', '2020-12-30');
-- DDL and sample data population, end
;WITH rs AS
(
SELECT *
, ROW_NUMBER() OVER (PARTITION BY make ORDER BY ReportDate DESC) AS seq
FROM #tbl
)
SELECT * FROM rs
WHERE seq = 1;

Seems like a DENSE_RANK and TOP would work (assuming ReportDate is a date):
SELECT TOP (1) WITH TIES
[ReportDate]
,[GUID]
,[Make]
,[Model]
,[MPxN]
,[PaymentMode]
,[Consent]
,[Category]
,[Fuel]
,[pkCommCompID]
FROM YourTable
ORDER BY DENSE_RANK() OVER (ORDER BY ReportDate DESC);
If ReportDate is a date and time value, and you want everything for the latest date (ignoring time), then replace ReportDate with CONVERT(date,ReportDate) in the ORDER BY.

Related

How to eliminate overlapping date ranges in SQL?

I am trying to eliminate overlapping date ranges in the data set. A smaller data set that I will be used:
How would I eliminate the highlighted first row of data as it overlaps the other date ranges for that specific id?
This is provided as a basic way to get you started since you are new to SO. You will undoubtedly need to change the logic on what you classify as overlapping.
--your test data...
declare #table table (ID int, BeginTime datetime, EndTime datetime)
insert into #table (ID, BeginTime, EndTime) VALUES
(101,'7/4/2016','9/21/2016'),
(101,'8/8/2016','9/8/2016'),
(101,'9/8/2016','9/21/2016'),
(102,'9/2/2016','9/7/2016'),
(103,'9/22/2016','9/28/2016'),
(103,'9/23/2016','9/28/2016')
/*
In SQL 2012 onward use LEAD and LAG to compare rows to the ones above or below them
Change this logic as you need... based on the limited information for "overlapping"
I just placed a flag where the dates didn't light up perfectly. There are undoubtedly
more cases / better logic you will need.
*/
select
ID,
BeginTime,
EndTime,
case when lead(BeginTime) over (partition by ID order by BeginTime asc) <> EndTime then 'n' else 'y' end as toKeep
from #table
--This is the same logic applied in a CTE so we can update the table
;with cte as(
select
ID,
BeginTime,
EndTime,
case when lead(BeginTime) over (partition by ID order by BeginTime asc) <> EndTime then 'n' else 'y' end as toKeep
from #table)
--Update your table via the CTE
delete from cte where toKeep = 'n'
select * from #table

List all dates within date range in SQL but ignore bank holidays

I'm making a holiday manager.
I have a table with a list of start and end dates for each instance of holiday.
[LeaveID], [EmployeeID], [StartDate], [EndDate]
I also do have a calendar table with dates from 2016-2030, listing the usual variations of date format as well as times the factory is shut, including bank holidays, etc.
I'm working on the front end for it now they want me to display it in sort of calendar format so I will need to mark on each day, who has booked time off.
I figure I need to list each date within each date range (start date to end date), then check if each date on the calendar appears on that list.
So basically I need to get a list of dates within a date range.
On top of that. I'd like to be able to compare the list of dates from above, to the calendar table so I can ignore bank holidays when calculating the amount of holiday used for each instance.
Thanks in advance!
To get a list of date within a date range, you will need source of numbers from 1 to n. I usually create such table and call it Numbers table.
To generate a list of date within a range, use following query.
SELECT
DATEADD(DAY, Numbers.Number-1, [StartDate]) Date
FROM
Numbers
WHERE
DATEADD(DAY, Numbers.Number-1, [StartDate]) <= [EndDate]
To create such table, refer to this question.
If you want to list all dates in Employee table, just cross join it.
SELECT
e.EmployeeID,
DATEADD(DAY, n.Number-1, e.[StartDate]) Date
FROM
Numbers n, Employee e
WHERE
DATEADD(DAY, n.Number-1, e.[StartDate]) <= e.[EndDate]
As you already have a dates table, you do not need the numbers table mentioned in the other answer. To accomplish what you are after requires a simple SQL Join from your dates table. Depending on how you want to format your final report you can either count up the number of EmployeeIDs returned or group them all into a calendar/table control in your front end on the DateValue.
In the query below you will get at least one DateValue for every date specified in the range (for which you can apply your own filtering such as where Dates.BankHoliday = 0 etc) and more than one where multiple Employees have taken leave:
-- Build some dummy data to run the query against.
declare #Emp table (LeaveID int, EmployeeID int , StartDate datetime, EndDate datetime);
insert into #Emp values
(1,1,'20161101','20161105')
,(2,1,'20161121','20161124')
,(3,2,'20161107','20161109')
,(4,3,'20161118','20161122');
declare #Dates table (DateKey int, DateValue datetime, DateLabel nvarchar(50));
declare #s datetime = '20161025';
with cte as
(
select cast(convert(nvarchar(8),#s,112) as int) as DateKey
,#s as DateValue
,convert(nvarchar(50),#s,103) as DateLabel
union all
select cast(convert(nvarchar(8),DateValue+1,112) as int)
,DateValue+1
,convert(nvarchar(50),DateValue+1,103)
from cte
where DateValue+1 <= '20161205'
)
insert into #Dates
select * from cte;
-- Actually query the data.
-- Define the start and end of your date range to return.
declare #MinStart datetime = (select min(StartDate) from #Emp);
declare #MaxEnd datetime = (select max(EndDate) from #Emp);
select d.DateValue
,e.EmployeeID
from #Dates d
left join #Emp e
on(d.DateValue between e.StartDate and e.EndDate)
where d.DateValue between #MinStart and #MaxEnd
order by d.DateValue
,e.EmployeeID;

T-SQL select values grouped by week, zero if no values present for week

I am trying to group values (sales data) by week in SQL Server. For items with no sales in a certain week, I still want to get the week number and year, with a sum of 0.
The sales ledger table has computed columns for year and week number, by which I group.
Right now my Query looks like this:
select ItemNumber, sum(Amount), year, week
from JournalPosition
group by week, year, ItemNumber
order by ItemNumber asc, year desc, week desc
What would be an efficient way to accomplish what i want without having to implement a data warehouse? (Stored procedure or temporary table would be fine for me)
You need to generate a list of all of the weeks that you want to include in your query and join onto it. You can either store these in a pre-generated table or use a CTE. Something like this will help you with a CTE how to get the start and end dates of all weeks between two dates in SQL server?
You can use recursive CTE with dates from your table:
declare #StartDate datetime,#EndDate datetime
set #StartDate=(select convert(varchar,min(Year),102) from JournalPosition)
set #EndDate=(select dateadd(day,-1,dateadd(year,2,convert(varchar,max(Year),102))) from JournalPosition)
print #StartDate
print #EndDate
;with CTE as (
select #StartDate as StartDate, DATEPART(week,#StartDate) as WeekNumber, DATEPART(year,#StartDate) as YearNumber
union all
select DATEADD(week, 1, StartDate), DATEPART(week,DATEADD(WEEK, 1, StartDate)), DATEPART(year,DATEADD(week, 1, StartDate))
from CTE
where DATEADD(week, 1, StartDate) <= #EndDate
)
select ItemNumber, isnull(sum(Amount),0), CTE.YearNumber, datepart(week,CTE.StartDate)
from JournalPosition
full join CTE
on JournalPosition.week=datepart(week,CTE.StartDate) and JournalPosition.year=CTE.YearNumber
group by CTE.YearNumber, datepart(week,CTE.StartDate), ItemNumber
order by 3 desc, 4 desc, 1 asc
option (maxrecursion 32767);
But maybe it's better not to use recursion (see http://www.sqlservercentral.com/Forums/Topic779830-338-1.aspx).

SQL running sum for an MVC application

I need a faster method to calculate and display a running sum.
It's an MVC telerik grid that queries a view that generates a running sum using a sub-query. The query takes 73 seconds to complete, which is unacceptable. (Every time the user hits "Refresh Forecast Sheet", it takes 73 seconds to re-populate the grid.)
The query looks like this:
SELECT outside.EffectiveDate
[omitted for clarity]
,(
SELECT SUM(b.Amount)
FROM vCI_UNIONALL inside
WHERE inside.EffectiveDate <= outside.EffectiveDate
) AS RunningBalance
[omitted for clarity]
FROM vCI_UNIONALL outside
"EffectiveDate" on certain items can change all the time... New items can get added, etc. I certainly need something that can calculate the running sum on the fly (when the Refresh button is hit). Stored proc or another View...? Please advise.
Solution: (one of many, this one is orders of magnitude faster than a sub-query)
Create a new table with all the columns in the view except for the RunningTotal col. Create a stored procedure that first truncates the table, then INSERT INTO the table using SELECT all columns, without the running sum column.
Use update local variable method:
DECLARE #Amount DECIMAL(18,4)
SET #Amount = 0
UPDATE TABLE_YOU_JUST_CREATED SET RunningTotal = #Amount, #Amount = #Amount + ISNULL(Amount,0)
Create a task agent that will run the stored procedure once a day. Use the TABLE_YOU_JUST_CREATED for all your reports.
Take a look at this post
Calculate a Running Total in SQL Server
If you have SQL Server Denali, you can use new windowed function.
In SQL Server 2008 R2 I suggest you to use recursive common table expression.
Small problem in CTE is that for fast query you have to have identity column without gaps (1, 2, 3,...) and if you don't have such a column you have to create a temporary or variable table with such a column and to move you your data there.
CTE approach will be something like this
declare #Temp_Numbers (RowNum int, Amount <your type>, EffectiveDate datetime)
insert into #Temp_Numbers (RowNum, Amount, EffectiveDate)
select row_number() over (order by EffectiveDate), Amount, EffectiveDate
from vCI_UNIONALL
-- you can also use identity
-- declare #Temp_Numbers (RowNum int identity(1, 1), Amount <your type>, EffectiveDate datetime)
-- insert into #Temp_Numbers (Amount, EffectiveDate)
-- select Amount, EffectiveDate
-- from vCI_UNIONALL
-- order by EffectiveDate
;with
CTE_RunningTotal
as
(
select T.RowNum, T.EffectiveDate, T.Amount as Total_Amount
from #Temp_Numbers as T
where T.RowNum = 1
union all
select T.RowNum, T.EffectiveDate, T.Amount + C.Total_Amount as Total_Amount
from CTE_RunningTotal as C
inner join #Temp_Numbers as T on T.RowNum = C.RowNum + 1
)
select C.RowNum, C.EffectiveDate, C.Total_Amount
from CTE_RunningTotal as C
option (maxrecursion 0)
There're may be some questions with duplicates EffectiveDate values, it depends on how you want to work with them - do you want to them to be ordered arbitrarily or do you want them to have equal Amount?

How can I optimize a SQL query that performs a count nested inside a group-by clause?

I have a charting application that dynamically generates SQL Server queries to compute values for each series on a given chart. This generally works quite well, but I have run into a particular situation in which the generated query is very slow. The query looks like this:
SELECT
[dateExpr] AS domainValue,
(SELECT COUNT(*) FROM table1 WHERE [dateExpr]=[dateExpr(maintable)] AND column2='A') AS series1
FROM table1 maintable
GROUP BY [dateExpr]
ORDER BY domainValue
I have abbreviated [dateExpr] because it's a combination of CAST and DATEPART functions that convert a datetime field to a string in the form of 'yyyy-MM-dd' so that I can easily group by all values in a calendar day. The query above returns both those yyyy-MM-dd values as labels for the x-axis of the chart and the values from the data series "series1" to display on the chart. The data series is supposed to count the number of records that fall into that calendar day that also contain a certain value in [column2]. The "[dateExpr]=[dateExpr(maintable)]" expression looks like this:
CAST(DATEPART(YEAR,dateCol) AS VARCHAR)+'-'+CAST(DATEPART(MONTH,dateCol) AS VARCHAR) =
CAST(DATEPART(YEAR,maintable.dateCol) AS VARCHAR)+'-'+CAST(DATEPART(MONTH,maintable.dateCol) AS VARCHAR)
with an additional term for the day (ommitted above for the sake of space). That is the source of the slowness of the query, but I don't know how to rewrite the query so that it returns the same result more efficiently. I have complete control over the generation of the query, so if I could find more efficient SQL that returned the same results, I could modify the query generator appropriately. Any pointers would be greatly appreciated.
I havent tested but i think it can be done by:
SELECT
[dateExpr] AS domainValue,
SUM (CASE WHEN column2='A' THEN 1 ELSE 0 END) AS series1
FROM table1 maintable
GROUP BY [dateExpr]
ORDER BY domainValue
The fastest way to do this would be to use calendar tables. Create a sql table with an entry for every month for next who knows how many years. Then select from that calendar table, joining in the entries from table1 that have dates between the start and end date for the month. Then, if your clustered index is on the dateCol in table1, the query will run very quickly.
EDIT: Example Query. This assumes a months table exists with two columns, StartDate and EndDate where EndDate is the midnight on the first day of the next month. The clustered index on the months table should be on StartDate
SELECT
months.StartDate,
COUNT(*) AS [Count]
FROM months
INNER JOIN table1
ON table1.dateCol >= months.StartDate AND table1.dateCol < months.EndDate
GROUP BY months.StartDate;
With Calendar As
(
Select DateAdd(d, DateDiff(d, 0, Min( dateCol ) ), 0) As [date]
From Table1
Union All
Select DateAdd(d, 1, [date])
From Calendar
Where [date] <= (
Select Max( DateAdd(d, DateDiff(d, 0, dateCol) + 1, 0) )
From Table1
)
)
Select C.date, Count(Table1.PK) As Total
From Calendar As C
Left Join Table1
On Table1.dateCol >= C.date
And Table1.dateCol < DateAdd(d, 1, C.date )
And Table1.column2 = 'A'
Group By C.date
Option (Maxrecursion 0);
Rather than try to force the display format in SQL, you should do that in your report or chart generator. However, what you can do in the SQL is to strip the time portion from the datetime values as I've done in my solution.

Resources