I have a table that essentially stores revisions of certain datasets (i.e. edits back through time if you like). The date of the revision is unimaginatively stored as a datetime field named revision.
Each new revision is taken daily, at midnight.
If this table were left to populate, there would be a set of rows for every single day, where each set of rows shares the same revision datetime field, e.g. 2010-10-31 00:00:00.
I would like to implement a stored procedure as a job that essentially runs daily and cleans up the number of revisions in this table, based on criteria similar to the way the Time Machine features works in OS X. Namely, the revisions to be kept should be:
one revision each day in the past week (since weekly revision below)
one revision each week for every week after this (the midnight Monday morning revision)
one revision each month for revisions beyond a year old (the midnight revision of the first day of the month)
So, for example, right now I would expect to see the following revisions:
2010-10-30 (daily)
2010-10-29 (daily)
2010-10-28 (daily)
2010-10-27 (daily)
2010-10-26 (daily)
2010-10-25 (daily/weekly)
2010-10-18 (weekly)
2010-10-11 (weekly)
2010-10-04 (weekly)
2010-09-27 (weekly)
...
That is of course bearing in mind my data set isn't yet a year old.
So, what's the best and most concise DELETE FROM to achieve this? Thanks!
Taking each case
Last weeek: simply filter < DATEADD(day, -8, GETDATE()) because no processing is needed. ( allows for time component and assumes clean up runs after midnight)
Between week ago and year ago: Use DATEPART(weekday, revision) <> 2. Depends on ##datefirst.
More than a year ago: Use DATEPART(day, revision) <> 1
So, something like but untested...
DELETE
MyTable
WHERE
Revision < DATEADD(day, -8, GETDATE())
AND
(
Revision > DATEADD(year, -1, GETDATE()) AND DATEPART(weekday, revision) <> 2
OR --Must be > 1 year ago)
Revision < DATEPART(month, revision) <> 1)
)
Related
Given a (for us) large datastream, of about 2GB/day. This data has to be incorporated/structured/cleansed into a destination database (this part is solved).
Until now we were dropping and creating a new datatable every day for the whole dataset of the current month, and then union into a materialized table of the previous months' dataset.
Besides, that this is a very un-elegant way, also unfortunately by the end of the month this query is becoming quite massive and draining the server for up to 1,5 hours daily.
I thought I do a parameterized query for date by grabbing all the data for DateMax
(
declare #ReportDateMax datetime
set #ReportDateMax = dateadd(dd,-1, cast(getdate() as date)
)
however it is not good enough-
the problem is the following:
1. on monday we need to download the data for friday
2. on tuesday for monday
3. wednesday for tuesday
4. thursday for wednesday
5. friday for thursday
which means that I cannot just download the information for yesterday, as than on monday I would not get anything.
Also ideally, if we just daily data I would like to run twice a day a select query for the previous reporting day.
desired solution:
design a query which selects the previous working day (public holidays are irrelevant for the topic).
I have found a solution eventually in another post.
In case somebody else is interested:
SELECT DATEADD(DAY, CASE DATENAME(WEEKDAY, GETDATE())
WHEN 'Sunday' THEN -2
WHEN 'Monday' THEN -3
ELSE -1 END, DATEDIFF(DAY, 0, GETDATE()))
Sorry if the Title is confusing but it's hard to explain what I'm after in one phrase.
I'm currently producing a report based on the production for the week. I start off my CTE construction with the following to get the days Monday to Friday of the current week:
WITH
cte_Date AS
(
SELECT
CAST(DateTime AS date) AS Date
FROM
( VALUES
(GETDATE()
)
, (DATEADD(day,-1,GETDATE()))
, (DATEADD(day,-2,GETDATE()))
, (DATEADD(day,-3,GETDATE()))
, (DATEADD(day,-4,GETDATE()))
, (DATEADD(day,-5,GETDATE()))
, (DATEADD(day,-6,GETDATE())) ) AS LastSevenDays(DateTime)
WHERE
DATENAME(weekday, DateTime) = 'Monday'
UNION ALL
SELECT
DATEADD(day,1,Date)
FROM
cte_Date
WHERE
DATENAME(weekday,Date) <> 'Friday'
)
This is working fine. I have made the report available to users so they can run it anytime however sometimes nobody is available to run it last thing Friday. This means they don't get to see the full production for Friday and then the following week the CTE days change.
I'm trying to keep this a one-click affair so rather than introduce date parameters I proposed to the users that we adjust the query such that if they run the report before midday on "Monday" then it will show them last week's figures and they were happy with this (me and my big mouth). I put Monday in quotes because what we really mean of course is the first production day of the week.
My primary data table (which we'll call MyData) has a datetime field named DateTime (really!) that I can reference to determine the first day of production for the week.
One final caveat: Due to the layout of the report the users insisted that they always want to see the five days Monday to Friday, even if there is no production on a given day. (Consequently I do a LEFT JOIN from cte_Date to all other tables required.) So to be clear, right now as I'm typing this it's 11:45am local time on Tuesday and yesterday happened to be a public holiday here so running the report now should return Monday to Friday last week, but running it in 20 minutes time should return Monday to Friday this week.
Please help, my poor brain is getting twisted trying to figure it out.
There are a few different ways you can tackle this, but they all boil down to the same thing: you need a way of figuring out whether it's before or after 12pm on the first working day of the current week, then you need to get the Monday of the current "production week".
Let's just say, for simplicity's sake, you have some sort of table that contains public holidays (or non-production days). To find out whether it's the first day of the current production week, you basically just have to add the number of days in a row since the start of the week that have been public holidays.
Then you need to figure out whether it's before or after 12pm of that day.
If it's before you want last week's Monday-Friday. If it's after, you want this week's Monday-Friday.
Here's one way you might do this:
DECLARE #NonProductionDays TABLE (NPD DATE UNIQUE NOT NULL); -- Public holiday table.
INSERT #NonProductionDays (NPD) VALUES ('2017-09-25');
DECLARE #i INT = -- You don't need a variable for this, but just to keep things simple...
(
SELECT COUNT(*) -- Extract number of public holidays in a row this week before current date.
FROM #NonProductionDays AS N
WHERE DATEDIFF(WEEK, 0, N.NPD) = DATEDIFF(WEEK, 0, GETDATE())
AND N.NPD <= GETDATE()
AND (DATENAME(WEEKDAY, N.NPD) = 'Monday' OR EXISTS (SELECT 1 FROM #NonProductionDays AS N2 WHERE N2.NPD = DATEADD(DAY, -1, N.NPD)))
);
SELECT D = CAST(DATEADD(DAY, T.N, DATEADD(WEEK, DATEDIFF(HOUR, DATEADD(DAY, #i, '1900-01-01 12:00:00'), GETDATE()) / 24 / 7, '1900-01-01')) AS DATE)
FROM (VALUES (0), (1), (2), (3), (4)) AS T(N);
/*
Breaking this down:
X = DATEADD(DAY, #i, '1900-01-01 12:00:00')
-- Adds the number of NPD days this week to '1900-01-01 12:00:00'
-- So, for example, X would be '1900-01-02 12:00:00' this week
Y = DATEDIFF(HOUR, X, GETDATE()) / 24 / 7
-- The number of weeks between X and now, by taking the number of hours and dividing by 24 then by 7
-- The division is necessary to compare the hour.
-- So, for example, as of 11am on the September 26 2017, you'd get 6142.
-- As of 12pm on September 26 2017, you'd get 6143.
Z = DATEADD(WEEK, Y, '1900-01-01')
-- Just adds Y weeks to 1900-01-01, which was a Monday. This tells you the Monday of the current "production week".
-- So, for example, as of 11am on September 26 2017, you'd get '2017-09-18 00:00:00.000'.
-- As of 12pm on September 26 2017, you'd get '2017-09-25 00:00:00.000'.
Then we cast this as a date and add 0/1/2/3/4 days to it to get Monday, Tuesday, Wednesday, Thursday and Friday of the current "production week".
*/
I'm not sure I came up with the most efficient approach, but after a week of tossing it about in my brain this is what I came up with. I approached the problem from the opposite direction of that suggested by #ZLK.
My existing logic was already giving me the Monday of this week so in a subquery I looked for the first production record after Monday, stripped off the time with a DATEDIFF and made it midday with a DATEADD. I was then able to compare the current Date/Time with midday of the first production day to determine whether to reduce the date by one week.
I replaced this SELECT clause:
SELECT
CAST(DateTime AS date) AS Date
with this one:
SELECT -- Monday this week if it's after midday on the first production day otherwise Monday last week
DATEADD(week,IIF(GETDATE()>=DATEADD(hour,12,(
SELECT DATEDIFF(day,0,MIN(DateTime))
FROM MyData
WHERE CAST(MyData.DateTime AS date) >= CAST(LastSevenDays.DateTime AS date)
)),0,-1),CAST(LastSevenDays.DateTime AS date)) AS Date
To cater for the case where a new week has commenced but the operator runs the report before production starts I carefully arranged the boolean condition inside my IIF clause so that the empty result set from the subquery would mean the test returned FALSE and the operator would still see last week's figures.
(#ZLK, Thanks for your input - you did help my thinking a bit but I don't think your answer should be marked as correct. What I've come up with here is what I was originally requesting and didn't require the use of a static table.)
My requirement is that I want to find business-week-ending (not the calender week) given a DATE column from the sales table in MSSQL.
Using different techniques I was able to find the [Calender] week-endings (and week-starting) dates corresponding to DATE in the table.
Since our business week ends on Wednesday [DOW 3 or 4 depending when the week started], I tried to deduct number of days from the week ending dates to pull it back to Wednesday. The idea did work pretty good with a flaw. Works fine as long as the Date in the table is greater than DOW 3 or 4. Any suggestion?
SELECT DateAdd(wk, DateDiff(wk, 0, Recons_Sales_Details.Recons_Date), 0) + 2
You need to look into SET DATEFIRST to do this:
SET DATEFIRST 4 --4 is Thursday week start
SQL Fiddle Demo
Greetings StackOverflow Wizards.
SQL datetime calculations always give me trouble. I am trying to determine if an employee's hiredate fell between the last payday of that month and the first of the next month. (I.e. did they get a paycheck in their hire month.
Knowns:
I know our paydays are every other Friday.
I know 01/02/1970 was a Payday, and that date precedes the longest
active employee we have.
I know the hire date of each active employee (pulled from table).
I know (can calculate) the first of the month following the hire
date.
What I cannot seem to wrap my head around is how to use that seed date (01/02/1970) with datediff, dateadd, datepart, etc. to determine if there is a pay date between the hire date in question and the first of the following month.
In pseudo-code, here is what I'm trying to do:
declare #seedDate datetime = '01/02/1970' -- Friday - Payday seed date from which to calculate
declare #hireDate datetime = '09/26/2008' -- this date will actually be pulled from ServiceTotal table
declare #firstOfMonth datetime = DATEADD(MONTH, DATEDIFF(MONTH, 0, #hireDate) + 1, 0) -- the first of the month following #hireDate
declare #priorPayDate datetime -- calculate the friday payday immediately prior to #firstOfMonth
if #priorPayDate BETWEEN #hireDate AND #firstOfMonth
begin
-- do this
end
else
begin
-- do that
end
Using the hard-coded #hireDate above, and the #seedDate to determine every-other-Friday paydays, I know that there was a payday on 9/19/2008 and not another one until 10/03/2008, so the boolean above would be FALSE, and I will "do that" rather than "do this." How do I determine the value of #priorPayDate?
In all my databases where there is a lot going on with dates I create a table with colums for date,day, weekday,month,weeknr,dayof month, etc etc. I then use a procedural programming language or a bunch of handwritten sql to populate this table with every day for a large range of years say 1970 to 2200.
I pack this table 100% and index it heavily. You can then simply join any date to this table and do complex date stuff with simple where clause. So basically you pre calculate a helper table. maybe in you case you add a column to the date helper table with friday since seed column.
hope that makes sense.
Taking a DATEDIFF for days between your #seedDate and #firstOfMonth will give you a total number of days, which you can modulus by number of days between pay periods (14) to get number of days from the last pay period to the #firstOfMonth. You'll run into problems when the 1st is a payday (e.g. next month), which makes a CASE statement necessary:
DECLARE #priorPayDate DATETIME
SET #priorPayDate = CASE
WHEN DATEDIFF(dd, #seedDate, #firstOfMonth) % 14 = 0
THEN DATEADD(dd, -14, #firstOfMonth)
ELSE DATEADD(dd, -(DATEDIFF(dd, #seedDate, #firstOfMonth) % 14), #firstOfMonth)
END
I have a database "DB" and I want to delete records from table "MyRecords" which are of future ("RecordDates" which are more than today just to avoid date change on system) and less than x no. of days from today.
That "x" may vary. Moreover, I want to execute same type of command with month and year also.
I am using SQL Server and C#.
To delete records which are in the future and less than n days in the future you could use T-SQL such as this:
DELETE FROM DB.table WHERE
(date > GETDATE() AND date < DATEADD(day, 5, GETDATE()));
In this case n is 5.
You can see all the arguments to DATEADD() at DATEADD (Transact-SQL)
This query will delete all records that are later than today's date, but less than 30 days in the future. You could replace "30" with a variable so you could determine how many days in the future to delete.
DELETE FROM Table
WHERE
TABDate > GETDATE() and TABDate < DATEADD(day, 30, GETDATE())
UPDATE
To delete all records less than 30 days in the past, you would change the query to look like this:
DELETE FROM Table
WHERE
TABDate > DATEADD(day, -30, GETDATE()) AND TABDate < GETDATE()
Also note that all these examples are calling GETDATE() which also has a time component as well as a date, so care must be taken in that anytime you see a statement like < GETDATE() you are not just deleting records with a date before, say, 2011-09-29, you are deleting all records with a date before '2011-09-29 17:30'. So be aware of that if you table dates contain times as well.
You can use the query DELETE FROM DB.table WHERE date > now() or WHERE date > unix_timestamp(), depending on how you are storing your dates. (i.e. date-time vs. timestamp).