Stored procedure to archive data older than 6 months (180) days - sql-server

We are trying to create a stored procedure to archive data older than 6 months (180 days) from our production database in to a new archive database.
We also want to delete those archived rows from the production database.
We are thinking to include a while loop, but we want to archive only 10,000 rows a day and we need to schedule it on daily basis.
Can you please share us your experience.
Thanks

Maybe delete into would work for you? Found something useful here: https://msdn.microsoft.com/en-us/library/ms177564.aspx
USE AdventureWorks2012;
GO
DECLARE #MyTableVar TABLE
(
ProductID INT NOT NULL
);
DELETE TOP (10000) ph
OUTPUT DELETED.ProductID INTO #MyTableVar
FROM Production.ProductProductPhoto AS ph
WHERE DATEDIFF(DAY, ph.YourDay, GETDATE()) > 180
--Display the results of the table variable.
SELECT *
FROM #MyTableVar

Related

Snowflake show tables not accessed in last 20 days

There is a situation where I need to clean up my database in snowflake.
we have around 40 database and each database has more than 100 table. Some are getting loaded everyday and some are not, but used everyday.
However, There has been lots of table added for testing and other purpose (by lots of developer and user).
Now we are working on cleaning up un-used table.
We have query_history table which gives us the information of query run in past, however it has field such as database, warehouse, User etc. but not table.
I was wondering is there is any way we can write a query which give us table name not used (DDL and DML b0th) in last 10 days.
select obj.value:objectName::string objName
, max(query_start_time) as QUERY_DATE_TIME
from snowflake.account_usage.access_history
, table(flatten(direct_objects_accessed)) obj
group by 1
order by QUERY_DATE_TIME desc;
The information schema has a tables view and in that you have a last altered column, will that work with you? It will not give you the last accessed table but will give the last altered table. Other than this, there are no easy way to get this information from snowflake at this time. I also needed this feature, I think we should request for this feature.
select table_schema,
table_name,
last_altered
from information_schema.tables
where table_type = 'BASE TABLE'
and last_altered < dateadd( 'DAY', -10, current_timestamp() )
order by table_schema,
table_name;

T-SQL, SSRS: Set up automatic daily Inserts into Table

I'm using SQL Server 2012.
SSMS 11.0.6248.0.
I want to create an automated way of Inserting data [using a T-SQL insert statement] into a SQL Server table before users start using the system [third-party business system] each morning.
I do a lot of SSRS reporting and creating subscriptions; know how to do inserts using T-SQL, and I am familiar with stored procedures, but I have not had to automate something like this strictly within SQL Server.
Can I make this happen on a schedule - strictly in the SQL Server realm [i.e. using SSRS ... or a stored procedure ... or a function]?
Example Data to read:
Declare #t Table
(
DoctorName Varchar(1),
AppointmentDate Date,
Appointments Int
)
Insert Into #t select 'A','2018-10-23', 5
Insert Into #t select 'B','2018-10-23', 5
Insert Into #t select 'C','2018-10-23', 5
Insert Into #t select 'D','2018-10-23', 5
Insert Into #t select 'E','2018-10-23', 5
Insert Into #t select 'F','2018-10-23', 5
Insert Into #t select 'G','2018-10-23', 5
Insert Into #t select 'H','2018-10-23', 5
Insert Into #t select 'I','2018-10-23', 5;
Select * From #t
The value in Appointments changes through the day as Doctors see patients. Patients may cancel. Patients may walk in. Typically, at the end of the day Doctors end up seeing more patients than they have scheduled at the start of the day. [I set the number at 5 for all Doctors at the start of the above day].
I want to capture the data as it is at the start of each day - before the Clinic opens and the numbers change - and store it in another Table for historic reporting.
I hope this simplified example clarifies what I want to do.
I would appreciate any suggestions on how I might best go about doing this.
Thanks!
This sounds like a job for the SQL Server Agent. A more specific suggestion will require a more detailed description of what you're doing (with sample data, preferably).
You can use SSIS to create a job that you can then schedule. Since you are familiar with stored procedures, you would create your SP first then in SSIS add a Control Flow of Execute SQL Task and configure it according to your needs.
If that doesn't work for you, you could create an application to run on a Timer that executes your SP, however, since you want to stay in the SQL realm, SSIS is the place to look.

Delete few entries from a table SQL Server

I have 42,715,078 entries in one of my table, that I would like to delete TOP 42,715,000 rows (i want to keep just 78 entries).
Any one know who can I do that??
PS: I dont want to delete the table, just want delete the entries of table.
Probably your best bet is to select out the 78 rows you want to keep into a temporary table, then truncate the table and insert them back in.
SELECT * INTO #temp FROM TableName WHERE <Condition that gets you the 78 rows you want>
Or if you don't have a specific 78 rows
SELECT TOP 78 * INTO #temp FROM TableName
Then
TRUNCATE TABLE TableName
And last but not least
INSERT INTO TableName
SELECT * FROM #temp
Doing it this way should be considerably faster depending on what condition you use to get the 78 rows and you avoid bloating the log as TRUNCATE is only minimally logged.
We have an activity log that we truncate once a month. (We keep the monthly backups, so we can get back to any old data if we want to.) If your table is growing every month and you want to keep it small like we do with ours, you can set up a SQL Agent Job to run each month.
We only remove 5000 rows at a time to keep the load of the database, so this job runs every two minutes for an hour. That gives it enough time to remove all the oldest rows without locking the database.
DECLARE #LastDate DateTime -- We remove the oldest rows by month
DECLARE #NumberOfRows INT -- Number of rows to keep
-- Set the Date to the current date minus 3 months.
SET #LastDate = DATEADD(MM, -3, GETDATE())
-- Since it runs on the first Saturday of each month, this code gets it
back to the first of the monh.
SET #LastDate = CAST(CAST(DATEPART(YYYY, #LastDate) AS varchar) + '-' + CAST(DATEPART(MM, #LastDate) AS varchar) + '-01' AS DATETIME)
-- We use 5000.
SET #NumberOfRows = 5000
DELETE TOP (#NumberOfRows) FROM MyTable WHERE Created < #LastDate
I got it.
DELETE TOP (42715000)
FROM <tablename>
WHERE <condition>
It worked so well!

Question about Crystal Reports + SQL Server stored procedure with GROUP BY

I'm writing a report in Crystal Reports XI Developer that runs a stored procedure in a SQL Server 2005 database. The record set returns a summary from log table grouped by Day and Hour.
Currently my query looks something like this:
SELECT
sum(colA) as "Total 1",
day(convert(smalldatetime, convert(float, Timestamp) / 1440 - 1)) as "Date",
datepart(hh, convert(smalldatetime, convert(float, Timestamp) / 1440 - 1)) as "Hour"
`etc...`
GROUP BY
Day, Hour
Ignore the date insanity, I think the system designers were drinking heavily when they worked out how to store their dates.
My problem is this: since there are not always records from each hour of the day, then I end up with gaps, which is understandable, but I'd like Crystal to be able to report on the entire 24 hours regardless of whether there is data or not.
I know I can change this by putting the entire query in a WHILE loop (within the stored procedure) and doing queries on the individual hours, but something inside me says that one query is better than 24.
I'd like to know if there's a way to have Crystal iterate through the hours of the day as opposed to iterating through the rows in the table as it normally does.
Alternatively, is there a way to format the query so that it includes the empty hour rows without killing my database server?
Here's how I solved this problem:
Create a local table in your SQL-Server. Call it "LU_Hours".
This table will have 1 integer field (called "Hours") with 24 rows. Of course, the values would be 1 through 24.
Right join this onto your existing query.
You might need to tweak this to make sure the nulls of empty hours are handled to your satisfaction.
You could use a WITH clause to create the 24 hours, then OUTER JOIN it.
WITH hours AS (
SELECT 1 AS hour
UNION
SELECT 2 AS hour
...
SELECT 24 AS hour
)
SELECT *
FROM hours h
LEFT OUTER JOIN [your table] x ON h.hour=x.datepart(hh, convert(smalldatetime, convert(float, Timestamp) / 1440 - 1))
This SQL would need to added to a Command.
Group as necessary in the SQL or the report.

Ensuring index is used on Informix DATETIME column

Say I have a table on an Informix DB:
create table password_audit (
username CHAR(20),
old_password CHAR(20),
new_password CHAR(20),
update_date DATETIME YEAR TO FRACTION));
I need the update_date field to be in milliseconds (or seconds maybe - same question applies) because there will be multiple updates of the password on the same day.
Say, I have a nightly batch job that wants to retrieve all records from the password_audit table for today.
To increase performance, I want to put an index on the update_date column. If I do this:
CREATE INDEX pw_idx ON password_audit(update_date);
and run this SQL:
SELECT *
FROM password_audit
WHERE DATE(update_date) = mdy(?,?,?)
(where ?, ?, ? are the month, day and year passed in by my batch job)
then I don't think my index will be used - is that right?
I think I need to create an index something like this:
CREATE INDEX pw_idx ON password_audit(DATE(update_date));
- is that right?
Because you are forcing the server to convert two values to DATE, not DATETIME, then it probably won't use an index.
You would do best to generate the SQL as:
SELECT *
FROM password_audit
WHERE update_date
BETWEEN DATETIME(2010-08-02 00:00:00.00000) YEAR TO FRACTION(5)
AND DATETIME(2010-08-02 23:59:59.99999) YEAR TO FRACTION(5)
That's rather verbose. Alternatively, and maybe slightly more easily:
SELECT *
FROM password_audit
WHERE update_date >= DATETIME(2010-08-02 00:00:00.00000) YEAR TO FRACTION(5)
AND update_date < DATETIME(2010-08-03 00:00:00.00000) YEAR TO FRACTION(5)
Both of these should be able to use the index on the update_date column. You can experiment with dropping some of the trailing zeroes from the literals, but I don't think you'll be able to remove them all - but see what the SET EXPLAIN ON output tells you.
Depending on your server version, you might need to run UPDATE STATISTICS after creating the index before the optimizer uses it at all; that is more of a problem on older (say 10.00 and earlier) versions of Informix than on the current (11.10 and later) versions.
I Didn't see 'date_to_accounts_ni' defined in your password_audit table.
What datatype/length is it?
Your first index on password_audit.update_date is adequate, why would you want to index
(DATE(update_table))?

Resources