how to track a deletion in the future? - sql-server

I have a table in my Datamart; users are supposed to only read from that table… I checked and the count of the table went down, meaning, someone, a SP or a job, deleted records, and I sense it is not the first time.
My question: what is the less invasive, and simpler way of tracking this: I do not want to prevent this; I want to get the person’s name, or the SP/Job name, and the exact time it happened.
I am using: Microsoft SQL Server 2012 (SP1) - 11.0.3000.0 (X64): Business Intelligence Edition (64-bit) on Windows NT 6.2 (Build 9200: ) (Hypervisor)
I have ‘simple’ recovery model, I assume tracking it in the past is really challenging, so I am happy with retrieving this information in the future.

You can use a AFTER DELETE trigger on your table and log deleted values into a history log table with user information and deletion time, etc as follows
CREATE TRIGGER dbo.myTableDeleteTrigger
ON dbo.myTable
AFTER DELETE
AS
INSERT INTO myTableHistory (
-- columns from myTable
DeletedDate,
DeletedByUserId
)
SELECT
-- delete column values from Deleted temp table
GETDATE(),
USER_ID()
FROM Deleted
GO
I used a similar trigger to log data changes for insert, update and delete DMLs on a SQL database table as I explained in the referred tutorial

This below code will generate the Triggers for every table in your database.
Note: You can exclude the tables which you don't want the track in CTE
;WITH CTE AS(
SELECT TAB.name FROM SYS.objects TAB
where TAB.type='U'
)
SELECT '
GO
CREATE TRIGGER [dbo].[TRG_'+NAME+'_LOG] ON [dbo].['+NAME+']
FOR UPDATE,DELETE
AS
INSERT INTO LOG_'+NAME+'
(LOG_DTE,'+(SELECT STUFF((
SELECT ', '+ COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME=NAME FOR XML PATH('')),1,1,''))+')
SELECT getdate(),'+(SELECT STUFF((
SELECT ', '+ COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME=NAME FOR XML PATH('')),1,1,''))+'
from deleted
PRINT ''AFTER TRIGGER FIRED''
' FROM CTE order by name OFFSET 81 ROWS FETCH NEXT 571 ROWS ONLY
But before you need to create a table for every table with the name prefix Log_ of actual table.
Ex: If you have table Employee (Eid int, EName Varchar(250))
You need to have a table like
Log_Employee (LOG_EID int identity,Log_DTe Datetime, EID int FK, EName Varchar(250))

Related

How do I setup a daily archive job in SQL server to keep my DB small and quick?

I have a DB in SQL server and one of the tables recieves a large amount of data every day (+100 000). The data is reported on, but my client only needs it for 7 days. On the odd occasion he will require access to historic data, but this can be ignored.
I need to ensure that my primary lookup table stays as small as can be (so that my queries are as quick as possible), and any data older than 7 days goes into a secondary (archiving) table, within the same database. Data feeds in consistently to my primary table throughout the day from a variety of data sources.
How would I go about performing this? I managed to get to the code below through using other questions, butI am now recieving an error ("Msg 8101, Level 16, State 1, Line 12
An explicit value for the identity column in table 'dbo.Archived Data Import' can only be specified when a column list is used and IDENTITY_INSERT is ON. ").
Below is my current code:
DECLARE #NextIDs TABLE(IndexID int primary key)
DECLARE #7daysago datetime
SELECT #7daysago = DATEADD(d, -7, GetDate())
WHILE EXISTS(SELECT 1 FROM [dbo].[Data Import] WHERE [Data Import].[Receive Date] < #7daysago)
BEGIN
BEGIN TRAN
INSERT INTO #NextIDs(IndexID)
SELECT TOP 10000 IndexID FROM [dbo].[Data Import] WHERE [Data Import].[Receive Date] < #7daysago
INSERT INTO [dbo].[Archived Data Import]
SELECT *
FROM [dbo].[Data Import] AS a
INNER JOIN #NextIDs AS b ON a.IndexID = b.IndexID
DELETE [dbo].[Data Import]
FROM [dbo].[Data Import]
INNER JOIN #NextIDs AS b ON a.IndexID = b.IndexID
DELETE FROM #NextIDs
COMMIT TRAN
END
What am I doing wrong here? Im using SQL server 2012 Express, so cannot partition (which would be ideal).
Beyond this, how do I turn this into a daily recurring task? Any help would be much appreciated.
An explicit value for the identity column in table 'dbo.Archived Data Import' can only be specified when a column list is used and IDENTITY_INSERT is ON
So... set identity insert on. Also, use DELETE ... OUTPUT INTO ... rather than SELECT ->
INSERT -> DELETE.
DECLARE #7daysago datetime
SELECT #7daysago = DATEADD(d, -7, GetDate());
SET IDENTITY_INSERT [dbo].[Archived Data Import] ON;
WITH CTE as (
SELECT TOP 10000 *
FROM [dbo].[Data Import]
WHERE [Data Import].[Receive Date] < #7daysago)
DELETE CTE
OUTPUT DELETED.id, DELTED.col1, DELETED.col2, ...
INTO [dbo].[Archived Data Import] (id, col1, col2, ....);
Beyond this, how do I turn this into a daily recurring task?
Use conversation timers and activated procedures. See Scheduling Jobs in SQL Server Express.
Without seeing your Table definitions, I am going to assume that your archive table has the same definition as your current table. Am I right in assuming that You have an identity column as Archived Data Import.IndexID? If so, switch it to ba an int large enough to hold expected values.
In order to schedule, this you will need to create a bat file to run this procedure and schedule it with windows scheduler.

SQL trigger for audit table getting out of sync

I recently created a SQL trigger to replace a very expensive query I used to run to reduce the amount of updates my database does each day.
Before I preform an update I check to see how many updates have already occurred for the day, this used to be done by querying:
SELECT COUNT(*) FROM Movies WHERE DateAdded = Date.Now
Well my database has over 1 million records and this query is run about 1-2k a minute so you can see why I wanted to take a new approach for this.
So I created an audit table and setup a SQL Trigger to update this table when any INSERT or UPDATE happens on the Movie table. However I'm noticing the audit table is getting out of sync by a few hundred everyday (the audit table count is higher than the actual updates in the movie table). As this does not pose a huge issue I'm just curious what could be causing this or how to go about debugging it?
SQL Trigger:
ALTER TRIGGER [dbo].[trg_Audit]
ON [dbo].[Movies]
AFTER UPDATE, INSERT
AS
BEGIN
UPDATE Audit SET [count] = [count] + 1 WHERE [date] = CONVERT (date, GETDATE())
IF ##ROWCOUNT=0
INSERT INTO audit ([date], [count]) VALUES (GETDATE(), 1)
END
The above trigger only happens after an UPDATE or INSERT on the Movie table and tries to update the count + 1 in the Audit table and if it doesn't exists (IF ##ROWCOUNT=0) it then creates it. Any help would be much appreciated! Thanks.
Something like this should work:
create table dbo.Movies (
A int not null,
B int not null,
DateAdded datetime not null
)
go
create view dbo.audit
with schemabinding
as
select CONVERT(date,DateAdded) as dt,COUNT_BIG(*) as cnt
from dbo.Movies
group by CONVERT(date,DateAdded)
go
create unique clustered index IX_MovieCounts on dbo.audit (dt)
This is called an indexed view. The advantage is that SQL Server takes responsibility for maintaining the data stored in this view, and it's always right.
Unless you're on Enterprise/Developer edition, you'd query the audit view using the NOEXPAND hint:
SELECT * from audit with (noexpand)
This has the advantages that
a) You don't have to write the triggers yourself now (SQL Server does actually have something quite similar to triggers behind the scenes),
b) It can now cope with multi-row inserts, updates and deletes, and
c) You don't have to write the logic to cope with an update that changes the DateAdded value.
Rather than incrementing the count by 1 you should probably be incrementing it by the number of records that have changed e.g.
UPDATE Audit
SET [count] = [count] + (SELECT COUNT(*) FROM INSERTED)
WHERE [date] = CONVERT (date, GETDATE())
IF ##ROWCOUNT=0
INSERT INTO audit ([date], [count])
VALUES (GETDATE(), (SELECT COUNT(*) FROM INSERTED))

How to compare two schemas and generate a script that transform one schema to the other? [duplicate]

When releasing database code to non-development databases , I use such approach - I create release sqlplus script that runs multiple create table/view/sequence/package/etc statements in a sequence. I also should create rollback script which performs drop and other statements if would be needed during deployment or further use. But it is quite annoying always to create rollback scripts manually. I.E. - when I put
alter table table_a add column some_column number(5);
into release script. I have to put
alter table table_a drop column some_column;
into the rollback script. And vice-versa.
Is there way to optimize(or semi-optimize) it? Maybe some there are some Java/Python/etc libraries that allow to parse ddl statements into logical parts?
Maybe there are some better approaches for release/rollback pl/sql code?
DBMS_METADATA_DIFF and a few metadata queries can automate this process.
This example demonstrates 6 types of changes: 1) adding a column 2) incrementing a sequence 3) dropping a table 4) creating a table 5) changing a view 6) allocating an extent.
create table user1.add_column(id number);
create table user2.add_column(id number);
alter table user2.add_column add some_column number(5);
create sequence user1.increment_sequence nocache;
select user1.increment_sequence.nextval from dual;
select user1.increment_sequence.nextval from dual;
create sequence user2.increment_sequence nocache;
select user2.increment_sequence.nextval from dual;
create table user1.drop_table(id number);
create table user2.create_table(id number);
create view user1.change_view as select 1 a from dual;
create view user2.change_view as select 2 a from dual;
create table user1.allocate_extent(id number);
create table user2.allocate_extent(id number);
insert into user2.allocate_extent values(1);
rollback;
You are correct that DBMS_METADATA_DIFF does not work for CREATE or DROP. Trying to diff an object that only exists in one schema will generate an error message
like this:
ORA-31603: object "EXTRA_TABLE" of type TABLE not found in schema "USER1"
ORA-06512: at "SYS.DBMS_METADATA", line 7944
ORA-06512: at "SYS.DBMS_METADATA_DIFF", line 712
However, dropping and adding objects may be easy to script with the following:
--Dropped objects
select 'DROP '||object_type||' USER1.'||object_name v_sql
from
(
select object_name, object_type from dba_objects where owner = 'USER1'
minus
select object_name, object_type from dba_objects where owner = 'USER2'
);
V_SQL
-----
DROP TABLE USER1.DROPPED_TABLE
--Added objects
select dbms_metadata.get_ddl(object_type, object_name, 'USER2') v_sql
from
(
select object_name, object_type from dba_objects where owner = 'USER2'
minus
select object_name, object_type from dba_objects where owner = 'USER1'
);
V_SQL
-----
CREATE TABLE "USER2"."CREATED_TABLE"
( "ID" NUMBER
) SEGMENT CREATION DEFERRED
PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
NOCOMPRESS LOGGING
TABLESPACE "USERS"
The alters can be handled with a SQL statement like this:
select object_name, object_type, dbms_metadata_diff.compare_alter(
object_type => object_type,
name1 => object_name,
name2 => object_name,
schema1 => 'USER2',
schema2 => 'USER1',
network_link1 => 'MYSELF',
network_link2 => 'MYSELF') difference
from
(
select object_name, object_type from dba_objects where owner = 'USER1'
intersect
select object_name, object_type from dba_objects where owner = 'USER2'
) objects;
OBJECT_NAME OBJECT_TYPE DIFFERENCE
----------- ----------- ----------
ADD_COLUMN TABLE ALTER TABLE "USER2"."ADD_COLUMN" DROP ("SOME_COLUMN")
ALLOCATE_EXTENT TABLE -- ORA-39278: Cannot alter table with segments to segment creation deferred.
CHANGE_VIEW VIEW -- ORA-39308: Cannot alter attribute of view: SUBQUERY
INCREMENT_SEQUENCE SEQUENCE ALTER SEQUENCE "USER2"."INCREMENT_SEQUENCE" RESTART START WITH 3
Some notes about these results:
ADD_COLUMN works as expected.
ALLOCATE_EXTENT is probably a false positive, I doubt you care about deferred segment creation. It is very unlikely to affect your system.
CHANGE_VIEW does not work at all. But as with the previous metadata queries, there should be a relatively easy way to build this script using DBA_VIEWS.
INCREMENT_SEQUENCE works too well. Most of the time an application does not care about the sequence values. But sometimes when things get out of sync you need to change them. This RESTART START WITH syntax can be very helpful. You don't need to drop or re-create the indexes, or mess with the increment by multiple times. This syntax is not in the 12c manual. In fact, I cannot find it anywhere on Google. Looks like this package is using undocumented features.
Some other notes:
The package can be very slow sometimes.
If network links on the server are a problem you will need to run it through a local instance with links to both servers.
There may be false positives. Sometimes it returns a row with just a space in it.
It is possible to fully automate this process. But based on the issues above, and my experience with all such automated tools, you should not trust it 100%.

How do I create and use a stored procedure for SQL-Sever which will delete the data in all tables in a database with VS 2010?

I've been using the entities framework with ASP.NET MVC and I'm looking for an easy and fast way to drop all of the information in the database. It takes quite a while to delete all of the information from the entities object and then save the changes to the database (probably because there are a lot of many-to-many relationships) and I think it should be really fast to just remove all of the information with a stored procedure but I'm not sure how to go about this. How do I create and use a stored procedure for SQL-Sever which will delete the data in all tables in a database with VS 2010? Also if I do this will the command be compatible with other version of SQL-Server? (I'm using 2008 on my testing comptuer, but when I upload it I not sure if my hosting company uses 2008 or 2005).
Thanks!!
This solution will work well in terms of deleting all your data in your database's tables.
You can create this stored proc right within Visual Studio on your SQL Server 2008 development server. It'll work well in any version of SQL Server (2000+).
CREATE PROC NukeMyDatabase
AS
--order is important here. delete data in FK'd tables first.
DELETE Foo
DELETE Bar
TRUNCATE TABLE Baz
I prefer TRUNCATE TABLE, as it's faster. It'll depend on your data model, as you can't issue a TRUNCATE TABLE on a table referenced by a foreign key constraint (i.e. parent tables).
You could then call this stored proc using Entity Framework after adding it to your .edmx:
myContext.NukeMyDatabase();
I recently faced a similar problem in that I had to clear over 200+ tables that were interlinked through many foreign key constraints.
The critical issue, as p.campbell pointed out, is determining the correct order of DELETE statements.
The foreign key constraints between tables essentially represent a hierarchy. If table 3 is dependent on table 2, and table 2 is dependent on table 1, then table 1 is the root and table 3 is the leaf.
In other words, if your going to delete from these three tables, you have to start with the table that has no dependencies and work your way up. That is the intent of this code:
DECLARE #sql VARCHAR(MAX)
SET #sql = ''
;WITH c AS
(
SELECT
parent_object_id AS org_child,
parent_object_id,
referenced_object_id,
1 AS Depth
FROM sys.foreign_keys
UNION ALL
SELECT
c.org_child,
k.parent_object_id,
k.referenced_object_id,
Depth + 1
FROM c
INNER JOIN sys.foreign_keys k
ON c.referenced_object_id = k.parent_object_id
WHERE c.parent_object_id != k.referenced_object_id
),
c2 AS (
SELECT
OBJECT_NAME(org_child) AS ObjectName,
MAX(Depth) AS Depth
FROM c
GROUP BY org_child
UNION ALL
SELECT
OBJECT_NAME(object_id),
0 AS Depth
FROM sys.objects o
LEFT OUTER JOIN c
ON o.object_id = c.org_child
WHERE c.org_child IS NULL
AND o.type = 'U'
)
SELECT #sql = #sql + 'DELETE FROM ' + CAST(ObjectName AS VARCHAR(100))
+ ';' + CHAR(13) + CHAR(10) /** for readability in PRINT statement */
FROM c2
ORDER BY Depth DESC
PRINT #sql
/** EXEC (#sql) **/
exec sp_MSForEachTable 'truncate table ?';
But I would recommend a different approach: take a backup of the empty database and simply restore this backup before each run. Even better, have no database at all and have your application be capable of deploying the database itself, using a schema version upgrade set of scripts.

Create SQL job that verifies daily entry of data into a table?

Writing my first SQL query to run specifically as a SQL Job and I'm a little out of my depth. I have a table within a SQL Server 2005 Database which is populated each day with data from various buildings. To monitor the system better, I am attempting to write a SQL Job that will run a query (or stored procedure) to verify the following:
- At least one row of data appears each day per building
My question has two main parts;
How can I verify that data exists for each building? While there is a "building" column, I'm not sure how to verify each one. I need the query/sp to fail unless all locations have reported it. Do I need to create a control table for the query/sp to compare against? (as the number of building reporting in can change)
How do I make this query fail so that the SQL Job fails? Do I need to wrap it in some sort of error handling code?
Table:
Employee RawDate Building
Bob 2010-07-22 06:04:00.000 2
Sally 2010-07-22 01:00:00.000 9
Jane 2010-07-22 06:04:00.000 12
Alex 2010-07-22 05:54:00.000 EA
Vince 2010-07-22 07:59:00.000 30
Note that the building column has at least one non-numeric value. The range of buildings that report in changes over time, so I would prefer to avoid hard-coding of building values (a table that I can update would be fine).
Should I use a cursor or dynamic SQL to run a looping SELECT statement that checks for each building based on a control table listing each currently active building?
Any help would be appreciated.
Edit: spelling
You could create a stored procedure that checks for missing entries. The procedure could call raiserror to make the job fail. For example:
if OBJECT_ID('CheckBuildingEntries') is null
exec ('create procedure CheckBuildingEntries as select 1')
go
alter procedure CheckBuildingEntries(
#check_date datetime)
as
declare #missing_buildings int
select #missing_buildings = COUNT(*)
from Buildings as b
left join
YourTable as yt
on yt.Building = b.name
and dateadd(dd,0, datediff(dd,0,yt.RawDate)) =
dateadd(dd,0, datediff(dd,0,#check_date))
where yt.Building is null
if #missing_buildings > 0
begin
raiserror('OMG!', 16, 0)
end
go
An example scheduled job running at 4AM to check yesterday's entries:
declare #yesterday datetime
set #yesterday = dateadd(day, -1, GETDATE())
exec CheckBuildingEntries #yesterday
If an entry was missing, the job would fail. You could set it up to send you an email.
Test tables:
create table Buildings (id int identity, name varchar(50))
create table YourTable (Employee varchar(50), RawDate datetime,
Building varchar(50))
insert into Buildings (name)
select '2'
union all select '9'
union all select '12'
union all select 'EA'
union all select '30'
insert into YourTable (Employee, RawDate, Building)
select 'Bob', '2010-07-22 06:04:00.000', '2'
union all select 'Sally', '2010-07-22 01:00:00.000', '9'
union all select 'Jane', '2010-07-22 06:04:00.000', '12'
union all select 'Alex', '2010-07-22 05:54:00.000', 'EA'
union all select 'Vince', '2010-07-22 07:59:00.000', '30'
Recommendations:
Do use a control table for the buildings - you may find that one
already exists, if you use the Object
Explorer in SQL Server Management
Studio
Don't use a cursor or dynamic SQL to run a loop - use set based
commands instead, possibly something
like the following:
SELECT BCT.Building, COUNT(YDT.Building) Build
FROM dbo.BuildingControlTable BCT
LEFT JOIN dbo.YourDataTable YDT
ON BCT.Building = YDT.Building AND
CAST(FLOOR( CAST( GETDATE() AS FLOAT ) - 1 ) AS DATETIME ) =
CAST(FLOOR( CAST( YDT.RawDate AS FLOAT ) ) AS DATETIME )
GROUP BY BCT.Building

Resources