Creating a history table without using triggers - sql-server

I have a TABLE A with 3000 records with 25 columns. I want to have a history table called Table A history holding all the changes updates and deletes for me to look up any day. I usually use cursors. Now thought using triggers which I was not asked to. Do you have any other suggestions? Many thanks!

If your using tsql /SQL server and you can't use triggers, which is the only sure way to get every change, maybe use a stored procedure that is scheduled in job to run every x amount of time, the stored procedure using a MERGE statement with the two tables to get new records or changes. I would not suggest this if you need every single change without question.
CREATE TABLE dbo.TableA (id INT, Column1 nvarchar(30))
CREATE TABLE dbo.TableA_History (id INT, Column1 nvarchar(30), TimeStamp DateTime)
(this code isn't production, just the general idea)
Put the following code inside a stored procedure and use a Sql Server Job with a schedule on it.
MERGE INTO dbo.TableA_History
USING dbo.TableA
ON TableA_History.id = TableA.id AND TableA_History.Column1 = TableA.Column1
WHEN NOT MATCHED BY TARGET THEN
INSERT (id,Column1,TimeStamp) VALUES (TableA.id,TableA.Column1,GETDATE())
So basically if the record either doesn't exist or doesn't match meaning a column changed, insert the record into the history table.

It is possible to create history without triggers in some case, even if you are not using SQL Server 2016 and system-versioned table are not available.
In some cases, when you can identify for sure which routines are modifying your table, you can create history using OUTPUT INTO clause.
For example,
INSERT INTO [dbo].[MainTable]
OUTPUT inserted.[]
,...
,'I'
,GETUTCDATE()
,#CurrentUserID
INTO [dbo].[HistoryTable]
SELECT *
FROM ... ;
In routines, when you are using MERGE I like that we can use $action:
Is available only for the MERGE statement. Specifies a column of type
nvarchar(10) in the OUTPUT clause in a MERGE statement that returns
one of three values for each row: 'INSERT', 'UPDATE', or 'DELETE',
according to the action that was performed on that row.
It's very handy that we can add the user which is modifying the table. Using triggers you need to use session context or session variable to pass the user. In versioning table you need to add additional column to the main table in order to log the user as it only logs the current table columns (at least for now).
So, basically it depends on your data and application. If you have many sources of CRUD over the table, the trigger is the most secure way. If your table is very big and heavily used, using MERGE is not good as it my cause blocking and harm performance.
In our databases we are using all of the methods depending on the situation:
triggers for legacy
system-versioning for new development
direct OUTPUT in the history, when sure that data is modified only by given set of routines

Related

Is there any way to update the modified date time automatically in SQL Server?

Is there any way to update the modified date time automatically in SQL Server.
I do not want to use Triggers. Also I want to avoid providing the value through application while calling SQL query.
Is there any support in SQL or in Dapper etc.
If you want to keep track of the changes in database you can use a feature called
System-Versioned Temporal Table as explained here.
Using a Temporal Table, you will be able to query the recent state of the row as usual, in addition to the ability to query the full history of that row
It's very handy if you are interested in keeping a history of data changes
I am able to solve the problem using Temporal Table. I am not sure is this a elegant solution. Here is how i solved.
Create Table:
CREATE TABLE extable4 (PriKey int PRIMARY KEY, ColValue varchar(200)
, [ModifiedDateTime] datetime2 (2) GENERATED ALWAYS AS ROW START
, [ModifiedExpiryDateTime] datetime2 (2) GENERATED ALWAYS AS ROW END HIDDEN
, PERIOD FOR SYSTEM_TIME (ModifiedDateTime,[ModifiedExpiryDateTime])
) ;
Insert a record with out providing input to ModifiedDatetime.
insert into extable4(PriKey,ColValue) values(1,'Ver 1');
ModifiedDateTime Populated with systime.
update extable4 set ColValue='Ver 1.1' where PriKey=1;
ModifiedDateTime updated now. :)

Trigger to log inserted/updated/deleted values SQL Server 2012

I'm using SQL Server 2012 Express and since I'm really used to PL/SQL it's a little hard to find some answers to my T-SQL questions.
What I have: about 7 tables with distinct columns and an additional one for logging inserted/updated/deleted values from the other 7.
Question: how can I create one trigger per table so that it stores the modified data on the Log table, considering I can't used Change Data Capture because I'm using the SQL Server Express edition?
Additional info: there is only two columns in the Logs table that I need help filling; the altered data from all the columns merged, example below:
CREATE TABLE USER_DATA
(
ID INT IDENTITY(1,1) NOT NULL,
NAME NVARCHAR2(25) NOT NULL,
PROFILE INT NOT NULL,
DATE_ADDED DATETIME2 NOT NULL
)
GO
CREATE TABLE AUDIT_LOG
(
ID INT IDENTITY(1,1) NOT NULL,
USER_ALTZ NVARCHAR(30) NOT NULL,
MACHINE SYSNAME NOT NULL,
DATE_ALTERERED DATETIME2 NOT NULL,
DATA_INSERTED XML,
DATA_DELETED XML
)
GO
The columns I need help filling are the last two (DATA_INSERTED and DATA_DELETED). I'm not even sure if the data type should be XML, but when someone either
INSERTS or UPDATES (new values only), all data inserted/updated on the all columns of USER_DATA should be merged somehow on the DATA_INSERTED.
DELETES or UPDATES (old values only), all data deleted/updated on the all columns of USER_DATA should be merged somehow on the DATA_DELETED.
Is it possible?
Use the inserted and deleted Tables
DML trigger statements use two special tables: the deleted table and
the inserted tables. SQL Server automatically creates and manages
these tables. You can use these temporary, memory-resident tables to
test the effects of certain data modifications and to set conditions
for DML trigger actions. You cannot directly modify the data in the
tables or perform data definition language (DDL) operations on the
tables, such as CREATE INDEX. In DML triggers, the inserted and
deleted tables are primarily used to perform the following: Extend
referential integrity between tables. Insert or update data in base
tables underlying a view. Test for errors and take action based on the
error. Find the difference between the state of a table before and
after a data modification and take actions based on that difference.
And
OUTPUT Clause (Transact-SQL)
Returns information from, or expressions based on, each row affected
by an INSERT, UPDATE, DELETE, or MERGE statement. These results can be
returned to the processing application for use in such things as
confirmation messages, archiving, and other such application
requirements. The results can also be inserted into a table or table
variable. Additionally, you can capture the results of an OUTPUT
clause in a nested INSERT, UPDATE, DELETE, or MERGE statement, and
insert those results into a target table or view.
Just posting because this is what solved my problem. As user #SeanLange said in the comments to my post, he said to me to use an "audit", which I didn't know it existed.
Googling it, led me to this Stackoverflow answer where the first link there is a procedure that creates triggers and "shadow" tables doing sort of what I needed (it didn't merge all values into one column, but it fits the job).

Use one SQL Server trigger that listens on updates of multiple tables

I have a Microsoft SQL Server 2012 database with multiple tables.
All tables contain the same two columns DataRowModified (type datetime) and DataRowLastAuthor (type nvarchar(MAX)). And no, I can't put all those columns into a separate table, it's a requirement that each table directly contains those rows.
I wrote the trigger below for the table Events to automatically update the values of those two columns whenever a row gets updated:
CREATE TRIGGER [dbo].[Trigger_Events_UpdateMetadata]
ON [dbo].[Events]
FOR UPDATE
AS
BEGIN
UPDATE [dbo].[Events]
SET [DataRowModified] = GETDATE(),
[DataRowLastAuthor] = ORIGINAL_LOGIN()
WHERE [Id] IN (SELECT [Id] FROM INSERTED)
END
Now my question is whether I have to copy (and rename) this trigger for every table I have to use it with, or can I somehow write a global trigger that works on all (or a specified set of) tables? It has to know in which table/row the update happened though, because it has to modify it.
What would be the easiest way to implement an automatically maintained LastAuthor and LastModificationDate column into many tables as described?
A trigger in SQL Server is always bound to a single table - you cannot have "global" triggers or triggers attached to multiple tables at once.
If you need a trigger on your 50 tables - you need to write 50 trigger, one each for every table. No way around this.
The only way to avoid this would be to update those columns in your database layer of your application, so that those values would already be present when you save your row of data. Things like Entity Framework allow such "bulk operations" on multiple entities to e.g. update a last modified date and last user to modify the entity.
No, But multiple triggers could invoke the same stored procedure.

Stored procedure to generate a unique id column

Good day
I have a situation where two users are saving data to the same database and there are primary key conflicts.
Is it possible to write a stored procedure or trigger which will generate a unique identity by adding two columns.
For instance: I have table2 related to table1 by Table1ID. Increment and seed is 1 for both.
If I had to add a row to table2 I would like the autogenerated ID number to be added to a text column thereby making it unique. So the ID would be something like JoeSoap5.
If you want to generated something unique you can use the build-in function "NEWID()". Type and executed the following code:
SELECT NEWID()
If you need to insert record in second table when record in your first table is inserted, is is possible to implement this using TRIGGERS. In your case you can use "AFTER INSERT TRIGGER" or "BEFORE INSERT TRIGGER" - generally this will be a piece of code that will be executed AFTER/BEFORE row in your first table is inserted.
You don't specify your SQL Server version.
SQL 2012 introduces the concept of a sequence - http://msdn.microsoft.com/en-us/library/ff878091.aspx - which would allow you to do just what you want.

INSERT INTO vs SELECT INTO

What is the difference between using
SELECT ... INTO MyTable FROM...
and
INSERT INTO MyTable (...)
SELECT ... FROM ....
?
From BOL [ INSERT, SELECT...INTO ], I know that using SELECT...INTO will create the insertion table on the default file group if it doesn't already exist, and that the logging for this statement depends on the recovery model of the database.
Which statement is preferable?
Are there other performance implications?
What is a good use case for SELECT...INTO over INSERT INTO ...?
Edit: I already stated that I know that that SELECT INTO... creates a table where it doesn't exist. What I want to know is that SQL includes this statement for a reason, what is it? Is it doing something different behind the scenes for inserting rows, or is it just syntactic sugar on top of a CREATE TABLE and INSERT INTO.
They do different things. Use INSERT when the table exists. Use SELECT INTO when it does not.
Yes. INSERT with no table hints is normally logged. SELECT INTO is minimally logged assuming proper trace flags are set.
In my experience SELECT INTO is most commonly used with intermediate data sets, like #temp tables, or to copy out an entire table like for a backup. INSERT INTO is used when you insert into an existing table with a known structure.
EDIT
To address your edit, they do different things. If you are making a table and want to define the structure use CREATE TABLE and INSERT. Example of an issue that can be created: You have a small table with a varchar field. The largest string in your table now is 12 bytes. Your real data set will need up to 200 bytes. If you do SELECT INTO from your small table to make a new one, the later INSERT will fail with a truncation error because your fields are too small.
Which statement is preferable? Depends on what you are doing.
Are there other performance implications? If the table is a permanent table, you can create indexes at the time of table creation which has implications for performance both negatively and positiviely. Select into does not recreate indexes that exist on current tables and thus subsequent use of the table may be slower than it needs to be.
What is a good use case for SELECT...INTO over INSERT INTO ...? Select into is used if you may not know the table structure in advance. It is faster to write than create table and an insert statement, so it is used to speed up develoment at times. It is often faster to use when you are creating a quick temp table to test things or a backup table of a specific query (maybe records you are going to delete). It should be rare to see it used in production code that will run multiple times (except for temp tables) because it will fail if the table was already in existence.
It is sometimes used inappropriately by people who don't know what they are doing. And they can cause havoc in the db as a result. I strongly feel it is inappropriate to use SELECT INTO for anything other than a throwaway table (a temporary backup, a temp table that will go away at the end of the stored proc ,etc.). Permanent tables need real thought as to their design and SELECT INTO makes it easy to avoid thinking about anything even as basic as what columns and what datatypes.
In general, I prefer the use of the create table and insert statement - you have more controls and it is better for repeatable processes. Further, if the table is a permanent table, it should be created from a separate create table script (one that is in source control) as creating permanent objects should not, in general, in code are inserts/deletes/updates or selects from a table. Object changes should be handled separately from data changes because objects have implications beyond the needs of a specific insert/update/select/delete. You need to consider the best data types, think about FK constraints, PKs and other constraints, consider auditing requirements, think about indexing, etc.
Each statement has a distinct use case. They are not interchangeable.
SELECT...INTO MyTable... creates a new MyTable where one did not exist before.
INSERT INTO MyTable...SELECT... is used when MyTable already exists.
The primary difference is that SELECT INTO MyTable will create a new table called MyTable with the results, while INSERT INTO requires that MyTable already exists.
You would use SELECT INTO only in the case where the table didn't exist and you wanted to create it based on the results of your query. As such, these two statements really are not comparable. They do very different things.
In general, SELECT INTO is used more often for one off tasks, while INSERT INTO is used regularly to add rows to tables.
EDIT:
While you can use CREATE TABLE and INSERT INTO to accomplish what SELECT INTO does, with SELECT INTO you do not have to know the table definition beforehand. SELECT INTO is probably included in SQL because it makes tasks like ad hoc reporting or copying tables much easier.
Actually SELECT ... INTO not only creates the table but will fail if it already exists, so basically the only time you would use it is when the table you are inserting to does not exists.
In regards to your EDIT:
I personally mainly use SELECT ... INTO when I am creating a temp table. That to me is the main use. However I also use it when creating new tables with many columns with similar structures to other tables and then edit it in order to save time.
I only want to cover second point of the question that is related to performance, because no body else has covered this. Select Into is a lot more faster than insert into, when it comes to tables with large datasets. I prefer select into when I have to read a very large table. insert into for a table with 10 million rows may take hours while select into will do this in minutes, and as for as losing indexes on new table is concerned you can recreate the indexes by query and can still save a lot more time when compared to insert into.
SELECT INTO is typically used to generate temp tables or to copy another table (data and/or structure).
In day to day code you use INSERT because your tables should already exist to be read, UPDATEd, DELETEd, JOINed etc. Note: the INTO keyword is optional with INSERT
That is, applications won't normally create and drop tables as part of normal operations unless it is a temporary table for some scope limited and specific usage.
A table created by SELECT INTO will have no keys or indexes or constraints unlike a real, persisted, already existing table
The 2 aren't directly comparable because they have almost no overlap in usage
Select into creates new table for you at the time and then insert records in it from the source table. The newly created table has the same structure as of the source table.If you try to use select into for a existing table it will produce a error, because it will try to create new table with the same name.
Insert into requires the table to be exist in your database before you insert rows in it.
The simple difference between select Into and Insert Into is:
--> Select Into don't need existing table. If you want to copy table A data, you just type Select * INTO [tablename] from A. Here, tablename can be existing table or new table will be created which has same structure like table A.
--> Insert Into do need existing table.INSERT INTO [tablename] SELECT * FROM A;.
Here tablename is an existing table.
Select Into is usually more popular to copy data especially backup data.
You can use as per your requirement, it is totally developer choice which should be used in his scenario.
Performance wise Insert INTO is fast.
References :
https://www.w3schools.com/sql/sql_insert_into_select.asp
https://www.w3schools.com/sql/sql_select_into.asp
The other answers are all great/correct (the main difference is whether the DestTable exists already (INSERT), or doesn't exist yet (SELECT ... INTO))
You may prefer to use INSERT (instead of SELECT ... INTO), if you want to be able to COUNT(*) the rows that have been inserted so far.
Using SELECT COUNT(*) ... WITH NOLOCK is a simple/crude technique that may help you check the "progress" of the INSERT; helpful if it's a long-running insert, as seen in this answer).
[If you use...]
INSERT DestTable SELECT ... FROM SrcTable
...then your SELECT COUNT(*) from DestTable WITH (NOLOCK) query would work.
Select into for large datasets may be good only for a single user using one single connection to the database doing a bulk operation task. I do not recommend to use
SELECT * INTO table
as this creates one big transaction and creates schema lock to create the object, preventing other users to create object or access system objects until the SELECT INTO operation completes.
As proof of concept open 2 sessions, in first session try to use
select into temp table from a huge table
and in the second section try to
create a temp table
and check the locks, blocking and the duration of second session to create a temp table object. My recommendation it is always a good practice to create and Insert statement and if needed for minimal logging use trace flag 610.

Resources