When a record is updated in a SQL Server table, how does the db engine physically execute such a request: is it INSERT + DELETE or UPDATE operation?
As we know, the performance of a database and any statements depends on many variables. But I would like to know if some things can be generalized.
Is there a threshold (table size, query length, # records affected...) after which the database switches to one approach or the other upon UPDATEs?
If there are times when SQL Server is physically performing insert/delete when a logical update is requested, is there a system view or metric that would show this? - i.e, if there is a running total of all the inserts, updates and deletes that the database engine has performed since it was started, then I would be able to figure out how the database behaves after I issue a single UPDATE.
Is there any difference between the UPDATE statement's behavior depending on SQL Server version (2008, 2012...)
Many thanks.
Peter
UPDATE on base table without triggers is always physical UPDATE. SQL Server has no such threshold. You can look up usage statistics, for example, in sys.dm_db_index_usage_stats.
Update edits the existing row. If it was insert/delete, then you'd get update failures for duplicate keys.
Insert/Update/Delete also all can be discretely permissioned. So a user could update records, but not insert or delete, also leading to that not being the way it works.
Related
I need some light here. I am working with SQL Server 2008.
I have a database for my application. Each table has a trigger to stores all changes on another database (on the same server) on one unique table 'tbSysMasterLog'. Yes the log of the application its stored on another database.
Problem is, before any Insert/update/delete command on the application database, a transaction its started, and therefore, the table of the log database is locked until the transaction is committed or rolled back. So anyone else who tries to write in any another table of the application will be locked.
So...is there any way possible to disable transactions on a particular database or on a particular table?
You cannot turn off the log. Everything gets logged. You can set to "Simple" which will limit the amount of data saved after the records are committed.
" the table of the log database is locked": why that?
Normally you log changes by inserting records. The insert of records should not lock the complete table, normally there should not be any contention in insertion.
If you do more than inserts, perhaps you should consider changing that. Perhaps you should look at the indices defined on log, perhaps you can avoid some of them.
It sounds from the question that you have a create transaction at the start of your triggers, and that you are logging to the other database prior to the commit transaction.
Normally you do not need to have explicit transactions in SQL server.
If you do need explicit transactions. You could put the data to be logged into variables. Commit the transaction and then insert it into your log table.
Normally inserts are fast and can happen in parallel with out locking. There are certain things like identity columns that require order, but this is very lightweight structure they can be avoided by generating guids so inserts are non blocking, but for something like your log table a primary key identity column would give you a clear sequence that is probably helpful in working out the order.
Obviously if you log after the transaction, this may not be in the same order as the transactions occurred due to the different times that transactions take to commit.
We normally log into individual tables with a similar name to the master table e.g. FooHistory or AuditFoo
There are other options a very lightweight method is to use a trace, this is what is used for performance tuning and will give you a copy of every statement run on the database (including triggers), and you can log this to a different database server. It is a good idea to log to different server if you are doing a trace on a heavily used servers since the volume of data is massive if you are doing a trace across say 1,000 simultaneous sessions.
https://learn.microsoft.com/en-us/sql/tools/sql-server-profiler/save-trace-results-to-a-table-sql-server-profiler?view=sql-server-ver15
You can also trace to a file and then load it into a table, ( better performance), and script up starting stopping and loading traces.
The load on the server that is getting the trace log is minimal and I have never had a locking problem on the server receiving the trace, so I am pretty sure that you are doing something to cause the locks.
I have a system in place where a SQL Server 2008 R2 database is being mirrored to a second server. The database on the first server has change tracking turned on and on the second server we create a database snapshot of the mirror database in order to pull data for an ETL system with the change tracking functions. Both databases have an isolation level of Read Committed
Occasionally when we pull the change tracking information from the database snapshot,using the changetable function, there is a row that is either an insert or update (i,u) but the row is not in the base table, it is deleted. When I go to check the first server, the changetable function shows only a 'd' row for the same primary key. It appears that the snapshot was taken right at the time the delete happened on the base table but before the information was tracked by change tracking.
My questions are:
Is the mechanism for capturing change tracking in the change tracking internal tables asynchronous?
EDIT:
The mechanism for change tracking is synchronous and a part of the transaction making the change to the table. I found that information here:
http://technet.microsoft.com/en-us/magazine/2008.11.sql.aspx
So that leaves me with these questions.
Why am I seeing the behavior I outlined above? If change tracking is synchronous why would I get inserts/updates from change tracking that don't have a corresponding row in the table?
Would changing the isolation level to snapshot isolation help alleviate the above situation?
EDIT:
After reading more about snapshot isolation recommendation it is recommended so that you can wrap your call to change_tracking_current_version and the call to changetable(changes...) in a transaction so that everything stays consistent. The database snapshot should be doing that for me since it is as of a point in time. I pass in the value from change_tracking_current_version into changetable(changes...) function.
Let me know if you need any more information!
Thanks
Chris
I am building a VoIP switch and I am going to be doing an insert using a SQL stored procedure.
I need to update the user table "balance field" each time I update the history table. Due to it being a switch I can have hundreds of updates each second.
I wanted to know the best way to update a field with out dead locks and with out wrong info.
I will be using MS sql server 2012.
Partition the user table into evenly sized partitions - SQL 2012 allows 10000 of them. That way the updates are distributed over many allocation units instead of just one. Then add the WITH(ROWLOCK) hint to the update query.
To kick off the actual update you could use a trigger.
I have a big data table for store articles(has more than 500 million record), therefore I use distributed partition view feature of SQL Server 2008 across 3 servers.
Select and Insert operations work fine. But Delete or Update action take long time and never complete.
In Processes tab of Activity Monitor, I see Wait Type field is "PREEMPTIVE_OLEDBOPS" for Update command.
Any idea what's the problem?
Note: I think problem with MSDTC, because Update command not shown in SQL Profiler of second server. but when check MSDTC status on the same server, status column is Update(active).
What is most likely happening is that all the data from the other server is pulled over to the machine where the query is running before the filter of your update statement is applied. This can happen when you use 4-part naming. Possible solutions are:
Make sure each table has a correct "check constraint" which defines the minimum and maximum value of the partition table. Without this partition elimination is not going to work properly.
Call a stored procedure with 4-part naming on the other server to do the update.
use OPENQUERY() to connect to the other server
To serve 500 million records your server seems to be adequate. A setup with Table Partitioning with a sliding window is probably a more cost effective way of handling the volume.
Is there a way of finding all the rows that have been updated by a single statement, sql itself must be tracking this as it could roll back the update if required. I'm interested in finding all the changed rows as I'm getting performance hit using update triggers.
I have a some large (2M-10M) row tables in Sql Server, and I'm adding audit triggers to track when records are updated and by what, trouble is this is killing performance. Most of the updates against the table will touch 20,000+ rows and they're now taking 5-10 times longer than previously.
I've thought of some options
1) Ditch triggers entirely and add the audit fields to every update statement, but that relies on everyone's code being changed.
2) Use before/after checksum values on the fields and then use them to update the changed rows a second time, still a performance hit.
Has anyone else solved this problem?
An UPDATE trigger already has the records affected by an update statement in the inserted and deleted pseudo columns. You can select their primary key columns into a preliminary audit table serving as a queue, and move more complicated calculation into a separate job.
Another option is the OUTPUT clause for the UPDATE statement, which was introduced in SQL Server 2005. (updated after comment by Philip Kelley)
SqlServer knows how to rollback because it has the transaction log. Is not something that you can find in the data tables.
You can try to add a timestamp column to your rows, then save a "current" timestamp, update all the rows. The changed rows should be all the rows with the timestamp greater than your "current" timestamp. THis will help you to find the changed rows, but not to find what has changed them.
You can use Change Tracking or Change Data Capture. These are technologies built into the Engine for tracking changes and are leveraging the Replication infrastructure (log reader or table triggers). Both are only available in SQL Server 2008 or 2008 R2 and CDC requires Enterprise Edition licensing.
Anything else you'd try to do would ultimately boil down to either one of:
reading the log for changes (which is only doable by Replication, including Change Data Capture, otherwise the Engine will recycle the log before you can read it)
track changes in triggers (which is what Change Tracking would use)
track changes in application
There just isn't any Free Lunch. If audit is a requirement, then the overhead of auditing has to be taken into consideration and capacity planning must be done accordingly. All data audit solution will induce significant overhead, so the an increase of operating cost by factors of 2x, 4x or even 10x are not unheard of.