I need to update multiple databases with a few simple SQL statement. The databases are configurared in SQL using 'Linked Servers', and the SQL versions are mixed (SQL 2008, SQL 2005, and SQL 2000). I intend to write a stored procedure in one of the databases, but I would like to do so using a transaction to make sure that each database gets updated consistently.
Which of the following is the most accurate:
Will a single BEGIN/COMMIT TRANSACTION work to guarantee that all statements across all databases are successful?
Will I need multiple BEGIN TRANSACTIONS for each individual set of commands on a database?
Are transactions even supported when updating remote databases? I would need to execute a remote SP with embedded transaction support.
Note that I don't care about any kind of cross-database referential integrity; I'm just trying to update multiple databases at the same time from a single stored procedure if possible.
Any other suggestions are welcome as well. Thanks!
It is possible. You can use explicit BEGIN DISTRIBUTED TRANSACTION in your controlling procedure, or simply start a normal transaction and rely on DTC to elevate your transaction to a distributed one the moment you go across a linked server, this happens automatically. See Transact-SQL Distributed Transactions in MSDN.
However I must warn you that it's a slippery slope. The number of failures and the downtime dramatically increase as soon as you bring DQ (distributed queries) into picture. If you have 99.5% up time servers (ie. 43 hours of downtime a year) and your query touches 5 servers your availability becomes 97.5% (216 hours of downtime a year). With 10 servers it becomes 95% up time (428 hours of down time year). Things like managing an OS patch deployment, a engine SP upgrade or application maintenance (think index rebuilds and such) become a nightmare to orchestrate and coordinate.
The way to go is to decouple the servers, use something like Service Broker instead of DQ.
You should be able to accomplish #1 using a distributed transaction. You will need DTC active and you'll need to use BEGIN DISTRIBUTED TRANSACTION along with ROLLBACK TRANSACTION and COMMIT TRANSACTION within your stored procedure.
Dealing with the DTC can being up a lot of gotchas, so good luck :)
Related
I wanted to see if you guys are utilizing marked transactions in your TFS backup scenario. Are there any drawbacks or gotchas to consider for this?
If I use the TFS Power Tools to create a backup plan, the following is created for me:
Tables and Stored Procedures needed for marked transactions
Scheduled Jobs
Maintenance Plans for Full, Differential, and Transaction Logs
The Backup/Restore Power Tool relies on SQL marked transactions to
keep consistency across the TFS (and dependency products) databases. Source: http://intovsts.net/tag/tfs-power-tools/
Before inserting named marks into the transaction log, consider the
following: Source: MSDN
Because transaction marks consume log space, use them only for
transactions that play a significant role in the database recovery
strategy.
After a marked transaction commits, a row is inserted in the
logmarkhistory table in msdb.
If a marked transaction spans multiple databases on the same database
server or on different servers, the marks must be recorded in the logs
of all the affected databases.
That kind of settles the matter of marked transactions in my backup plan. Especially since the TFS databases use full recovery mode, and the tool relies on it, there isn't much choice. :)
Inside a transaction that have a savepoint I have to make a join with a table that is in a linked server. When I try to do it, I get the error message:
“Cannot use SAVE TRANSACTION within a distributed transaction”
The remote table data rarely changes. It is almost fixed. Is is possible to tell SqlServer to exclude this table from the transaction? I've tried a (NOLOCK) hint, but it isn't possible to use this hint for a table in a linked server.
Does anyone knows about a workaround? I'm using the ole SqlServer 2000.
One thing that you could do is to make a local copy of the remote table before you start the transaction. I know that this may sound like a lot of overhead, but remote joins are frequently a performance problem anyway and the SOP fix for that is also to make a local copy.
According to this link, the ability to use SAVEPOINTs in a Distributed transaction was dropped in SQL 7.
To allow application migration from Microsoft SQL Server 6.5 when
savepoints inside distributed transactions are in use, Microsoft SQL
Server 2000 Service Pack 1 introduces a trace flag that allows a
savepoint within a distributed transaction. The trace flag is 8599 and
can be turned on during the SQL Server startup or within an individual
session (that is, prior to enabling a distributed transaction with a
BEGIN DISTRIBUTED TRANSACTION statement) by using the DBCC TRACEON
command. When trace flag 8599 is set to ON, SQL Server allows you to
use a savepoint within a distributed transaction.
So unfortunately, you may either have to drop the bounding ACID transaction, or change the SPROC on the remote server so that it doesn't use SAVEPOINTs.
On a side note (Although I have seen that you have tagged it SQL SERVER 2000) but to make a point that SQL SERVER 2008 has remote proc trans Option for this.
In this case if the distributed table is not too large I would copy it to a temp table. If possible, include any filtering to get the number of rows to a minimum. Then you can proceed normally. Another option since the data changes rarely is copy the data to a permanant table and checking if anything has changed to prevent sending to much data over the network every time you run the transaction. You could only pull over the recent changes.
If you wish to handle transaction from UI level and you have Visual Studio 2008/.net fx 3.5 or + framework then you can wrap your logic with TransactionScope Class. If you dont have any frontends and you are working only on Sql Servers kindly ignore my answer...
Anyone know how the underlying replication model in sql server works? Do they essentially depend on UTC datetime values to determine if something is new or do they keep a table of all the changes (like a table of tableID+rowid that have changed).
I am building my own "replication" system and was planning on using the dates to know what to replicate. Then I started wondering what would happen if the date got off in the computer for some reason. The obvious choice is to keep a log of the changes as you go and once you replicate those changes, you remove from the log of changes. But thats a lot of extra work, instead of just checking dates.
I figure if sql server replication works by just checking the dates, then that should be good enough for me.
Any wisdom here?
thanks
As a transaction occurs in SQL Server, it is written to the transaction log along with information pertinent to the transaction.
SQL Server replication uses this transaction log to determine which transactions have not yet been processed and to move them to the subscriber. There is a lot more going on under the hood to keep track of the intersection between transactions, publications, subscriptions, etc. but I will leave that to MSDN documentation about SQL Server replication http://msdn.microsoft.com/en-us/library/ms151198.aspx
Moving on to your point about building your own replication system:
Do not build your own replication system. There are too many complications involved that will cause you to spend many many days working. You will be much better off using the items that are shipped with SQL Server.
SQL Server replication methods are pretty impressive out of the box.
If you outline what causes you to think in terms of building your own replication system, we can help you figure out how to use existing items to provision what you need.
Also, read up as much as you can here to get an idea of what it can do for you http://msdn.microsoft.com/en-us/library/ms151198.aspx
SQL Server has a LogReader job that is aptly named. Replication reads the transaction log and applies appropriate transactions to the subscribing databases.
For one thing SQLServer (and it's not the only one) supports multiple replication algorithms.
You can find here details about the ones implemented in SQLServer 2008. Read first the X Replication Overview then follow the How X Replication works for more details.
Is it true that better concurrency can be achieved in Oracle databases than in MS SQL Server databases? In particular in an OLTP scenario, such as an ERP system?
I've overheard an SAP consultant making this claim, referring to Oracle locking techniques like row locking and multi-version read consistency and the redo log.
Out of the box, Oracle will have a higher transaction throughput but this is because it defaults to MVCC. SQL Server defaults to blocking selects on uncommitted updates but it can be changed to MVCC as well so that difference should basically go away. See Read Committed Isolation Level.
See Enabling Row Versioning-Based Isolation Levels.
When the ALLOW_SNAPSHOT_ISOLATION
database option is set ON, the
instance of the Microsoft SQL Server
Database Engine does not generate row
versions for modified data until all
active transactions that have modified
data in the database complete. If
there are active modification
transactions, SQL Server sets the
state of the option to PENDING_ON.
After all of the modification
transactions complete, the state of
the option is changed to ON. Users
cannot start a snapshot transaction in
that database until the option is
fully ON. The database passes through
a PENDING_OFF state when the database
administrator sets the
ALLOW_SNAPSHOT_ISOLATION option to
OFF.
He/She was probably referring to the facts that:
In Oracle readers do not block writers and writers do not block readers
Oracle does not maintain a list of row locks so there is no significant overhead in locking and locks never escalate to the table level.
Starting with SQL 2005 this is no longer true - you can enable snapshot isolation and your writers will not block your readers, just like in Oracle.
Sql Server has row locking, several different transaction isolation levels, and a transaction log that can be replayed.
Maybe he's referring to Access, which does not have these.
Or maybe he believes Oracle uses better defaults. He might have a better argument there, but with either DBMS if you're talking ERP you better have a DBA who knows enough about the system to keep it tuned properly.
We are experiencing some very annoying deadlock situations in a production SQL Server 2000 database.
The main setup is the following:
SQL Server 2000 Enterprise Edition.
Server is coded in C++ using ATL OLE Database.
All database objects are being accessed trough stored procedures.
All UPDATE/INSERT stored procedures wrap their internal operations in a BEGIN TRANS ... COMMIT TRANS block.
I collected some initial traces with SQL Profiler following several articles on the Internet like this one (ignore it is referring to SQL Server 2005 tools, the same principles apply). From the traces it appears to be a deadlock between two UPDATE queries.
We have taken some measures that may have reduced the likelihood of the problem from happening as:
SELECT WITH (NOLOCK). We have changed all the SELECT queries in the stored procedures to use WITH (NOLOCK). We understand the implications of having dirty reads but the data being queried is not that important since we do a lot of automatic refreshes and under normal conditions the UI will have the right values.
READ UNCOMMITTED. We have changed the transaction isolation level on the server code to be READ UNCOMMITED.
Reduced transaction scope. We have reduced the time a transaction is being held in order to minimize the probabilities of a database deadlock to take place.
We are also questioning the fact that we have a transaction inside the majority of the stored procedures (BEGIN TRANS ... COMMIT TRANS block). In this situation my guess is that the transaction isolation level is SERIALIZABLE, right? And what about if we also have a transaction isolation level specified in the source code that calls the stored procedure, which one applies?
This is a processing intensive application and we are hitting the database a lot for reads (bigger percentage) and some writes.
If this were a SQL Server 2005 database I could go with Geoff Dalgas answer on an deadlock issue concerning Stack Overflow, if that is even applicable for the issue I am running into. But upgrading to SQL Server 2005 is not, at the present time, a viable option.
As these initial attempts failed my question is: How would you go from here? What steps would you take to reduce or even avoid the deadlock from happening, or what commands/tools should I use to better expose the problem?
A few comments:
The isolation level explicitly specified in your stored procedure overrides isolatlation level of the caller.
If sp_getapplock is available on 2000, I'd use it:
http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/06/30/855.aspx
In many cases serializable isolation level increases the chance you get a deadlock.
A good resource for 2000:
http://www.code-magazine.com/article.aspx?quickid=0309101&page=1
Also some of Bart Duncan's advice might be applicable:
http://blogs.msdn.com/bartd/archive/2006/09/09/747119.aspx
In addition to Alex's answer:
Eyeball the code to see if tables are being accessed in the same order. We did this recently and reordered code to alway to parent then child. The system had grown, code and features were more complex, more user: we simply started getting deadlocks.
- See if transactions can be shortened (eg start later, finish earlier, less processing)
Identify which code you'd like not to fail and use SET DEADLOCK PRIORITY LOW in the other
We've used this (SQL 2005 has more options here) to make sure that some code will never be deadlocked and sacrificed other code.
If you have SELECT at the start of the transaction to prepare some stuff, consider HOLDLOCK (maybe UPDLOCK) to keep this locked for the duration. We use this occasionally so stop writes on this table by other processes.
The reason for the deadlocks in my setup scenario was after all the indexes. We were using (generated by default) non clustered indexes for the primary keys of the tables. Changing to clustered indexes fixed the problem.
My guess would be that you are experiencing deadlocks, either:
Because your DML(Updates probably) statements are getting escalations to table-locks, or
Different stored procedures are accessing the same tables in transactions but in a different order.
To address this, I would first examine the stored procedures, and make sure the the modifications statements have the indexes that they need.
Note: this applies to both the target tables and the source tables (despite NOLOCK, an UPDATE's source tables will get locks also. Check the query plans for scans on user stored procedures. Unlike batch or bulk operations, most user queries & DMLs work on a small subsets of the table rows and so should not be locking the entire table.
Then secondly, I would check the stored procedures to ensure that all data access in a stored procedure is being done in a consistent order (Parent -> Child is usually preferred).