I'm in a scenario where I have two distinct databases and I want to commit different changes in both database in such a way that if one of the commit fails, the other will also fail.
If I have a 'Transaction A' on 'Database A' and 'Transaction B' on 'Database B', how can I bind the two transactions together so that both will succeed or fail atomically.
I can't figure out a way to do this. It's easy to rollback 'Transaction B' if 'Transaction A' fails, but if 'Transaction B' fails when 'Transaction A' is already committed I'm screwed.
I would like to know if there is a technology to handle this in a specific database product, or even better if there is a generic pattern to handle this scenario that could even be applied to any transactional system such as binding a database transaction with a transactional message queue.
You need to use a distributed transaction manager (which, in the most simple case, you program itself can act as).
X/OpenXA is the most widely used standard for managing distributed transactions, and all databases of the big four support it natively.
There is a standard database process called "Two Phase Commit" and most of the commercial RDBMSs has successful implementation of 2PC one way or another.
Here are some references;
SQL Server: http://msdn.microsoft.com/en-us/library/aa754091(v=bts.10).aspx
Oracle: http://download.oracle.com/docs/cd/E14072_01/server.112/e10595/ds_txns003.htm
MySQL: http://dev.mysql.com/doc/refman/5.0/en/xa.html
using (TransactionScope scopeExternal = new TransactionScope())
{
using (TransactionScope scope1
= new TransactionScope(TransactionScopeOption.Suppress))
{
... operations for 1st DB
scope1.Complete();
}
using (TransactionScope scope2
= new TransactionScope(TransactionScopeOption.Suppress))
{
... operations for 2nd DB
scope2.Complete();
}
scopeExternal.Complete();
}
If any transaction from either scope1 or scope2 fails it throws an exception that prevent from scopeExternal transaction commit.
Related
I have no Idea about JTA,to understand the overall scenarios please follow this link How to maintain acid property of 3 sequential transaction of three diffrenet databases ,However based on suggestions from the post,I have to use Distributed transactions. I am using apache-tomcat server.
But As I said i have no idea about JTA, So my problem is that, I have more than 15 database connection, and based on the some condition, their respective database is connected. So I can't create hibernate.cfg.xml and session factories and entities for each databases .
So My question is that, can i use JTA with plain jdbc? ,and if possible then provide me some links or examples.
Yes . You can use JTA with plain JDBC . The general idea is that instead of using JDBC Connection object to declare the transaction boundary , you use the Transaction Manager object which is provided by the JTA implementation to declare the transaction boundary .
For example , in the case of Bitronix Transaction Manager , declaring a transaction boundary across many database Connection can be done by the following codes:
PoolingDataSource derbyDataSource1 = new PoolingDataSource();
derbyDataSource1.setClassName("org.apache.derby.jdbc.EmbeddedXADataSource");
derbyDataSource1.setUniqueName("derby1");
derbyDataSource1.getDriverProperties().setProperty("databaseName", "database1");
derbyDataSource1.init();
PoolingDataSource derbyDataSource2= new PoolingDataSource();
derbyDataSource2.setClassName("org.apache.derby.jdbc.EmbeddedXADataSource");
derbyDataSource2.setUniqueName("derby2");
derbyDataSource2.getDriverProperties().setProperty("databaseName", "database2");
derbyDataSource2.init();
BitronixTransactionManager btm = TransactionManagerServices.getTransactionManager();
btm.begin();
try {
Connection c1= derbyDataSource1.getConnection();
Connection c2= derbyDataSource2.getConnection();
/***Use c1 and c2 to execute statements again their corresponding DBs as usual**/
btm.commit();
} catch (SQLException ex) {
ex.printStackTrace();
btm.rollback();
}
Say that a method only reads data from a database and does not write to it. Is it always the case that such methods don't need to run within a transaction?
In many databases a request for reading from the database which is not in an explicit transaction implicitly creates a transaction to run the request.
In a SQL database you may want to use a transaction if you are running multiple SELECT statements and you don't want changes from other transactions to show up in one SELECT but not an earlier one. A transaction running at the SERIALIZABLE transaction isolation level will present a consistent view of the data across multiple statements.
No. If you don't read at a specific isolation level you might not get enough guarantees. For example rows might disappear or new rows might appear.
This is true even for a single statement:
select * from Tab
except select * from Tab
This query can actually return rows in case of concurrent modifications because it scans the table twice.
SQL Server: There is an easy way to get fast, nonblocking, nonlocking, consistent reads: Enable snapshot isolation and read in a snapshot transaction. AFAIK Oracle has this capability as well. Postgres too.
the purpose of transaction is to rollback or commit the operations done to a database, if u are just selecting values and making no change in the data there is no need of transaction.
I did not find explicit sqlite locking commands before inserting or updating rows into the table. Does sqlite handle the locking mechanism on it own?
The pager module described in http://sqlite.org/lockingv3.html handles the locking mechanism. But I am not sure if there are any commands that the user can use to explicitly lock the tables. Please advice.
Thanks
As far as I know there are no dedicated sqlite commands to control locking. However you can get sqlite to lock the database using create transaction. For instance:
BEGIN IMMEDIATE TRANSACTION;
...
COMMIT TRANSACTION;
BEGIN EXCLUSIVE TRANSACTION;
...
COMMIT TRANSACTION;
If you read the documentation I linked you should get a better idea on the difference between IMMEDIATE & EXCLUSIVE transactions.
It might be worth noting that the locks in sqlite apply to the whole database and not just individual tables, unlike the LOCK TABLE statement in other sql databases.
SQLite does whatever locking is necessary in order to implement the transaction scheme that your SQL statements describe. In particular, if you don't describe any then you get auto-commit behavior, with a lock held for the duration of each statement and then dropped as the statement finishes. Should you need longer transactions (often true!) then you ask for them explicitly with BEGIN TRANSACTION (often shortened to BEGIN) and finish with COMMIT TRANSACTION (or ROLLBACK TRANSACTION). The transaction handling is frequently wrapped for you by your language interface (as this makes it considerably easier to get right, coupling the transaction lifetime to a code block or method call) but at the base level, it comes down to BEGIN/COMMIT/ROLLBACK.
In short, you've got transactions. Locks are used to implement transactions. You don't have raw locks (which is a good thing; they're rather harder to get right than you might think from first glance).
I am using transactionscope to ensure that data is being read to the database correctly. However, I may have a need to select some data (from another page) while the transaction is running. Would it be possible to do this? I'm very noob when it comes to databases.
I am using LinqToSQL and SQL Server 2005(dev)/2008(prod).
Yes, it is possible to still select data from a database while a transaction is running.
Data not affected by your transaction (for instance, rows in a table which are being not updated) can usually be read from other transactions. (In certain situations SQL Server will introduce a table lock that stops reads on all rows in the table but they are unusual and most often a symptom of something else going on in your query or on the server).
You need to look into Transaction Isolation Levels since these control exactly how this behaviour will work.
Here is the C# code to set the isolation level of a transaction scope.
TransactionOptions option = new TransactionOptions();
options.IsolationLevel = System.Transactions.IsolationLevel.ReadCommitted;
using (TransactionScope sc = new TransactionScope(TransactionScopeOption.Required, options)
{
// Code within transaction
}
In general, depending on the transaction isolation level specified on a transaction (or any table hints like NOLOCK) you get different levels of data locking that protect the rest of your application from activity tied up in your transaction. With a transaction isolation level of READUNCOMMITTED for example, you can see the writes within that transaction as they occur. This allows for dirty reads but also prevents (most) locks on data.
The other end of the scale is an isolation level like SERIALIZABLE which ensures that your transaction activity is entirely isolated until it has comitted.
In adition to the already provided advice, I would strongly recommend you look into snapshot isolation models. There is a good discussion at Using Snapshot Isolation. Enabling Read Committed Snapshot ON on the database can aleviate a lot of contention problems because readers are no longer blocked by writers. Since default reads are performed under read commited isolation mode, this simple database option switch has immediate benefits and requires no changes in the app.
There is no free lunch, so this comes at a price, in this case the price being aditional load on tempdb, see Row Versioning Resource Usage.
If howeever you are using explict isolation levels and specially if you use the default TransactionScope Serializable mode, then you'll have to review your code to enforce the more bening ReadCommited isolation level. If you don't know what isolation level you use, it means you use ReadCommited.
Yes, by default a TransactionScope will lock the tables involved in the transaction. If you need to read while a transaction is taking place, enter another TransactionScope with TransactionOptions IsolationLevel.ReadUncommitted:
TransactionScopeOptions = new TransactionScopeOptions();
options.IsolationLevel = IsolationLevel.ReadUncommitted;
using(var scope = new TransactionScope(
TransactionScopeOption.RequiresNew,
options
) {
// read the database
}
With a LINQ-to-SQL DataContext:
// db is DataContext
db.Transaction =
db.Connection.BeginTransaction(System.Data.IsolationLevel.ReadUncommitted);
Note that there is a difference between System.Transactions.IsolationLevel and System.Data.IsolationLevel. Yes, you read that correctly.
If I have several SPs
SP1
SP2
some_inline_queries
how do I ensure that they are all run at once without interruption from other threads?
Is it possible to do this from SQL Server level?
edit:
Let's say we have a main script with 3 major actions:
sp1 scans table t1 to generate a random string that is unique to column c1 in t1;
sp2 does some fairly intensive stuff;
some statement inserts the random string returned by sp1 into c1 of table t1. So if I run many instances of this main script simultaneously,I need all contents in t1.c1 to be distinct when all scripts finished running.
are you running them inside a transaction? Not sure what you mean by "interruption" but they would be safe assuming that they are within a:
Begin Transaction MyTranNameHere
exec sp1
exec sp2
some statement
Commit Transaction MyTranNameHere
If I'm understanding you right, you want to set the transaction isolation level to SERIALIZABLE. Assuming MSSQL has true serializability (it might not; true serializability is rarely needed and often quite costly to implement), this will guarantee that even if you execute many transactions at once, the end result will be identical to executing one (though which one is usually non-deterministic), waiting for it to finish, then executing another transaction, and so on. Be careful, though: there are often subtle "bugs", be they actual bugs or misfeatures, in database SERIALIZABLE implementations, since this stuff is really tricky to get right. Especially nasty is the fact that some MVCC-based databases (Oracle and PostgreSQL use MVCC, and I know the Postgres lists were recently discussing these issues with their DBMS) don't really implement SERIALIZABLE perfectly, going instead for what should be called SNAPSHOT isolation -- this gives 99% of the benefits of SERIALIZABLE with minimal performance hit, but that's no good if you fall into that 1%.
If SERIALIZABLE is not an option, either because it doesn't do what you want or for some other reason, you can always have each SP take an exclusive lock before doing its dirty work. This might lead to deadlocks or timeouts and require other tuning elsewhere, so it's kind of an ugly option, but it should do the job.