Reference to the database I am looking for concurrent transaction problems and their solutions.
You can search for "transaction isolation level" and choose one of the many production documentation (for example PostgreSQL https://www.postgresql.org/docs/12/transaction-iso.html.).
Main Concurrent transactions issues are:
Dirty Read
Nonrepeatable Read
Phantom Read
Serialization Anomaly
Related
MSDN describes the JET transaction isolation for its OLEDB provider as follows:
Jet supports five levels of nesting in transactions. The only
supported mode for transactions is Read Committed. Setting lesser
levels of transactional separation implies Read Committed. Setting
higher levels will cause StartTransaction to fail.
Jet supports only single-phase commit.
MSDN describes Read Committed as follows:
Specifies that shared locks are held while the data is being read to
avoid dirty reads, but the data can be changed before the end of the
transaction, resulting in nonrepeatable reads or phantom data. This
option is the SQL Server default.
My questions are:
What is single-phase commit? What consequence does this have for transactions and isolation?
Would the Read Committed isolation level as described above be suitable for my requirements here?
What is the best way to achieve a Serializable transaction isolation using Jet?
By question number:
Single-phase commit is used where all of your data is in one database -- the activity of the transaction is committed atomically and you're done. If you have a logical transaction which needs to be spread across multiple storage engines (like a relational database for metadata and some sort of document store for a big blob) you can use a transaction manager to coordinate the activities so that the work is persisted in both or neither, if both products support two phase commit. They are just telling you that they don't support two-phase commit, so the product is not suitable for distributed transactions.
Yes, if you check the condition in the UPDATE statement itself; otherwise you might have problems.
They seem to be suggesting that you can't.
As an aside, I worked for decades as a consultant in quite a variety of environments. More than once I was engaged to migrate people off of Jet because of performance problems. In one case a simple "star" type query was running for two minutes because it was joining on the client rather than letting the database do it. As a direct query against the database it was sub-second. In another case there was a report which took 72 hours to run through Jet, which took 2 minutes when run directly against the database. If it generally works OK for you, you might be able to deal with such situations by using stored procedures where Jet is causing performance pain.
I am copying some data from sql server to firebird over the network. Because of integrity I need to use transactions, but I am transferring about 9k of rows. Has reading in opened transaction some negative influence on reading costs, against reading in non transaction mode?
"Non-Transaction mode" simply doesn't exists, there is always a transaction, whether you declare it or not. The question is really whether is there any difference between reading in an implicit transaction vs. reading in an explicit transaction. And the short answer is no, there is no difference.
There may be some difference if you use an explicit transaction under a higher isolation level, other than the default READ_COMMITTED. It also depends whether you do anything else in an explicit transaction, but all these details cannot be infered from the frugal information in your post.
The default transaction isolation level is READ COMMITTED. It will lock the table for others while querying it.
MSDN on transaction isolation level:
http://msdn.microsoft.com/en-us/library/ms173763.aspx
I've had the issue of an occasional strange error message concerning locking deadlock. This even happened to stackoverflow - see this great article by Jeff Atwood. I strongly recommend switching to 'read committed snapshot' which solved the error + my performance issues.
http://www.codinghorror.com/blog/2008/08/deadlocked.html
Hallo,
I want to get data into a database on a multicore system with ative WAL using JDBC. I was thinking about spawning multiple threads in my application to insert data parallely.
If the application has multiple threads I will have to increase the isolation level to Repeatable Read which on MVCC-databases should be mapped to Snapshot isolation.
If I were using one thread I wouldn't need isolation levels. As far as I know most Snapshot isolation databases analyze the write sets of all transaction that could have a conflict and then rollback all but one of the real conflict transactions. More specific I'm talking about Oracle, InnoDB and PostgreSQL.
1.) Is this analyzing of the write sets expensive?
2.) Is it a good idea to multithread the inserts for a higher total throughput? Real conflict are nearly impossible because of the application layer feeding the threads conflict free stuff. But the database shall be a safety net.
Oracle does not support Repeatable Read. It supports only Read Committed and Serializable. I might be mistaken, but setting an isolation level of Repeatable Read for Oracle might result in a transaction with an isolation level of Serializable. In short, you are left to mercy of the database support for the isolation levels that you desire.
I cannot speak for InnoDB and PostgreSQL, but the same would apply if they do not support the required isolation levels. The database could automatically upgrade the isolation level to a higher level to meet the desired isolation characteristics. You ought to rethink this approach, if your application's desired isolation level has to be Repeatable Read.
The problem like you've rightly inferred is that optimistic locking will possibly result in transaction rollbacks, if a conflict is detected. Oracle does so by reporting the ORA-08177 SQL error. Since this error is reported when two threads will access the same data range, it could be avoided if the threads work against data sets involving different data ranges. You will have to ensure that this is the case when dividing work across threads.
I think the limiting factor here will be disk IO, not the overhead of moving to Repeatable Read.
Even a single thread may be able to max out the disks on the DB server especially with the amount of DB logging required on insert / update. Are you sure that's not already the case?
Also, in any multi-user system, you probably want to be running with Repeatable Read isolation anyway (Postgres only supports this and serializable). So, I don't think of this as adding any "overhead" above what I would normally see.
Are there any issues using SNAPSHOT isolation to read data consistently for viewing without locking, blocking or dirty/phantom reads, while a separate process is processing continuous incoming data in serializable transactions?
We need readers (guaranteed read-only: web data sync, real-time monitoring views, etc) to be able to read consistent data, without being blocked, or blocking the updates. We were using SNAPSHOT for everything, but had too many consistency failures so switched the updating process to SERIALIZABLE.
I've read about but am not totally clear as to the impacts of using different isolation levels concurrently. I've seen the lock compatibility matrix, and read various info. It seems ok, but I'd really appreciate some wise guidance from people with practical experience about any major pitfalls.
Are there any issues using Snapshot isolation for the readers while SERIALIZABLE transactions are writing? Are there circumstances it will block a SERIALIZABLE transaction? Is there a benefit to using SNAPSHOT vs READ COMMITTED (with READ_COMMITTED_SNAPSHOT ON)?
Thanks, any assistance greatly appreciated :-)
Reads performed under SNAPSHOT isolation level read any modified data from the version store. As such they are affected only by writes. Writes behave identically under all isolation levels. Therefore SNAPSHOT reads behave the same way no matter the isolation level of the concurent transactions.
READ_COMMITTED_SNAPSHOT ON makes READ COMMITTED act as SNAPSHOT. In effect, it is SNAPSHOT: the READ_COMMITTED_SNAPSHOT was provided as a quick way to port applications to SNAPSHOT w/o code changes. So everything said on the first paragraph applies.
I understand that an Isolation level of Serializable is the most restrictive of all isolation levels. I'm curious though what sort of applications would require this level of isolation, or when I should consider using it?
Ask yourself the following question: Would it be bad if someone were to INSERT a new row into your data while your transaction is running? Would this interfere with your results in an unacceptable way? If so, use the SERIALIZABLE level.
From MSDN regarding SET TRANSACTION ISOLATION LEVEL:
SERIALIZABLE
Places a range lock on the data set,
preventing other users from updating
or inserting rows into the data set
until the transaction is complete.
This is the most restrictive of the
four isolation levels. Because
concurrency is lower, use this option
only when necessary. This option has
the same effect as setting HOLDLOCK on
all tables in all SELECT statements in
a transaction.
So your transaction maintains all locks throughout its lifetime-- even those normally discarded after use. This makes it appear that all transactions are running one at a time, hence the name SERIALIZABLE. Note from Wikipedia regarding isolation levels:
SERIALIZABLE
This isolation level specifies that
all transactions occur in a completely
isolated fashion; i.e., as if all
transactions in the system had
executed serially, one after the
other. The DBMS may execute two or
more transactions at the same time
only if the illusion of serial
execution can be maintained.
The SERIALIZABLE isolation level is the highest isolation level based on pessimistic concurrency control where transactions are completely isolated from one another.
The ANSI/ISO standard SQL 92 covers the following read phenomena when one transaction reads data, which is changed by second transaction:
dirty reads
non-repeatable reads
phantom reads
and Microsoft documentation extends with the following two:
lost updates
missing and double reads caused by row updates
The following table shows the concurrency side effects enabled by the different isolation levels:
So, the question is what read phenomena are allowed by your business requirements and then to check if your hardware environment can handle stricter concurrency control?
Note, something very interesting about the SERIALIZABLE isolation level - it is the default isolation level specified by the SQL standard. In the context of SQL Server of course, the default is READ COMMITTED.
Also, the official documentation about Transaction Locking and Row Versioning Guide is a great place where a lot of aspects are covered and explained.
Try accounting. Transactions in accounts are inherently serializable if you want to have proper account values AND adhere to things like credit limits.
It behaves in a way that when you try to update a row, It simply blocks the updation process until the transaction is completed.