database atomicity consistency - database

What is difference between Atomicity and consistency ? it looks to me as both are saying same thing in different word.
Atomicity
All tasks of a transaction are performed or none of them are. There are no partial transactions. For example, if a transaction starts updating 100 rows, but the system fails after 20 updates, then the database rolls back the changes to these 20 rows.
Consistency
The transaction takes the database from one consistent state to another consistent state. For example, in a banking transaction that debits a savings account and credits a checking account, a failure must not cause the database to credit only one account, which would lead to inconsistent data.
and looks like atomicity is subset of consistency, then it shoud be cid(conistency, isolation, duribility) ,no atomicity

Atomicity is indeed saying that each transaction is either all or nothing, meaning that either all or none of its actions are executed and that there are no partial operations.
However, consistency talks about ensuring that any transaction will bring the database from one valid state to another. Any data written to the database must be valid according to all defined rules, including but not limited to constraints, cascades, triggers, and any combination thereof
(taken from Wikipedia).
That basically means that only valid states are written to the database, and that a transaction will either be executed if it doesn't violate the data consistency or rolled back if it does.
Hope it clears things out for you.

simple explain For consistency : if a field-type in database is Integer, it should accept only Integer value's and not some kind of other.If you want to store other types in this field, consistency are violated. At this condition transaction will rollback.

Atomicity :
Bunch of statement just take an example of 100 statements which can be insert statement also , if any of the statement failed while processing should revert back remaining statement , which means database should go back original state.
autocommit = false
try{
statement one ;
statement two ;
statement three;
}
catch (){rollback;}
finally(){commit;}
Consistency :
If your trying to insert date into database which need to be satisfy the constraints, cascades, triggers like while your trying to insert the data into database but the table has primary key constraints so the data your planning to insert should be satisfy with primary key constraint.
Isolation :
if two process are running on database assume one is reading and other is writing the data into database .
the reading thread should read only committed data , should not be in-memory data
Durability :
once transaction data committed into the database should be same stage , it should not affect the from power failure or system crash any other

Related

Row locking behaviour while updating

In Oracle databases I can start a transaction and update a row without committing. Selecting this row in another session still returns the current ("old") value.
How to get this behaviour in SQL Server? Currently, the row is locked until the transaction is ended. WITH (NOLOCK) inside the select statement gives the new value from the uncommitted transaction which is potentially dangerous.
Starting the transaction without committing:
BEGIN TRAN;
UPDATE test SET val = 'Updated' WHERE id = 1;
This works:
SELECT * FROM test WHERE id = 2;
This waits for the transaction to be committed:
SELECT * FROM test WHERE id = 1;
With Read Committed Snapshot Isolation (RCSI), versions of rows are stored in a version store, so readers can read a version of a row that existed at the time the statement started and before any changes have been made; while a transaction is open; without taking shared locks on rows or pages; and without blocking writers or other readers. From this post by Paul White:
To summarize, locking read committed sees each row as it was at the time it was briefly locked and physically read; RCSI sees all rows as they were at the time the statement began. Both implementations are guaranteed to never see uncommitted data,
One cost, of course, is that if you read a prior version of the row, it can change (even many times) before you're done doing whatever it is you plan to do with it. If you're making important decisions based on some past version of the row, it may be the case that you actually want an isolation level that forces you to wait until all changes have been committed.
Another cost is that version store is not free... it requires space and I/O in tempdb, so if tempdb is already a bottleneck on your system, this is something worth testing.
(In SQL Server 2019, with Accelerated Database Recovery, the version store shifts to the user database, which increases database size but mitigates some of the tempdb contention.)
Paul's post goes on to explain some other risks and caveats.
In almost all cases, this is still way better than NOLOCK, IMHO. Lots of links about the dangers there (and why RCSI is better) here:
I'm using NOLOCK; is that bad?
And finally, from the documentation (adding one clarification from the comments):
When the READ_COMMITTED_SNAPSHOT database option is set ON, read committed isolation uses row versioning to provide statement-level read consistency. Read operations require only SCH-S table level locks and no page or row locks. That is, the SQL Server Database Engine uses row versioning to present each statement with a transactionally consistent snapshot of the data as it existed at the start of the statement. Locks are not used to protect the data from updates by other transactions. A user-defined function can return data that was committed after the time the statement containing the UDF began.When the READ_COMMITTED_SNAPSHOT database option is set OFF, which is the default setting * on-prem but not in Azure SQL Database *, read committed isolation uses shared locks to prevent other transactions from modifying rows while the current transaction is running a read operation. The shared locks also block the statement from reading rows modified by other transactions until the other transaction is completed. Both implementations meet the ISO definition of read committed isolation.

MSSQL how to properly Lock rows and insert?

I want to insert two rows in 2 different tables but want to roll back the transaction if some pre conditions on the second table are met.
Does it work In .NET if i simply start a transaction scope and execute a sql query to check data on the second table before executing the insert statements? If so, what is the isolation level to use?
I don't want it lock the whole tables as there are going to be many inserts. UNIQUE constraint is not an option because what i want to do is guarantee not more than 2 rows in the 2nd table to have the same value (FK to a PK column of table 1)
Thanks
Yes you can execute a sql query to check data on the second table before executing the insert statements.
Fyi the default is Serializable. From MSDN:
The lowest isolation level, ReadUncommitted, allows many transactions
to operate on a data store simultaneously and provides no protection
against data corruption due to interruptive transactions. The highest
isolation level, Serializable, provides a high degree of protection
against interruptive transactions, but requires that each transaction
complete before any other transactions are allowed to operate on the
data.
The isolation level of a transaction is determined when the
transaction is created. By default, the System.Transactions
infrastructure creates Serializable transactions. You can determine
the isolation level of an existing transaction using the
IsolationLevel property of a transaction.
Given your requirement, I do not think you want to use Serializable since it is the least friendly for high volume multi user systems because they cause the most amount of blocking.
You need to decide on the amount of protection that is required. At a minimum, you should look into READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ. The following answer goes over Isolation Levels in detail. From that, you can decide what level of protection is sufficient for your requirement.
Transaction isolation levels relation with locks on table

How to prevent interim identity holes in SQL Server

Is there a way (using config + transaction isolation levels) to ensure that there are no interim holes in a SQL Server IDENTITY column? Persistent holes are OK. The situation I am trying to avoid is when one query returns a hole but a subsequent similar query returns a row that was not yet committed when the query had been run the first time.
Your question is one of isolation levels and has nothing to do with IDENTITY. The same problem applies to any update/insert visibility. The first query can return results which had include an uncommited row in one and only one situation: if you use dirty reads (read uncommited). If you do, then you deserve all the inconsistent results you'll get and you deserve no help.
If you want to see stable results between two consecutive reads you must have a transaction that encompases both reads and use SERIALIZABLE isolation level or, better, use a row versioning based isolation level like SNAPSHOT. My recommendation would be to enable SNAPSHOT and use it. See Using Snapshot Isolation.
All I need is the promise that inserts to a table are committed in order of identity values they claim.
I hope you read this again and realize the impossibility of the request ('promise ... commit..'). You can't ask for something to guarantee success before it finished. What you're asking eventually boils down to asking not to allocate a new identity before the previous allocated one has committed successfully. In other words, full serialization of all insert transactions.

Is it correct to say that data reading operations need not run inside transactions?

Say that a method only reads data from a database and does not write to it. Is it always the case that such methods don't need to run within a transaction?
In many databases a request for reading from the database which is not in an explicit transaction implicitly creates a transaction to run the request.
In a SQL database you may want to use a transaction if you are running multiple SELECT statements and you don't want changes from other transactions to show up in one SELECT but not an earlier one. A transaction running at the SERIALIZABLE transaction isolation level will present a consistent view of the data across multiple statements.
No. If you don't read at a specific isolation level you might not get enough guarantees. For example rows might disappear or new rows might appear.
This is true even for a single statement:
select * from Tab
except select * from Tab
This query can actually return rows in case of concurrent modifications because it scans the table twice.
SQL Server: There is an easy way to get fast, nonblocking, nonlocking, consistent reads: Enable snapshot isolation and read in a snapshot transaction. AFAIK Oracle has this capability as well. Postgres too.
the purpose of transaction is to rollback or commit the operations done to a database, if u are just selecting values and making no change in the data there is no need of transaction.

Why are rollbacks needed?

Why are rollbacks so important?
Is it to prevent data (like data in a SQL DB) from being in an inconsistent state?
If so, how comes the data "store" (the SQL DB or whatever) made it possible in the first place to become in a corrupt state?
Are there data storage mechanisms that don't have a need for "rollbacks"?
Rollbacks are important in case of any kind of errors appearing during database operational. They can really save the day in case of database server crashes or a critical exception is thrown in an application that modifies contents of DB. When a significant DB operation is performed (i.e. updates, inserts, etc.) and the process is broken in the middle, it would be very hard to trace which operations were successful and usage of DB afterward would be very complicated.
The "store" itself does not generally have a built-in mechanism for consistency control - this is exactly why we use rollbacks and transactions. This can be perceived as a sort of 'live backup' mechanism.
There are cases, when you need insert/update data in many related tables - if you didn't have transactional logic, then any errors somewhere in middle of process could make data inconsistent.
Simple example. Say you need to insert both order header data into orders table and order lines into lines table. You insert order header, read identity, start inserting order lines - but this second insert fails on whatever reason. Only reliable way to recover from this situation is to rollback first insert - either explicitly (when your connection to db is alive) or implicitly (when link is gone down).

Resources