What does SaveChanges() exactly do in EF6? - sql-server

I'm trying to understand transactions in entity framework 6.. I searched a lot but I'm still confused..
Take a look at this:
Dim transaction = context.Database.BeginTransaction()
Using transaction
.
.
context.Entry(entity1).State = System.Data.EntityState.Added;
SaveChanges()
.
.
context.Entry(entity2).State = System.Data.EntityState.Added;
SaveChanges()
.
.
context.Entry(entity3).State = System.Data.EntityState.Added;
SaveChanges()
Transaction.Commit()
'Or Transaction.RollBack()
End using
Now what exactly does SaveChanges() Do? and how does it differ from the commit??
Does it begin a new (maybe internal) transaction for each insert and then commit it?
I read https://msdn.microsoft.com/en-us/data/dn456843.aspx.
..that was what I understood from "In all versions of Entity Framework, whenever you execute SaveChanges() to insert, update or delete on the database the framework will wrap that operation in a transaction. This transaction lasts only long enough to execute the operation and then completes. When you execute another such operation a new transaction is started."

My understanding is, all changes to the entities (especially where there are relationships that have cascaded deletes, or, reinsert an item that has been deleted) is to sort the operations so they are carried out in the correct order.
For example, if you have a table with a unique constraint, and you have deleted one entity with a unique value on the column with the constraint and reinserted another entity with the same value, the operations are carried out in the correct order so the underlying dmbs doesn't throw a unique constraint exception. The goes for non auto incremented primary keys and a variety of other things although hopefully you get the gist of it.
The entities are stored in a graph with the relationships as edges so it can sort the graph and perform the operations in the correct order.
This is carried out by the ChangeTracker. I know this from working with / building my own entity tracker using the source code from the awesome IQToolkit.
I also understand that this is carried out in a single transaction, if the underlying dmbs supports it...
Also, in your example, you only need to call
SaveChanges()
Once not after each time you change an entity.
You also don't need to create an explicit transaction and commit it, as SaveChanges does this internally, unless you need to rollback the transaction due to some external factor
EDIT
To explicitly answer your questions in bold:
"Now what exactly does SaveChanges() Do? and how does it differ from the commit??"
It sorts the sql commands generated by each change made to the entities and executes them, in a single transaction, in an order that will not violate any relationship or field constraints setup within the database. As it uses its own transaction, you don't need to wrap the operation in a new transaction and commit it, unless you have a reason to roll the operations back due to some external factor.
It differs from Commit as Commit will commit any changes made during a transaction, while SaveChanges creates it's own transaction around the updates and commits the transaction. What you are doing is nesting the transaction created by SaveChanges in the outer transaction, so you can cancel it if required.
"Does it begin a new (maybe internal) transaction for each insert and then commit it?"
No, it wraps them all and commits in a single, internal transaction.

Related

SQLAlchemy: Should I commit an update between two queries?

In a single session, if I want to:
// make a query on Foo table to get one instance
// update this instance
// commit() or not?
// make the same query on Foo table
Will I get the same result in these two queries? That's to say, is it necessary to commit the update before query on the table within a single session?
Thanks!
It is not necessary to commit prior to making the query again. As a general principle, updates within a transaction (session) will be visible to subsequent queries in that same transaction, even prior to a commit.
Having said that, doing the same exact query twice within a transaction might be "code smell". It's worth considering, since the updated object is already memory, is it really necessary to query the object again?
Also, depending on the database isolation level, the second query is not guaranteed to return the same result set as the first one. This can happen if another transaction modifies the data prior to the second query.
It's not necessary to do both commits, as each transaction is visible to subsequent actions in the database (or queries).
You can just put the commit at the end, although I am not sure if multiple commits will affect runtime.

Nhibernate .SaveOrUpdate, how to get if row was update or not, means RowCount

I am updating a column in a SQL table and I want to check if it was updated successfully or it was updated already and my query didn't do anything
as we get ##rowcount in SQL Server.
In my case, I want to update a column named lockForProcessing, so if it is already processing, then my query would not affect any row, it means someone else is already processing it, else I would process it.
If I understand you correctly, your problem is related to a multi threading / concurrency problem, where the same table may be updated simultaneously.
You may want to have a look at the :
Chapter 11. Transactions And Concurrency
The ISession is not threadsafe!
The entity is not stored the moment the code session.SaveOrUpdate() is executed, but typically after transaction.Commit().
stored and commited are two different things.
The entity is stored after any session.Flush(). Depending on the IsolationLevel, the entity won't be seen by other transactions.
The entity is commited after a transaction.Commit(). A commit also flushes.
Maybe all you need to do is choose the right IsolationLevel when beginning transactions and then read the table row to get the current value:
using (var transaction = session.BeginTransaction(IsolationLevel.Serializable))
{
session.Get(); // Read your row
transaction.Commit();
}
Maybe it is easier to create some locking or pipeline mechanism in your application code though. Without knowing more about who is accessing the database (other transactions, sessions, processes?) it is hard to answer more precisely.

database atomicity consistency

What is difference between Atomicity and consistency ? it looks to me as both are saying same thing in different word.
Atomicity
All tasks of a transaction are performed or none of them are. There are no partial transactions. For example, if a transaction starts updating 100 rows, but the system fails after 20 updates, then the database rolls back the changes to these 20 rows.
Consistency
The transaction takes the database from one consistent state to another consistent state. For example, in a banking transaction that debits a savings account and credits a checking account, a failure must not cause the database to credit only one account, which would lead to inconsistent data.
and looks like atomicity is subset of consistency, then it shoud be cid(conistency, isolation, duribility) ,no atomicity
Atomicity is indeed saying that each transaction is either all or nothing, meaning that either all or none of its actions are executed and that there are no partial operations.
However, consistency talks about ensuring that any transaction will bring the database from one valid state to another. Any data written to the database must be valid according to all defined rules, including but not limited to constraints, cascades, triggers, and any combination thereof
(taken from Wikipedia).
That basically means that only valid states are written to the database, and that a transaction will either be executed if it doesn't violate the data consistency or rolled back if it does.
Hope it clears things out for you.
simple explain For consistency : if a field-type in database is Integer, it should accept only Integer value's and not some kind of other.If you want to store other types in this field, consistency are violated. At this condition transaction will rollback.
Atomicity :
Bunch of statement just take an example of 100 statements which can be insert statement also , if any of the statement failed while processing should revert back remaining statement , which means database should go back original state.
autocommit = false
try{
statement one ;
statement two ;
statement three;
}
catch (){rollback;}
finally(){commit;}
Consistency :
If your trying to insert date into database which need to be satisfy the constraints, cascades, triggers like while your trying to insert the data into database but the table has primary key constraints so the data your planning to insert should be satisfy with primary key constraint.
Isolation :
if two process are running on database assume one is reading and other is writing the data into database .
the reading thread should read only committed data , should not be in-memory data
Durability :
once transaction data committed into the database should be same stage , it should not affect the from power failure or system crash any other

Are updates within an entity group always visible to reads within the group after commit returns?

I have a question about the examples in this article:
http://code.google.com/appengine/articles/transaction_isolation.html
Suppose I put Adam and Bob in the same entity group and modify the operation getTallPeople to only check the height of Adam and Bob (i.e. access only entities in the entity group). Now, if I execute the following statements:
begin transaction
updatePerson (update Adam's height to 74 inches)
commit transaction
begin transaction
getTallPeople
commit transaction
Can I be sure that getTallPeople will always return both Adam and Bob? I.e. if entity/index updates have not completed, will the second transaction wait until they have? Also, would the behavior be the same without using a transaction for getTallPeople?
Thanks for your help!
Yes. For getTallPeople to be called within a transaction, it must use an "ancestor" filter in its query to limit its results to members of the group. If it does so, both the index data it uses to determine the results and the entities it fetches based on those results will be strongly consistent with the committed results of the previous transaction. This is also true without the explicit transaction if the query uses an ancestor filter and you're using the HR datastore. (The HR datastore has been the default for a while, so you're probably using it.)
If getTallPeople performs a query without an ancestor filter and you're using the HR datastore, it will use the global index data, which is only guaranteed to be eventually consistent across the dataset. In this case, the query might see index data for the group prior to the previous transaction, even though the previous transaction has already committed.

Sql Server 2005 - manage concurrency on tables

I've got in an ASP.NET application this process :
Start a connection
Start a transaction
Insert into a table "LoadData" a lot of values with the SqlBulkCopy class with a column that contains a specific LoadId.
Call a stored procedure that :
read the table "LoadData" for the specific LoadId.
For each line does a lot of calculations which implies reading dozens of tables and write the results into a temporary (#temp) table (process that last several minutes).
Deletes the lines in "LoadDate" for the specific LoadId.
Once everything is done, write the result in the result table.
Commit transaction or rollback if something fails.
My problem is that if I have 2 users that start the process, the second one will have to wait that the previous has finished (because the insert seems to put an exclusive lock on the table) and my application sometimes falls in timeout (and the users are not happy to wait :) ).
I'm looking for a way to be able to have the users that does everything in parallel as there is no interaction, except the last one: writing the result. I think that what is blocking me is the inserts / deletes in the "LoadData" table.
I checked the other transaction isolation levels but it seems that nothing could help me.
What would be perfect would be to be able to remove the exclusive lock on the "LoadData" table (is it possible to force SqlServer to only lock rows and not table ?) when the Insert is finished, but without ending the transaction.
Any suggestion?
Look up SET TRANSACTION ISOLATION LEVEL READ COMMITTED SNAPSHOT in Books OnLine.
Transactions should cover small and fast-executing pieces of SQL / code. They have a tendancy to be implemented differently on different platforms. They will lock tables and then expand the lock as the modifications grow thus locking out the other users from querying or updating the same row / page / table.
Why not forget the transaction, and handle processing errors in another way? Is your data integrity truely being secured by the transaction, or can you do without it?
if you're sure that there is no issue with cioncurrent operations except the last part, why not start the transaction just before those last statements, Whichever they are that DO require isolation), and commit immediately after they succeed.. Then all the upfront read operations will not block each other...

Resources