How does the DataContext handle concurrency? - sql-server

I wonder how the DataContext handles concurrency violation.
For example -
Two users are fetching some data from the database, then some of them change some row and commit changes, then other user trying commit their changes so a ChangeConflictException should occur, but how does the DataContext know that the data has changed?
Fetching this data again and comparing? Or some database notification mechanism?

Concurrency control can be done by using either TimeStamp column in database or UpdateCheck attribute in LINQ-to-SQL.
MSDN Concurrency Overview (LINQ-to-SQL)
MSDN How to manage change conflicts (LINQ-to-SQL)

The previous values of all columns that aren't being changed and are marked UpdateCheck are included in the where clause of the generated update statement. If the update affected 1 row, all is well- if it affected 0 rows (eg, somebody else changed one of those values, hence it couldn't locate the row after filtering), you get the ChangeConflictException.

Yes it fetches the data again to validate concurrency.
The LINQ to SQL employs optimistic concurrency control, which means L2S checks the status of data as opposed to locking the data. You can specify which column L2S should use to determine if data is changed or not. By default it would compare every column.
See Understanding LINQ to SQL (9) Concurrent Conflict for in depth discussion.

Related

is rowversion a transactionally-consistent value to capture table data changes

If an ETL process attempts to detect data changes on system-versioned tables in SQL Server by including rows as defined by a rowversion column to be within a rowversion "delta window", e.g.:
where row_version >= #previous_etl_cycle_rowversion
and row_version < #current_etl_cycle_rowversion
.. and the values for #previous_etl_cycle_rowversion and #current_etl_cycle_rowversion are selected from a logging table whose newest rowversion gets appended to said logging table at the start of each ETL cycle via:
insert into etl_cycle_logged_rowversion_marker (cycle_start_row_version)
select ##DBTS
... is it possible that a rowversion of a record falling within a given "delta window" (bounded by the 2 ##DBTS values) could be missed/skipped due to rowversion's behavior vis-à-vis transactional consistency? - i.e., is it possible that rowversion would be reflected on a basis of "eventual" consistency?
I'm thinking of a case where say, 1000 records are updated within a single transaction and somehow ##DBTS is "ahead" of the record's committed rowversion yet that specific version of the record is not yet readable...
(For the sake of scoping the question, please exclude any cases of deleted records or immediately consecutive updates on a given record within such a large batch transaction.)
If you make sure to avoid row versioning for the queries that read the change windows you shouldn't miss many rows. With READ COMMITTED SNAPSHOT or SNAPSHOT ISOLATION an updated but uncommitted row would not appear in your query.
But you can also miss rows that got updated after you query ##dbts. That's not such a big deal usually as they'll be in the next window. But if you have a row that is constantly updated you may miss it for a long time.
But why use rowversion? If these are temporal tables you can query the history table directly. And Change Tracking is better and easier than using rowversion, as it tracks deletes and optionally column changes. The feature was literally built for to replace the need to do this manually which:
usually involved a lot of work and frequently involved using a
combination of triggers, timestamp columns, new tables to store
tracking information, and custom cleanup processes
.
Under SNAPSHOT isolation, it turns out the proper function to inspect rowversion which will ensure contiguous delta windows while not skipping rowversion values attached to long-running transactions is MIN_ACTIVE_ROWVERSION() rather than ##DBTS.

Nhibernate .SaveOrUpdate, how to get if row was update or not, means RowCount

I am updating a column in a SQL table and I want to check if it was updated successfully or it was updated already and my query didn't do anything
as we get ##rowcount in SQL Server.
In my case, I want to update a column named lockForProcessing, so if it is already processing, then my query would not affect any row, it means someone else is already processing it, else I would process it.
If I understand you correctly, your problem is related to a multi threading / concurrency problem, where the same table may be updated simultaneously.
You may want to have a look at the :
Chapter 11. Transactions And Concurrency
The ISession is not threadsafe!
The entity is not stored the moment the code session.SaveOrUpdate() is executed, but typically after transaction.Commit().
stored and commited are two different things.
The entity is stored after any session.Flush(). Depending on the IsolationLevel, the entity won't be seen by other transactions.
The entity is commited after a transaction.Commit(). A commit also flushes.
Maybe all you need to do is choose the right IsolationLevel when beginning transactions and then read the table row to get the current value:
using (var transaction = session.BeginTransaction(IsolationLevel.Serializable))
{
session.Get(); // Read your row
transaction.Commit();
}
Maybe it is easier to create some locking or pipeline mechanism in your application code though. Without knowing more about who is accessing the database (other transactions, sessions, processes?) it is hard to answer more precisely.

SQL Change tracking SYS_CHANGE_COLUMNS

We are running SQL 2008 R2 and have started exploring change tracking as our method for identifying changes to export to our data warehouse. We are only interested in specific columns.
We are identifying the changes on a replicated copy of the source database. If we query the change table on the source server, any specific column update is available and the SYS_CHANGE_COLUMNS is populated.
However on the replicated copy the changes are being tracked but the SYS_CHANGE_COLUMNS field is always NULL for an update change.
Track columns updated is set to true on the subscriber.
Is this due to the way replication works and it is performing whole row updates and therefore you cannot get column level changes on a subscriber?
Any help or alternative approaches would be much appreciated.
Thanks
I realize this is an old question, but since I've happened across it I figure I may as well provide an answer for others who come later.
SYS_CHANGE_COLUMNS is null when every column is "updated". "Updated" here doesn't nessarily mean the value changed, it just means the column was touched by the DML statement. So, "update t set c = c" would mean column c was "updated".
Inserts and deletes will therefore always have a SYS_COLUMNS_CHANGED value of "null", since the whole row is affected by an insert or a delete. But most replication technologies do an update by setting every column value to the value of the column on the replication source. Therefore, a replication "update" will touch every column, and so the SYS_CHANGE_COLUMNS value will always be null.

NHibernate session.flush() fails but makes changes

We have a SQL Server database table that consists of user id, some numeric value, e.g. balance, and a version column.
We have multiple threads updating this table's value column in parallel, each in its own transaction and session (we're using a session-per-thread model). Since we want all logical transaction to occur, each thread does the following:
load the current row (mapped to a type).
make the change to the value, based on old value. (e.g. add 50).
session.update(obj)
session.flush() (since we're optimistic, we want to make sure we had the correct version value prior to the update)
if step 4 (flush) threw StaleStateException, refresh the object (with lockmode.read) and goto step 1
we only do this a certain number of times per logical transaction, if we can't commit it after X attempts, we reject the logical transaction.
each such thread commits periodically, e.g. after 100 successful logical transactions, to keep commit-induced I/O to manageable levels. meaning - we have a single database transaction (per transaction) with multiple flushes, at least once per logical change.
what's the problem here, you ask? well, on commits we see changes to failed logical objects.
specifically, if the value was 50 when we went through step 1 (for the first time), and we tried to update it to 100 (but we failed since e.g. another thread changed it to 70), then the value of 50 is committed for this row. obviously this is incorrect.
What are we missing here?
Well, I do not have a ton of experience here, but one thing I remember reading in the documentation is that if an exception occurs, you are supposed to immediately rollback the transaction and dispose of the session. Perhaps your issue is related to the session being in an inconsistent state?
Also, calling update in your code here is not necessary. Since you loaded the object in that session, it is already being tracked by nhibernate.
If you want to make your changes anyway, why do you bother with row versioning? It sounds like you should get the same result if you simply always update the data and let the last transaction win.
As to why the update becomes permanent, it depends on what the SQL statements for the version check/update look like and on your transaction control, which you left out of the code example. If you turn on the Hibernate SQL logging it will probably become obvious how this is happening.
I'm not a nhibernate guru, but answer seems simple.
When nhibernate loads an object, it expects it not to change in db as long as it's in nhibernate session cache.
As you mentioned - you got multi thread app.
This is what happens=>
1st thread loads an entity
2nd thread loads an entity
1st thread changes entity
2nd thread changes entity and => finds out that loaded entity has changed by something else and being afraid that it has screwed up changes 1st thread made - throws an exception to let programmer be aware about that.
You are missing locking mechanism. Can't tell much about how to apply that properly and elegantly. Maybe Transaction would help.
We had similar problems when we used nhibernate and raw ado.net concurrently (luckily - just for querying - at least for production code). All we had to do - force updating db on insert/update so we could actually query something through full-text search for some specific entities.
Had StaleStateException in integration tests when we used raw ado.net to reset db. NHibernate session was alive through bunch of tests, but every test tried to cleanup db without awareness of NHibernate.
Here is the documention for exception in the session
http://nhibernate.info/doc/nhibernate-reference/best-practices.html

Synchronize DataSet

What is the best approach to synchronizing a DataSet with data in a database? Here are the parameters:
We can't simply reload the data because it's bound to a UI control which a user may have configured (it's a tree grid that they may expand/collapse)
We can't use a changeflag (like a UpdatedTimeStamp) in the database because changes don't always flow through the application (e.g. a DBA could update a field with a SQL statement)
We cannot use an update trigger in the database because it's a multi-user system
We are using ADO.NET DataSets
Multiple fields can change of a given row
I've looked at the DataSet's Merge capability, but this doesn't seem to keep the notion of an "ID" column. I've looked at DiffGram capability but the issue here is those seem to be generated from changes within the same DataSet rather than changes that occured on some external data source.
I've been running from this solution for a while but the approach I know would work (with a lot of ineffeciency) is to build a separate DataSet and then iterate all rows applying changes, field by field, to the DataSet on which it is bound.
Has anyone had a similar scenario? What did you do to solve the problem? Even if you haven't run into a similar problem, any recommendation for a solution is appreciated.
Thanks
DataSet.Merge works well for this if you have a primary key defined for each DataTable; the DataSet will raise changed events to any databound GUI controls
if your table is small you can just re-read all of the rows and merge periodically, otherwise limiting the set to be read with a timestamp is a good idea - just tell the DBAs to follow the rules and update the timestamp ;-)
another option - which is a bit of work - is to keep a changed-row queue (timestamp, row ID) using a trigger or stored procedure, and base the refresh queries off of the timestamp in the queue; this will be more efficient if the base table has a lot of rows in it, allowing you (via an inner join on the queue record) to pull only the changed rows since the last poll time.
I think it would be easier to store a list of the nodes that the user has expanded (assuming you can uniquely identify each one), then re-load the data and re-bind it to the tree view, and then expand all the nodes previously expanded.

Resources