Pessimistic offline locking using Entity framework - sql-server

First I'd like to describe the mechanism of a locking solution I'd like to implement. Basically an item can be opened in read or write mode. However if an user opens the item in write mode, no other user should be able to open it in edit mode. The item means a case in a customer service application.
In order to to this I came up with the following: The table will contain a flag which indicates if an item is checked out for edit, and an 'end time', while this flag is valid. The default value for it is 3 minutes, if no user interaction happens during this time, the flag can be ignored next time when an user tries to open the same item.
On the UI side, I use jQuery to monitor if an user is active. If he or she is, a periodic AJAX call extends his or her time frame so he or she can continue working on the item. When the user saves the item, the flag will be removed. The end time is necessary to handle situations when the browser crashes or when the user goes to drink a coffee and leaves the item open for an hour.
So, the question. :) If an user opens the item in edit mode first I have to read the flag & time values for the time item, and if I find these valid (flag is not set, or set but not valid because of the time) and I have to update them with new values.
What kind of transaction level should I use for this in EF, if any? Or should I write stored procedures to handle the select & update in a transaction? If so, what kind of locking method should I use?

You are describing pessimistic locking, there is really no debate on that. There are detailed instructions on what you want to do in the excellent MVC/EF tutorial http://www.asp.net/mvc/tutorials/getting-started-with-ef-using-mvc/handling-concurrency-with-the-entity-framework-in-an-asp-net-mvc-application
There’s a chapter early on about pessimistic.

Optimistic locking is still OK in this case. You can use timestamp / rowversion and your flag together. The flag will be used to handle your application logic - only single user can edit the record and the timestamp will be used to avoid race condition when setting the flag because only single thread will be able to read the record and write it back. If any other thread tries to read the record concurrently and saves it after the first thread it will get concurrency exception.
If you don't want to use timestamp different transaction isolation level will not help you because isolation level doesn't force queries to lock records. You must manually write SQL query and use UPDLOCK hint to lock the record by querying and after that execute update. You can do this in stored procedure.

The answer below is not a good way to implement pessimistic concurrency. You should not implement this at the application level. The RDBMS have better tools for this.
If you are locking a row in the db, this is by definition pessimistic.
Since you are controlling the pessimistic concurrency at the application level, I don't think it matters which transaction scope EF uses. EF will automatically start a db-level transaction when you SaveChanges.
To prevent multiple threads from executing the lock / unlock from your app, you can lock the section of code that queries & updates like so:
public static object _lock = new object();
public class MyClassThatManagesConcurrency
{
public void MyMethodThatManagesConcurrency()
{
lock(_lock)
{
// query for the data
// determine if item should be unlocked
// dbContext.SaveChanges();
}
}
}
With the above, no 2 threads will ever execute code inside the lock section at the same time. However, I am not sure why this is necessary. If all you are doing is reading the object and unlocking it when time has expired, and 2 threads enter the method at the same time, either way, the item will become unlocked.
On the other hand, if your db row for this object has a timestamp column (not a datetime column but a columng for versioning rows), and 2 threads enter the method at the same time, the second will receive a concurrency exception. But unless you have are versioning rows at the db level, I don't think you need to do any locking.
Reply to comment
Ok I get it now, you are right. But you are still locking at the application level, which means it should not matter which db transaction ef chooses. To prevent 2 users from unlocking the same object, use the C# lock block I posted above.

Related

Implement pessimistic locking

I'm interested in how I can implement pessimistic locking, with very specific behavior.
(The reason I tagged the question with Sybase+Oracle+MSSQL, is because I'd be happy with a solution or "that's impossible!" for any one of them)
What I want is this:
1 - be able to lock a row (so that process can later do update, but no other process can lock the row)
2 - when another process tries to lock same row it should get notification that record is locked - I don't want this process to hang (I believe simple timeout can be used here)
3 - when another process tries to read record, it should be able to read it the way it is currently in database (but I don't want to use dirty reads).
The above 3 requirements are currently solved by application using shared memory - and performing record-locking outside database. I'd like to move the locking into the database.
So far, I'm having conflicts between #1 and #3 - if I lock record by doing 'update ...' by updating a field to same value, than 'select' from another process hangs.
Edit:
I'm having some luck now with snapshot isolation level on MSSQL. I can do both the locking, and reads without using dirty reads.
The reason I don't want to use dirty-reads, is that if a report is running, it might read multiple tables, and issue multiple queries. Snapshot gives me a consistent snapshot of the datatabase. With a dirty read, I'd have mismatching data - if there were any updates in the middle.
I think Oracle has snapshot as well, so now I'm most interested in Sybase.
In Oracle you can use select for update nowait to lock a record.
select * from tab where id=1234 for update nowait;
If another process try to execute the same statment it gets an exception:
ORA-00054: resource busy and acquire with NOWAIT specified
until the first process(session) performs commit or rollback.
normally, oracle don't permit dirty reads
Your described conflict between #1 and #3 is a logical one: you can either let the database do dirty reads OR you block the reads. If you could read the locked row, it is a dirty read by definition. That has nothing to do with the specific database system you use!
So if you want it that way: Yes, what you want is impossible with all 3 systems because it hurts the definition of "dirty read".

Elegant solution to write to a database as well as a rfid card in a somewhat atomic way

Assuming you want to write some data to a RFID card and some data to a database.
Theoretically the connection to the database might break at any time and the connection between the card and the card reader might break at any time too.
Your goal is to design it in a kinda atomic way that you either end up with both or none being written.
Is there any elegant approach to solve this?
Any hints are appreciated :)
(I'm using MS SQL Server and the Windows Smart Card Module (Winscard.dll) in case it makes any difference)
Insert a record before you start your update of the RFID card, then update this record when complete. In this way you'll also be able to spot half-completed updates, and act accordingly.
Insert a record into the database with a status of 'incomplete'. Give it a timestamp, a unique id, and any other pertinent information you require to uniquely describe the RFID write attempt
Write to RFID card
Update the record inserted in step 1. with a 'completed' or 'failure' status
Then have a process which resolves incomplete records after a timeout period has elapsed (either by trying again, or whatever else you want to do).
Do not try to hold open a database transaction while you communicate with your RFID card, as this will:
Give you unwanted (and potentially lengthy) locks on your database
Rollback upon failure, which will remove any evidence you tried to do something to your RFID card
Define 'somewhat'. True atomic way require the smartcard module to participate in a distributed transaction and I'm not aware of any smartcard API and/or provider supporting transaction.
Anything less atomic means not atomic and you can play games of retry and status until the cows come home, but unless the smartcard write is idempotent you have zero chance of correctness in the presence of failures. If the smartcard write is idempotent then is simply a matter of retry and ack logic: write to DB with status 'pending', commit, then write to smartcard, then write to DB with status 'Done'. On crash you need a process to scan for status 'Pending' and retry. Since the smartcard write is idempotent, the retry is safe. And yes, Using Tables as Queues is relevant here.
You'll want to update the database first, and then predicate the RFID update on the success of the database update. The database update should be wrapped in a transaction to make it atomic as well.
Of course you'll want to roll back the database update if the subsequent RFID update fails. Therefore I would consider including a transaction audit with the database update. If the RFID update fails, you can then return to a consistent state by invoking a rollback routine for your DB update.
Hope this helps!

database record locking

I have a server application, and a database. Multiple instances of the server can run at the same time, but all data comes from the same database (on some servers it is postgresql, in other cases ms sql server).
In my application, there is a process that is performed which can take hours. I need to ensure that this process is only executed one at a time. If one server is processing, no other server instance can process until the first one has completed.
The process depends on one table (let's call it 'ProcessTable'). What I do is, before any server starts the hour-long process, I set a boolean flag in the ProcessTable which indicates that this record is 'locked' and is being processed (not all records in this table are processed / locked, so I need to specifically mark each record which is needed by the process). So when the next server instance comes along while the previous instance is still processing, it sees the boolean flags and throws an exception.
The problem is, that 2 server instances might both be activated at nearly the same time, and when both check the ProcessTable, there may not be any flags set, but both servers are actually in the process of 'setting' the flags but since the transaction hasn't yet commited for either process, neither process will see the locking done by the other process. This is because the locking mechanism itself may take a few seconds, so there is that window of opportunity where 2 servers might still be able to process at the same time.
It appears that what I need is a single record in my 'Settings' table which should store a boolean flag called 'LockInProgress'. So before even a server can lock the needed records in the ProcessTable, it first must make sure that it has full rights to do the locking by checking the 'LockInProgress' column in the Settings table.
So my question is, how do I prevent two servers from both modifying that LockInProgress column in the settings table, at the same time... or am I going about this in the wrong manner?
Please note that I need to support both postgresql and ms sql server as some servers use one database, and some servers use the other.
Thanks in advance...
How about obtaining a lock on the record first and then update the record to show "locked". This would avoid the 2nd instance to get a lock successfully and thereby the update of record fails.
The point is to make sure the lock and update as one atomic step.
Make a stored procedure that hands out the lock, and run it under 'serializable' isolation. This will guarantee that one and only one process can get at the resource at any given time.
Note that this means that the second process trying to get at the lock will block until the first process releases it. Also, if you have to get multiple locks in this manner, make sure that the design of the process guarantees that the locks will be acquired and released in the same order. This will avoid deadlock situations where two processes hold resources while waiting for each other to release locks.
Unless you can't deal with your other processes blocking this would probably be easier to implement and more robust than attempting to implement 'test and set' semantics.
I've been thinking about this, and I think this is the simplest way of doing things; I just execute a command like this:
update settings set settingsValue = '333' where settingsKey = 'ProcessLock' and settingsValue = '0'
'333' would be a unique value which each server process gets (based on date/time, server name, + random value etc).
If no other process has locked the table, then the settingsValue would be = to 0, and that statement would adjust the settingsValue.
If another process has already locked the table, then that statement becomes a no-op, and nothing get's modified.
I then immediately commit the transaction.
Finally, I requery the table for the settingsValue, and if it is the correct value, then our lock succeeded and we continue on, otherwise an exception is thrown, etc. When we're done with the lock, we reset the value back down to 0.
Since I'm using SERIALIZATION transaction mode, I can't see this causing any issues... please correct me if I'm wrong.

Two threads adding new rows at the same time - how to prevent?

In my application, I have couple of threads that execute some logic.
At the end they adding new row to some table.
Before adding the new row, they check if a previous entry with the same details does not already exist. If one found - they updating instead of adding.
The problem is when some thread A do the check, see that no previous entity with the same details exist, and just before he add a new row, the thread B search the DB for the same entity. Thread B see that no such entity exist so he add new row too.
The result is that there are two rows with the same data in the table.
Note: no table key violated, because the thread get the next sequence just before adding the row, and the table key is some ID that does not related to the data.
Even if I will change the table key so it will be a combination of the data, It will prevent two rows with the same data, but will cause a DB error when the second thread will try to add the row.
Thank you in advance for the help, Roy.
You should be using a queue, possibly blocking queue. Threads A and B (producers) would add objects to the queue and another thread C (consumer) would poll the queue and remove the oldest object from the queue persisting it to the DB. This will prevent the problem when both A and B in the same time want to persist equal objects
You speak of "rows" so presumably this is a SQL database?
If so, why not just use transactions?
(Unless the threads are sharing a database connection, in which case a mutex might help, but I would prefer to give each thread a separate connection.)
I would recommend avoid locking in the client layer. Synchronized only works within one process, later you may scale so that your threads are across several JVMs or indeed machines.
I would enforce uniqueness in the DB, as you suggest this will then cause an exception for the second inserter. Catch that exception and do an update if that's the business logic you need.
But consider this argument:
Sometimes either of the following sequences may occur:
A insert Values VA, B updates to values VB.
B insert VB, A updates to VA.
If the two threads are racing either of these two outcomes VA or VB is equally valid. So you can't distinguish the second case from A inserts VA and B just fails!
So in fact there may be no need for the "fail and then update" case.
I think this is a job for SQL constraints, namely "UNIQUE" on the set of columns that have the data + the appropriate error handling.
Most database frameworks (Hibernate in Java, ActiveRecord etc in Ruby) have a form of optimistic locking. What this means is that you execute each operation on the assumption that it will work without conflict. In the special case where there is a conflict, you check this atomically at the point where you do the database operation, throw an exception, or error return code, and retry the operation in your client code after requerying etc.
This is usually implemented using a version number on each record. When a database operation is done, the row is read (including the version number), the client code updates the data, then saves it back to the database with a where clause specifying the primary key ID AND the version number being the same as it was when it was read. If it is different - this means another process has updated the row, and the operation should be retried. Usually this means re-reading the record, and doing that operation again on it with the new data from the other process.
In the case of adding, you would also want a unique index on the table, so the database refuses the operation, and you can handle that in the same code.
Pseudo code would look something like
do {
read row from database
if no row {
result_code = insert new row with data
} else {
result_code = update row with data
}
} while result_code != conflict_code
The benefit of this is that you don't need complicated synchronization/locking in your client code - each thread just executes in isolation, and uses the database as the consistency check (which it is very quick, and good at). Because you're not locking on some shared resource for every operation, the code can run much faster.
It also means that you can run multiple separate operating system processes to split the load and/or scale the operation over multiple servers as well without any code changes to handle conflicts.
You need to wrap the calls to check and write the row in a critical section or mutex.
With a critical section, interrupts and thread-switching are disabled while you perform the check and write, so both threads can't write at once.
With a mutex, the first thread would lock the mutex, perform its operations, then unlock the mutex. The second thread would attempt to do the same but the mutex lock would block until the first thread released the mutex.
Specific implementations of critical section or mutex functionality would depend on your platform.
You need to perform the act of checking for existing rows and then updating / adding rows inside a single transaction.
When you perform your check you should also acquire an update lock on those records, to indicate that you are going to write to the database based on the information that you have just read, and that no-one else should be allowed to change it.
In pseudo T-SQL (for Microsoft SQL Server):
BEGIN TRANSACTION
SELECT id FROM MyTable WHERE SomeColumn = #SomeValue WITH UPDLOCK
-- Perform your update here
END TRANSACTION
The update lock wont prevent people reading from those records, but it will prevent people from writing anything which might change the output of your SELECT
Multi Threading is always mind-bending ^^.
Main thing to do is to delimit the critical resources and critical operations.
Critical resource : your table.
Critical operation : adding yes, but
the whole procedure
You need to lock access to your table from the beginning of the check, until the end of the add.
If a thread attempt to do the same, while another is adding/checking, then he waits until the thread finish its operation. As simple as that.

NHibernate session.flush() fails but makes changes

We have a SQL Server database table that consists of user id, some numeric value, e.g. balance, and a version column.
We have multiple threads updating this table's value column in parallel, each in its own transaction and session (we're using a session-per-thread model). Since we want all logical transaction to occur, each thread does the following:
load the current row (mapped to a type).
make the change to the value, based on old value. (e.g. add 50).
session.update(obj)
session.flush() (since we're optimistic, we want to make sure we had the correct version value prior to the update)
if step 4 (flush) threw StaleStateException, refresh the object (with lockmode.read) and goto step 1
we only do this a certain number of times per logical transaction, if we can't commit it after X attempts, we reject the logical transaction.
each such thread commits periodically, e.g. after 100 successful logical transactions, to keep commit-induced I/O to manageable levels. meaning - we have a single database transaction (per transaction) with multiple flushes, at least once per logical change.
what's the problem here, you ask? well, on commits we see changes to failed logical objects.
specifically, if the value was 50 when we went through step 1 (for the first time), and we tried to update it to 100 (but we failed since e.g. another thread changed it to 70), then the value of 50 is committed for this row. obviously this is incorrect.
What are we missing here?
Well, I do not have a ton of experience here, but one thing I remember reading in the documentation is that if an exception occurs, you are supposed to immediately rollback the transaction and dispose of the session. Perhaps your issue is related to the session being in an inconsistent state?
Also, calling update in your code here is not necessary. Since you loaded the object in that session, it is already being tracked by nhibernate.
If you want to make your changes anyway, why do you bother with row versioning? It sounds like you should get the same result if you simply always update the data and let the last transaction win.
As to why the update becomes permanent, it depends on what the SQL statements for the version check/update look like and on your transaction control, which you left out of the code example. If you turn on the Hibernate SQL logging it will probably become obvious how this is happening.
I'm not a nhibernate guru, but answer seems simple.
When nhibernate loads an object, it expects it not to change in db as long as it's in nhibernate session cache.
As you mentioned - you got multi thread app.
This is what happens=>
1st thread loads an entity
2nd thread loads an entity
1st thread changes entity
2nd thread changes entity and => finds out that loaded entity has changed by something else and being afraid that it has screwed up changes 1st thread made - throws an exception to let programmer be aware about that.
You are missing locking mechanism. Can't tell much about how to apply that properly and elegantly. Maybe Transaction would help.
We had similar problems when we used nhibernate and raw ado.net concurrently (luckily - just for querying - at least for production code). All we had to do - force updating db on insert/update so we could actually query something through full-text search for some specific entities.
Had StaleStateException in integration tests when we used raw ado.net to reset db. NHibernate session was alive through bunch of tests, but every test tried to cleanup db without awareness of NHibernate.
Here is the documention for exception in the session
http://nhibernate.info/doc/nhibernate-reference/best-practices.html

Resources