do nested objectify transactions remain atomic - or do they work - google-app-engine

i have a quick question about objectify - this may be in the actual documentation but i haven't found anything so i'm asking here to be safe.
i have a backend using objectify that I kind of rushed out - what i would like to do is the following - i have an event plan that is made up of activities. currently, if i delete an event i am actually writing in all of the logic to delete the individual activities inside the event plans delete method.
what i'm wondering is, if i call the activitys delete method from the event plans delete method (if it lets me do this), is it atomic?
sample (this is just pseudo code - not actual - case and method names may be wrong):
// inside event plan dao
public void delete(EventPlan eventPlan) {
final Objectify ofy = Objectify.beginTransaction();
try {
final ActivityDAO activityDao = new ActivityDAO();
for (final Activity activity : eventPlan.getActivities()) {
activityDao.delete(activity);
}
ofy.getTxn().commit();
} finally {
if (ofy.getTxn().isActive()) {
ofy.getTxn().rollback();
|
}
}
// inside activity dao
public void delete(Activity activity) {
final Objectify ofy = Objectify.beginTransaction();
try {
// do some logic in here, delete activity and commit txn
} finally {
// check and rollback as normal
}
}
is this safe to do? - as it is right now, the reason it's so mangled is because i didn't realize the entity group issue - there were certain things in the activity that were not in the same entity group as the activity itself - after fixing this i put all of the logic in the event plan delete and the method is becoming unmanageable - is it ok to break stuff down into smaller pieces or will it break atomicity.
thank you

Nested transactions do not happen in a single atomic chunk. There is not really any such thing as a nested transaction - the transactions in your example are all in parallel, with different Objectify (DatastoreService) objects.
Your inner transactions will complete transactionally. Your outer transaction doesn't really do anything. Each inner delete is in its own transaction - it is still perfectly possible for the first Activity to be successfully deleted even though the second Activity is not deleted.
If your goal is to delete a group of entities all-or-nothing style, look into using task queues. You can delete the first Activity and enqueue a task to delete the second transactionally, so you can be guaranteed that either the Activity will be deleted AND the task enqueued, or neither. Then, in the task, you can do the same with the second, etc. Since tasks are retried if they fail, you can control the behavior to be sort of like a transaction. The only thing to beware of are results in other requests including the partially-deleted series during the process.

If he removes the inner transaction will the outer transaction still do nothing?

Related

Concurrency exception in Entity Framework when loading and deleting objects

I've got an EF class mapped to a SQL Server table. I have the following very simple Entity Framework code (using the ASP.NET Boilerplate repository wrapper over the EF DBContext):
var objectsToDelete = repository.GetAll().Where(r => r.parentId == param1).ToList();
foreach(var obj in objectsToDelete) {
repository.Delete(obj); //doesn't actually update the database yet.
}
...
repository.SaveChanges();
Note that the GetAll call will immediately read from the database.
Surprisingly to me, this is a real concurrency headache when multiple web service calls try to run this code simultaneously with the same param1 value (imagine that the user performs the exact same action twice in quick succession). To cut a long story short, a context switch can occur between getting the IDs and calling SaveChanges. So this is what happens (you can change this order around quite a bit without affecting the result):
Thread #1 starts a SQL transaction and starts running the above code.
Thread #2 starts a SQL transaction and starts running the above code (using a different DBContext).
Thread #1 gets the objects to delete.
Context switch.
Thread #2 gets the objects to delete - they will have the same IDs that #1 read.
Thread #2 marks the objects as deleted and calls SaveChanges, which issues the SQL DELETE statements.
Thread #2 commits its transaction and it's done.
Thread #1, using its (now stale) list, marks those same objects as deleted and calls SaveChanges, which issues the SQL DELETE statements.
Exception! Entity throws this:
Store update, insert, or delete statement affected an unexpected number of rows (0).
Now, even if I turn the transaction isolation level all the way up to Serializable, this doesn't help. "Serializable" specifies that statements cannot read data that has been modified but not yet committed by other transactions. But the problem occurred in step 5, and at that point nothing had been modified. Ah, but Serializable also specifies that no other transactions can modify data that has been read by the current transaction until the current transaction completes. Okay, so in the above sequence Thread #2 will block at step 6. But that just shifts the problem without addressing it. Whichever thread is the second one to attempt the delete will encounter the 'unexpected rowcount' exception.
I can't see any prevention - only treatment. It seems that I have to either swallow the exception or turn off Entity Framework's validation-on-save. And since I'm using ASP.NET Boilerplate, that doesn't seem to be very convenient. I’m puzzled that something so apparently simple and typical isn’t easier to control. What am I missing?
var objectsToDelete = repository.GetAll().Where(r => r.parentId == param1).ToList();
if(objectsToDelete.Count!=0)
{
repository.RemoveRange(objectsToDelete );
}
...
repository.SaveChanges();

Conditional associated record deletion in afterDelete()

I have the following setup:
Models:
Team
Task
Change
TasksTeam
TasksTeam is a hasManyThrough, that associates teams to tasks. Change is used to record changes in the details of tasks, including when teams are attached/detached (i.e. through records in TasksTeam).
TasksTeam also cascades deletes of Task. If a task is deleted, all related team associations should also be deleted.
When a TasksTeam is deleted, it means a team has left a task, and I'd like to record a Change for that. I'm using the TasksTeam afterDelete() to record teams leaving. In the TasksTeam beforeDelete I save the data to $this->predelete so it'll be available in the afterDelete().
Here is the non-working code in TasksTeam:
public function afterDelete(){
$team_id = $this->predelete['TasksTeam']['team_id'];
$task_role_id = $this->predelete['TasksTeam']['task_role_id'];
$task_id = $this->predelete['TasksTeam']['task_id'];
// Wanted: only record a change if the task isn't deleted
if($this->Task->exists($task_id)){
$this->Task->Change->removeTeamFromTask($task_id, $team_id, $task_role_id);
}
return true;
}
Problem:
When a task is deleted, the delete cascades to TasksTeam correctly. However, a change will be recorded even if the Task is deleted. From another answer to something similar on SO, I think the reason is that the callbacks are called before Model:del(), meaning the task hasn't yet been deleted when it hits TasksTeam afterDelete()
Question
How can I successfully save a Change only if the task isn't deleted?
Thanks in advance.
If the callbacks are getting called before the actual delete, I see maintaining an assoc. array of flags with task IDs as keys, or a set of task IDs, which are added when afterDelete is called on Task. Then you could create a method in Task, such as isDeleting or similar, which queries the array, to tell you if the task is in the process of being deleted.
Using the suggestion from #James Dunne I ended up adding a tinyint field to the Task model called is_deleted and simply set this boolean true in the Task beforeDelete(). I then check for this flag and only save a Change if the flag is boolean false. It seems wasteful to add a field for something that is only affected just before the record is deleted, but for my purposes it works fine. I think a "real solution" would involve the Cake Events System , avoiding the need for chained callbacks.

Nested transaction on google app engine datastore

If I want all deletes execute all-or-nothing.
If nothing changed. Will the group of deletes be atomic?
If I remove outer transaction, will something change?
If I remove only inner transaction, will group be atomic?
Ig I replace for-cycle with a batch delete and leave only outer transaction?
// inside event plan dao
public void delete(EventPlan eventPlan) {
final Objectify ofy = Objectify.beginTransaction();
try {
final ActivityDAO activityDao = new ActivityDAO();
for (final Activity activity : eventPlan.getActivities()) {
activityDao.delete(activity);
}
ofy.getTxn().commit();
} finally {
if (ofy.getTxn().isActive()) {
ofy.getTxn().rollback();
|
}
}
// inside activity dao
public void delete(Activity activity) {
final Objectify ofy = Objectify.beginTransaction();
try {
// do some logic in here, delete activity and commit txn
} finally {
// check and rollback as normal
}
}
If you use Objectify 3.1 then all transactions are XG-transactions, which can operate on max 5 different entity groups, i.e. if your Activities do not have common parent (= putting them in the same entity group) then you can only delete max five in one transaction.
No, you are using parallel transactions ( one outer, multiple inner).
No, outer transaction has no operation executed, so it does nothing. There are multiple inner transaction (loop) each doing it's own delete.
Yes, you must perform all operations within one transaction for operations to be atomic. If you remove inner transactions you are on the right path. However, Entity Group transaction limit still applies: all entities touched within a transaction must belong to the same Entity Group, or (since XG is enabled by default) to max five different Entity Groups (see above). Note if you don't explicitly put entity in Entity Group (by setting parent) then every entity gets it's own Entity Group.
Yes, batch delete is better then delete in loop (due to efficiency) but all transaction rules in point 3. still apply.

Pessimistic offline locking using Entity framework

First I'd like to describe the mechanism of a locking solution I'd like to implement. Basically an item can be opened in read or write mode. However if an user opens the item in write mode, no other user should be able to open it in edit mode. The item means a case in a customer service application.
In order to to this I came up with the following: The table will contain a flag which indicates if an item is checked out for edit, and an 'end time', while this flag is valid. The default value for it is 3 minutes, if no user interaction happens during this time, the flag can be ignored next time when an user tries to open the same item.
On the UI side, I use jQuery to monitor if an user is active. If he or she is, a periodic AJAX call extends his or her time frame so he or she can continue working on the item. When the user saves the item, the flag will be removed. The end time is necessary to handle situations when the browser crashes or when the user goes to drink a coffee and leaves the item open for an hour.
So, the question. :) If an user opens the item in edit mode first I have to read the flag & time values for the time item, and if I find these valid (flag is not set, or set but not valid because of the time) and I have to update them with new values.
What kind of transaction level should I use for this in EF, if any? Or should I write stored procedures to handle the select & update in a transaction? If so, what kind of locking method should I use?
You are describing pessimistic locking, there is really no debate on that. There are detailed instructions on what you want to do in the excellent MVC/EF tutorial http://www.asp.net/mvc/tutorials/getting-started-with-ef-using-mvc/handling-concurrency-with-the-entity-framework-in-an-asp-net-mvc-application
There’s a chapter early on about pessimistic.
Optimistic locking is still OK in this case. You can use timestamp / rowversion and your flag together. The flag will be used to handle your application logic - only single user can edit the record and the timestamp will be used to avoid race condition when setting the flag because only single thread will be able to read the record and write it back. If any other thread tries to read the record concurrently and saves it after the first thread it will get concurrency exception.
If you don't want to use timestamp different transaction isolation level will not help you because isolation level doesn't force queries to lock records. You must manually write SQL query and use UPDLOCK hint to lock the record by querying and after that execute update. You can do this in stored procedure.
The answer below is not a good way to implement pessimistic concurrency. You should not implement this at the application level. The RDBMS have better tools for this.
If you are locking a row in the db, this is by definition pessimistic.
Since you are controlling the pessimistic concurrency at the application level, I don't think it matters which transaction scope EF uses. EF will automatically start a db-level transaction when you SaveChanges.
To prevent multiple threads from executing the lock / unlock from your app, you can lock the section of code that queries & updates like so:
public static object _lock = new object();
public class MyClassThatManagesConcurrency
{
public void MyMethodThatManagesConcurrency()
{
lock(_lock)
{
// query for the data
// determine if item should be unlocked
// dbContext.SaveChanges();
}
}
}
With the above, no 2 threads will ever execute code inside the lock section at the same time. However, I am not sure why this is necessary. If all you are doing is reading the object and unlocking it when time has expired, and 2 threads enter the method at the same time, either way, the item will become unlocked.
On the other hand, if your db row for this object has a timestamp column (not a datetime column but a columng for versioning rows), and 2 threads enter the method at the same time, the second will receive a concurrency exception. But unless you have are versioning rows at the db level, I don't think you need to do any locking.
Reply to comment
Ok I get it now, you are right. But you are still locking at the application level, which means it should not matter which db transaction ef chooses. To prevent 2 users from unlocking the same object, use the C# lock block I posted above.

Diagnosing Deadlocks in SQL Server 2005

We're seeing some pernicious, but rare, deadlock conditions in the Stack Overflow SQL Server 2005 database.
I attached the profiler, set up a trace profile using this excellent article on troubleshooting deadlocks, and captured a bunch of examples. The weird thing is that the deadlocking write is always the same:
UPDATE [dbo].[Posts]
SET [AnswerCount] = #p1, [LastActivityDate] = #p2, [LastActivityUserId] = #p3
WHERE [Id] = #p0
The other deadlocking statement varies, but it's usually some kind of trivial, simple read of the posts table. This one always gets killed in the deadlock. Here's an example
SELECT
[t0].[Id], [t0].[PostTypeId], [t0].[Score], [t0].[Views], [t0].[AnswerCount],
[t0].[AcceptedAnswerId], [t0].[IsLocked], [t0].[IsLockedEdit], [t0].[ParentId],
[t0].[CurrentRevisionId], [t0].[FirstRevisionId], [t0].[LockedReason],
[t0].[LastActivityDate], [t0].[LastActivityUserId]
FROM [dbo].[Posts] AS [t0]
WHERE [t0].[ParentId] = #p0
To be perfectly clear, we are not seeing write / write deadlocks, but read / write.
We have a mixture of LINQ and parameterized SQL queries at the moment. We have added with (nolock) to all the SQL queries. This may have helped some. We also had a single (very) poorly-written badge query that I fixed yesterday, which was taking upwards of 20 seconds to run every time, and was running every minute on top of that. I was hoping this was the source of some of the locking problems!
Unfortunately, I got another deadlock error about 2 hours ago. Same exact symptoms, same exact culprit write.
The truly strange thing is that the locking write SQL statement you see above is part of a very specific code path. It's only executed when a new answer is added to a question -- it updates the parent question with the new answer count and last date/user. This is, obviously, not that common relative to the massive number of reads we are doing! As far as I can tell, we're not doing huge numbers of writes anywhere in the app.
I realize that NOLOCK is sort of a giant hammer, but most of the queries we run here don't need to be that accurate. Will you care if your user profile is a few seconds out of date?
Using NOLOCK with Linq is a bit more difficult as Scott Hanselman discusses here.
We are flirting with the idea of using
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
on the base database context so that all our LINQ queries have this set. Without that, we'd have to wrap every LINQ call we make (well, the simple reading ones, which is the vast majority of them) in a 3-4 line transaction code block, which is ugly.
I guess I'm a little frustrated that trivial reads in SQL 2005 can deadlock on writes. I could see write/write deadlocks being a huge issue, but reads? We're not running a banking site here, we don't need perfect accuracy every time.
Ideas? Thoughts?
Are you instantiating a new LINQ to SQL DataContext object for every operation or are you perhaps sharing the same static context for all your calls?
Jeremy, we are sharing one static datacontext in the base Controller for the most part:
private DBContext _db;
/// <summary>
/// Gets the DataContext to be used by a Request's controllers.
/// </summary>
public DBContext DB
{
get
{
if (_db == null)
{
_db = new DBContext() { SessionName = GetType().Name };
//_db.ExecuteCommand("SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED");
}
return _db;
}
}
Do you recommend we create a new context for every Controller, or per Page, or .. more often?
According to MSDN:
http://msdn.microsoft.com/en-us/library/ms191242.aspx
When either the
READ COMMITTED SNAPSHOT or
ALLOW SNAPSHOT ISOLATION database
options are ON, logical copies
(versions) are maintained for all data
modifications performed in the
database. Every time a row is modified
by a specific transaction, the
instance of the Database Engine stores
a version of the previously committed
image of the row in tempdb. Each
version is marked with the transaction
sequence number of the transaction
that made the change. The versions of
modified rows are chained using a link
list. The newest row value is always
stored in the current database and
chained to the versioned rows stored
in tempdb.
For short-running transactions, a
version of a modified row may get
cached in the buffer pool without
getting written into the disk files of
the tempdb database. If the need for
the versioned row is short-lived, it
will simply get dropped from the
buffer pool and may not necessarily
incur I/O overhead.
There appears to be a slight performance penalty for the extra overhead, but it may be negligible. We should test to make sure.
Try setting this option and REMOVE all NOLOCKs from code queries unless it’s really necessary. NOLOCKs or using global methods in the database context handler to combat database transaction isolation levels are Band-Aids to the problem. NOLOCKS will mask fundamental issues with our data layer and possibly lead to selecting unreliable data, where automatic select / update row versioning appears to be the solution.
ALTER Database [StackOverflow.Beta] SET READ_COMMITTED_SNAPSHOT ON
NOLOCK and READ UNCOMMITTED are a slippery slope. You should never use them unless you understand why the deadlock is happening first. It would worry me that you say, "We have added with (nolock) to all the SQL queries". Needing to add WITH NOLOCK everywhere is a sure sign that you have problems in your data layer.
The update statement itself looks a bit problematic. Do you determine the count earlier in the transaction, or just pull it from an object? AnswerCount = AnswerCount+1 when a question is added is probably a better way to handle this. Then you don't need a transaction to get the correct count and you don't have to worry about the concurrency issue that you are potentially exposing yourself to.
One easy way to get around this type of deadlock issue without a lot of work and without enabling dirty reads is to use "Snapshot Isolation Mode" (new in SQL 2005) which will always give you a clean read of the last unmodified data. You can also catch and retry deadlocked statements fairly easily if you want to handle them gracefully.
The OP question was to ask why this problem occured. This post hopes to answer that while leaving possible solutions to be worked out by others.
This is probably an index related issue. For example, lets say the table Posts has a non-clustered index X which contains the ParentID and one (or more) of the field(s) being updated (AnswerCount, LastActivityDate, LastActivityUserId).
A deadlock would occur if the SELECT cmd does a shared-read lock on index X to search by the ParentId and then needs to do a shared-read lock on the clustered index to get the remaining columns while the UPDATE cmd does a write-exclusive lock on the clustered index and need to get a write-exclusive lock on index X to update it.
You now have a situation where A locked X and is trying to get Y whereas B locked Y and is trying to get X.
Of course, we'll need the OP to update his posting with more information regarding what indexes are in play to confirm if this is actually the cause.
I'm pretty uncomfortable about this question and the attendant answers. There's a lot of "try this magic dust! No that magic dust!"
I can't see anywhere that you've anaylzed the locks that are taken, and determined what exact type of locks are deadlocked.
All you've indicated is that some locks occur -- not what is deadlocking.
In SQL 2005 you can get more info about what locks are being taken out by using:
DBCC TRACEON (1222, -1)
so that when the deadlock occurs you'll have better diagnostics.
Are you instantiating a new LINQ to SQL DataContext object for every operation or are you perhaps sharing the same static context for all your calls? I originally tried the latter approach, and from what I remember, it caused unwanted locking in the DB. I now create a new context for every atomic operation.
Before burning the house down to catch a fly with NOLOCK all over, you may want to take a look at that deadlock graph you should've captured with Profiler.
Remember that a deadlock requires (at least) 2 locks. Connection 1 has Lock A, wants Lock B - and vice-versa for Connection 2. This is an unsolvable situation, and someone has to give.
What you've shown so far is solved by simple locking, which Sql Server is happy to do all day long.
I suspect you (or LINQ) are starting a transaction with that UPDATE statement in it, and SELECTing some other piece of info before hand. But, you really need to backtrack through the deadlock graph to find the locks held by each thread, and then backtrack through Profiler to find the statements that caused those locks to be granted.
I expect that there's at least 4 statements to complete this puzzle (or a statement that takes multiple locks - perhaps there's a trigger on the Posts table?).
Will you care if your user profile is a few seconds out of date?
Nope - that's perfectly acceptable. Setting the base transaction isolation level is probably the best/cleanest way to go.
Typical read/write deadlock comes from index order access. Read (T1) locates the row on index A and then looks up projected column on index B (usually clustered). Write (T2) changes index B (the cluster) then has to update the index A. T1 has S-Lck on A, wants S-Lck on B, T2 has X-Lck on B, wants U-Lck on A. Deadlock, puff. T1 is killed.
This is prevalent in environments with heavy OLTP traffic and just a tad too many indexes :). Solution is to make either the read not have to jump from A to B (ie. included column in A, or remove column from projected list) or T2 not have to jump from B to A (don't update indexed column).
Unfortunately, linq is not your friend here...
#Jeff - I am definitely not an expert on this, but I have had good results with instantiating a new context on almost every call. I think it's similar to creating a new Connection object on every call with ADO. The overhead isn't as bad as you would think, since connection pooling will still be used anyway.
I just use a global static helper like this:
public static class AppData
{
/// <summary>
/// Gets a new database context
/// </summary>
public static CoreDataContext DB
{
get
{
var dataContext = new CoreDataContext
{
DeferredLoadingEnabled = true
};
return dataContext;
}
}
}
and then I do something like this:
var db = AppData.DB;
var results = from p in db.Posts where p.ID = id select p;
And I would do the same thing for updates. Anyway, I don't have nearly as much traffic as you, but I was definitely getting some locking when I used a shared DataContext early on with just a handful of users. No guarantees, but it might be worth giving a try.
Update: Then again, looking at your code, you are only sharing the data context for the lifetime of that particular controller instance, which basically seems fine unless it is somehow getting used concurrently by mutiple calls within the controller. In a thread on the topic, ScottGu said:
Controllers only live for a single request - so at the end of processing a request they are garbage collected (which means the DataContext is collected)...
So anyway, that might not be it, but again it's probably worth a try, perhaps in conjunction with some load testing.
Q. Why are you storing the AnswerCount in the Posts table in the first place?
An alternative approach is to eliminate the "write back" to the Posts table by not storing the AnswerCount in the table but to dynamically calculate the number of answers to the post as required.
Yes, this will mean you're running an additional query:
SELECT COUNT(*) FROM Answers WHERE post_id = #id
or more typically (if you're displaying this for the home page):
SELECT p.post_id,
p.<additional post fields>,
a.AnswerCount
FROM Posts p
INNER JOIN AnswersCount_view a
ON <join criteria>
WHERE <home page criteria>
but this typically results in an INDEX SCAN and may be more efficient in the use of resources than using READ ISOLATION.
There's more than one way to skin a cat. Premature de-normalisation of a database schema can introduce scalability issues.
You definitely want READ_COMMITTED_SNAPSHOT set to on, which it is not by default. That gives you MVCC semantics. It's the same thing Oracle uses by default. Having an MVCC database is so incredibly useful, NOT using one is insane. This allows you to run the following inside a transaction:
Update USERS Set FirstName = 'foobar';
//decide to sleep for a year.
meanwhile without committing the above, everyone can continue to select from that table just fine. If you are not familiar with MVCC, you will be shocked that you were ever able to live without it. Seriously.
Setting your default to read uncommitted is not a good idea. Your will undoubtedly introduce inconsistencies and end up with a problem that is worse than what you have now. Snapshot isolation might work well, but it is a drastic change to the way Sql Server works and puts a huge load on tempdb.
Here is what you should do: use try-catch (in T-SQL) to detect the deadlock condition. When it happens, just re-run the query. This is standard database programming practice.
There are good examples of this technique in Paul Nielson's Sql Server 2005 Bible.
Here is a quick template that I use:
-- Deadlock retry template
declare #lastError int;
declare #numErrors int;
set #numErrors = 0;
LockTimeoutRetry:
begin try;
-- The query goes here
return; -- this is the normal end of the procedure
end try begin catch
set #lastError=##error
if #lastError = 1222 or #lastError = 1205 -- Lock timeout or deadlock
begin;
if #numErrors >= 3 -- We hit the retry limit
begin;
raiserror('Could not get a lock after 3 attempts', 16, 1);
return -100;
end;
-- Wait and then try the transaction again
waitfor delay '00:00:00.25';
set #numErrors = #numErrors + 1;
goto LockTimeoutRetry;
end;
-- Some other error occurred
declare #errorMessage nvarchar(4000), #errorSeverity int
select #errorMessage = error_message(),
#errorSeverity = error_severity()
raiserror(#errorMessage, #errorSeverity, 1)
return -100
end catch;
One thing that has worked for me in the past is making sure all my queries and updates access resources (tables) in the same order.
That is, if one query updates in order Table1, Table2 and a different query updates it in order of Table2, Table1 then you might see deadlocks.
Not sure if it's possible for you to change the order of updates since you're using LINQ. But it's something to look at.
Will you care if your user profile is a few seconds out of date?
A few seconds would definitely be acceptable. It doesn't seem like it would be that long, anyways, unless a huge number of people are submitting answers at the same time.
I agree with Jeremy on this one. You ask if you should create a new data context for each controller or per page - I tend to create a new one for every independent query.
I'm building a solution at present which used to implement the static context like you do, and when I threw tons of requests at the beast of a server (million+) during stress tests, I was also getting read/write locks randomly.
As soon as I changed my strategy to use a different data context at LINQ level per query, and trusted that SQL server could work its connection pooling magic, the locks seemed to disappear.
Of course I was under some time pressure, so trying a number of things all around the same time, so I can't be 100% sure that is what fixed it, but I have a high level of confidence - let's put it that way.
You should implement dirty reads.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
If you don't absolutely require perfect transactional integrity with your queries, you should be using dirty reads when accessing tables with high concurrency. I assume your Posts table would be one of those.
This may give you so called "phantom reads", which is when your query acts upon data from a transaction that hasn't been committed.
We're not running a banking site here, we don't need perfect accuracy every time
Use dirty reads. You're right in that they won't give you perfect accuracy, but they should clear up your dead locking issues.
Without that, we'd have to wrap every LINQ call we make (well, the simple reading ones, which is the vast majority of them) in a 3-4 line transaction code block, which is ugly
If you implement dirty reads on "the base database context", you can always wrap your individual calls using a higher isolation level if you need the transactional integrity.
So what's the problem with implementing a retry mechanism? There will always be the possibility of a deadlock ocurring so why not have some logic to identify it and just try again?
Won't at least some of the other options introduce performance penalties that are taken all the time when a retry system will kick in rarely?
Also, don't forget some sort of logging when a retry happens so that you don't get into that situation of rare becoming often.
Now that I see Jeremy's answer, I think I remember hearing that the best practice is to use a new DataContext for each data operation. Rob Conery's written several posts about DataContext, and he always news them up rather than using a singleton.
http://blog.wekeroad.com/2007/08/17/linqtosql-ranch-dressing-for-your-database-pizza/
http://blog.wekeroad.com/mvc-storefront/mvcstore-part-9/ (see comments)
Here's the pattern we used for Video.Show (link to source view in CodePlex):
using System.Configuration;
namespace VideoShow.Data
{
public class DataContextFactory
{
public static VideoShowDataContext DataContext()
{
return new VideoShowDataContext(ConfigurationManager.ConnectionStrings["VideoShowConnectionString"].ConnectionString);
}
public static VideoShowDataContext DataContext(string connectionString)
{
return new VideoShowDataContext(connectionString);
}
}
}
Then at the service level (or even more granular, for updates):
private VideoShowDataContext dataContext = DataContextFactory.DataContext();
public VideoSearchResult GetVideos(int pageSize, int pageNumber, string sortType)
{
var videos =
from video in DataContext.Videos
where video.StatusId == (int)VideoServices.VideoStatus.Complete
orderby video.DatePublished descending
select video;
return GetSearchResult(videos, pageSize, pageNumber);
}
I would have to agree with Greg so long as setting the isolation level to read uncommitted doesn't have any ill effects on other queries.
I'd be interested to know, Jeff, how setting it at the database level would affect a query such as the following:
Begin Tran
Insert into Table (Columns) Values (Values)
Select Max(ID) From Table
Commit Tran
It's fine with me if my profile is even several minutes out of date.
Are you re-trying the read after it fails? It's certainly possible when firing a ton of random reads that a few will hit when they can't read. Most of the applications that I work with are very few writes compared to the number of reads and I'm sure the reads are no where near the number you are getting.
If implementing "READ UNCOMMITTED" doesn't solve your problem, then it's tough to help without knowing a lot more about the processing. There may be some other tuning option that would help this behavior. Unless some MSSQL guru comes to the rescue, I recommend submitting the problem to the vendor.
I would continue to tune everything; how are is the disk subsystem performing? What is the average disk queue length? If I/O's are backing up, the real problem might not be these two queries that are deadlocking, it might be another query that is bottlenecking the system; you mentioned a query taking 20 seconds that has been tuned, are there others?
Focus on shortening the long-running queries, I'll bet the deadlock problems will disappear.
Had the same problem, and cannot use the "IsolationLevel = IsolationLevel.ReadUncommitted" on TransactionScope because the server dont have DTS enabled (!).
Thats what i did with an extension method:
public static void SetNoLock(this MyDataContext myDS)
{
myDS.ExecuteCommand("SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED");
}
So, for selects who use critical concurrency tables, we enable the "nolock" like this:
using (MyDataContext myDS = new MyDataContext())
{
myDS.SetNoLock();
// var query = from ...my dirty querys here...
}
Sugestions are welcome!

Resources