During the course of our application login there are several queries ran, all around validating the login. In evaluating them I noticed that one of the queries is run without the NOLOCK hint.
There does not seem to be any particular danger of dirty read because the data would hardly ever change.
Thinking about it from an attempted DOS type attack by somebody attempting failed logins over and over again I am suggesting that the lack of NOLOCK lowers our threshold for failure.
I believe it is an extremely unlikely result of a DOS attack (I think the web server would go first) but adding NOLOCK should make it go from unlikely to impossible.
So, am I being excessive or trivial?
Having NOLOCKs or not is the least of your worries with a DoS attempt against your server.
I wouldn't sweat it.
If, as you say, the data rarely changes, having the NOLOCKs there probably don't hurt.
Yes, you are being excessively trival.
If you are exposed to DOS attacks, NOLOCK on the SQL authorization call is the least of your worries. Implement some DOS detection, failure tracking+throttle, even some planned pauses that wouldn't effect the user but would slow down an attack...
The NOLOCK could potentially improve your performance if you call that query frequently especially in broader transactions.
Consider the nature of the dirty read - would the time window where that can occur be truely critical? e.g. where you add or remove someone from an authorized role.
In the add scenario a dirty read would fail on that attempt. (access denied)
In the remove scenario a dirty read would work on that attempt. (access granted)
If the data changes through manual operation e.g. human interaction - their margin for "latency" is typically much higher/indeterminate than your databases!
Also a very rare case, but still: Just at the moment somebody deactivates the user to prevent them from logging in, NOLOCK lets them in. Could be a rogue user/hacker/employee who needs to be locked out immediately?
You would have to be concerned about this particular scenario to forgo the performance advantage of NOLOCK.
You would protect your table far better by looking at the permissions to it. A table like this should not allow any direct access, all access should be from stored procs and the permissions set on them instead.
Related
I recently came up with a case that makes me wonder if I'm a newbie or something trivial has escaped to me.
Suppose I have a software to be run by many users, that uses a table. When the user makes login in the app a series of information from the table appears and he has just to add and work or correct some information to save it. Now, if the software he uses is run by many people, how can I guarantee is he is the only one working with that particular record? I mean how can I know the record is not selected and being worked by 2 or more users at the same time? And please I wouldn't like the answer use “SELECT FOR UPDATE... “
because for what I've read it has too negative impact on the database. Thanks to all of you. Keep up the good work.
This is something that is not solved primarily by the database. The database manages isolation and locking of "concurrent transactions". But when the records are sent to the client, you usually (and hopefully) closed the transaction and start a new one when it comes back.
So you have to care yourself.
There are different approaches, the ones that come into my mind are:
optimistic locking strategies (first wins)
pessimistic locking strategies
last wins
Optimistic locking: you check whether a record had been changed in the meanwhile when storing. Usually it does this by having a version counter or timestamp. Some ORMs and frameworks may help a little to implement this.
Pessimistic locking: build a mechanism that stores the information that someone started to edit something and do not allow someone else to edit the same. Especially in web projects it needs a timeout when the lock is released anyway.
Last wins: the second person storing the record just overwrites the first changes.
... makes me wonder if I'm a newbie ...
That's what happens always when we discover that very common stuff is still not solved by the tools and frameworks we use and we have to solve it over and over again.
Now, if the software he uses is runed by many people how can I guarantee is he
is the only one working with that particular record.
Ah...
And please I wouldn't like the answer use “SELECT FOR UPDATE... “ because for
what I've read it has too negative impact on the database.
Who cares? I mean, it is the only way (keep a lock on a row) to guarantee you are the only one who can change it. Yes, this limits throughput, but then this is WHAT YOU WANT.
It is called programming - choosing the right tools for the job. IN this case impact is required because of the requirements.
The alternative - not a guarantee on the database but an application server - is an in memory or in database locking mechanism (like a table indicating what objects belong to what user).
But if you need to guarantee one record is only used by one person on db level, then you MUST keep a lock around and deal with the impact.
But seriously, most programs avoid this. They deal with it either with optimistic locking (second user submitting changes gets error) or other programmer level decisions BECAUSE the cost of such guarantees are ridiculously high.
Oracle is different from SQL server.
In Oracle, when you update a record or data set the old information is still available because your update is still on hold on the database buffer cache until commit.
Therefore who is reading the same record will be able to see the old result.
If the access to this record though is a write access, it will be a lock until commit, then you'll have access to write the same record.
Whenever the lock can't be resolved, a deadlock will pop up.
SQL server though doesn't have the ability to read a record that has been locked to write changes, therefore depending which query you're running, you might lock an entire table
First you need to separate queries and insert/updates using a data-warehouse database. Which means you could solve slow performance in update that causes locks.
The next step is to identify what is causing locks and work out each case separately.
rebuilding indexes during working hours could cause very nasty locks. Push them to after hours.
I've worked in a very large organization where they had to use NOLOCK on most queries - because data was commonly updated via ETL processes during the day and locking the application for 40% of the working day was certainly not an option.
Out of habit, in my next place I went on to automatically use NOLOCK everywhere. But since reading warnings and risks, I've been gradually undoing this, to have no table hints specified and let SQL Server do it's thing.
However, I'm still not comfortable I'm doing the right thing. In the place where we used NOLOCK, I never once saw data get doubled up, or corrupt data.. I was there for many years. Ever since removing NOLOCK I am running into the obvious obstacle of rowlocks slowing / pausing queries which gives the illusion that my DB is slow or flakey. When in actuality it's just somebody running a lengthy save somewhere (the ability for them to do so is a requirement of the application).
I would be interested to hear from anyone who has actually experienced data corruption or data duplication from NOLOCK in practise, rather than people who are going by what they've read about it on the internet. I would be especially appreciative if anybody can provide replication steps to see this happen. I am trying to gauge just how risky is it, and do the risks outweigh the obvious benefit of being able to run reports parallel to updates?
I've seen corruptions, duplication and bad results with NOLOCK. Not once, not seldom. Every deployment that relied on NOLOCK and I had a chance to look at had correctness issues. True that many (most?) were not aware and did not realize, but the problems were there, always.
You have to realize that NOLOCK problems do no manifest as hard corruptions (the kind DBCC CHECKDB would report), but 'soft' corruption. And the problems are obvious only on certain kind of workloads, mostly on analytic types (aggregates). They would manifest as an incorrect value in a report, a balance mismatch in a ledger, a wrong department headcount and similar. These problems are visible as problems only when carefully inspected by a qualified human. And they may well vanish mysteriously on a simple Refresh of the page. So you may well have all these problems and not be aware of them. Your users might accept that 'sometimes the balance is wrong, just ask for the report again and will be OK' and never report you the issue.
And there are some workloads that are not very sensitive to NOLOCK issues. If you display 'posts' and 'comments' you won't see much of NOLOCK issues. Maybe the 'unanswered count' is off by 2, but who will notice?
Ever since removing NOLOCK I am running into the obvious obstacle of rowlocks slowing / pausing queries
I would recommend evaluating SNPASHOT isolation models (including READ_COMMITTED_SNAPSHOT). You may get a free lunch.
I see you've read a lot about it, but allow me to point you to a very good explanation on the dangers of using NOLOCK (that's it READ UNCOMMITTED isolation level): SQL Server NOLOCK Hint & other poor ideas.
Apart from this, I'll make some citations and comments. The worst part of NOLOCK is this:
It creates “incredibly hard to reproduce” bugs.
The problem is that when you read uncommited data, most of the time is commited, so everything is alright. But it will randomly fail if the transaction is not comitted. And that doesn't usually happen. Right? Nope: first, a single error is a very bad thing (your customer don't like it). And second, things can get much worse, LO:
The issue is that transactions do more than just update the row. Often they require an index to be updated OR they run out of space on the data page. This may require new pages to be allocated & existing rows on that page to be moved, called a PageSplit. It is possible for your select to completely miss a number of rows &/or count other rows twice. More info on this in the linked article
So, that means that even if the uncommited transaction you've read is committed, you can still read bad data. And, this will happen at random times. That's ugly, very ugly!
What about corruption?
As Remus Rusanu said, it's not "hard" but "soft" corruption. And it affects specially aggregates, because you're reading what you shouldn't when updating them. This can lead for example to a wrong account balance.
Haven't you heard of big LOB apps that have procedures to rebuild account balances? Why? They should be correctly updated inside transactions! (That can be acceptable if the balances are rebuilt at critical moments, for example while calcultaing taxes).
What can I do without corrupting data (and thus is relatively safe)?
Let's say it's "quite safe" to read uncommited data when you're not using it to update other existing data on the DB. I.e. if you use NOLOCK only for reporting purposes (without write-back) you're on the "quite safe" side. The only "tiny trouble" is that the report can show the wrong data, but, at least, the data in the DB will keep consistent.
To consider this safe depends on the prupose of what you're reading. If it's something informational, which is not going to be used to make decissions, that's quite safe (for example it's not very bad to have some errors on a report of the best customers, or the most sold products). But if you're getting this information to make decissions, things can be much worse (you can make a decission on a wrong basis!)
A particular experience
I worked on the development of a 'crowded' application, with some 1,500 users which used NOLOCK for reading data, modifying it an updating it on the DB (a HHRR/TEA company). And (apparently) there were no problems. The trick was that each employee read "atomic data" (an employee's data) to modify it, and it was nearly impossible that two people read and modified the same data at the same time. Besides this "atomic data" didn't influence any aggregate data. So everything was fine. But from time to time there were problems on the reporting area, which read "aggregated data" with NOLOCK. So, the critical reports had to be scheduled for moments where noone was working in the DB. The small deviations on non-critical reports was overlooked and admittable.
Now you know it. You have no excuses. You decide, to NOLOCK or not to NOLOCK
After performing an insert/update/delete, is it necessary to query the database to check if the action was performed correctly?
Edit:
I accepted an answer and would like to use it to convince management.
However, the management insists that there is a possibility that an insert/update/delete request could be corrupted in transmission (but wouldn't the network checksum fail?), and that I'm supposed to check if each transaction was performed correctly. Perhaps they're hinging on the fact that the checksum of a damaged packet can collide with the original packet's checksum. I think they're stretching it too far, and in most likelihood wouldn't do it for my own projects. Nonetheless, I am just a junior programmer and have no say.
Shouldn't be. Commercial database inserts/updates/deletes (and all db transactions) follow the ACID principle.
Wiki Quote:
In computer science, ACID (atomicity,
consistency, isolation, durability) is
a set of properties that guarantee
database transactions are processed
reliably.
If you have the feeling that you need to double check the success of your transactions then the problem most likely lies elsewhere in your architecture.
This isn't necessary - if the query completes successfully then the modification has been performed - if the query fails for whatever reason then the entire action will be rolled back for the query that failed if multiple queries are executed in a batch.
Depending on the isolation level that is being used, it's wholly possible that your modification is either superceded by modifications made by another query running 'at the same time' - whether this is important is down to what you're expecting to happen in this circumstance.
You shouldn't.
You can use SQL (or your programming platforms) built in error handling mechanism to see if there was errors so you can notify user that something bad happened, but otherwise all DB transactions follow ACID (as mentioned by Paul) which means that if something in batch fails, all batch is rolled back.
In an environment with a SQL Server failover cluster or mirror, how do you prefer to handle errors? It seems like there are two options:
Fail the entire current client request, and let the user retry
Catch the error in your DAL, and retry there
Each approach has its pros and cons. Most shops I've worked with do #1, but many of them also don't follow strict transactional boundaries, and seem to me to be leaving themselves open for trouble in the event of failure. Even so, I'm having trouble talking them into #2, which should also result in a better user experience (one catch is the potentially long delay while the failover happens).
Any arguments one way or the other would be appreciated. If you use the second approach, do you have a standard wrapper that helps simplify implementation? Either way, how do you structure your code to avoid issues such as those related to the lack of idempotency in the command that failed?
Number 2 could be an infinite loop. What if it's network related, or the local PC needs rebooted, or whatever?
Number 1 is annoying to users, of course.
If you only allow access via a web site, then you'll never see the error anyway unless the failover happens mid-call. For us, this is unlikely and we have failed over without end users realising.
In real life you may not have nice clean DAL on a web server. You may have an Excel sheet connecting (most financials) or WinForms where the connection is kept open, so you only have the one option.
Fail over should only take a few seconds anyway. If the DB recovery takes more than that, you have bigger issues anyway. And if it happens often enough to have to think about handling it, well...
In summary, it will happen that rarely that you want to know and number 1 would be better. IMHO.
In a financial system I am currently maintaining, we are relying on the rollback mechanism of our database to simulate the results of running some large batch jobs - rolling back the transaction or committing it at the end, depending on whether we were doing a test run.
I really cannot decide what my opinion is. In a way I think this is great, because then there is no difference between the simulation and a live run - on the other hand it just feels kind of icky, like e.g. depending on code throwing exceptions to carry out your business logic.
What is your opinion on relying on database transactions to carry out your business logic?
EXAMPLE
Consider an administration system having 1000 mortgage deeds in a database. Now the user wants to run a batch job that creates the next term invoice on each deed, using a couple of advanced search criteria that decides which deeds are to be invoiced.
Before she actually does this, she does a test run (implemented by doing the actualy run but ending in a transaction rollback), creating a report on which deeds will be invoiced. If it looks satisfactory, she can choose to do the actual run, which will end in a transaction commit.
Each invoice will be stamped with a batch number, allowing us to revert the changes later if it is needed, so it's not "dangereous" to do the batch run. The users just feel that it's better UX to be able to simulate the results first.
CLARIFICATION
It is NOT about testing. We do have test and staging environments for that. It's about a regular user using our system wanting to simulate the results of a large operation, that may seem "uncontrollable" or "irreversible" even though it isn't.
CONCLUSION
Doesn't seem like anyone has any real good arguments against our solution. As always, context means everything, so in the context of complex functional requirements exceeding performance requirements, using db rollback to implement batch job simulations seems a viable solution.
As there is no real answer to this question, I am not choosing an answer - instead I upvoted those who actually put forth an argument.
I think it's an acceptable approach, as long as it doesn't interfere with regular processing.
The alternative would be to build a query that displays the consequences for review, but we all have had the experience of taking such an approach and not quite getting it right; or finding that the context changed between query and execution.
At the scale of 1000 rows, it's unlikely the system load is burdensome.
Before she actually does this, she does a test run (implemented by doing the actualy run but ending in a transaction rollback), creating a report on which deeds will be invoiced. If it looks satisfactory, she can choose to do the actual run, which will end in a transaction commit.
That's wrong, prone to failure, and must be hell on your database logs. Unless you wrap your simulation and the actual run in a single transaction (which, judging by the timeline necessary to inspect 1000 deeds, would lead to a lot of blocked users) then there's no guaranteed consistency between test run and real run. If somebody changed data, added rows, etc. then you could end up with a different result - defeating the entire purpose of the test run.
A better pattern to do this would be for the test run to tag the records, and the real run to pick up the tagged records and process them. Or, if you have a thick client app, you can pull down the records to the client, show the report, and - if approved - push them back up.
We can see what the user needs to do, quite a reasonable thing. I mean how often do we get a regexp right first time? Refining a query till it does exactly what you want is not unusual.
The business consequences of not catching errors may be quite high, so doing a trial run makes sense.
Given a blank sheet of paper I'm sure we can devise an clean implementation expressed in formal behaviours of the system rather than this somewhat back-door appraoch.
How much effort would I put into fixing that now? Depends on whether the current approach is actually hurting. We can imagine that in a heaviliy used system it could lead to contention in the database.
What I wrote about the PRO FORMA environment in that bank I worked in was also entirely a user thing.
I'm not sure exactly what you're asking here. Taking you literally
What is your opinion on relying on
database transactions to carry out
your business logic?
Well, that's why we have transactions. We do rely on them. We hit an error and abort a transaction and rely on work done in that transaction scope to be rolled-back. So exploiting the transactional beahviours of our systems is a jolly good thing, and we'd need to hand-roll the same thing ourselves if we didn't.
But I think your question is about testing in a live system and relying on not commiting in order to do no damage. In an ideal world we have a live system and a test system and we don't mess with live systems. Such ideals are rarely seen. Far more common is "patch the live system. testing? what do you mean testing?" So in fact you're ahead of the game compared with some.
An alternative is to have dummy data in the live system, so that some actions can actually run all the way through. Again, error prone.
A surprisingly high proportion of systems outage are due to finger trouble, it's the humans who foul up.
It works - as you say. I'd worry about the concurrency of the system since the transaction will hold locks, possibly many locks. It means that your tests will hold up any live action on the system (and any live action operations will hold up your tests). It is generally better to use a test system for testing. I don't like it much, but if the risks from running a test but forgetting to roll it back are not a problem, and neither is the interference, then it is an effective way of attaining a 'what if' type calculation. But still not very clean.
When I was working in a company that was part of the "financial system", there was a project team that had decided to use the production environment to test their conversion procedure (and just rollback instead of commit at the end).
They almost got shot for it. With afterthought, it's a pity they weren't.
Your test environments that you claim you have are for the IT people's use. Get a similar "PRO-FORMA" environment of which you can guarantee your users that it is for THEIR use exclusively.
When I worked in that bank, creating such a PRO-FORMA environment was standard procedure at every year closure.
"But it's not about testing, it's about a user wanting to simulate the results of a large job."
Paraphrased : "it's not about testing, it's about simulation".
Sometimes I wish Fabian Pascal was still in business.
(Oh, and in case anyone doesn't understand : both are about "not perpetuating the results".)