After performing an insert/update/delete, is it necessary to query the database to check if the action was performed correctly?
Edit:
I accepted an answer and would like to use it to convince management.
However, the management insists that there is a possibility that an insert/update/delete request could be corrupted in transmission (but wouldn't the network checksum fail?), and that I'm supposed to check if each transaction was performed correctly. Perhaps they're hinging on the fact that the checksum of a damaged packet can collide with the original packet's checksum. I think they're stretching it too far, and in most likelihood wouldn't do it for my own projects. Nonetheless, I am just a junior programmer and have no say.
Shouldn't be. Commercial database inserts/updates/deletes (and all db transactions) follow the ACID principle.
Wiki Quote:
In computer science, ACID (atomicity,
consistency, isolation, durability) is
a set of properties that guarantee
database transactions are processed
reliably.
If you have the feeling that you need to double check the success of your transactions then the problem most likely lies elsewhere in your architecture.
This isn't necessary - if the query completes successfully then the modification has been performed - if the query fails for whatever reason then the entire action will be rolled back for the query that failed if multiple queries are executed in a batch.
Depending on the isolation level that is being used, it's wholly possible that your modification is either superceded by modifications made by another query running 'at the same time' - whether this is important is down to what you're expecting to happen in this circumstance.
You shouldn't.
You can use SQL (or your programming platforms) built in error handling mechanism to see if there was errors so you can notify user that something bad happened, but otherwise all DB transactions follow ACID (as mentioned by Paul) which means that if something in batch fails, all batch is rolled back.
Related
I understand, in a fuzzy sort of way, how regular ACID transactions work. You perform some work on a database in such a way that the work is not confirmed until some kind of commit flag is set. The commit part is based on some underlying assumption (like a single disk block write is atomic). In the event of a catastrophic error, you can just clear out the uncommitted data in the recovery phase.
How do distributed transactions work? In some of the MS documentation I have read that you can somehow perform a transaction across databases and filesystems (among other things).
This technology could be (and probably is) used for installers, where you want the program to be fully installed or fully absent. You simply begin a transaction at the start of the installer. Next you could connect to the registry and filesystem, making the changes that define the installation. When the job is done, simply commit, or rollback if the installation fails for some reason. The registry and filesystem are automatically cleaned for you by this magical distributed transaction coordinator.
How is it possible that two disparate systems can be transacted upon in this fashion? It seems to me that it is always possible to leave the system in an inconsistent state, where the filesystem has committed its changes and the registry has not. I think in MSDTC it is even possible to perform a transaction across the network.
I have read http://blogs.msdn.com/florinlazar/archive/2004/03/04/84199.aspx, but it feels like only the beginning of the explanation, and that step 4 should be expanded considerably.
Edit: From what I gather on http://en.wikipedia.org/wiki/Distributed_transaction, it can be accomplished by a two-phase commit (http://en.wikipedia.org/wiki/Two-phase_commit). After reading this, I'm still not understanding the method 100%, it seems like there is a lot of room for error between the steps.
About "step 4":
The transaction manager coordinates
with the resource managers to ensure
that all succeed to do the requested
work or none of the work if done, thus
maintaining the ACID properties.
This of course requires all participants to provide the proper interfaces and (error-free) implementations. The interface looks like vaguely this:
public interface ITransactionParticipant {
bool WouldCommitWork();
void Commit();
void Rollback();
}
The Transaction manager at commit-time queries all participants whether they are willing to commit the transaction. The participants may only assert this if they are able to commit this transaction under all allowable error conditions (validation, system errors, etc). After all participants have asserted the ability to commit the transaction, the manager sends the Commit() message to all participants. If any participant instead raises an error or times out, the whole transaction aborts and individual members are rolled back.
This protocol requires participants to have recorded their whole transaction content before asserting their ability to commit. Of course this has to be in a special local transaction log structure to be able to recover from various kinds of failures.
I recently came up with a case that makes me wonder if I'm a newbie or something trivial has escaped to me.
Suppose I have a software to be run by many users, that uses a table. When the user makes login in the app a series of information from the table appears and he has just to add and work or correct some information to save it. Now, if the software he uses is run by many people, how can I guarantee is he is the only one working with that particular record? I mean how can I know the record is not selected and being worked by 2 or more users at the same time? And please I wouldn't like the answer use “SELECT FOR UPDATE... “
because for what I've read it has too negative impact on the database. Thanks to all of you. Keep up the good work.
This is something that is not solved primarily by the database. The database manages isolation and locking of "concurrent transactions". But when the records are sent to the client, you usually (and hopefully) closed the transaction and start a new one when it comes back.
So you have to care yourself.
There are different approaches, the ones that come into my mind are:
optimistic locking strategies (first wins)
pessimistic locking strategies
last wins
Optimistic locking: you check whether a record had been changed in the meanwhile when storing. Usually it does this by having a version counter or timestamp. Some ORMs and frameworks may help a little to implement this.
Pessimistic locking: build a mechanism that stores the information that someone started to edit something and do not allow someone else to edit the same. Especially in web projects it needs a timeout when the lock is released anyway.
Last wins: the second person storing the record just overwrites the first changes.
... makes me wonder if I'm a newbie ...
That's what happens always when we discover that very common stuff is still not solved by the tools and frameworks we use and we have to solve it over and over again.
Now, if the software he uses is runed by many people how can I guarantee is he
is the only one working with that particular record.
Ah...
And please I wouldn't like the answer use “SELECT FOR UPDATE... “ because for
what I've read it has too negative impact on the database.
Who cares? I mean, it is the only way (keep a lock on a row) to guarantee you are the only one who can change it. Yes, this limits throughput, but then this is WHAT YOU WANT.
It is called programming - choosing the right tools for the job. IN this case impact is required because of the requirements.
The alternative - not a guarantee on the database but an application server - is an in memory or in database locking mechanism (like a table indicating what objects belong to what user).
But if you need to guarantee one record is only used by one person on db level, then you MUST keep a lock around and deal with the impact.
But seriously, most programs avoid this. They deal with it either with optimistic locking (second user submitting changes gets error) or other programmer level decisions BECAUSE the cost of such guarantees are ridiculously high.
Oracle is different from SQL server.
In Oracle, when you update a record or data set the old information is still available because your update is still on hold on the database buffer cache until commit.
Therefore who is reading the same record will be able to see the old result.
If the access to this record though is a write access, it will be a lock until commit, then you'll have access to write the same record.
Whenever the lock can't be resolved, a deadlock will pop up.
SQL server though doesn't have the ability to read a record that has been locked to write changes, therefore depending which query you're running, you might lock an entire table
First you need to separate queries and insert/updates using a data-warehouse database. Which means you could solve slow performance in update that causes locks.
The next step is to identify what is causing locks and work out each case separately.
rebuilding indexes during working hours could cause very nasty locks. Push them to after hours.
Whenever I read something about NoSQL distributed databases they mention the CAP theorem and that it means that in a partitioned system you can either have full consistency, full availability, or a little bit of both, but never both entirely.
What is not really clear to me is what type of consistency they are talking about:
Is it consistency in data freshness, where some clients may get older data than others?
Or is it consistency in the sense that transactions may complete only partially and this may bring the data in an inconsistent state?
The second interpretation sounds quite dangerous to me and not really acceptable. The first interpretation sounds acceptable but how can you prevent that a client that requests a set of data is not served with partly outdated data and partly fresh data?
How dangerous is it to only offer partial consistency and what are the possible negative effects?
Consistency in distributed databases is a huge problem, and it means both of your options: stale data in some places, and partially completed transactions. I'm not going to write an essay about it because it is a huge problem and the solutions are not easy. However, here are some key phrases.
Eventual Consistency is the solution to this, but implementing it sounds like a big job. The key to the implementation is Idempotent Messages. Lets say a complete transaction involves updating data on machines A, B, and C. How do you actually do that? You start sending messages around the place, and keep sending them until you receive an acknowledgement of receipt and successful processing. You may send the message to B twice either because B never got the message, or because B's ack never got received. If you sent it twice because you never got the ack, then B had better do the right thing when it gets it again (which may be to ignore it), and send you an ack so you stop bothering it.
This is a pretty good article, it looks like, and its from a NoSQL point of view. There are loads of links about Idempotent Messages hidden in any search engine, so I'll let you root around.
Final note: Pat Helland who worked on Distributed Databases for many years (at Microsoft and Google among other places) eventually came to the conclusion that consistency for Distributed DBs was impossible, and that you'd better settle for Eventual Consistency via Idempotent Messages.
All transaction managers (Atomikos, Bitronix, IBM WebSphere TM etc) save some "transaction logs" into 'tranlogs' folder to file system.
When something terrible happens and server gets down sometimes tranlogs become broken.
They require some manual recovery procedure.
I've been told that by simply clearing broken tranlogs folder I risk to have an inconsistent state of resources that participated in transactions.
As a "dumb" developer I feel more comfortable with simple concepts. I want to think that distributed transaction management should be alike the regular transaction management:
If something went wrong at any party (network, app error, timeout) - I expect the whole multi-resource transaction not to be committed in any part of it. All leftovers should be cleaned up sooner or later automatically.
If transaction managers fails (file system fault, power supply fault) - I expect all the transactions under this TM to be rollbacked (apparently, at DB timeout level).
File storage for tranlogs is optional if I don't want to have any automatic TX recovery (whatever it would mean).
Questions
Why can't I think like this? What's so complicated about 2PC?
What are the exact risks when I clear broken tranlogs?
If I am wrong and I really need all the mess with 2PC file system state. Don't you feel sick about the fact that TX manager can actually break storage state in an easy and ugly manner?
When I was first confronted with 2 phase commit in real life in 1994 (initially on a larger Oracle7 environment), I had a similar initial reaction. What a bloody shame that it is not generally possible to make it simple. But looking back at algorithm books of university, it become clear that there is no general solution for 2PC.
See for instance how to come to consensus in a distributed environment
Of course, there are many specific cases where a 2PC commit of a transaction can be resolved more easy to either complete or roll back completely and with less impact. But the general problem stays and can not be solved.
In this case, a transaction manager has to decide at some time what to do; a transaction can not remain open forever. Therefor, as an ultimate solution they will always need to have go back to their own transaction logs, since one or more of the other parties may not be able to reliably communicate status now and in the near future. Some transaction managers might be more advanced and know how to resolve some cases more easily, but the need for an ultimate fallback stays.
I am sorry for you. Fixing it generally seems to be identical to "Falsity implies anything" in binary logic.
Summarizing
On Why can't I think like this? and What's so complicated about 2PC: See above. This algorithmetic problem can't be solved universally.
On What are the exact risks when I clear broken tranlogs?: the transaction manager has some database backing it. Deleting translogs is the same problem in general relational database software; you loose information on the transactions in process. Some db platforms can still have somewhat or largely integer files. For background and some database theory, see Wikipedia.
On Don't you feel sick about the fact that TX manager can actually break storage state in an easy and ugly manner?: yes, sometimes when I have to get a lot of work done by the team, I really hate it. But well, it keeps me having a job :-)
Addition: to 2PC or not
From your addition I understand that you are thinking whether or not to include 2PC in your projects.
In my opinion, your mileage may vary. Our company has as policy for 2PC: avoid it whenever possible. However, in some environments and especially with legacy systems and complex environments such a found in banking you can not work around it. The customer requires it and they may be not willing to allow you to perform a major change in other infrastructural components.
When you must do 2PC: do it well. I like a clean architecture of the software and infrastructure, and something that is so simple that even 5 years from now it is clear how it works.
For all other cases, we stay away from two phase commit. We have our own framework (Invantive Producer) from client, to application server to database backend. In this framework we have chosen to sacrifice elements of ACID when normally working in a distributed environment. The application developer must take care himself of for instance atomicity. Often that is possible with little effort or even doesn't require thinking about. For instance, all software must be safe for restart. Even with atomicity of transactions this requires some thinking to do it well in a massive multi user environment (for instance locking issues).
In general this stupid approach is very easy to understand and maintain. In cases where we have been required to do two phase commit, we have been able to just replace some plug-ins on the framework and make some changes to client-side code.
So my advice would be:
Try to avoid 2PC.
But encapsulate your transaction logic nicely.
Allowing to do 2PC without a complete rebuild, but only changing things where needed.
I hope this helps you. If you can tell me more about your typical environments (size in #tables, size in GB persistent data, size in #concurrent users, typical transaction mgmt software and platform) may be i can make some additions or improvements.
Addition: Email and avoiding message loss in 2PC
Regarding whether suggesting DB combining with JMS: No, combining DB with JMS is normally of little use; it will itself already have some db, therefor the original question on transaction logs.
Regarding your business case: I understand that per event an email is sent from a template and that the outgoing mail is registered as an event in the database.
This is a hard nut to crack; I've been enjoying doing security audits and one of the easiest security issues to score was checking use of email.
Email - besides not being confidential and tampersafe in most situations like a postcard - has no guarantees for delivery and/or reading without additional measures. For instance, even when email is delivered directly between your mail transfer agent and the recipient, data loss can occur without the transaction monitor being informed. That even gets worse when multiple hops are involved. For instance, each MTA has it's own queueing mechanism on which a "bomb can be dropped" leading to data loss. But you can also think of spam measures, bad configuration, mail loops, pressing delete file by accident, etc. Even when you can register the sending of the email without any loss of transaction information using 2PC, this gives absolutely no clue on whether the email will arrive at all or even make it across the first hop.
The company I work for sells a large software package for project-driven businesses. This package has an integrated queueing mechanism, which also handles email events. Typically combined in most implementation with Exchange nowadays. A few months we've had a nice problem: transaction started, opened mail channel, mail delivered to Exchange as MTA, register that mail was handled... transaction aborted, since Oracle tablespace full. On the next run, the mail was delivered again to Exchange, again abort, etc. The algorithm has been enhanced now, but from this simple example you can see that you need all endpoints to cooperate in your 2PC, even when some of the endpoints are far away in an organisation receiving and displaying your email.
If you need measures to ensure that an email is delivered or read, you will need to supplement it by additional measures. Please pick one of application controls, user controls and process controls from literature.
In a financial system I am currently maintaining, we are relying on the rollback mechanism of our database to simulate the results of running some large batch jobs - rolling back the transaction or committing it at the end, depending on whether we were doing a test run.
I really cannot decide what my opinion is. In a way I think this is great, because then there is no difference between the simulation and a live run - on the other hand it just feels kind of icky, like e.g. depending on code throwing exceptions to carry out your business logic.
What is your opinion on relying on database transactions to carry out your business logic?
EXAMPLE
Consider an administration system having 1000 mortgage deeds in a database. Now the user wants to run a batch job that creates the next term invoice on each deed, using a couple of advanced search criteria that decides which deeds are to be invoiced.
Before she actually does this, she does a test run (implemented by doing the actualy run but ending in a transaction rollback), creating a report on which deeds will be invoiced. If it looks satisfactory, she can choose to do the actual run, which will end in a transaction commit.
Each invoice will be stamped with a batch number, allowing us to revert the changes later if it is needed, so it's not "dangereous" to do the batch run. The users just feel that it's better UX to be able to simulate the results first.
CLARIFICATION
It is NOT about testing. We do have test and staging environments for that. It's about a regular user using our system wanting to simulate the results of a large operation, that may seem "uncontrollable" or "irreversible" even though it isn't.
CONCLUSION
Doesn't seem like anyone has any real good arguments against our solution. As always, context means everything, so in the context of complex functional requirements exceeding performance requirements, using db rollback to implement batch job simulations seems a viable solution.
As there is no real answer to this question, I am not choosing an answer - instead I upvoted those who actually put forth an argument.
I think it's an acceptable approach, as long as it doesn't interfere with regular processing.
The alternative would be to build a query that displays the consequences for review, but we all have had the experience of taking such an approach and not quite getting it right; or finding that the context changed between query and execution.
At the scale of 1000 rows, it's unlikely the system load is burdensome.
Before she actually does this, she does a test run (implemented by doing the actualy run but ending in a transaction rollback), creating a report on which deeds will be invoiced. If it looks satisfactory, she can choose to do the actual run, which will end in a transaction commit.
That's wrong, prone to failure, and must be hell on your database logs. Unless you wrap your simulation and the actual run in a single transaction (which, judging by the timeline necessary to inspect 1000 deeds, would lead to a lot of blocked users) then there's no guaranteed consistency between test run and real run. If somebody changed data, added rows, etc. then you could end up with a different result - defeating the entire purpose of the test run.
A better pattern to do this would be for the test run to tag the records, and the real run to pick up the tagged records and process them. Or, if you have a thick client app, you can pull down the records to the client, show the report, and - if approved - push them back up.
We can see what the user needs to do, quite a reasonable thing. I mean how often do we get a regexp right first time? Refining a query till it does exactly what you want is not unusual.
The business consequences of not catching errors may be quite high, so doing a trial run makes sense.
Given a blank sheet of paper I'm sure we can devise an clean implementation expressed in formal behaviours of the system rather than this somewhat back-door appraoch.
How much effort would I put into fixing that now? Depends on whether the current approach is actually hurting. We can imagine that in a heaviliy used system it could lead to contention in the database.
What I wrote about the PRO FORMA environment in that bank I worked in was also entirely a user thing.
I'm not sure exactly what you're asking here. Taking you literally
What is your opinion on relying on
database transactions to carry out
your business logic?
Well, that's why we have transactions. We do rely on them. We hit an error and abort a transaction and rely on work done in that transaction scope to be rolled-back. So exploiting the transactional beahviours of our systems is a jolly good thing, and we'd need to hand-roll the same thing ourselves if we didn't.
But I think your question is about testing in a live system and relying on not commiting in order to do no damage. In an ideal world we have a live system and a test system and we don't mess with live systems. Such ideals are rarely seen. Far more common is "patch the live system. testing? what do you mean testing?" So in fact you're ahead of the game compared with some.
An alternative is to have dummy data in the live system, so that some actions can actually run all the way through. Again, error prone.
A surprisingly high proportion of systems outage are due to finger trouble, it's the humans who foul up.
It works - as you say. I'd worry about the concurrency of the system since the transaction will hold locks, possibly many locks. It means that your tests will hold up any live action on the system (and any live action operations will hold up your tests). It is generally better to use a test system for testing. I don't like it much, but if the risks from running a test but forgetting to roll it back are not a problem, and neither is the interference, then it is an effective way of attaining a 'what if' type calculation. But still not very clean.
When I was working in a company that was part of the "financial system", there was a project team that had decided to use the production environment to test their conversion procedure (and just rollback instead of commit at the end).
They almost got shot for it. With afterthought, it's a pity they weren't.
Your test environments that you claim you have are for the IT people's use. Get a similar "PRO-FORMA" environment of which you can guarantee your users that it is for THEIR use exclusively.
When I worked in that bank, creating such a PRO-FORMA environment was standard procedure at every year closure.
"But it's not about testing, it's about a user wanting to simulate the results of a large job."
Paraphrased : "it's not about testing, it's about simulation".
Sometimes I wish Fabian Pascal was still in business.
(Oh, and in case anyone doesn't understand : both are about "not perpetuating the results".)