Safely deleting a Django model from the database using a transaction

Safely deleting a Django model from the database using a transaction - database

In my Django application, I have code that deletes a single instance of a model from the database. There is a possibility that two concurrent requests could both try to delete the same model at the same time. In this case, I want one request to succeed and the other to fail. How can I do this?
The problem is that when deleting a instance with delete(), Django doesn't return any information about whether the command was successful or not. This code illustrates the problem:
b0 = Book.objects.get(id=1)
b1 = Book.objects.get(id=1)
b0.delete()
b1.delete()
Only one of these two delete() commands actually deleted the object, but I don't know which one. No exceptions are thrown and nothing is returned to indicate the success of the command. In pure SQL, the command would return the number of rows deleted and if the value was 0, I would know my delete failed.
I am using PostgreSQL with the default Read Commited isolation level. My understanding of this level is that each command (SELECT, DELETE, etc.) sees a snapshot of the database, but that the next command could see a different snapshot of the database. I believe this means I can't do something like this:
# I believe this wont work
#commit_on_success
def view(request):
try:
book = Book.objects.get(id=1)
# Possibility that the instance is deleted by the other request
# before we get to the next delete()
book.delete()
except ObjectDoesntExist:
# Already been deleted
Any ideas?

You can put the constraint right into the SQL DELETE statement by using QuerySet.delete instead of Model.delete:
Book.objects.filter(pk=1).delete()
This will never issue the SELECT query at all, just something along the lines of:
DELETE FROM Book WHERE id=1;
That handles the race condition of two concurrent requests deleting the same record at the same time, but it doesn't let you know whether your delete got there first. For that you would have to get the raw cursor (which django lets you do), .execute() the above DELETE yourself, and then pick up the cursor's rowcount attribute, which will be 0 if you didn't wind up deleting anything.

Related

SalesForce DML set-based operations and atomic transactions

I have just begun to read about Salesforce APEX and its DML. It seems you can do bulk updates by constructing a list, adding items to be updated to the list, and then issuing an update myList call.
Does such an invocation of update create an atomic transaction, so that if for any reason an update to one of the items in the list should fail, the entire update operation is rolled back? If not, is there a way to wrap an update in an atomic transaction?

Your whole context is an atomic transaction. By the time Apex code runs SF has already started, whether it's a Visualforce button click, a trigger or any other entry point. If you hit a validation error, null pointer reference exception, attempt to divide by zero, thrown exception etc - whole thing will be rolled back.
update myList; works in "all or nothing" mode. If one of records fails on validation rule required field missing etc - you'll get an exception. You can wrap it in a try-catch block but still - whole list just failed to load.
If you need "save what you can" behavio(u)r - read up about Database.update() version of this call. It takes optional parameter that lets you do exactly that.
Last but not least if you're inserting complex scenarios (insert account, insert contacts, one of contacts fails but you had that in try-catch so the account saved OK so what now, do you have to manually delete it? Weak...) you have Database.setSavepoint() and Database.rollback() calls.
https://developer.salesforce.com/docs/atlas.en-us.apexcode.meta/apexcode/langCon_apex_dml_database.htm
https://developer.salesforce.com/docs/atlas.en-us.apexcode.meta/apexcode/langCon_apex_transaction_control.htm
https://salesforce.stackexchange.com/questions/9410/rolling-back-dml-operation-in-apex-method

Conditional SQL block evaluated even when it won't be executed

I'm working on writing a migration script for a database, and am hoping to make it idempotent, so we can safely run it any number of times without fear of it altering the database (/ migrating data) beyond the first attempt.
Part of this migration involves removing columns from a table, but inserting that data into another table first. To do so, I have something along these lines.
IF EXISTS
(SELECT * FROM sys.columns
WHERE object_id = OBJECT_ID('TableToBeModified')
AND name = 'ColumnToBeDropped')
BEGIN
CREATE TABLE MigrationTable (
Id int,
ColumnToBeDropped varchar
);
INSERT INTO MigrationTable
(Id, ColumnToBeDropped)
SELECT Id, ColumnToBeDropped
FROM TableToBeModified;
END
The first time through, this works fine, since it still exists. However, on subsequent attempts, it fails because the column no longer exists. I understand that the entire script is evaluated, and I could instead put the inner contents into an EXEC statement, but is that really the best solution to this problem, or is there another, still potentially "validity enforced" option?

I understand that the entire script is evaluated, and I could instead put the inner contents into an EXEC statement, but is that really the best solution to this problem
Yes. There are several scenarios in which you would want to push off the parsing validation due to dependencies elsewhere in the script. I will even sometimes put things into an EXEC, even if there are no current problems, to ensure that there won't be as either the rest of the script changes or the environment due to addition changes made after the current rollout script was developed. Minorly, it helps break things up visually.
While there can be permissions issues related to breaking ownership changing due to using Dynamic SQL, that is rarely a concern for a rollout script, and not a problem I have ever run into.

If we are not sure that the script will work or not specially migrating database.
However, For query to updated data related change, i will execute script with BEGIN TRAN and check result is expected then we need to perform COMMIT TRAN otherwise ROLLBACK transaction, so it will discard transaction.

'tail -f' a database table

Is it possible to effectively tail a database table such that when a new row is added an application is immediately notified with the new row? Any database can be used.

Use an ON INSERT trigger.
you will need to check for specifics on how to call external applications with the values contained in the inserted record, or you will write your 'application' as a SQL procedure and have it run inside the database.
it sounds like you will want to brush up on databases in general before you paint yourself into a corner with your command line approaches.

Yes, if the database is a flat text file and appends are done at the end.
Yes, if the database supports this feature in some other way; check the relevant manual.
Otherwise, no. Databases tend to be binary files.

I am not sure but this might work for primitive / flat file databases but as far as i understand (and i could be wrong) the modern database files are encrypted. Hence reading a newly added row would not work with that command.

I would imagine most databases allow for write triggers, and you could have a script that triggers on write that tells you some of what happened. I don't know what information would be available, as it would depend on the individual database.

There are a few options here, some of which others have noted:
Periodically poll for new rows. With the way MVCC works though, it's possible to miss a row if there were two INSERTS in mid-transaction when you last queried.
Define a trigger function that will do some work for you on each insert. (In Postgres you can call a NOTIFY command that other processes can LISTEN to.) You could combine a trigger with writes to an unpublished_row_ids table to ensure that your tailing process doesn't miss anything. (The tailing process would then delete IDs from the unpublished_row_ids table as it processed them.)
Hook into the database's replication functionality, if it provides any. This should have a means of guaranteeing that rows aren't missed.
I've blogged in more detail about how to do all these options with Postgres at http://btubbs.com/streaming-updates-from-postgres.html.

tail on Linux appears to be using inotify to tell when a file changes - it probably uses similar filesystem notifications frameworks on other operating systems. Therefore it does detect file modifications.
That said, tail performs an fstat() call after each detected change and will not output anything unless the size of the file increases. Modern DB systems use random file access and reuse DB pages, so it's very possible that an inserted row will not cause the backing file size to change.
You're better off using inotify (or similar) directly, and even better off if you use DB triggers or whatever mechanism your DBMS offers to watch for DB updates, since not all file updates are necessarily row insertions.

I was just in the middle of posting the same exact response as glowcoder, plus another idea:
The low-tech way to do it is to have a timestamp field, and have a program run a query every n minutes looking for records where the timestamp is greater than that of the last run. The same concept can be done by storing the last key seen if you use a sequence, or even adding a boolean field "processed".

With oracle you can select an psuedo-column called 'rowid' that gives a unique identifier for the row in the table and rowid's are ordinal... new rows get assigned rowids that are greater than any existing rowid's.
So, first select max(rowid) from table_name
I assume that one cause for the raised question is that there are many, many rows in the table... so this first step will be taxing the db a little and take some time.
Then, select * from table_name where rowid > 'whatever_that_rowid_string_was'
you still have to periodically run the query, but it is now just a quick and inexpensive query

Is there an automatic way to generate a rollback script when inserting data with LINQ2SQL?

Let's assume we have a bunch of LINQ2SQL InsertOnSubmit statements against a given DataContext. If the SubmitChanges call is successful, is there any way to automatically generate a list of SQL commands (or even LINQ2SQL statements) that could undo everything that was submitted at a later time? It's like executing a rollback even though everything worked as expected.
Note: The destination database will either be Oracle or SQL Server, so if there is specific functionality for both databases that will achieve this, I'm happy to use that as well.
Clarification:
I do not want the "rollback" to happen automatically as soon as the inserts have succesfully completed. I want to have the ability to "undo" the INSERT statements via DELETE (or some other means) up to 24 hours (for example) after the original program finished inserting data. We can ignore any possible referential integrity issues that may come up.
Assume a Table A with two columns: Id (autogenerated unique id) and Value (string)
If the LINQ2SQL code performs two inserts
INSERT INTO Table A VALUES('a') // Creates new row with Id = 1
INSERT INTO Table A VALUES('z') // Creates new row with Id = 2
<< time passes>>
At some point later I would want to be able "undo" this by executing
DELETE FROM A Where Id = 1
DELETE FROM A Where Id = 2
or something similar. I want to be able to generate the DELETE statements to match the INSERT ones. Or use some functionality that would let me capture a transaction and perform a rollback later.
We cannot just 'reset the database' to a certain point in time either as other changes not initiated by our program could have taken place since.

It is actually quite easy to do this, because you can pass in a SqlConnection into the LINQ to SQL DataContext on construction. Just run this connection in a transaction and roll that transaction back as soon as you're done.
Here's an example:
string output;
using (var connection = new SqlConnection("your conn.string"))
{
connection.Open();
using (var transaction = connection.StartTransaction())
{
using (var context = new YourDataContext(connection))
{
// This next line is needed in .NET 3.5.
context.Transaction = transaction;
var writer = new StringWriter();
context.Log = writer;
// *** Do your stuff here ***
context.SubmitChanges();
output = writer.ToString();
}
transaction.Rollback();
}
}

I am always required to provide a RollBack script to our QA team for testing before any change script can be executed in PROD.
Example: Files are sent externally with a bunch of mappings between us, the recipient and other third parties. One of these third parties wants to change, on an agreed date, the mappings between the three of us.
Exec script would maybe update some exisiting, delete some now redundant and insert some new records - scope_identity used in subsequent relational setup etc etc.
If, for some reason, after we have all executed our changes and the file transport is fired up, just like in UAT, we see some errors not encountered in UAT, we might multilaterally make the decision to roll back the changes. Hence the roll back script.
SQL has this info when you BEGIN TRAN until you COMMIT TRAN or ROLLBACK TRAN. I guess your question is the same as mine - can you output that info as a script.

Why do you need this?
Maybe you should explore the flashback possibilities of Oracle. It makes it possible to travel back in time.
It makes it possible to reset the content of a table or a database to how it once was at a specific moment in time (or at a specific system change number).
See: http://www.oracle.com/technology/deploy/availability/htdocs/Flashback_Overview.htm

NHibernate session.flush() fails but makes changes

We have a SQL Server database table that consists of user id, some numeric value, e.g. balance, and a version column.
We have multiple threads updating this table's value column in parallel, each in its own transaction and session (we're using a session-per-thread model). Since we want all logical transaction to occur, each thread does the following:
load the current row (mapped to a type).
make the change to the value, based on old value. (e.g. add 50).
session.update(obj)
session.flush() (since we're optimistic, we want to make sure we had the correct version value prior to the update)
if step 4 (flush) threw StaleStateException, refresh the object (with lockmode.read) and goto step 1
we only do this a certain number of times per logical transaction, if we can't commit it after X attempts, we reject the logical transaction.
each such thread commits periodically, e.g. after 100 successful logical transactions, to keep commit-induced I/O to manageable levels. meaning - we have a single database transaction (per transaction) with multiple flushes, at least once per logical change.
what's the problem here, you ask? well, on commits we see changes to failed logical objects.
specifically, if the value was 50 when we went through step 1 (for the first time), and we tried to update it to 100 (but we failed since e.g. another thread changed it to 70), then the value of 50 is committed for this row. obviously this is incorrect.
What are we missing here?

Well, I do not have a ton of experience here, but one thing I remember reading in the documentation is that if an exception occurs, you are supposed to immediately rollback the transaction and dispose of the session. Perhaps your issue is related to the session being in an inconsistent state?
Also, calling update in your code here is not necessary. Since you loaded the object in that session, it is already being tracked by nhibernate.

If you want to make your changes anyway, why do you bother with row versioning? It sounds like you should get the same result if you simply always update the data and let the last transaction win.
As to why the update becomes permanent, it depends on what the SQL statements for the version check/update look like and on your transaction control, which you left out of the code example. If you turn on the Hibernate SQL logging it will probably become obvious how this is happening.

I'm not a nhibernate guru, but answer seems simple.
When nhibernate loads an object, it expects it not to change in db as long as it's in nhibernate session cache.
As you mentioned - you got multi thread app.
This is what happens=>
1st thread loads an entity
2nd thread loads an entity
1st thread changes entity
2nd thread changes entity and => finds out that loaded entity has changed by something else and being afraid that it has screwed up changes 1st thread made - throws an exception to let programmer be aware about that.
You are missing locking mechanism. Can't tell much about how to apply that properly and elegantly. Maybe Transaction would help.
We had similar problems when we used nhibernate and raw ado.net concurrently (luckily - just for querying - at least for production code). All we had to do - force updating db on insert/update so we could actually query something through full-text search for some specific entities.
Had StaleStateException in integration tests when we used raw ado.net to reset db. NHibernate session was alive through bunch of tests, but every test tried to cleanup db without awareness of NHibernate.

Here is the documention for exception in the session
http://nhibernate.info/doc/nhibernate-reference/best-practices.html

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Safely deleting a Django model from the database using a transaction - database

Related

SalesForce DML set-based operations and atomic transactions

Conditional SQL block evaluated even when it won't be executed

'tail -f' a database table

Is there an automatic way to generate a rollback script when inserting data with LINQ2SQL?

NHibernate session.flush() fails but makes changes

Categories

Resources