Cakephp - Model Transactions for insert / update / delete - cakephp

I have a table that has several fields, 2 of these fields are "startdate" and "enddate", which mark the validity of the record. If i insert 1 new record, the new record cannot overlap with other records in terms of start date and end date.
Hence on insertion of new record i may need to adjust the value of "startdate" and "enddate" of pre-existing records so they don't overlap with the new record. Similarly, any preexisting records that have 100% overlap with the new record, will need to be deleted.
My table is an InnoDB table, which i know supports such transactions.
Are there any examples which show use of insert / update / delete using transactions (all must succeed in order for any one of them to succeed and be commited) ?
I don't know how to do this. Most examples only show the use of saveAssociated() which i'm not sure is capable of catering for delete operations?
Thanks
Kevin

Perhaps you could use the beforeSave callback to search for the preexisting records and delete them before saving your new record.
from the docs:
Place any pre-save logic in this function. This function executes immediately after model data has been successfully validated, but just before the data is saved. This function should also return true if you want the save operation to continue.

I think you're looking to do Transactions: http://book.cakephp.org/2.0/en/models/transactions.html
That should allow you to run your queries - you start a transaction, perform any required actions, and then commit or rollback based on the outcome. Although, given your description I'd think doing some reads and adjusting your data before committing anything might be a better approach. Either way, transactions aren't a bad idea though!

Related

Add DATE column to store when last read

We want to know what rows in a certain table is used frequently, and which are never used. We could add an extra column for this, but then we'd get an UPDATE for every SELECT, which sounds expensive? (The table contains 80k+ rows, some of which are used very often.)
Is there a better and perhaps faster way to do this? We're using some old version of Microsoft's SQL Server.
This kind of logging/tracking is the classical application server's task. If you want to realize your own architecture (there tracking architecture) do it on your own layer.
And in any case you will need application server there. You are not going to update tracking field it in the same transaction with select, isn't it? what about rollbacks? so you have some manager who first run select than write track information. And what is the point to save tracking information together with entity info sending it back to DB? Save it into application server file.
You could either update the column in the table as you suggested, but if it was me I'd log the event to another table, i.e. id of the record, datetime, userid (maybe ip address etc, browser version etc), just about anything else I could capture and that was even possibly relevant. (For example, 6 months from now your manager decides not only does s/he want to know which records were used the most, s/he wants to know which users are using the most records, or what time of day that usage pattern is etc).
This type of information can be useful for things you've never even thought of down the road, and if it starts to grow large you can always roll-up and prune the table to a smaller one if performance becomes an issue. When possible, I log everything I can. You may never use some of this information, but you'll never wish you didn't have it available down the road and will be impossible to re-create historically.
In terms of making sure the application doesn't slow down, you may want to 'select' the data from within a stored procedure, that also issues the logging command, so that the client is not doing two roundtrips (one for the select, one for the update/insert).
Alternatively, if this is a web application, you could use an async ajax call to issue the logging action which wouldn't slow down the users experience at all.
Adding new column to track SELECT is not a practice, because it may affect database performance, and the database performance is one of major critical issue as per Database Server Administration.
So here you can use one very good feature of database called Auditing, this is very easy and put less stress on Database.
Find more info: Here or From Here
Or Search for Database Auditing For Select Statement
Use another table as a key/value pair with two columns(e.g. id_selected, times) for storing the ids of the records you select in your standard table, and increment the times value by 1 every time the records are selected.
To do this you'd have to do a mass insert/update of the selected ids from your select query in the counting table. E.g. as a quick example:
SELECT id, stuff1, stuff2 FROM myTable WHERE stuff1='somevalue';
INSERT INTO countTable(id_selected, times)
SELECT id, 1 FROM myTable mt WHERE mt.stuff1='somevalue' # or just build a list of ids as values from your last result
ON DUPLICATE KEY
UPDATE times=times+1
The ON DUPLICATE KEY is right from the top of my head in MySQL. For conditionally inserting or updating in MSSQL you would need to use MERGE instead

How to update and add new records at the same time

I have a table that contains over than a million records (products).
Now, daily, I need to either update existing records, and/or add new ones.
Instead of doing it one-by-one (takes couple of hours), I managed to use SqlBulkCopy to work with bunch of records and managed to do my inserts in the matter of seconds, but it can handle only new inserts. So I am thinking about creating a new table that contains new records and old records; and then use that temporary table (on the SQL end) to update/add to the main table.
Any advice how can I perform that update?
One of the better ways to handle this is with the MERGE command in SQL. Mssqltips has a good tutorial on it, it can be a bit trickier to use than some of the other commands.
Also, due to locking you may want to break this up into multiple smaller transactions, unless you know you can tolerate blocking during the update.
We handle this situation in our code in the way you described; we have a temp table, then run an update where the ID in the temp table matches the table to be updated, then run an insert where the ID in the table to be updated is null. We normally do this for updates to library/program settings, though, so it is only run infrequently, on smaller tables. Performance may not be up to par for that many records, or daily runs.
The main "gotcha" I've encountered with this method is that for the update, we did a comparison to make sure at least one of several fields changed before actually running the update. (Our initial reason for this was to avoid overwriting some defaults, which could affect server behavior. Your reason for this might be performance, if your temp table could contain records that haven't actually changed). We encountered a case where we did actually want to update one of the defaults, but our old script didn't catch that. So if you do any comparisons to determine which products you want to update, make sure it is either complete from the start, or document well any fields you don't compare, and why.

LINQ to SQL object versioning

I'm trying to create a LINQ to SQL class that represents the "latest" version of itself.
Right now, the table that this entity represents has a single auto-incrementing ID, and I was thinking that I would add a version number to the primary key. I've never done anything like this, so I'm not sure how to proceed. I would like to be able to abstract the idea of the object's version away from whoever is using it. In other words, you have an instance of this entity that represents the most current version, and whenever any changes are submitted, a new copy of the object is stored with an incremented version number.
How should I proceed with this?
If you can avoid keeping a history, do. It's a pain.
If a complete history is unavoidable (regulated financial and medical data or the like), consider adding history tables. Use a trigger to 'version' into the history tables. That way, you're not dependent on your application to ensure a version is recorded - all inserts/updates/deletes are captured regardless of the source.
If your app needs to interact with historical data, make sure it's readonly. There's no sense capturing transaction histories if someone can simply change them.
If your concern is concurrent updates, consider using a record change timestamp. When both User A and User B view a record at noon, they fetch the record's timestamp. When User A updates the record, her timestamp matches the record's so the update goes through and the timestamp is updated as well. When User B updates the record five minutes later, his timestamp doesn't match the record's so he's warned that the record has changed since he last viewed it. Maybe it's automatically reloaded...
Whatever you decide, I would avoid inter-mingling current and historic data.
Trigger resources per comments:
MSDN
A SQL Team Introduction
Stackoverflow's Jon Galloway describes a general data-change logging trigger
The keys to an auditing trigger are the virtual tables 'inserted' and 'deleted'. These tables contain the rows effected by an INSERT, UPDATE, or DELETE. You can use them to audit changes. Something like:
CREATE TRIGGER tr_TheTrigger
ON [YourTable]
FOR INSERT, UPDATE, DELETE
AS
IF EXISTS(SELECT * FROM inserted)
BEGIN
--this is an insert or update
--your actual action will vary but something like this
INSERT INTO [YourTable_Audit]
SELECT * FROM inserted
END
IF EXISTS(SELECT * FROM deleted)
BEGIN
--this is a delete, mark [YourTable_Audit] as required
END
GO
The best way to proceed is to stop and seriously rethink your approach.
If you are going to keep different versions of the "object" around, then you are better off serializing it into an xml format and storing that in an XML column with a field for the version number.
There are serious considerations when trying to maintain versioned data in sql server revolving around application maintenance.
UPDATE per comment:
Those considerations include: the inability to remove a field or change the data type of a field in future "versions". New fields are required to be nullable or, at the very least, have a default value stored in the DB for them. As such you will not be able to use them in a unique index or as part of the primary keys.
In short, the only thing your application can do is expand. Provided the expansion can be ignored by previous layers of code.
This is the classic problem of Backwards Compatibility which desktop software makers have struggled with for years. And is the reason you might want to stay away from it.

SQL Server 2000: Is there a way to tell when a record was last modified?

The table doesn't have a last updated field and I need to know when existing data was updated. So adding a last updated field won't help (as far as I know).
SQL Server 2000 does not keep track of this information for you.
There may be creative / fuzzy ways to guess what this date was depending on your database model. But, if you are talking about 1 table with no relation to other data, then you are out of luck.
You can't check for changes without some sort of audit mechanism. You are looking to extract information that ha not been collected. If you just need to know when a record was added or edited, adding a datetime field that gets updated via a trigger when the record is updated would be the simplest choice.
If you also need to track when a record has been deleted, then you'll want to use an audit table and populate it from triggers with a row when a record has been added, edited, or deleted.
You might try a log viewer; this basically just lets you look at the transactions in the transaction log, so you should be able to find the statement that updated the row in question. I wouldn't recommend this as a production-level auditing strategy, but I've found it to be useful in a pinch.
Here's one I've used; it's free and (only) works w/ SQL Server 2000.
http://www.red-gate.com/products/SQL_Log_Rescue/index.htm
You can add a timestamp field to that table and update that timestamp value with an update trigger.
OmniAudit is a commercial package which implments auditng across an entire database.
A free method would be to write a trigger for each table which addes entries to an audit table when fired.

Editing database records by multiple users

I have designed database tables (normalised, on an MS SQL server) and created a standalone windows front end for an application that will be used by a handful of users to add and edit information. We will add a web interface to allow searching accross our production area at a later date.
I am concerned that if two users start editing the same record then the last to commit the update would be the 'winner' and important information may be lost. A number of solutions come to mind but I'm not sure if I am going to create a bigger headache.
Do nothing and hope that two users are never going to be editing the same record at the same time. - Might never happed but what if it does?
Editing routine could store a copy of the original data as well as the updates and then compare when the user has finished editing. If they differ show user and comfirm update - Would require two copies of data to be stored.
Add last updated DATETIME column and check it matches when we update, if not then show differences. - requires new column in each of the relevant tables.
Create an editing table that registers when users start editing a record that will be checked and prevent other users from editing same record. - would require carful thought of program flow to prevent deadlocks and records becoming locked if a user crashes out of the program.
Are there any better solutions or should I go for one of these?
If you expect infrequent collisions, Optimistic Concurrency is probably your best bet.
Scott Mitchell wrote a comprehensive tutorial on implementing that pattern:
Implementing Optimistic Concurrency
A classic approach is as follows:
add a boolean field , "locked" to each table.
set this to false by default.
when a user starts editing, you do this:
lock the row (or the whole table if you can't lock the row)
check the flag on the row you want to edit
if the flag is true then
inform the user that they cannot edit that row at the moment
else
set the flag to true
release the lock
when saving the record, set the flag back to false
# Mark Harrison : SQL Server does not support that syntax (SELECT ... FOR UPDATE).
The SQL Server equivalent is the SELECT statement hint UPDLOCK.
See SQL Server Books Online for more information.
-first create filed (update time) to store last update record
-when any user select record save select time,
compare between select time and update time field if( update time) > (select time) that mean another user update this record after select record
SELECT FOR UPDATE and equivalents are good providing you hold the lock for a microscopic amount of time, but for a macroscopic amount (e.g. the user has the data loaded and hasn't pressed 'save' you should use optimistic concurrency as above. (Which I always think is misnamed - it's more pessimistic than 'last writer wins', which is usually the only other alternative considered.)
Another option is to test that the values in the record that you are changing are the still the same as they were when you started:
SELECT
customer_nm,
customer_nm AS customer_nm_orig
FROM demo_customer
WHERE customer_id = #p_customer_id
(display the customer_nm field and the user changes it)
UPDATE demo_customer
SET customer_nm = #p_customer_name_new
WHERE customer_id = #p_customer_id
AND customer_name = #p_customer_nm_old
IF ##ROWCOUNT = 0
RAISERROR( 'Update failed: Data changed' );
You don't have to add a new column to your table (and keep it up to date), but you do have to create more verbose SQL statements and pass new and old fields to the stored procedure.
It also has the advantage that you are not locking the records - because we all know that records will end up staying locked when they should not be...
The database will do this for you. Look at "select ... for update", which is designed just for this kind of thing. It will give you a write lock on the selected rows, which you can then commit or roll back.
With me, the best way i have a column lastupdate (timetamp datatype).
when select and update just compare this value
another advance of this solution is that you can use this column to track down the time data has change.
I think it is not good if you just create a colum like isLock for check update.

Resources