Better alternative to INSTEAD OF INSERT, UPDATE trigger - sql-server

I work for a security contracting company, and I have a table in my database that stores adjusted hours/times for worked POs. For each date that is worked, the finance department will enter the billable start and end time for the vendor, and the billable start and end time for the client. Most of the time, the vendor and client billable times are equal, except in rare circumstances.
The way our system works, the vendor start and end time always updates the client start and end time, unless the client start and end time differs from the vendor start and end time. To store this information, I have a table with these columns:
JobID
PostNumber
StartTime (vendor start time)
ClosingTime (vendor start time)
ClientStartTime
ClientClosingTime
Since the vendor and client start and end times usually need to stay synced with each other, I figured that I would use a trigger to handle it. Because I need access to the current values to compare them against the new values, it seems that an INSTEAD OF INSERT, UPDATE trigger is the way to go, except that I'm not crazy about triggers in the first place, and relying on a trigger to perform all of my inserts and updates into this table makes me nervous. Maybe it's an irrational fear I have, but I usually try to stay away from triggers when I can. In this circumstance however, it seems like the best option.
Here is my trigger, which should make the logic clear:
ALTER TRIGGER [dbo].[<UpdateAdjustedHours>]
ON [dbo].[<AdjustedHoursTable>]
INSTEAD OF INSERT, UPDATE
AS
BEGIN
IF (NOT EXISTS(SELECT CurrentValues.JobID FROM WorkOrderDetailAdjustment CurrentValues, Inserted NewValues WHERE CurrentValues.JobID = NewValues.JobID AND CurrentValues.PostNumber = NewValues.PostNumber))
BEGIN
INSERT INTO WorkOrderDetailAdjustment
SELECT Inserted.JobID,
Inserted.PostNumber,
Inserted.StartTime,
Inserted.ClosingTime,
ISNULL(Inserted.ClientStartTime, Inserted.StartTime),
ISNULL(Inserted.ClientClosingTime, Inserted.ClosingTime)
FROM Inserted
END
ELSE BEGIN
UPDATE CurrentValues
SET CurrentValues.StartTime = ISNULL(NewValues.StartTime, CurrentValues.StartTime),
CurrentValues.ClosingTime = ISNULL(NewValues.ClosingTime, CurrentValues.ClosingTime),
CurrentValues.ClientStartTime = (
CASE WHEN DATEDIFF(SECOND, CurrentValues.ClientStartTime, CurrentValues.StartTime) != 0 THEN
ISNULL(NewValues.ClientStartTime, CurrentValues.ClientStartTime)
ELSE
NewValues.StartTime
END
),
CurrentValues.ClientClosingTime = (
CASE WHEN DATEDIFF(SECOND, CurrentValues.ClientClosingTime, CurrentValues.ClosingTime) != 0 THEN
ISNULL(NewValues.ClientClosingTime, CurrentValues.ClientClosingTime)
ELSE
NewValues.ClosingTime
END
)
FROM WorkOrderDetailAdjustment CurrentValues, Inserted NewValues
WHERE CurrentValues.JobID = NewValues.JobID AND CurrentValues.PostNumber = NewValues.PostNumber
END
END
What I'm wondering is if there's a better way to do this. I'm open to all suggestions, but if possible I would like to keep this at the database level. I'm also open to hearing that I chose the best possible way of doing this, but for some reason I doubt it.
Thank you for your help!

One thing you could do...if it makes sense for your operation, is to add a persisted computed column to your table. (I'm inferring from your description that you actually want this value stored. If not, you don't need the persisted option).
It could be something like coalesce(client_start_time, vender_start_time).
Alternatively, you could just create view that did the same thing. I'm not crazy about views, as they're easy to abuse, but they're not always bad.
Eithe rway, you'd only have to enter client_start time when it was actually different than vender_start_time. And you could avoid using a trigger.

I would suggest handling it in business logic layer of your application and not in the database.
We went through a painful time of having business rules placed "everywhere" - in application back end, in database in stored procedures and triggers. We redesigned our approach and split the applications in strict layers. Development and maintenance is much smoother now. In my opinion the task you described - keeping the billable data synchronized - is part of the business logic and should be implemented there.

Related

Way to persist function result as a constant

I needed to create a function today which will always return the exact same value on the specific database it's executed on. It may / may not be the same across databases which is why it has to be able to load it from a table the first time it's required.
CREATE FUNCTION [dbo].[PAGECODEGET] ()
RETURNS nvarchar(6)
AS
BEGIN
DECLARE #PageCode nvarchar(6) = ( SELECT PCO_IDENTITY FROM PAGECODES WHERE PCO_PAGE = 'SWTE' AND PCO_TAB = 'RECORD' )
RETURN #PageCode
END
The PCO_IDENTITY field is a sql identity field, so once the record is inserted for the first time, it's always going to return the same result thereafter.
My question is, is there any way to persist this value to something equivalent to a C# readonly variable?
From a perfomance point of view I know sql will optimise the plan etc, but from a best practice point of view I'm thinking there may possibly be a better way of doing it.
We use a mix of SQL Servers, but the lowest is 2008 R2 in case there's a version specific solution.
I'm afraid there's no such thing as a global variable like you suggest in SQL Server.
As you've pointed out, the function will potentially return different results on another database, depending on a variety of factors, such as when the row was inserted, what other values exist in the table already etc. - basically, the PCO_IDENTITY value for this row cannot be relied upon to be consistent.
A few observations:
I don't see how getting this value occasionally is really going to be a performance bottleneck. I don't think best practices cover this, as selecting a value from a table is as basic as you can get.
If this is part of another larger query, you will probably get better performance by using a join to the PAGECODES table directly, rather than potentially running this function for every row
However, if you are really worried:
There are objects in the database which are persistant - tables. When you first insert this value, retrieve the PCO_IDENTITY value, and create a new table with just that in, that you may join to in your queries. Seems a bit of a waste for one value, doesn't it? (Note you could also make a view, but how would that be any better performing than the function you started with?)
You could force these values into a row with a specific PCO_IDENTITY value, using IDENTITY_INSERT. That way the value is consistent, and you know what it is - you could hard code it in your queries. (NB: Turn IDENTITY_INSERT off again afterwards, and other rows inserted into this table will continue to be automatically generated again)
TL;DR: How you are doing it is probably fine. I suspect you are trying to optimise something that isn't a problem. As always - if in doubt, try out a few approaches and measure.

Displaying queries in CakePHP without running them

Is there any way to run a model command such as $this->MyModel->saveall($rows) but without it actually performing an action on the database, just displaying all the queries it would run, the way it does when one of the queries has an error?
Yes you can, have a look at "Transactions" http://book.cakephp.org/2.0/en/models/transactions.html
// get the datasource and store it in a local variable
$ds = $this->MyModel->getDataSource();
// begin a "transaction"
$ds->begin();
// do your saving
$this->MyModel->saveAll($rows); // you can add more queries here, that's what transactions are all about! :)
// rollback, in a normal situation you would check if the save was successful and commit()/rollbac() depending on the situation.
$ds->rollback();
Please note: Auto Increment fields WILL increment, due to the fact that MySQL or any other database engine will "reserve" these ID's while doing the transaction in order to prevent duplicated ID's. This shouldn't be of any concern, but when you are debugging and you are remembering an ID , it could give you a headache if it's Monday morning (been there, done that)... ;-)

How to perform this in update in Go? (Google App Engine)

I need to update a datastore entity in a way that will not be broken by multiple concurrent users doing the same thing.
I understand that I can't use SQL for updating the datastore but I'm not sure what else would work.
This is how I would achieve it in an RDBMS using SQL:
-- Account.Balance = current balance
-- Account.Rate = increase per second
-- Account.CheckDate = the last time the balance was checked and updated
-- so we need to find the number of seconds since the last check,
-- update the balance by rate*seconds, then update the check datetime
UPDATE Account
SET Account.Balance = Account.Balance + ( DATEDIFF(S, GETDATE(), Account.CheckDate) * Account.Rate),
Account.CheckDate = GETDATE()
I know that I can wrap all the operations in a single transaction, but how can I ensure that the update is not miscalculated because of multiple users without using a single update operation like the SQL shown?
You can probably see that several operations like:
1. Read entity
2. Update values
3. Save entity
might fail because of several users doing the same thing
I'm guessing there are several possible ways to achieve this and I'm looking for the one which would work best for this and future requirements.
================== ANSWER ==================
When I got to this point I realised that as long as I update the balance and the check date at the same time, all will be fine.
Concurrent updates will not break anything :)
But I thought I'd post it anyway!
I'm still happy to see better solutions though...
Transactions may be used to atomically execute a sequence of datastore operations.
My solution follows:
I eventually realised that as long as I update the balance and the check date at the same time, all will be fine. Concurrent updates will not break anything :)
But I thought I'd post it anyway!
I'm still happy to see better solutions though...

How to avoid circular relationship in SQL-Server?

I am creating a self-related table:
Table Item columns:
ItemId int - PK;
Amount money - not null;
Price money - a computed column using a UDF that retrieves value according to the items ancestors' Amount.
ParentItemId int - nullable, reference to another ItemId in this table.
I need to avoid a loop, meaning, a sibling cannot become an ancestor of his ancestors, meaning, if ItemId=2 ParentItemId = 1, then ItemId 1 ParentItemId = 2 shouldn't be allowed.
I don't know what should be the best practice in this situation.
I think I should add a CK that gets a Scalar value from a UDF or whatever else.
EDIT:
Another option is to create an INSTEAD OF trigger and put in 1 transaction the update of the ParentItemId field and selecting the Price field from the ##RowIdentity, if it fails cancel transaction, but I would prefer a UDF validating.
Any ideas are sincerely welcomed.
Does this definitely need to be enforced at the database level?
I'm only asking as I have databases like this (where the table similar to this is like a folder) and I only make sure that the correct parent/child relationships are set up in the application.
Checks like this is not easy to implement, and possible solutions could cause a lot of bugs and problems may be harder then initial one. Usually it is enough to add control for user's input and prevent infinite loop on read data.
If your application uses stored procedures, no ORM, than I would choose to implement this logic in SP. Otherwise - handle it in other layers, not in DB
How big of a problem is this, in real life? It can be expensive to detect these situations (using a trigger, perhaps). In fact, it's likely going to cost you a lot of effort, on each transaction, when only a tiny subset of all your transactions would ever cause this problem.
Think about it first.
A simple trick is to force the ParentItemId to be less than the ItemId. This prevents loop closure in this simple context.
However, there's a down side - if you need for some reason to delete/insert a parent, you may need to delete/insert all of its children in order as well.
Equally, hierarchies need to be inserted in order, and you may not be able to reassign a parent.
Tested and works just great:
CREATE TRIGGER Item_UPDATE
ON Item
FOR INSERT, UPDATE
AS
BEGIN
BEGIN TRY
SELECT Price FROM INSERTED
END TRY
BEGIN CATCH
RAISERROR('This item cannot be specified with this parent.', 16, 1)
ROLLBACK TRANSACTION;
END CATCH
END
GO

What should be returned when inserting into SQL?

A few months back, I started using a CRUD script generator for SQL Server. The default insert statement that this generator produces, SELECTs the inserted row at the end of the stored procedure. It does the same for the UPDATE too.
The previous way (and the only other way I have seen online) is to just return the newly inserted Id back to the business object, and then have the business object update the Id of the record.
Having an extra SELECT is obviously an additional database call, and more data is being returned to the application. However, it allows additional flexibility within the stored procedure, and allows the application to reflect the actual data in the table.
The additional SELECT also increases the complexity when wanting to wrap the insert/update statements in a transaction.
I am wondering what people think is better way to do it, and I don't mean the implementation of either method. Just which is better, return just the Id, or return the whole row?
We always return the whole row on both an Insert and Update. We always want to make sure our client apps have a fresh copy of the row that was just inserted or updated. Since triggers and other processes might modify values in columns outside of the actual insert/update statement, and since the client usually needs the new primary key value (assuming it was auto generated), we've found it's best to return the whole row.
The select statement will have some sort of an advantage only if the data is generated in the procedure. Otherwise the data that you have inserted is generally available to you already so no point in selecting and returning again, IMHO. if its for the id then you can have it with SCOPE_IDENTITY(), that will return the last identity value created in the current session for the insert.
Based on my prior experience, my knee-jerk reaction is to just return the freshly generated identity value. Everything else the application is inserting, it already knows--names, dollars, whatever. But a few minutes reflection and reading the prior 6 (hmm, make that 5) replies, leads to a number of “it depends” situations:
At the most basic level, what you inserted is what you’d get – you pass in values, they get written to a row in the table, and you’re done.
Slightly more complex that that is when there are simple default values assigned during an insert statement. “DateCreated” columns that default to the current datetime, or “CreatedBy” that default to the current SQL login, are a prime example. I’d include identity columns here, since not every table will (or should) contain them. These values are generated by the database upon table insertion, so the calling application cannot know what they are. (It is not unknown for web server clocks to not be synchronized with database server clocks. Fun times…) If the application needs to know the values just generated, then yes, you’d need to pass those back.
And then there are are situations where additional processing is done within the database before data is inserted into the table. Such work might be done within stored procedures or triggers. Once again, if the application needs to know the results of such calculations, then the data would need to be returned.
With that said, it seems to me the main issue underlying your decision is: how much control/understanding do you have over the database? You say you are using a tool to automatically generate your CRUD procedures. Ok, that means that you do not have any elaborate processing going on within them, you’re just taking data and loading it on in. Next question: are there triggers (of any kind) present that might modify the data as it is being written to the tables? Extend that to: do you know whether or not such triggers exists? If they’re there and they matter, plan accordingly; if you do not or cannot know, then you might need to “follow up” on the insert to see if changes occurred. Lastly: does the application care? Does it need to be informed of the results of the insert action it just requested, and if so, how much does it need to know? (New identity value, date time it was added, whether or not something changed the Name from “Widget” to “Widget_201001270901”.)
If you have complete understanding and control over the system you are building, I would only put in as much as you need, as extra code that performs no useful function impacts performance and maintainability. On the flip side, if I were writing a tool to be used by others, I’d try to build something that did everything (so as to increase my market share). And if you are building code where you don't really know how and why it will be used (application purpose), or what it will in turn be working with (database design), then I guess you'd have to be paranoid and try to program for everything. (I strongly recommend not doing that. Pare down to do only what needs to be done.)
Quite often the database will have a property that gives you the ID of the last inserted item without having to do an additional select. For example, MS SQL Server has the ##Identity property (see here). You can pass this back to your application as an output parameter of your stored procedure and use it to update your data with the new ID. MySQL has something similar.
INSERT
INTO mytable (col1, col2)
OUTPUT INSERTED.*
VALUES ('value1', 'value2')
With this clause, returning the whole row does not require an extra SELECT and performance-wise is the same as returning only the id.
"Which is better" totally depends on your application needs. If you need the whole row, return the whole row, if you need only the id, return only the id.
You may add an extra setting to your business object which can trigger this option and return the whole row only if the object needs it:
IF #return_whole_row = 1
INSERT
INTO mytable (col1, col2)
OUTPUT INSERTED.*
VALUES ('value1', 'value2')
ELSE
INSERT
INTO mytable (col1, col2)
OUTPUT INSERTED.id
VALUES ('value1', 'value2')
FI
I don't think I would in general return an entire row, but it could be a useful technique.
If you are code-generating, you could generate two procs (one which calls the other, perhaps) or parametrize a single proc to determine whther to return it over the wire or not. I doubt the DB overhead is significant (single-row, got to have a PK lookup), but the data on the wire from DB to client could be significant when all added up and if it's just discarded in 99% of the cases, I see little value. Having an SP which returns different things with different parameters is a potential problem for clients, of course.
I can see where it would be useful if you have logic in triggers or calculated columns which are managed by the database, in which case, a SELECT is really the only way to get that data back without duplicating the logic in your client or the SP itself. Of course, the place to put any logic should be well thought out.
Putting ANY logic in the database is usually a carefully-thought-out tradeoff which starts with the minimally invasive and maximally useful things like constraints, unique constraints, referential integrity, etc and growing to the more invasive and marginally useful tools like triggers.
Typically, I like logic in the database when you have multi-modal access to the database itself, and you can't force people through your client assemblies, say. In this case, I would still try to force people through views or SPs which minimize the chance of errors, duplication, logic sync issues or misinterpretation of data, thereby providing as clean, consistent and coherent a perimeter as possible.

Resources