Can SQL Server's MERGE statement update its merge_search_condition? - sql-server

I use MERGE for Upsert implementation.
This code might be executed from two different application instances concurrently.
The code:
MERGE foo as TARGET
USING (SELECT
#Field1 as [field1],
#Field2 as [field2],
#SuperId as [super_id]) as SOURCE
ON TARGET.super_id = SOURCE.super_id
WHEN MATCHED THEN
UPDATE SET
TARGET.[field1] = SOURCE.[field1],
TARGET.[field2] = SOURCE.[field2]
WHEN NOT MATCHED BY TARGET THEN
INSERT (
[field1],
[field2],
[super_id])
VALUES (
SOURCE.[field1],
SOURCE.[field2],
SOURCE.[super_id])
OUTPUT
inserted.common_id;
Can it somehow update the super_id field? Because that's exactly what I have. There was super_id 1 at the time of August, 23rd. And on September, 12th there was 2.
I suspect that one machine was mergin on id = 1, and the second one on id = 2. Something went horribly wrong and id got replaced, because there's no rows with super_id = 1.
No other code accesses this table to somehow address the super_id field. One other way is for somebody to manually set the super_id field by their own hands. That is being investigated.

Related

Update a column only when having a different value

Update table1 set Name='Deepak' where id=1 and Name !='Deepak'
Does adding a condition on name column improves the performance considering that id has clustered index and Name has non clustered and there is around 60% probability of getting '0 rows updated' message after running the above query.
Reasons to Only Update If Different
Reduced locks
Prevents unnecessary activity if you have triggers or certain configs of SQL Server Replication
Preserve audit trail columns like LastModifiedDateTime
Only Update If Different Using EXCEPT
Most people's main complaint would probably be extra query complexity, but I find using EXCEPT makes this process super easy. EXCEPT is ideal because it handles any data type and NULL values without issue
UPDATE Table1
SET Col1 = #NewVal1
,Col2 = #NewVal2
....
,LastModifiedBy = #UserID
,LastModifiedDateTime = GETDATE()
WHERE EXISTS (
SELECT Col1,Col2
EXCEPT
SELECT #NewVal1,#NewVal2
)

SQL Server select for update

I am struggling to find a SQL Server replacement for select for update that works.
I have a master table that contains a column which is used for next order number. The application does a select from update on this row, reads the current value (while locked) adds one to this value and then updates the row, then uses the number it received. This process works perfectly on all databases I've tried but for SQL Server which does not seem to have any process for selecting data for exclusive use.
How do I do a locked read and update of something like a next order number from a sequence table is SQL Server?
BTW, I know I can use things like IDENTITY cols and stuff, to do this, but in this case I must read from this existing column. Get the value and inc it, and do it in a safe locked manner to avoid 2 users getting the same value.
UPDATE::
Thank you, that works for this case :)
DECLARE #Output char(30)
UPDATE scheme.sysdirm
SET #Output = key_value = cast(key_value as int)+1
WHERE system_key='OPLASTORD'
SELECT #Output
I have one other place I do something similar. I read and lock a stock record too.
SELECT STOCK
FROM PRODUCT
WHERE ID = ? FOR UPDATE.
I then do some validation and the do
UPDATE PRODUCT SET STOCK = ?
WHERE ID=?
I can't just use your above method here, as the value I update is based on things I do from the stock I read. But I need to ensure no one else can mess with the stock while I do this. Again, easy on other DB's with SELECT FOR UPDATE... is there a SQL Server workaround?? :)
You can simple do an UPDATE that also reads out the new value into a SQL Server variable:
DECLARE #Output INT
UPDATE dbo.YourTable
SET #Output = YourColumn = YourColumn + 1
WHERE ID = ????
SELECT #Output
Since it's an atomic UPDATE statement, it's safe against concurrency issues (since only one connection can get an update locks at any one given time). A potential second session that wants to get the incremented value at the same time will have to wait until the first one completes, thus getting the next value from the table.
As an alternative you can use the OUTPUT clause of the UPDATE statement, although this will insert into a table variable.
Create table YourTable
(
ID int,
YourColumn int
)
GO
INSERT INTO YourTable VALUES (1, 1)
GO
DECLARE #Output TABLE
(
YourColumn int
)
UPDATE YourTable
SET YourColumn = YourColumn + 1
OUTPUT inserted.YourColumn INTO #Output
WHERE ID = 1
SELECT TOP 1 YourColumn
FROM #Output
**** EDIT
If you want to ensure that no-one can change the data after you have read it, you can use a repeatable read. You should be aware that any reads of any tables you do will be locked for Update (pessimistic locking) and may cause Deadlocking. You can also sue the SELECT ... FROM TABLE (UPDLOCK) hint within a transaction.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRANSACTION
SELECT STOCK
FROM PRODUCT
WHERE ID = ?
.....
...
UPDATE Product
SET Stock = nnn
WHERE ID = ?
COMMIT TRANSACTION

SQL Server 2005 SELECT TOP 1 from VIEW returns LAST row

I have a view that may contain more than one row, looking like this:
[rate] | [vendorID]
8374 1234
6523 4321
5234 9374
In a SPROC, I need to set a param equal to the value of the first column from the first row of the view. something like this:
DECLARE #rate int;
SET #rate = (select top 1 rate from vendor_view where vendorID = 123)
SELECT #rate
But this ALWAYS returns the LAST row of the view.
In fact, if I simply run the subselect by itself, I only get the last row.
With 3 rows in the view, TOP 2 returns the FIRST and THIRD rows in order. With 4 rows, it's returning the top 3 in order. Yet still top 1 is returning the last.
DERP?!?
This works..
DECLARE #rate int;
CREATE TABLE #temp (vRate int)
INSERT INTO #temp (vRate) (select rate from vendor_view where vendorID = 123)
SET #rate = (select top 1 vRate from #temp)
SELECT #rate
DROP TABLE #temp
.. but can someone tell me why the first behaves so fudgely and how to do what I want? As explained in the comments, there is no meaningful column by which I can do an order by. Can I force the order in which rows are inserted to be the order in which they are returned?
[EDIT] I've also noticed that: select top 1 rate from ([view definition select]) also returns the correct values time and again.[/EDIT]
That is by design.
If you don't specify how the query should be sorted, the database is free to return the records in any order that is convenient. There is no natural order for a table that is used as default sort order.
What the order will actually be depends on how the query is planned, so you can't even rely on the same query giving a consistent result over time, as the database will gather statistics about the data and may change how the query is planned based on that.
To get the record that you expect, you simply have to specify how you want them sorted, for example:
select top 1 rate
from vendor_view
where vendorID = 123
order by rate
I ran into this problem on a query that had worked for years. We upgraded SQL Server and all of a sudden, an unordered select top 1 was not returning the final record in a table. We simply added an order by to the select.
My understanding is that SQL Server normally will generally provide you the results based on the clustered index if no order by is provided OR off of whatever index is picked by the engine. But, this is not a guarantee of a certain order.
If you don't have something to order off of, you need to add it. Either add a date inserted column and default it to GETDATE() or add an identity column. It won't help you historically, but it addresses the issue going forward.
While it doesn't necessarily make sense that the results of the query should be consistent, in this particular instance they are so we decided to leave it 'as is'. Ultimately it would be best to add a column, but this was not an option. The application this belongs to is slated to be discontinued sometime soon and the database server will not be upgraded from SQL 2005. I don't necessarily like this outcome, but it is what it is: until it breaks it shall not be fixed. :-x

TSQL Stored Proc to copy records (with a twist!)

I am trying to write a Stored Procedure in SQL Server (2005) to do something that sounds simple, but is actually proving to be more difficult that I thought.
I have a table with 30 columns and 50,000 rows.
The number of records is fixed, but users can edit the fields of existing records.
To save them having to re-key repetitive data, I want to give them the ability to select a record, and specify a range of IDs to copy those details to.
The SP I'm trying to write will take 3 parameters: The source record primary key, and the lower and upper primary keys of the range of records that the data will be copied into.
Obviously the PKs of the destination records remain unchanged.
So I figured the SP needs to do a SELECT - to get all the data to be copied, and an UPDATE - to write the data into the specified destination records.
I just don't know how to store the results of the SELECT to slot them into the UPDATE.
A temp table wouldn't help - selecting from that would be just the same as selecting from the table!
What I need is a variable that is effectively a single record, so I can go something like:
#tempRECORD = SELECT * FROM SOURCETABLE WHERE ID = #sourcePK
UPDATE SOURCETABLE
SET FIELD1 = #tempRECORD.FIELD1,
FIELD2 = #tempRECORD.FIELD2,
...
FIELD30 = #tempRECORD.FIELD30
WHERE ID >= #LOWER_id AND ID <= #UPPER_id
But I don't know how, or if you even can.
I'm also open to any other clever way I haven't even thought of!
Thanks guys!
So I figured the SP needs to do a SELECT - to get all the data to be copied, and an UPDATE - to write the data into the specified destination records.
What you need is the T-SQL-specific extension to UPDATE, UPDATE ... FROM:
UPDATE T
SET
Field1 = source.Field1
, Field2 = source.Field2
, Field3 = source.Field3
FROM
(SELECT * FROM T AS source_T WHERE source_T.ID = #sourcePK) as source
WHERE
T.ID BETWEEN #LOWER_Id AND #UPPER_Id
Note that this ability to put a FROM clause in an UPDATE statement is not standard ANSI SQL, so I don't know how this would be done in other RDBMSs.
I am pretty sure this ain't the easiest way to do it, but it should work without any problems:
DECLARE #tempField1 varchar(255)
DECLARE #tempField2 varchar(255)
...
DECLARE #tempField30 varchar(255)
SELECT #tempField1 = FIELD1, #tempField2 = FIELD2, ... ,#tempField30 = FIELD30 FROM SOURCETABLE WHERE ID = #sourcePK
UPDATE SOURCETABLE
SET FIELD1 = #tempField1,
FIELD2 = #tempField2,
...
FIELD30 = #tempField30
WHERE ID >= #LOWER_id AND ID <= #UPPER_id
You would need to edit the tempField variables so that they have the right type.

SQL Server best way to calculate datediff between current row and next row?

I've got the following rough structure:
Object -> Object Revisions -> Data
The Data can be shared between several Objects.
What I'm trying to do is clean out old Object Revisions. I want to keep the first, active, and a spread of revisions so that the last change for a time period is kept. The Data might be changed a lot over the course of 2 days then left alone for months, so I want to keep the last revision before the changes started and the end change of the new set.
I'm currently using a cursor and temp table to hold the IDs and date between changes so I can select out the low hanging fruit to get rid of. This means using #LastID, #LastDate, updates and inserts to the temp table, etc...
Is there an easier/better way to calculate the date difference between the current row and the next row in my initial result set without using a cursor and temp table?
I'm on sql server 2000, but would be interested in any new features of 2005, 2008 that could help with this as well.
Here is example SQL. If you have an Identity column, you can use this instead of "ActivityDate".
SELECT DATEDIFF(HOUR, prev.ActivityDate, curr.ActivityDate)
FROM MyTable curr
JOIN MyTable prev
ON prev.ObjectID = curr.ObjectID
WHERE prev.ActivityDate =
(SELECT MAX(maxtbl.ActivityDate)
FROM MyTable maxtbl
WHERE maxtbl.ObjectID = curr.ObjectID
AND maxtbl.ActivityDate < curr.ActivityDate)
I could remove "prev", but have it there assuming you need IDs from it for deleting.
If the identity column is sequential you can use this approach:
SELECT curr.*, DATEDIFF(MINUTE, prev.EventDateTime,curr.EventDateTime) Duration FROM DWLog curr join DWLog prev on prev.EventID = curr.EventID - 1
Hrmm, interesting challenge. I think you can do it without a self-join if you use the new-to-2005 pivot functionality.
Here's what I've got so far, I wanted to give this a little more time before accepting an answer.
DECLARE #IDs TABLE
(
ID int ,
DateBetween int
)
DECLARE #OID int
SET #OID = 6150
-- Grab the revisions, calc the datediff, and insert into temp table var.
INSERT #IDs
SELECT ID,
DATEDIFF(dd,
(SELECT MAX(ActiveDate)
FROM ObjectRevisionHistory
WHERE ObjectID=#OID AND
ActiveDate < ORH.ActiveDate), ActiveDate)
FROM ObjectRevisionHistory ORH
WHERE ObjectID=#OID
-- Hard set DateBetween for special case revisions to always keep
UPDATE #IDs SET DateBetween = 1000 WHERE ID=(SELECT MIN(ID) FROM #IDs)
UPDATE #IDs SET DateBetween = 1000 WHERE ID=(SELECT MAX(ID) FROM #IDs)
UPDATE #IDs SET DateBetween = 1000
WHERE ID=(SELECT ID
FROM ObjectRevisionHistory
WHERE ObjectID=#OID AND Active=1)
-- Select out IDs for however I need them
SELECT * FROM #IDs
SELECT * FROM #IDs WHERE DateBetween < 2
SELECT * FROM #IDs WHERE DateBetween > 2
I'm looking to extend this so that I can keep at maximum so many revisions, and prune off the older ones while still keeping the first, last, and active. Should be easy enough through select top and order by clauses, um... and tossing in ActiveDate into the temp table.
I got Peter's example to work, but took that and modified it into a subselect. I messed around with both and the sql trace shows the subselect doing less reads. But it does work and I'll vote him up when I get my rep high enough.

Resources