Quick Summary: I have a function that pulls data from table X. I'm running an UPDATE on table X, and using a CROSS APPLY on the function that is pulling data from X (during the update) and the function doesn't look to be returning updated data.
The real-world scenario is much more complicated, but here's a sample of what I'm seeing.
Table
create table BO.sampleData (id int primary key, data1 int, val int)
Function
create function BO.getPrevious(
#id int
)
returns #info table (
id int, val int
)
as
begin
declare #val int
declare #prevRow int = #id - 1
-- grab data from previous row
insert into #info
select #id, val
from BO.sampleData where id = #prevRow
-- if previous row doesn't exist, return 3*prev row id
if ##rowcount = 0
insert into #info values (#id, #prevRow * 3)
return
end
Issue
Populate some sample data:
delete BO.sampleData
insert into BO.sampleData values (10, 20, 0)
insert into BO.sampleData values (11, 22, 0)
insert into BO.sampleData values (12, 24, 0)
insert into BO.sampleData values (13, 26, 0)
insert into BO.sampleData values (14, 28, 0)
select * from BO.sampleData
id data1 val
----------- ----------- -----------
10 20 0
11 22 0
12 24 0
13 26 0
14 28 0
Update BO.sampleData using a CROSS APPLY on BO.getPrevious (which accesses data from BO.sampleData):
update t
set t.val = ca.val
from bo.sampleData t
cross apply BO.getPrevious(t.id) ca
where t.id = ca.id
Problem
I'm expecting the row with id 10 to have the value 27 (since there is no row 9, the function will return 9*3). For id 11, I assumed it would look in 10 (which just got updated with 27) and set it's val to 27 -- and this would cascade down the rest of the table. But what I get is:
id data1 val
----------- ----------- -----------
10 20 27
11 22 0
12 24 0
13 26 0
14 28 0
I'm guessing this isn't allowed/supported -- the function doesn't have access to the updated data yet? Or I've got something wrong with the syntax? In the real scenario I'm researching, the function is much more complex, does some child table look ups, aggregates, etc.. before returning a result. But this represents the basics of what I'm seeing -- the function that queries BO.sampleData doesn't seem to have access to the updated values of BO.sampleData within the CROSS APPLY during the UPDATE.
Any ideas welcomed.
Thanks to #Martin Smith for identifying the issue -- i.e. "Halloween Protection". Now that my issue has a name, I did some research and found the following article which mentions this specific scenario in SQL Server:
... update plans consist of two parts: a read cursor that identifies
the rows to be updated and a write cursor that actually performs the
updates. Logically speaking, SQL Server must execute the read cursor
and write cursor of an update plan in two separate steps or phases.
To put it another way, the actual update of rows must not affect the
selection of which rows to update.
Emphasis mine. It makes sense now. The CROSS APPLY is happening over the read cursor where all of the values are still zero.
The data is always coming from #info
For input id = 11, it will execute:
insert into #info
select #id, val --which #id = 10, val = 0
from BO.sampleData where id = 10
so from the #info, the val for id=10 is 0(which comes from BO.sampleData where id = 11), then cross apply, it deal with id = 10 from #info, which is val = 10.
everything is what it is in your UDF. And there is no update val to 27 when id = 10 from #info in your UDF, be careful that #info is the table get returned.
Related
I am trying to puzzle out a trigger in a SQL Server database. I am a student working on a summer project so I am no pro at this but can easily learn it.
This is a simplified version of my database table sorted by rank:
ID as primary key
ID | RANK
--------------
2 | NULL
1 | 1
3 | 2
4 | 3
7 | 4
The objective for me right now is to have the ability to insert/delete/update the rank and maintain incremental order of ranks in the database without any missing numbers in available positions along with no duplicates.
/* Insert new row */
INSERT INTO TABLE (ID, RANK) VALUES (6, 4)
/* AFTER INSERT */
ID | RANK
--------------
2 | NULL
1 | 1
3 | 2
4 | 3
6 | 4 <- new
7 | 5 <- notice how the rank increased to make room for the new row
I think doing this in a trigger is the most efficient/easiest way; although I may be wrong.
Alternatively to a trigger, I have made a temporary solution that uses front end code to run updates on each row when any rank is changed.
If you know how or if a trigger could do this please share.
EDIT: Added scenarios
The rank being inserted would always take its assigned number. Everything that is greater than or equal to the one being inserted would increase.
The rank causing the trigger will always have priority to claim its number while everything else will have rank increased to accommodate.
If rank is the highest number then the trigger would ensure that the number is +1 of the max.
This may work for you. Let me know.
DROP TABLE dbo.test
CREATE TABLE dbo.test (id int, ranke int)
INSERT INTO test VALUES (2, NULL)
INSERT INTO test VALUES (1, 1)
INSERT INTO test VALUES (3, 2)
INSERT INTO test VALUES (4, 3)
INSERT INTO test VALUES (7, 4)
GO
CREATE TRIGGER t_test
ON test
AFTER INSERT
AS
UPDATE test set ranke += 1 WHERE ranke >= (select max(ranke) from inserted) and id <> (select max(id) from inserted)
GO
INSERT INTO test values (6,4)
INSERT INTO test values (12,NULL)
SELECT * FROM test
TempTable has columns RunningTotal and ClientCount, we also have #RunningTotal variable declared and set to 0.
Can someone please explain what does this line do ?
UPDATE Temptable
SET #RunningTotal = RunningTotal = #RunningTotal + ClientCount
Never seen this construct before, but it seems to work like this.
It fills column RunningTotal with a cumulative total of ClientCount.
Say we start with a table with just ClientCount filled in:
CREATE TABLE dbo.Temptable (ClientCount int, RunningTotal int)
INSERT INTO Temptable (ClientCount) VALUES (5), (4), (6), (2)
SELECT * FROM Temptable
ClientCount RunningTotal
----------- ------------
5 NULL
4 NULL
6 NULL
2 NULL
And then run the update statement:
DECLARE #RunningTotal int = 0
UPDATE Temptable SET #RunningTotal = RunningTotal = #RunningTotal + ClientCount
SELECT * FROM Temptable
ClientCount RunningTotal
----------- ------------
5 5
4 9
6 15
2 17
As you can see, each value of RunningTotal is the sum of all ClientCount values from the current and any preceding records.
The downside is, you have no control in which order the records are processed. Which makes me wonder whether this is a recommended approach in a production environment.
Please check here for a deeper discussion:
Calculate a Running Total in SQL Server
I was answering another question and ran into a strange outcome - the output of a product aggregate (without CLR) was different when used in a SELECT vs UPDATE.
This is simplified from the original question to minimally reproduce the problem:
GroupKey RowIndex A
----------- ----------- -----------
25 1 5
25 2 6
25 3 NULL
26 1 3
26 2 4
26 3 NULL
The goal is for each group key to update the A column of each row with a RowIndex = 3 to the product of the A columns of each row with RowIndex IN (1, 2), so this would produce the following changes:
GroupKey RowIndex A
----------- ----------- -----------
25 3 30
26 3 12
So this is the code I used:
UPDATE T SET
A = Products.Product
FROM #Table T
INNER JOIN (
SELECT
GroupKey,
EXP(SUM(LOG(A))) AS Product
FROM #Table
WHERE RowIndex IN (1, 2)
GROUP BY
GroupKey
) Products
ON Products.GroupKey = T.GroupKey
WHERE T.RowIndex = 3
SELECT * FROM #Table WHERE RowIndex = 3
Which then produced the off-by-one results:
GroupKey RowIndex A
----------- ----------- -----------
25 3 29
26 3 12
If I just run the sub-query, I see the correct values.
GroupKey Product
----------- ----------------------
25 30
26 12
Here's the full script to make it easy to play with. I can't figure out where the off-by-one is coming from.
DECLARE #Table TABLE (GroupKey INT, RowIndex INT, A INT)
INSERT #Table VALUES (25, 1, 5), (25, 2, 6), (25, 3, NULL), (26, 1, 3), (26, 2, 4), (26, 3, NULL)
SELECT * FROM #Table
SELECT
GroupKey,
EXP(SUM(LOG(A))) AS Product
FROM #Table
WHERE RowIndex IN (1, 2)
GROUP BY
GroupKey
UPDATE T SET
A = Products.Product
FROM #Table T
INNER JOIN (
SELECT
GroupKey,
EXP(SUM(LOG(A))) AS Product
FROM #Table
WHERE RowIndex IN (1, 2)
GROUP BY
GroupKey
) Products
ON Products.GroupKey = T.GroupKey
WHERE T.RowIndex = 3
SELECT * FROM #Table WHERE RowIndex = 3
Here are some references I came across:
Non-CLR Aggregate: http://michaeljswart.com/2011/03/the-aggregate-function-product/
Original question: Set one row fields as a multiplication of 2 others
I'd say that this cute "PRODUCT" aggregate is inherently unreliable if you want to work with ints - EXP and LOG are only defined against the float type and so we get rounding errors creeping in.
Why they're not consistently appearing, I couldn't say, except to suggest that different queries may cause changes in evaluation orders.
As a simpler example of how this can go wrong:
select CAST(EXP(LOG(5)) as int)
Can produce 4. EXP and LOG together will produce a value that is just less than 5, but of course when converting to int, SQL Server always truncates rather than applying any rounding.
I have one table (Stock_ID, Stock_Name). I want to write a stored procedure in SQL Server with Stock_ID running number with a format like xxxx/12 (xxxx = number start from 0001 to 9999; 12 is the last 2 digits of current year).
My scenario is that if the year change, the running number will be reset to 0001/13.
what do you intend to do when you hit more than 9999 in a single year??? it may sound impossible, but I've had to deal with so many "it will never happen" data related design mess-ups over the years from code first design later developers. These are major pains depending on how may places you need to fix these items which are usually primary key and foreign keys used all over.
This looks like a system requirement to SHOW the data this way, but it is the developers responsibility to design the internals of the application. The way you store it and display it don't need to be identical. I'd split that into two columns, using an int for the number portion and a tiny int for the 2 digit year portion. You can use a computed column for quick and easy display (persist it and index if necessary), where you pad with leading zeros and add the slash. Throw in a check constraint on the year portion to make sure it stays within a reasonable range. You can make the number portion an identity and just have a job reseed it back to 1 every new years eve.
try it out:
--drop table YourTable
--create the basic table
CREATE TABLE YourTable
(YourNumber int identity(1,1) not null
,YourYear tinyint not null
,YourData varchar(10)
,CHECK (YourYear>=12 and YourYear<=25) --optional check constraint
)
--add the persisted computed column
ALTER TABLE YourTable ADD YourFormattedNumber AS ISNULL(RIGHT('0000'+CONVERT(varchar(10),YourNumber),4)+'/'+RIGHT(CONVERT(varchar(10),YourYear),2),'/') PERSISTED
--make the persisted computed column the primary key
ALTER TABLE YourTable ADD CONSTRAINT PK_YourTable PRIMARY KEY CLUSTERED (YourFormattedNumber)
sample data:
--insert rows in 2012
insert into YourTable values (12,'aaaa')
insert into YourTable values (12,'bbbb')
insert into YourTable values (12,'cccc')
--new years eve job run this
DBCC CHECKIDENT (YourTable, RESEED, 0)
--insert rows in 2013
insert into YourTable values (13,'aaaa')
insert into YourTable values (13,'bbbb')
select * from YourTable order by YourYear,YourNumber
OUTPUT:
YourNumber YourYear YourData YourFormattedNumber
----------- -------- ---------- -------------------
1 12 aaaa 0001/12
2 12 bbbb 0002/12
3 12 cccc 0003/12
1 13 aaaa 0001/13
2 13 bbbb 0002/13
(5 row(s) affected)
to handle the possibility of more than 9999 rows per year try a different computed column calculation:
CREATE TABLE YourTable
(YourNumber int identity(9998,1) not null --<<<notice the identity starting point, so it hits 9999 quicker for this simple test
,YourYear tinyint not null
,YourData varchar(10)
)
--handles more than 9999 values per year
ALTER TABLE YourTable ADD YourFormattedNumber AS ISNULL(RIGHT(REPLICATE('0',CASE WHEN LEN(CONVERT(varchar(10),YourNumber))<4 THEN 4 ELSE 1 END)+CONVERT(varchar(10),YourNumber),CASE WHEN LEN(CONVERT(varchar(10),YourNumber))<4 THEN 4 ELSE LEN(CONVERT(varchar(10),YourNumber)) END)+'/'+RIGHT(CONVERT(varchar(10),YourYear),2),'/') PERSISTED
ALTER TABLE YourTable ADD CONSTRAINT PK_YourTable PRIMARY KEY CLUSTERED (YourFormattedNumber)
sample data:
insert into YourTable values (12,'aaaa')
insert into YourTable values (12,'bbbb')
insert into YourTable values (12,'cccc')
DBCC CHECKIDENT (YourTable, RESEED, 0) --new years eve job run this
insert into YourTable values (13,'aaaa')
insert into YourTable values (13,'bbbb')
select * from YourTable order by YourYear,YourNumber
OUTPUT:
YourNumber YourYear YourData YourFormattedNumber
----------- -------- ---------- --------------------
9998 12 aaaa 9998/12
9999 12 bbbb 9999/12
10000 12 cccc 10000/12
1 13 aaaa 0001/13
2 13 bbbb 0002/13
(5 row(s) affected)
This might help:
DECLARE #tbl TABLE(Stock_ID INT,Stock_Name VARCHAR(100))
INSERT INTO #tbl
SELECT 1,'Test'
UNION ALL
SELECT 2,'Test2'
DECLARE #ShortDate VARCHAR(2)=RIGHT(CAST(YEAR(GETDATE()) AS VARCHAR(4)),2)
;WITH CTE AS
(
SELECT
CAST(ROW_NUMBER() OVER(ORDER BY tbl.Stock_ID) AS VARCHAR(4)) AS RowNbr,
tbl.Stock_ID,
tbl.Stock_Name
FROM
#tbl AS tbl
)
SELECT
REPLICATE('0', 4-LEN(RowNbr))+CTE.RowNbr+'/'+#ShortDate AS YourColumn,
CTE.Stock_ID,
CTE.Stock_Name
FROM
CTE
From memory, this is a way to get the next id:
declare #maxid int
select #maxid = 0
-- if it does not have #maxid will be 0, if it was it will give the next id
select #maxid = max(convert(int, substring(Stock_Id, 1, 4))) + 1
from table
where substring(Stock_Id, 6, 2) = substring(YEAR(getdate()), 3, 2)
declare #nextid varchar(7)
select #nextid = right('0000'+ convert(varchar,#maxid),4)) + '/' + substring(YEAR(getdate()), 3, 2)
I have the following problem. I have a table with a few hundred thousand records, which has the following identifiers (for simplicity)
MemberID SchemeName BenefitID BenefitAmount
10 ABC 1 10000
10 ABC 1 2000
10 ABC 2 5000
10 A.B.C 3 11000
What I need to do is to convert this into a single record that looks like this:
MemberID SchemeName B1 B2 B3
10 ABC 12000 5000 11000
The problem of course being that I need to differentiate by SchemeName, and for most records this won't be a problem, but for some SchemeName wouldn't be captured properly. Now, I don't particularly care if the converted table uses "ABC" or "A.B.C" as scheme name, as long as it just uses 1 of them.
I'd love hear your suggestions.
Thanks
Karl
(Using SQL Server 2008)
based on the limited info in the original question, give this a try:
DECLARE #YourTable table(MemberID int, SchemeName varchar(10), BenefitID int, BenefitAmount int)
INSERT INTO #YourTable VALUES (10,'ABC' ,1,10000)
INSERT INTO #YourTable VALUES (10,'ABC' ,1,2000)
INSERT INTO #YourTable VALUES (10,'ABC' ,2,5000)
INSERT INTO #YourTable VALUES (10,'A.B.C',3,11000)
INSERT INTO #YourTable VALUES (11,'ABC' ,1,10000)
INSERT INTO #YourTable VALUES (11,'ABC' ,1,2000)
INSERT INTO #YourTable VALUES (11,'ABC' ,2,5000)
INSERT INTO #YourTable VALUES (11,'A.B.C',3,11000)
INSERT INTO #YourTable VALUES (10,'mnp',3,11000)
INSERT INTO #YourTable VALUES (11,'mnp' ,1,10000)
INSERT INTO #YourTable VALUES (11,'mnp' ,1,2000)
INSERT INTO #YourTable VALUES (11,'mnp' ,2,5000)
INSERT INTO #YourTable VALUES (11,'mnp',3,11000)
SELECT
MemberID, REPLACE(SchemeName,'.','') AS SchemeName
,SUM(CASE WHEN BenefitID=1 THEN BenefitAmount ELSE 0 END) AS B1
,SUM(CASE WHEN BenefitID=2 THEN BenefitAmount ELSE 0 END) AS B2
,SUM(CASE WHEN BenefitID=3 THEN BenefitAmount ELSE 0 END) AS B3
FROM #YourTable
GROUP BY MemberID, REPLACE(SchemeName,'.','')
ORDER BY MemberID, REPLACE(SchemeName,'.','')
OUTPUT:
MemberID SchemeName B1 B2 B3
----------- ----------- ----------- ----------- -----------
10 ABC 12000 5000 11000
10 mnp 0 0 11000
11 ABC 12000 5000 11000
11 mnp 12000 5000 11000
(4 row(s) affected)
It looks that PIVOTS can help
The schemename issue is something that will have to be dealt with manually since the names can be so different. This indicates first and foremost a problem with how you are allowing data entry. You should not have these duplicate schemenames.
However since you do, I think the best thing is to create cross reference table that has two columns, something like recordedscheme and controlling scheme. Select distinct scheme name to create a list of possible schemenames and insert into the first column. Go through the list and determine what the schemename you want to use for each one is (most willbe the same as the schemename). Once you have this done, you can join to this table to get the query. This will work for the current dataset, however, you need to fix whatever is causeing the schemename to get duplicated beofre going further. YOu will also want to fix it so when a schemename is added, you table is populated with the new schemename in both columns. Then if it later turns out that a new one is a duplicate, all you have to do is write a quick update to the second column showing which one it really is and boom you are done.
The alternative is to actually update the schemenames that are bad in the data set to the correct one. Depending on how many records you have to update and in how many tables, this might be a performance issue.This too is only good for querying the data right now and doesn't address how to fix the data going forth.