TSQL - How to prevent optimisation in this query - sql-server

I have a query analogous to:
update x
set x.y = (
select sum(x2.y)
from mytable x2
where x2.y < x.y
)
from mytable x
the point being, I'm iterating over rows and updating a field based on a subquery over those fields which are changing.
What I'm seeing is the subquery is being executed for each row before any updates occur, so the changed values for each row are not being picked up.
How can I force the subquery to be re-evaluated for each row of the update?
Is there a suitable table hint or something?
As an aside, I was doing the below and it did work, however since modifying my query somewhat (for logic purposes, not to try and solve this issue) this trick no longer works :(
declare #temp int
update x
set #temp = (
select sum(x2.y)
from mytable x2
where x2.y < x.y
),
x.y = #temp
from mytable x
I'm not particularly concerned about performance, this is a background task run over a few rows

It looks like task is incorrect or other rules should apply.
Let's see on example. Let's say you have values 4, 1, 2, 3, 1, 2
Sql will update rows based on original values. I.e. during single update statement newly calculated values is NOT mixing with original values:
-- only original values used
4 -> 9 (1+2+3+1+2)
1 -> null
2 -> 2 (1+1)
3 -> 6 (1+2+1+2)
1 -> null
2 -> 2 (1+1)
Based on your request you wants that update of each rows will count newly calculated values. (Note, that SQL does not guarantees the sequence in which rows will be processed.)
Let's do this calculation by processing rows from top to bottom:
-- from top
4 -> 9 (1+2+3+1+2)
1 -> null
2 -> 1 (1)
3 -> 4 (1+1+2)
1 -> null
2 -> 1 (1)
Do the same in other sequence - from bottom to top:
-- from bottom
4 -> 3 (2+1)
1 -> null
2 -> 1 (1)
3 -> 5 (2+2+1)
1 -> null
2 -> 2 (1+1)
How you can see your expected result is inconsistent. To make it right you need to correct the calculation rule - for instance define strong sequence of the rows to process (date, id, ...)
Also, if you want to do some recursive processing look at the common_table_expression:
http://technet.microsoft.com/en-us/library/ms186243(v=sql.105).aspx

Related

Using the window function "last_value", when the values of the sorted field are same, the value snowflake returns is not the last value

As we all known, the window function "last_value" returns the last value within an ordered group of values.
In the following example, group by field "A" and sort by field "B" in positive order.
In the group of "A = 1", the last value is returned, which is, the C value 4 when B = 2.
However, in the group of "A = 2", the values of field "B" are the same.
At this time, instead of the last value, which is, the C value 4 in line 6, the first C value 1 in B = 2 is returned.
This puzzles me why the last value within an ordered group of values is not returned when I encounter the value I want to use for sorting.
Example
row_number
A
B
C
LAST_VALUE(C) IGNORE NULLS OVER (PARTITION BY A ORDER BY B ASC)
1
1
1
2
4
2
1
1
1
4
3
1
1
3
4
4
1
2
4
4
5
2
2
1
1
6
2
2
4
1
This puzzles me why the last value within an ordered group of values is not returned when I encounter the value I want to use for sorting.
For partition A equals 2 and column B, there is a tie:
The sort is NOT stable. To achieve stable sort a column or a combination of columns in ORDER BY clause must be unique.
To ilustrate it:
SELECT C
FROM tab
WHERE A = 2
ORDER BY B
LIMIT 1;
It could return either 1 or 4.
If you sort by B within A then any duplicate rows (same A and B values) could appear in any order and therefore last_value could give any of the possible available values.
If you want a specific row, based on some logic, then you would need to sort by all columns within the group to reflect that logic. So in your case you would need to sort by B and C
Good day Bill!
Right, the sorting is not stable and it will return different output each time.
To get stable results, we can run something like below
select
column1,
column2,
column3,
last_value(column3) over (partition by column1 order by
column2,column3) as column2_last
from values
(1,1,2), (1,1,1), (1,1,3),
(1,2,4), (2,2,1), (2,2,4)
order by column1;

Update column based on another column's value which is also being updated

I wish to update column A with null. And I also wish to update column B with null, but only if it's value matched the original value of column A.
Update MyTable
Set A=Null,
B=Case When B=A Then Null Else B End
Will the above statement work? Is the value of A within the Case statement already Null when evaluated? And does the order of the updates matter?
Any way to write this better and avoid setting B=B when it's not really required? I need to do this within 1 statement as the actual query is long and complex involving many joins and sub-queries.
You query should be fine if you just place B before A like this becuase the order does matter. It is possible that it will also work like you wrote it, but I doubt it.
UPDATE MyTable
SET B = CASE WHEN A = B then null else B END,
A = NULL
The order of the columns being updated does not matter: in fact, you will also get the original value of A (that is, the one you would get from the deleted virtual table if you were writing a trigger), despite of the value you are assigning to it during the update.
You can easily test this with this script, which contains two UPDATEs with a different order of the updated columns:
CREATE TABLE Test
(
Id INT PRIMARY KEY NOT NULL,
X INT,
Y INT
)
INSERT Test (Id, X, Y)
VALUES (1, 2, 3)
INSERT Test (Id, X, Y)
VALUES (4, 5, 6)
UPDATE Test
SET X = X + 1,
Y = X
WHERE Id = 1
UPDATE Test
SET Y = X,
X = X + 1
WHERE Id = 4
SELECT *
FROM Test
Here the SELECT would return this:
Id X Y
1 3 2
4 6 5
As you can see, the updated value of Y is always the same of the original X, despite the different order of the UPDATEs.

ms sql table adding rows whenever level changes by more than 1 so that every row has difference of 1 in start_level and end_level

(This is my first stack overflow question. So please let me know suggestions for posing a better question, if you cannot understand.)
I have a table of around 500 people(users) who are going up the stairs from floor x (0=x, max(y) = 50). A person can climb zero/one or many levels in a single go which corresponds to a single row of the table along with the time taken to do so in seconds.
I want to find average time taken to go from floor a to a+1 where a is any of the floor number. To do so I intend to divide every row of the mentioned table into rows which have start_level+1= end_level. Duration will be divided equally as shown in EXPECTED OUTPUT TABLE for user b.
GIVEN TABLE INPUT
start_level end_level duration user
1 1 10 a
1 2 5 a
2 5 27 b
5 6 3 c
EXPECTED OUTPUT
start_level end_level duration user
1 1 10 a
1 2 5 a
2 3 27/3 b
3 4 27/3 b
4 5 27/3 b
5 6 3 c
Note: level jumps are in integers only.
After getting expected output, I can simply create a column sum(duration)/count(distinct users) at a start_level level to get average time taken to get one floor above from each floor.
Any help is appreciated.
You can use a Numbers table to "create" the incremental steps. Here's my setup:
CREATE TABLE #floors
(
[start_level] INT,
[end_level] INT,
[duration] DECIMAL(10, 4),
[user] VARCHAR(50)
)
INSERT INTO #floors
([start_level],
[end_level],
[duration],
[user])
VALUES (1,1,10,'a'),
(1,2,5,'a'),
(2,5,27,'b'),
(5,6,3,'c')
Then, using a Numbers table and some LEFT JOIN/COALESCE logic:
-- Create a Numbers table
;WITH Numbers_CTE
AS (SELECT TOP 50 [Number] = ROW_NUMBER()
OVER(
ORDER BY (SELECT NULL))
FROM sys.columns)
SELECT [start_level] = COALESCE(n.[Number], f.[start_level]),
[end_level] = COALESCE(n.[Number] + 1, f.[end_level]),
[duration] = CASE
WHEN f.[end_level] = f.[start_level] THEN f.[duration]
ELSE f.[duration] / ( f.[end_level] - f.[start_level] )
END,
f.[user]
FROM #floors f
LEFT JOIN Numbers_CTE n
ON n.[Number] BETWEEN f.[start_level] AND f.[end_level]
AND f.[end_level] - f.[start_level] > 1
Here are the logical steps:
LEFT JOIN the Numbers table for cases where end_level >= start_level + 2 (this has the effect of giving us multiple rows - one for each incremental step)
new start_level = If the LEFT JOIN "completes": take Number from the Numbers table, else: take the original start_level
new end_level = If the LEFT JOIN "completes": take Number + 1, else: take the original end_level
new duration = If end_level = start_level: take the original duration (to avoid divide by 0), else: take the average duration over end_level - start_level

Updating multiple rows with respect to a change in one row

I am using PostgreSQL and let's say I have a tasks table to keep track of task items. Tasks table is as follows;
Id Name Index
7 name A 1
5 name B 2
6 name C 3
3 name D 4
Index column in tasks table stores the sort order of the tasks. Therefore I will output the tasks with respect to index in increasing order.
So When I change Task D(id = 3)' s index into 2 the new indexes should be as below;
Id Name Index
7 name A 1
5 name B 3
6 name C 4
3 name D 2
or when I change Task A(id = 7)' s index into 4 the new indexes should be as below;
Id Name Index
7 name A 4
5 name B 2
6 name C 3
3 name D 1
What I think is updating all row's index values one by one is pretty inefficient.
So what is the most efficient way to update all index values when I change one of the indexes in my Tasks table?
Edit :
First of all sorry for the confusion. What I am asking is not a simple exchanging two row indexes. If you look at the examples when I change Task D's index in to 2 more than one rows change. So when Task D is 2, Task B becomes 3 and Task C becomes 4.
For instance;
It is like when you drag Task D and drop below Task A so that it's index becomes 2 and B and C's index increases by 1.
SQL Fiddle
What you are doing is exchanging two row's indexes. So it is necessary to store the index value of the first updated one in a temp variable and setting it temporarily to a special value to avoid a unique index collision, that is, if the index is unique. If the index is not unique that step is unnecessary.
begin;
create temp table t as
select
(
select index
from tasks
where id = 3
) as index,
(
select id
from tasks
where index = 2
) as id
;
update tasks
set index = -1
where id = (select id from t)
;
update tasks
set index = 2
where id = 3
;
update tasks
set index = (select index from t)
where id = (select id from t)
;
drop table t;
commit;
The following assumes the index column (as well as id) is unique:
with swapped as (
select n1.id as id1,
n1.name as name1,
n1.index as index1,
n2.id as id2,
n2.name as name2,
n2.index as index2
from names n1
join names n2 on n2.index = 2 -- this is the value of the "new index"
where n1.id = 3 -- this is the id of the row where the index should be changed to the new value
)
update names as n
set index = case
when n.id = s.id1 then s.index2
when n.id = s.id2 then s.index1
end
from swapped s
where n.id in (s.id1, s.id2);
The CTE first creates a single row with the ids of the two rows to be swapped and then the update just compares the ids of the target table with those from the CTE, swapping the values.
SQLFiddle example: http://sqlfiddle.com/#!15/71dc2/1

Changing indices and order in arrays

I have a struct mpc with the following structure:
num type col3 col4 ...
mpc.bus = 1 2 ... ...
2 2 ... ...
3 1 ... ...
4 3 ... ...
5 1 ... ...
10 2 ... ...
99 1 ... ...
to from col3 col4 ...
mpc.branch = 1 2 ... ...
1 3 ... ...
2 4 ... ...
10 5 ... ...
10 99 ... ...
What I need to do is:
1: Re-order the rows of mpc.bus, such that all rows of type 1 are first, followed by 2 and at last, 3. There is only one element of type 3, and no other types (4 / 5 etc.).
2: Make the numbering (column 1 of mpc.bus, consecutive, starting at 1.
3: Change the numbers in the to-from columns of mpc.branch, to correspond to the new numbering in mpc.bus.
4: After running simulations, reverse the steps above to turn up with the same order and numbering as above.
It is easy to update mpc.bus using find.
type_1 = find(mpc.bus(:,2) == 1);
type_2 = find(mpc.bus(:,2) == 2);
type_3 = find(mpc.bus(:,2) == 3);
mpc.bus(:,:) = mpc.bus([type1; type2; type3],:);
mpc.bus(:,1) = 1:nb % Where nb is the number of rows of mpc.bus
The numbers in the to/from columns in mpc.branch corresponds to the numbers in column 1 in mpc.bus.
It's OK to update the numbers on the to, from columns of mpc.branch as well.
However, I'm not able to find a non-messy way of retracing my steps. Can I update the numbering using some simple commands?
For the record: I have deliberately not included my code for re-numbering mpc.branch, since I'm sure someone has a smarter, simpler solution (that will make it easier to redo when the simulations are finished).
Edit: It might be easier to create normal arrays (to avoid woriking with structs):
bus = mpc.bus;
branch = mpc.branch;
Edit #2: The order of things:
Re-order and re-number.
Columns (3:end) of bus and branch are changed. (Not part of this question)
Restore original order and indices.
Thanks!
I'm proposing this solution. It generates a n x 2 matrix, where n corresponds to the number of rows in mpc.bus and a temporary copy of mpc.branch:
function [mpc_1, mpc_2, mpc_3] = minimal_example
mpc.bus = [ 1 2;...
2 2;...
3 1;...
4 3;...
5 1;...
10 2;...
99 1];
mpc.branch = [ 1 2;...
1 3;...
2 4;...
10 5;...
10 99];
mpc.bus = sortrows(mpc.bus,2);
mpc_1 = mpc;
mpc_tmp = mpc.branch;
for I=1:size(mpc.bus,1)
PAIRS(I,1) = I;
PAIRS(I,2) = mpc.bus(I,1);
mpc.branch(mpc_tmp(:,1:2)==mpc.bus(I,1)) = I;
mpc.bus(I,1) = I;
end
mpc_2 = mpc;
% (a) the following mpc_tmp is only needed if you want to truly reverse the operation
mpc_tmp = mpc.branch;
%
% do some stuff
%
for I=1:size(mpc.bus,1)
% (b) you can decide not to use the following line, then comment the line below (a)
mpc.branch(mpc_tmp(:,1:2)==mpc.bus(I,1)) = PAIRS(I,2);
mpc.bus(I,1) = PAIRS(I,2);
end
% uncomment the following line, if you commented (a) and (b) above:
% mpc.branch = mpc_tmp;
mpc.bus = sortrows(mpc.bus,1);
mpc_3 = mpc;
The minimal example above can be executed as is. The three outputs (mpc_1, mpc_2 & mpc_3) are just in place to demonstrate the workings of the code but are otherwise not necessary.
1.) mpc.bus is ordered using sortrows, simplifying the approach and not using find three times. It targets the second column of mpc.bus and sorts the remaining matrix accordingly.
2.) The original contents of mpc.branch are stored.
3.) A loop is used to replace the entries in the first column of mpc.bus with ascending numbers while at the same time replacing them correspondingly in mpc.branch. Here, the reference to mpc_tmp is necessary so ensure a correct replacement of the elements.
4.) Afterwards, mpc.branch can be reverted analogously to (3.) - here, one might argue, that if the original mpc.branch was stored earlier on, one could just copy the matrix. Also, the original values of mpc.bus are re-assigned.
5.) Now, sortrows is applied to mpc.bus again, this time with the first column as reference to restore the original format.

Resources