T-SQL: simple recursion exceeding max recursion depth - sql-server

I have a table my_table of the form
rowNumber number ...
1 23
2 14
3 15
4 25
5 19
6 21
7 19
8 37
9 31
...
1000 28
and I want to find the maximum length of an increasing consecutive sequence of the column number. For this example, it will be 3:
14, 15, 25
My idea is to calculate such length for each number:
rowNumber number ... length
1 23 1
2 14 1
3 15 2
4 25 3
5 19 1
6 21 2
7 19 1
8 37 2
9 31 1
...
and then take the maximum. To calculate length, I wrote the following query that is using recursion:
with enhanced_table as (select *
,1 length
from my_table
where rowNumber = 1
union all
(select b.*
,case when b.number > a.number
then a.length + 1
end new_column
from enhanced_table a, my_table b
where b.rowNumber = a.rowNumber + 1
)
select max(length)
from enhanced_table
So, I'm trying to start from rowNumber = 1 and add all other rows consecutively by recursion. I'm getting the maximum recursion 100 has been exhausted before statement completion error.
My question is: should I find a way to increase maximum iterations allowed on the server (given that the query is simple, I think there won't be a problem to run 1000 iterations), or find another approach?
Also, isn't 100 iterations too low of a threshold?
Thank you!

There has to be some default threshold, and that is what Microsoft chose. It's to prevent infinite loops. Besides, looping doesn't perform well in SQL Server and goes against its set-based structure.
You can specify the max recursion you want to set for the individual query. This overrides the default.
select max(length)
from enhanced_table
option (maxrecursion 1000)
Note, option (maxrecursion 0) is the same as unlimited... and can cause an infinte loop
REFERENCE
An incorrectly composed recursive CTE may cause an infinite loop. For
example, if the recursive member query definition returns the same
values for both the parent and child columns, an infinite loop is
created. To prevent an infinite loop, you can limit the number of
recursion levels allowed for a particular statement by using the
MAXRECURSION hint and a value between 0 and 32,767 in the OPTION
clause of the INSERT, UPDATE, DELETE, or SELECT statement. This lets
you control the execution of the statement until you resolve the code
problem that is creating the loop. The server-wide default is 100.
When 0 is specified, no limit is applied. Only one MAXRECURSION value
can be specified per statement

If you wish to declare the maxrecursion parameter in the beginning of the query.
You could try building query something like:
DECLARE #Query NVARCHAR(MAX)
SET #Query = N'
;WITH foo AS (
...
)
SELECT * FROM foo
OPTION (MAXRECURSION ' + CAST(#maxrec AS NVARCHAR) + ');'
and the Execute it using Exec
You could go refer to this answer here:Maxrecursion parameter

Related

SQL Server script not working as expected

I have this little script that shall return the first number in a column of type int which is not used yet.
SELECT t1.plu + 1 AS plu
FROM tovary t1
WHERE NOT EXISTS (SELECT 1 FROM tovary t2 WHERE t2.plu = t1.plu + 1)
AND t1.plu > 0;
this returns the unused numbers like
3
11
22
27
...
The problem is, that when I make a simple select like
SELECT plu
FROM tovary
WHERE plu > 0
ORDER BY plu ASC;
the results are
1
2
10
20
...
Why the first script isn't returning some of free numbers like 4, 5, 6 and so on?
Compiling a formal answer from the comments.
Credit to Larnu:
It seems what the OP really needs here is an (inline) Numbers/Tally (table) which they can then use a NOT EXISTS against their table.
Sample data
create table tovary
(
plu int
);
insert into tovary (plu) values
(1),
(2),
(10),
(20);
Solution
Isolating the tally table in a common table expression First1000 to produce the numbers 1 to 1000. The amount of generated numbers can be scaled up as needed.
with First1000(n) as
(
select row_number() over(order by (select null))
from ( values (0),(0),(0),(0),(0),(0),(0),(0),(0),(0) ) a(n) -- 10^1
cross join ( values (0),(0),(0),(0),(0),(0),(0),(0),(0),(0) ) b(n) -- 10^2
cross join ( values (0),(0),(0),(0),(0),(0),(0),(0),(0),(0) ) c(n) -- 10^3
)
select top 20 f.n as Missing
from First1000 f
where not exists ( select 'x'
from tovary
where plu = f.n);
Using top 20 in the query above to limit the output. This gives:
Missing
-------
3
4
5
6
7
8
9
11
12
13
14
15
16
17
18
19
21
22
23
24

t-sql loop through all rows and sum amount from column until value is reached

I have a table containing the below test data:
I now would like to fill a restaurant with 12 seating spaces.
This should result in:
Basically, I need to loop from top to bottom through all rows and add the AmountPersons until I have filled the restaurant.
In this example:
(first few rows: AmountPersons) 3+1+2+4 = 10
UserId 52 can't be added because they reserved for 3 persons, which would result in 13 occupied places and there are only 12 available.
In the next row it notices a reservation for 1. This can be added to the previous 10 we already found.
NewTotal is now 11.
UserId 79 and 82 can't be added because we'd exceed the capacity again.
UserId 95 reserved for 1, this one can be added and we now have all places filled.
This is the result I get from the cursor I use, but I'm stuck now. Please help.
The while loop I have in the cursor basically stops when the next value would be higher than 12. But that is not correct.
Because you want to skip rows, you need a recursive CTE. But it is tricky -- because you may not have a group following your rules that adds up to exactly 12.
So:
with tn as (
select t.*, row_number() over (order by userid) as seqnum
from t
),
cte as (
select userId, name, amountPersons as total, 1 as is_included, seqnum
from tn
where seqnum = 1
union all
select tn.userId, tn.name,
(case when tn.amountPersons + cte.total <= 12
then tn.amountPersons + cte.total
else cte.total
end),
(case when tn.amountPersons + cte.total <= 12
then 1
else 0
end) as is_included,
tn.seqnum
from cte join
tn
on tn.seqnum = cte.seqnum + 1
where cte.total < 12
)
select cte.*
from cte
where is_included = 1;
Here is a db<>fiddle.
Note that if you change "I" to a larger value, then it is not included and the number of occupied seats is 11, not 12.

Convert Frequency Table Back to Non-Frequency Table (ungroup-ing)

In SQL Server, I have the following table (snippet) which is the source data I receive (I cannot get the raw table it was generated from).
Gradelevel | YoS | Inventory
4 | 0 | 4000
4 | 1 | 3500
4 | 2 | 2000
The first row of the table is saying for grade level 4, there are 4,000 people with 0 years of service (YoS).
I need to find the median YoS for each Grade level. This would be easy if the table wasn't given to me aggregated up to the Gradelevel/YoS level with a sum in the Inventory column, but sadly I'm not so lucky.
What I need is to ungroup this table such that I have a new table where the first record is in the table 4,000 times, the next record 3,500 times, the next 2,000, etc (the inventory column would not be in this new table). Then I could take the percent_disc() of the YoS column by grade level and get the median. I could also then use other statistical functions on YoS to glean other insights from the data.
So far I've looked at unpivot (doesn't appear to be a candidate for my use case), CTEs (can't find an example close to what I'm trying to do), and a function which iterates through the above table inserting the number of rows indicated by the value in inventory to a new table which becomes my 'ungrouped' table I can run statistical analyses on. I believe the last approach is the best option available to me but the examples I've all seen iterate and focus on a single column from a table. I need to iterate through each row, then use the gradelevel, and yos values to insert [inventory] number of times before moving on to the next row.
Is anyone aware of:
A better way to do this other then the iteration/cursor method?
How to iterate through a table to accomplish my goal? I've been reading Is there a way to loop through a table variable in TSQL without using a cursor? but am having a hard time figuring out how to apply that iteration to my use case.
Edit 10/3, here is the looping code I got working which produces the same as John's cross apply. Pro is any statistical function can then be run on it, con is it is slow.
--this table will hold our row (non-frequency) based inventory data
DROP TABLE IF EXISTS #tempinv
CREATE TABLE #tempinv(
amcosversionid INT NOT null,
pp NVARCHAR(3) NOT NULL,
gl INT NOT NULL,
yos INT NOT NULL
)
-- to transform the inventory frequency table to a row based inventory we need to iterate through it
DECLARE #MyCursor CURSOR, #pp AS NVARCHAR(3), #gl AS INT, #yos AS INT, #inv AS int
BEGIN
SET #MyCursor = CURSOR FOR
SELECT payplan, gradelevel, step_yos, SUM(inventory) AS inventory
FROM
mytable
GROUP BY payplan, gradelevel, step_yos
OPEN #MyCursor
FETCH NEXT FROM #MyCursor
INTO #pp, #GL, #yos, #inv
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #i int
SET #i = 1
--insert into our new table for each number of people in inventory
WHILE #i<=#inv
BEGIN
INSERT INTO #tempinv (pp,gl,yos) VALUES (#pp,#gl,#yos)
SET #i = #i + 1
END
FETCH NEXT FROM #MyCursor
INTO #pp, #GL, #yos, #inv
END;
One Option is to use an CROSS APPLY in concert with an ad-hoc tally table. This will "expand" your data into N rows. Then you can perform any desired analysis you want.
Example
Select *
From YourTable A
Cross Apply (
Select Top ([Inventory]) N=Row_Number() Over (Order By (Select NULL))
From master..spt_values n1, master..spt_values n2
) B
Returns
Grd Yos Inven N
4 0 4000 1
4 0 4000 2
4 0 4000 3
4 0 4000 4
4 0 4000 5
...
4 0 4000 3998
4 0 4000 3999
4 0 4000 4000
4 1 3500 1
4 1 3500 2
4 1 3500 3
4 1 3500 4
...
4 1 3500 3499
4 1 3500 3500
4 2 2000 1
4 2 2000 2
4 2 2000 3
...
4 2 2000 1999
4 2 2000 2000

Update column from table starting from 0

I have a SQL Server Docs Table with two fields, Idx1 and Idx2:
Idx1 Idx2
0 23
1 34
2 12
4 1
5 21
7 45
8 50
9 3
10 9
... ...
Note that numbers in Idx1 column are unique, they are never repeated.
And now I am trying to re-number Idx1 column starting from 0, that is, 0,1,2,3,4,... and so on.
The expected result should be:
Idx1 Idx2
0 23
1 34
2 12
3 1
4 21
5 45
6 50
7 3
8 9
... ...
I have tried below and it works:
DECLARE #myVar int
SET #myVar = 0
UPDATE
Docs
SET
#myvar = Idx1 = #myVar + 1
but i am worried about in which order SQL Server are numbering them. I would like to explicitly order them first by Idx1 column and then re-number them taking into account this order.
NOTE: I am using SQL Server 2008
There's no need to play around with variables. You could make a subquery and apply sorting inside it to be certain it follows explicit order. There's alternative, modern, approach, which will also work in other database engines supporting window functions.
Use ROW_NUMBER window function available from SQL Server 2008 to create a column with temporary (for the query run) sequence based on order by argument. Then substract 1 from it to make it start from 0.
UPDATE docs
SET idx1 = t.rn
FROM (
SELECT idx1, row_number() over (order by idx1) - 1 as rn
FROM docs
) t
WHERE docs.idx1 = t.idx1
Why not simply use row_number() function :
with t as (
select *, (row_number() over (order by idx1))-1 seq
from docs
)
update t
set t.idx1 = t.seq
from t inner join
docs t1
on t1.idx1 = t.idx1;

Find the average of a given result set

My table looks like the one below.
I am doing average for total table. I am getting 14. It is fine.
declare #Table table (Student Varchar(10), Score int)
insert into #Table
select 'A',10
union all
select 'B',20
union all
select 'A',10
union all
select 'C',20
union all
select 'B',10
select avg(cast(Score as float)) AvgScore from #Table
AvgScore
--------
14
select Student, avg(cast(Score as float)) AvgScore from #Table group by Grouping sets(Student,())
Student AvgScore
------------------
A 10
B 15
C 20
NULL 14
If I do average (10+15+20)/3, I am not getting 14.
How can I over come this?
Am I not doing mathematics correct?
Can any give me brief explanation about it.
Thanks in advance.
Total average is for all data so:
(10 + 20 + 10 + 20 + 10) / 5 = 70 / 5 = 14
Everything is ok. You try to calculate average on averages (10+15+20)/3 which is nonsense from Math point of view.
Look at this example:
A - 1
A - 1
A - 1
A - 1
B - 20
Average is (1+1+1+1+20) / 5 and NOT (1+20)/2
The problem is that you reduce the information you have in the two steps of the calculation. Your original is a simple average.
After your reduction you got:
The problem you have is that the weight of each value is different. You got 2 values affecting A, two values affecting B but only 1 value affecting C. And this information, while important for calculating the average, is lost. What you need to do in addition is to get the proper average, is to store the weight of each average. Means the amount of source values. This would be:
Student Value Weight
A 10 2
B 15 2
C 20 1
A weight is simply the count of values for each student. You can extract that easily in one query.
Now your final average calculation should look like this:
Selecting the values you need should look like this I think:
SELECT Student, AVG(CAST(Score as float)) AvgScore, COUNT(*) Weight
FROM #Table
GROUP BY Grouping sets(Student,())
The rest of the path should be clear. Multiply weight and average values and divide it by the sum of the weight value.

Resources