SQL Server calculations based on aggregates inside recursive CTEs - sql-server

Could you please help me with this dillema?
I tried to simplify the example as much as I could, but basically, what I want is to somehow use an aggregate of the previous results of a recursive query inside the next level of the recursive query (hope that makes sense).
I tried using window functions (max() over()), however those seem to focus on only the current row for some reason (that seems better explained here: recursive cte with ranking functions ).
I also tried referencing the 'r' CTE more than once, however that seems illegal.
Do you have any other ideas of how I can do this?
I need to do this in SQL and not T-SQL. The reason is that I actually have a working version in T-SQL using loops, but the performance of that is pretty poor for what I'm trying to do. I'm hoping a pure SQL solution will work much faster.
I'm using SQL Server 2012.
Thanks!
--this works, however it's not recursive and I don't know in advance how many "levels" there will be:
;with t as (
select 1 a, 1 b union all
select 2 a, 1 b union all
select 3 a, 1 b
), r as (
select a, b, 1 lvl
from t
)
select *
from r
union all --we took the "union all" outside the CTE, which means it's not recursive anymore
select a + max(a) over(partition by b) a, --this now works as expected and returns "a + 3" on all cases
b, lvl-1
from r
where lvl > 0
--this doesn't work:
;with t as (
select 1 a, 1 b union all
select 2 a, 1 b union all
select 3 a, 1 b
), r as (
select a, b, 1 lvl
from t
union all
select a + max(a) over(partition by b) a, --this returns the "max" over only the current row instead of doing the partition from what I expect to be the "previous step"
b, lvl-1
from r
where lvl > 0
)
select *
from r
--this also fails:
;with t as (
select 1 a, 1 b union all
select 2 a, 1 b union all
select 3 a, 1 b
), r as (
select a, b, 1 lvl
from t
union all
select a + (select max(a) from r r2 where r2.b = r.b) a, --this returns the "max" over only the current row instead of doing the partition from what I expect to be the "previous step"
b, lvl-1
from r
where lvl > 0
)
select *
from r

Related

how to generate individual rows for each character between two delimiters in a string

I have a data set with square brackets.
CREATE TABLE Testdata
(
SomeID INT,
String VARCHAR(MAX)
)
INSERT Testdata SELECT 1, 'S0000X-T859XX[DEFGH]'
INSERT Testdata SELECT 1, 'T880XX-T889XX[DS]'
INSERT Testdata SELECT 2, 'V0001X-Y048XX[DS]'
INSERT Testdata SELECT 2, 'Y0801X-Y0889X[AB]'
i need to get output like below,
SomeId String
1 S0000XD-T859XXD
1 S0000XE-T859XXE
1 S0000XF-T859XXF
1 S0000XG-T859XXG
1 S0000XH-T859XXH
1 T880XXD-T889XXD
1 T880XXS-T889XXS
2 V0001XD-Y048XXD
2 V0001XS-Y048XXS
2 Y0801XA-Y0889XA
2 Y0801XB-Y0889XB
Appreciate if any one can help this
You don't need a function here, and certainly no need for loops. A tally table will make short work of this. First you need a tally table. I keep one as a view on my system. It is nasty fast!!!
create View [dbo].[cteTally] as
WITH
E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
select N from cteTally
GO
You don't have use a view like this, you could just use the CTEs directly in your code.
This is some rather ugly string manipulation but since your data is not properly normalized you are kind of stuck there. Please try this query. It produces what you said you expect as output.
select *
, NewOutput = left(td.String, charindex('-', td.String) - 1) + SUBSTRING(td.String, CHARINDEX('[', td.String) + t.N, 1) +
left(substring(td.String, charindex('-', td.String), len(td.String)), charindex('[', substring(td.String, charindex('-', td.String), len(td.String))) - 1) + SUBSTRING(td.String, CHARINDEX('[', td.String) + t.N, 1)
from TestData td
join cteTally t on t.N <= CHARINDEX(']', td.String) - CHARINDEX('[', td.String) - 1
order by td.String
, t.N
I am posting this because I did it.
select distinct *
,[base]+substring(splitter,number,1)
from
(
select SomeID
-- split your column into a base plus a splitter column
,[base] = left(string,charindex('[',string)-1)
,splitter = substring(string, charindex('[',string)+1,len(string) - charindex('[',string)-1)
from
(
-- converted your insert into a union all
SELECT 1 SomeID, 'S0000X-T859XX[DEFGH]' string
union all
SELECT 1, 'T880XX-T889XX[DS]'
union all
SELECT 2, 'V0001X-Y048XX[DS]'
union all
SELECT 2, 'Y0801X-Y0889X[AB]'
) a
) inside
cross apply (Select number from master..spt_values where number>0 and number <=len(splitter)) b -- this is similar to a tally table using an internal SQL table
Since you didn't include any attempt to solve this, I will assume that you are not stuck on any particular technique, and are looking for a high-level approach.
I would solve this by first creating a Table-valued function that takes the id and the string as parameters, and creates an output table by looping through the characters in between the square-brackets, so that the function's output looks like this
ID String Character
1 S0000X-T859XX D
1 S0000X-T859XX E
1 S0000X-T859XX F
..
2 V0001XD-Y048XX D
etc...
Then you simply write a query that joins the raw table to the function on the ID and the portion of the String without the brackets.

T-SQL Recursive with two initial statements

I don't have idea how to write this recursive in SQL. How handle with CTE when I have two initial assumptions?
Below easy example:
a1 = 2
a2 = 3
an = a(n-1)*a(n-2)
I tried write something like below but unfortunately I don't know how handle with this:
with recur(n,results) as
(
select 1,2
union all
select 2,3
union all
select
/*how to write this pattern?*/
where n<
)
select * from recur
Do you have any idea?
It seems you want to generate the Fibonacci numbers using a recursive CTE.
Try something like this:
WITH CTE AS (
SELECT 1 AS N, 2 AS A, 3 AS B
UNION ALL
SELECT N+1 AS N, B AS A, A+B AS B
FROM CTE
WHERE N<10
)
SELECT A FROM CTE

Why was the DelimitedSplit8k udf was written with 2X (cartesian product) in SQL server?

I was asking this question about writing fast inline table valued function in sql server.
The code in the answer is working but I'm asking about that part :
It is clear to me that he wanted to create many numbers ( 1,1,1,1,1,...) and then turn them to sequential numbers (1,2,3,4,5,6....):
In this part :
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)
,E2(N) AS (SELECT 1 FROM E1 a, E1 b)
,E4(N) AS (SELECT 1 FROM E2 a, E2 b)
SELECT * FROM e4 --10000 rows
He created 10000 rows.
This function is widely used and hence my question:
Question :
Why didn't he (Jeff Moden) use :
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)
,E2(N) AS (SELECT 1 FROM E1 a, E1 b , E1 c , E1 d)
SELECT * FROM E2 -- ALSO 10000 rows !!!
But choose to split it into E2 , E4 ?
Although I am not Jeff Moden and do not know his reasoning, I find it likely that he simply used a known pattern for number generation which he himself calls Itzik Ben Gan's cross joined CTE method in this Stack Overflow answer.
The pattern goes like this:
WITH E00(N) AS (SELECT 1 UNION ALL SELECT 1),
E02(N) AS (SELECT 1 FROM E00 a, E00 b),
E04(N) AS (SELECT 1 FROM E02 a, E02 b),
E08(N) AS (SELECT 1 FROM E04 a, E04 b),
...
In order to adapt the method for his string splitting function, he apparently found it more convenient to modify the initial CTE to be ten rows instead of two and to cut down the number of cross joining CTEs to two to just cover the 8000 rows necessary for his solution.
Heh... just ran across this and thought I'd answer.
Andriy M answered it exactly right. It was very much modeled after Itzik Ben-Gan's great original BASE 2 code and, yes, I changed it (as have many others) to Base 10 code just to cut down on the number of cCTEs (Cascading CTEs). The latest code that I and many others use cuts down on the number of cCTEs even further. It also uses the VALUES operator to cut down on the bulk of the code, although there's no performance advantage in doing so.
WITH E1(N) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))E0(N)) --10 rows
,E4(N) AS (SELECT 1 FROM E1 a, E1 b, E1 c, E1 d)
SELECT * FROM e4 --10000 rows
;
There are a great many other places where the need for such an on-the-fly creation of a sequence is required. Some need to start the sequence at 0 and others at 1. There's also a much larger range of values needed and, to be honest, I got tired of meticulously writing out code similar to the above so I did what Mr. Ben-Gan and many others have done. I wrote an iTVF called "fnTally". I don't normally use Hungarian Notation for functions but I had two reasons for using the "fn" prefix. 1) is because I still maintain a physical Tally Table and so the function needed to be named differently and 2) I can tell people at work "If you had used the 'eff-n' Tally function I told you about, you wouldn't have this problem" without it actually being an HR violation. ;-)
Just in case anyone should need such a thing, here's the code I wrote for my version of an fnTally function. There's a tiny bit of trade off in allowing it to start at 0 or 1 performance wise but it's worth the extra flexibility, to me anyways. And, yes... you could reduce the number of cCTEs in it by doing 12 CROSS JOINs in the 2nd and final cCTE. I just didn't go that route. You could without harm.
Also note that I still use the SELECT/UNION ALL method to form the first 10 pseudo-rows because I still do a lot of work with folks on 2005 and was stuck using 2005 myself until about 6 months ago. Full documentation is included in the code.
CREATE FUNCTION [dbo].[fnTally]
/**********************************************************************************************************************
Purpose:
Return a column of BIGINTs from #ZeroOrOne up to and including #MaxN with a max value of 1 Trillion.
As a performance note, it takes about 00:02:10 (hh:mm:ss) to generate 1 Billion numbers to a throw-away variable.
Usage:
--===== Syntax example (Returns BIGINT)
SELECT t.N
FROM dbo.fnTally(#ZeroOrOne,#MaxN) t
;
Notes:
1. Based on Itzik Ben-Gan's cascading CTE (cCTE) method for creating a "readless" Tally Table source of BIGINTs.
Refer to the following URLs for how it works and introduction for how it replaces certain loops.
http://www.sqlservercentral.com/articles/T-SQL/62867/
http://sqlmag.com/sql-server/virtual-auxiliary-table-numbers
2. To start a sequence at 0, #ZeroOrOne must be 0 or NULL. Any other value that's convertable to the BIT data-type
will cause the sequence to start at 1.
3. If #ZeroOrOne = 1 and #MaxN = 0, no rows will be returned.
5. If #MaxN is negative or NULL, a "TOP" error will be returned.
6. #MaxN must be a positive number from >= the value of #ZeroOrOne up to and including 1 Billion. If a larger
number is used, the function will silently truncate after 1 Billion. If you actually need a sequence with
that many values, you should consider using a different tool. ;-)
7. There will be a substantial reduction in performance if "N" is sorted in descending order. If a descending
sort is required, use code similar to the following. Performance will decrease by about 27% but it's still
very fast especially compared with just doing a simple descending sort on "N", which is about 20 times slower.
If #ZeroOrOne is a 0, in this case, remove the "+1" from the code.
DECLARE #MaxN BIGINT;
SELECT #MaxN = 1000;
SELECT DescendingN = #MaxN-N+1
FROM dbo.fnTally(1,#MaxN);
8. There is no performance penalty for sorting "N" in ascending order because the output is explicity sorted by
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
Revision History:
Rev 00 - Unknown - Jeff Moden
- Initial creation with error handling for #MaxN.
Rev 01 - 09 Feb 2013 - Jeff Moden
- Modified to start at 0 or 1.
Rev 02 - 16 May 2013 - Jeff Moden
- Removed error handling for #MaxN because of exceptional cases.
Rev 03 - 22 Apr 2015 - Jeff Moden
- Modify to handle 1 Trillion rows for experimental purposes.
**********************************************************************************************************************/
(#ZeroOrOne BIT, #MaxN BIGINT)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN WITH
E1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1) --10E1 or 10 rows
, E4(N) AS (SELECT 1 FROM E1 a, E1 b, E1 c, E1 d) --10E4 or 10 Thousand rows
,E12(N) AS (SELECT 1 FROM E4 a, E4 b, E4 c) --10E12 or 1 Trillion rows
SELECT N = 0 WHERE ISNULL(#ZeroOrOne,0)= 0 --Conditionally start at 0.
UNION ALL
SELECT TOP(#MaxN) N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E12 -- Values from 1 to #MaxN
;

Selecting a set of rows more than once

Is there a simple, concise way to select the same set of rows repeated based on a count held in a variable, without using a loop?
For instance, suppose SELECT a, b, c FROM foo WHERE whatsit = something returns
a b c
--- --- ----
1 2 3
...and I have #count with 3 in it. Is there a reasonable way without a loop to get:
a b c
--- --- ----
1 2 3
1 2 3
1 2 3
? Order doesn't matter, and I don't need to know which group any given row belongs to. I actually only need this for one row (as above), and a solution that only works for one row would do the trick, but I assume if we can do it for one, we can do it for any number.
Try with a Recursive CTE
WITH cte
AS (SELECT 1 AS id,a,b,c
FROM tablename
UNION ALL
SELECT id + 1,a,b,c
FROM cte
WHERE id < 3) --#count
SELECT a,b,c
FROM cte
Another way to do using cross join
SELECT a, b, c
FROM Table1
CROSS JOIN (SELECT number
FROM master.dbo.spt_values
WHERE type = 'P'
AND number BETWEEN 1 AND 3) T
I don't know of a way you could do this without a loop or dynamic SQL. I think a union is all that I can come up with
select q1.a,q1.b,q1.c
from (
SELECT a, b, c FROM foo
union all
SELECT a, b, c FROM foo
union all
SELECT a, b, c FROM foo ) q1
order by q1.a

How to Pivot one Row into One Column

Can someone please help me out.
I've looked around and can't find something similar to what I need to do. Basically,
I have a table that will need to be pivoted, it is coming from a flat file that loads all columns as one comma delimited column. I will need to break out the columns into their respective order before the pivot and I've got procedures that do this beautifully. However, the crux of this table is that I need to edit the headers before I can continue.
I need help to pivot the information in the first column and put it another table I've created. Therefore, I need this
ID Column01
1 Express,,,Express,,,HyperMakert,,WebStore,Web
To End up like this....
New_ID New_Col
1 Express
2
3
4 Express
5
6
7 HyperMarket
8
9 WebStore
10 Web
Please note that I need to include the '' Black columns from the original row and.
I looked and the links below but they were not helpful;
SQL Server : Transpose rows to columns
Efficiently convert rows to columns in sql server
Mysql query to dynamically convert rows to columns
There are many methods of splitting string in SQL Server you can find on the web, some are really complicated but some are just simple. I like the way of using dynamic query. It's just short and simple (not sure about the performance but I believe it would be not too bad):
declare #s varchar(max)
-- save the Column01 string/text into #s variable
select #s = Column01 from test where ID = 1
-- build the query string
set #s = 'select row_number() over (order by current_timestamp) as New_ID, c as New_Col from (values ('''
+ replace(#s, ',', '''),(''') + ''')) v(c)'
insert newTable exec(#s)
go
select * from newTable
Sqlfiddle Demo
The use of values() clause above is some kind of anonymous table, here is a simple example of such usage (so that you can understand it better). The anonymous table in the following example has just 1 column, the table name is v and the column name is c. Each row has just 1 cell and should be wrapped in a pair of parentheses (). The rows are separated by commas and follow after values. Here is the code:
-- note about the outside (...) wrapping values ....
select * from (values ('a'),('b'),('c'), ('d')) v(c)
The result will be:
c
------
1 a
2 b
3 c
4 d
Just try running that code and you'll understand how useful it is.
You may want to use a tally table here. See http://www.sqlservercentral.com/articles/T-SQL/62867/
declare #parameter varchar(4000)
set #parameter = 'Express,,,Express,,,HyperMakert,,WebStore,Web'
set #parameter = ',' + #parameter + ',' -- add commas
with
e1 as(select 1 as N union all select 1), -- 2 rows
e2 as(select 1 as N from e1 as a, e1 as b), -- 4 rows
e3 as(select 1 as N from e2 as a, e2 as b), -- 16 rows
e4 as(select 1 as N from e3 as a, e3 as b), -- 256 rows
e5 as(select 1 as N from e4 as a, e4 as b), -- 65536 rows
tally as (select row_number() over(order by N) as N from e5
)
select
substring(#parameter, N+1, charindex(',', #parameter, N+1) - N-1)
from tally
where
N < len(#parameter)
and substring(#parameter, N, 1) = ','

Resources