Table function for each record in a query - sql-server

I have a function that has two outputs ...
dbo.func1(#code) -> Table(out1, out2)
This function is too costly and takes much time to calculate these two outputs.
and I have a query like this :
SELECT code, name,
(SELECT out1 dbo.func1(code)), (SELECT out2 dbo.func1(code))
FROM MyInnerJoinedTablesResult
But my costly function is call for two time but I want to call it one time for each record selected in my table... and result in two column in any row (not double rows)
SELECT code, name,
(out1 in func1), (out2 in func2)
FROM MyInnerJoinedTablesResult

You need to use Cross Apply
SELECT code, name, func.out1, func.out2
FROM MyInnerJoinedTablesResult
cross apply dbo.func1(code) as func

Related

Max Recursion Exhausted before Statement Completion

I know this has been asked and answered a few times here, but I can't seem to find the answer to my specific problem. Here's the recursive query:
CTE as (
SELECT
ZipCode
,Age
,[Population]
,Deaths
,DeathRate
,Death_Proportion
,DeathProbablity
,SurvivalProbablity
,PersonsAlive
FROM ProbabilityTable
WHERE Age = 0
UNION ALL
SELECT
p.ZipCode
,p.Age
,p.[Population]
,p.Deaths
,p.DeathRate
,p.Death_Proportion
,p.DeathProbablity
,p.SurvivalProbablity
,LAG(c.PersonsAlive,1) OVER(PARTITION BY p.ZipCode ORDER BY p.Age) * p.SurvivalProbablity
FROM ProbabilityTable p
INNER JOIN CTE c
ON p.ZipCode = c.ZipCode
and p.Age = c.Age
WHERE p.Age < 86
)
In the ProbabilityTable PersonsAlive is set to 100,000 when Age = 0. What I'm looking to do with the recursive CTE is multiple the previous value of PersonsAlive by the current SurvivalProbability to calculate the PersonsAlive of that Age. Age goes up to 85 so that's why I have my termination clause set at 86.
I've tried tweaking the recursive part of the query a number of times (and also setting PersonsAlive to 100,000 in the anchor part) but I can't figure it out. This is my first attempt at a recursive query and even with some course work it's not clicking for me.
EDIT
Here is the updated code that actually runs:
CTE as (
SELECT
ZipCode
,Age
,[Population]
,Deaths
,DeathRate
,Death_Proportion
,DeathProbablity
,SurvivalProbablity
,PersonsAlive
FROM ProbabilityTable
WHERE Age = 0
UNION ALL
SELECT
p.ZipCode
,p.Age
,p.[Population]
,p.Deaths
,p.DeathRate
,p.Death_Proportion
,p.DeathProbablity
,p.SurvivalProbablity
,LAG(c.PersonsAlive,1) OVER(PARTITION BY p.ZipCode ORDER BY p.Age) * p.SurvivalProbablity
FROM ProbabilityTable p
INNER JOIN CTE c
ON p.ZipCode = c.ZipCode
and p.Age = c.Age + 1
WHERE p.Age < 6
)
And here is the results it returns:
What I want the results to be for PersonsAlive is as follows:
So with each iteration of the CTE, it needs to reference the previous row of PersonsAlive and the current row of SurvivalProbability to calculate PersonsAlive
It's hard to test this without your raw data but I think your issue is you're lagging over the previous row, causing your frame of reference to be 2 rows back.
When you're using a recursive CTE, you already have access to the previous row, via CTE c. When you do LAG(c.PersonsAlive,1) you're actually telling it to look at PersonsAlive from 2 rows back from the current row (lagging 1 row back from the previous row).
Since on the first recursive pass, there is only 1 row back, the LAG() function will return NULL by default since there is no 2 rows back at that point. This is why every row in your results has NULL for the PersonsAlive column, except for the first row (anchor row from the first half of your UNION ALL clause). So if you remove the LAG() function from it and instead just do c.PersonsAlive * p.SurvivalProbablity, you should get all of the expected PersonsAlive values.
That being said, a recursive CTE seems like overkill here and you probably can just use the LAG() window function in a static call on your ProbabilityTable like so:
SELECT
ZipCode,
Age,
[Population],
Deaths,
DeathRate,
Death_Proportion,
DeathProbablity,
SurvivalProbablity,
ISNULL(LAG(PersonsAlive,1) OVER (PARTITION BY ZipCode ORDER BY Age), PersonsAlive) AS PersonsAlive
FROM ProbabilityTable
As I mentioned, I can't really test this, so please let me know if you run into any issues, and I'll help you accordingly.
Recursive CTEs are good for tree-like problems, e.g. when you need to compare multiple child rows to their parent, or interact with multiple levels of the tree simultaneously. Window functions like LAG() allow you to interact with any single row at a time relative to the current row. Your problem seems to be the latter kind.

Returning from a join the first result of one column based one a second column

I need some help to improve part of my query. The query is returning the correct data, I just need to exclude some extra information that I don't need.
I believe that one of the main parts that will change is:
JOIN TBL_DATA_TYPE_RO_BODY TB ON TB.FK_ID_TBL_FILE_NAMES=VMI.ID_TBL_FILE_NAMES
In this part, I have, for example, 2 FK_ID_TBL_FILE_NAMES, it will return 2 results from TBL_DATA_TYPE_RO_BODY.
The data that I have is (I excluded some extra columns):
If I have 2 or more equal MAG for the same field "ONLY_FIELD_NAME" I should return only the first one (I don't care about the others one). I believe that this is a simple case for Group by, but I am having trouble doing the group by on the join.
My ideas:
Use select top (i.e. here)
Use first valeu (i.e. here)
What I have (note the 2 last lines):
Freq|Mag|Phase|Date|ONLY_FILE_NAME
1608039|767|3234|37:00.0|RO_Mass_Load_4b
1608039|781|3371|44:00.0|RO_Mass_Load_4b
1608039|788|3138|37:00.0|RO_Mass_Load_4b
1608039|797|3326|44:00.0|RO_Mass_Load_4b
1608039|808|3117|37:00.0|RO_Mass_Load_4b
1608039|808|3269|44:00.0|RO_Mass_Load_4b
What I would like to have (note the last line):
Freq|Mag|Phase|Date|ONLY_FILE_NAME
1608039|767|3234|37:00.0|RO_Mass_Load_4b
1608039|781|3371|44:00.0|RO_Mass_Load_4b
1608039|788|3138|37:00.0|RO_Mass_Load_4b
1608039|797|3326|44:00.0|RO_Mass_Load_4b
1608039|808|3117|37:00.0|RO_Mass_Load_4b
Note that the mag field is coming from my JOIN.
Ideas? Any help?
In case you wanna see the whole code is:
SELECT TW.CURRENT_MEASUREMENT as Cycle_Current_Measurement,
TW.REF_MEASUREMENT as Cycle_Ref_Measurement,
CONVERT(REAL,TT.CURRENT_TEMP) as Cycle_Current_Temp,
CONVERT(REAL,TT.REF_TEMP) as Cycle_Ref_Temp,
TP.TYPE as Cycle_Type, TB.FREQUENCY as Freq,
TB.MAGNITUDE as Mag,
TB.PHASE as Phase,
VMI.TIME_FORMATTED as Date,
VMI.ID_TBL_FILE_NAMES as IdFileNames, VMI.ID_TBL_DATA_TYPE_RO_HEADER as IdHeader, VMI.*
FROM VW_MAIN_INFO VMI
JOIN TBL_DATA_TYPE_RO_BODY TB ON TB.FK_ID_TBL_FILE_NAMES=VMI.ID_TBL_FILE_NAMES
LEFT JOIN TBL_POINTS_AND_CYCLES TP ON VMI.ID_TBL_DATA_TYPE_RO_HEADER = TP.FK_ID_TBL_DATA_TYPE_RO_HEADER
LEFT JOIN TBL_POINTS_AND_MEASUREMENT TW ON VMI.ID_TBL_DATA_TYPE_RO_HEADER = TW.FK_ID_TBL_DATA_TYPE_RO_HEADER
LEFT JOIN TBL_POINTS_AND_TEMP TT ON VMI.ID_TBL_DATA_TYPE_RO_HEADER = TT.FK_ID_TBL_DATA_TYPE_RO_HEADER
Try something like this. the partition by is like a group by; it defines groups over which row_number will auto-increment an integer by 1. The order by tells row_number which rows should have a lower number. So in this example, the lowest date will have RID = 1. Then subquery it, and select only those rows which have RID = 1
select *
from (select RID = row_number() over (partition by tb.Magnitude order by vmi.time_formatted)
from ...<rest of your query>) a
where a.RID = 1

Make a new array with items derived from another array

Given a PostgreSQL ARRAY of items of one type, how can I create a new array where each item is derived from the items in the initial array?
Example: I have an array of INTERVAL values. I want a new array where each item is a NUMERIC(10, 1) that is the total number of seconds in the corresponding INTERVAL value.
I know how to convert one INTERVAL value:
foo=> SELECT '00:01:20.000'::INTERVAL AS duration_interval;
duration_interval
-------------------
00:01:20
(1 row)
foo=> SELECT extract(EPOCH FROM date_trunc('second', '00:01:20.000'::INTERVAL))
::NUMERIC(10, 1) AS duration_seconds;
duration_seconds
------------------
80.0
(1 row)
The array does not exist in a table – this is a value returned from another function call – so the conversion code needs to operate on it as an array.
How can I convert an array of INTERVAL values to an array of corresponding NUMERIC values?
You need to unnest() the array, do the conversion and then aggregate back into an array.
Assuming you want to do this on a real table with a primary key:
SELECT pk, array_agg(extract(epoch from dur_int)::numeric(10,1)
ORDER BY ordinality) AS duration_seconds
FROM my_table, unnest(duration_interval) WITH ORDINALITY d(dur_int)
GROUP BY pk;
If you have a single array, such as the result from a function call:
SELECT array_agg(extract(epoch from dur_int)::numeric(10,1)
ORDER BY ordinality) AS duration_seconds
FROM unnest(function(...)) WITH ORDINALITY d(dur_int);
Note that you need the WITH ORDINALITY clause when unnesting the array. This will add a column ordinality to the result such that every row has two columns: (dur_int interval, ordinality bigint). When putting the array back again with seconds instead of an interval, you order the rows by the ordinality column. That way you ensure that the order in the resulting array of seconds is the same as in the original array of intervals. (In general, SQL row sources have no specific ordering, the server may present rows in any order it prefers.)
If you have access to the function and you are not breaking other uses of it, you might be better off by changing the function such that you can use its result directly.
If there is a primary key then #Patrick answer is enough. If not then use row_number to aggregate on:
with i(i) as (values
(array['00:01:20.000','00:00:30.000']::interval[]),
(array['00:02:10.000','00:01:30.000']::interval[])
)
select array_agg(extract(epoch from a)::numeric(10,1))
from (
select i, row_number() over() as r
from i
) s, unnest(i) a (a)
group by r
;
array_agg
--------------
{80.0,30.0}
{130.0,90.0}

How can I prevent a function from being called twice in Select statement

Let's say I have a function that returns a number dbo.somefunction(#id INT). Then I have another function that returns a bit. dbo.returnsbit(#id INT)
This is what my view looks like
SELECT dbo.somefunction(id) AS returnIntValue
FROM sometable
Ultimately this is what I want to do
SELECT dbo.somefunction(id) AS returnIntValue,, dbo.returnsbit(returnIntValue) As BitValue
FROM sometable
The trouble is, the second function (dbo.returnsbit) uses the returned value from the first function (dbo.somefunction). Of course I can do dbo.returnsbit(dbo.somefunction(id)), but that means the first function is called twice resulting in increased overhead.
Quite often, if you are using user-defined functions in your queries that process many rows the performance of the query drops significantly just because the server has to call the function for each row. In this sense optimizing the query and calling two functions per row instead of three would not change the overall performance much, it will be poor in any case.
If you have a very complex function, where each call of the function takes long time and you expect to call it for a limited number of rows, then it makes sense to optimize the overall number of calls.
In any case, I found this question interesting enough to check my guesses and write my findings.
If you are using SQL Server 2005 or later you can use CROSS APPLY.
SELECT
CA.returnIntValue
,dbo.returnsbit(CA.returnIntValue) As BitValue
FROM
sometable
CROSS APPLY
(
SELECT dbo.somefunction(id) AS returnIntValue
) AS CA
I have checked on SQL Server 2008 that this variant indeed calls somefunction only once per row.
Sometimes it is important not just from the performance point of view. You may get different results if the function with side effects is called more times than needed. See example below.
Here is how to confirm that CROSS APPLY calls somefunction only once per row.
Create a table with few rows
CREATE TABLE [dbo].[Numbers]([Number] [int] NOT NULL);
INSERT INTO [dbo].[Numbers] ([Number])
VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10);
Create first function. It will return a random number. A different number each time it is called. It will not use the parameter, but it doesn't matter for this example.
CREATE VIEW [dbo].[ViewRnd]
AS
SELECT CAST(CRYPT_GEN_RANDOM(4) AS int) AS rnd
CREATE FUNCTION [dbo].[somefunction]
(
#id int
)
RETURNS int
AS
BEGIN
DECLARE #Result int;
SELECT #Result = rnd FROM dbo.ViewRnd;
RETURN #Result;
END
Create a second function. For this example it will simply return the given parameter.
CREATE FUNCTION [dbo].[somefunction2]
(
#id int
)
RETURNS int
AS
BEGIN
DECLARE #Result int;
SELECT #Result = #id;
RETURN #Result;
END
First query with two calls per row:
SELECT
dbo.somefunction(dbo.Numbers.Number) AS f1
, dbo.somefunction2(dbo.somefunction(dbo.Numbers.Number)) AS f2
FROM dbo.Numbers
;
Result set:
f1 f2
-1111498263 -1481060640
1678801669 230929974
1377897182 -1527788053
1786076194 -301754441
734901522 1385475384
-636644847 -1076939672
1551114591 -385251162
32984627 -1214863465
2075259001 -1450159610
-2063202107 -1023434184
You can see that values f1 and f2 in each row are different, which means that random number was generated twice for each row, i.e. the function somefunction was called twice for each row.
Second query with one call per row:
SELECT
CA.f1
, dbo.somefunction2(CA.f1) AS f2
FROM
dbo.Numbers
CROSS APPLY
(
SELECT dbo.somefunction(dbo.Numbers.Number) AS f1
) CA
;
Result set:
f1 f2
-963307489 -963307489
450369380 450369380
1193334688 1193334688
1480723291 1480723291
-1666937401 -1666937401
1001969991 1001969991
-1142557574 -1142557574
-1891218324 -1891218324
-102288163 -102288163
1575326336 1575326336
You can see that values f1 and f2 in each row are the same, which means that the function somefunction was called only once per row.
You can do this:
SELECT returnIntValue, dbo.returnsbit(returnIntValue) As BitValue
FROM
(SELECT dbo.somefunction(id) AS returnIntValue
FROM sometable) t
I guess dbo.somefunction may still be invoked twice on each row depending on the query plan the optimizer happens to pick, as Damien_The_Unbeliever noted.

Single Subquery returns multiple rows in Oracle

select distinct a.person.name, b.title,b.director.name
from movie_roles a, movie b
where a.person.name=
( select b.director.name
from movie b, movie_roles a
where b.director.name=a.person.name)
and b.movieID=a.movie.movieID;
I keep getting error that saying single single subquery returns multiple rows in Oracle.
Can anyone help me to solve this problem?
It's self explanatory.In the following line
where a.person.name= ( select b.director.name from movie b, movie_roles a where b.director.name=a.person.name)
you get more than one result so you cannot use "=". Try
where a.person.name IN( select b.director.name from movie b, movie_roles a where b.director.name=a.person.name)

Resources