SQL Server : Create Function Wrong Syntax near 'Begin'

SQL Server : Create Function Wrong Syntax near 'Begin' - sql-server

I have a problem figuring out how to get rid of an error. It says there is wrong Syntax near the Begin statement. I assume it means before, but I do not know what. I've tried many different declarations of the function but did not get it to work.
I've table that is feeded a line in every step of a process, for multiple processes. The function should take a process name (unit) and time and should result all lines for that process from start to end.
Executing the sql without a function works fine.
CREATE FUNCTION [GetFullCIP]
(
#pTime DATETIME2,
#pName NVARCHAR(50)
)
RETURNS TABLE
AS
BEGIN
DECLARE #cipid int
SELECT TOP(1) #cipid=unit_id FROM [dbo].[md_units] WHERE unit=#pName
DECLARE #stop Datetime2;
DECLARE #start Datetime2;
--start
SELECT TOP (1) #start=[begin_date] FROM [dbo].[log] WHERE [operation_id]=1 AND unit_id=#cipid AND [begin_date] <=#pTime ORDER BY [cip_id] DESC
--stop
SELECT TOP (1) #stop=[inserted_date] FROM [dbo].[log] WHERE [operation_id]=99 AND unit_id=#cipid AND [inserted_date]>=#pTime ORDER BY [cip_id] ASC
RETURN (SELECT * FROM [dbo].[log] WHERE unit_id=#cipid AND [begin_date]>=#start AND [inserted_date]<=#stop)
END
GO
I read that i should give the return table a name, like #resdata. I tried that and at the end write
SET #resdata=(SELECT ...) but that doesnt work, by than it does not know #resdata anymore.
Thx in advance

As I mentioned, I would use an inline table-value function. This is untested, due to no sample data or expected results, but is a literal translation of the ml-TVF you have posted.
CREATE FUNCTION dbo.[GetFullCIP] (#pTime datetime2(7), #pName nvarchar(50))
RETURNS table
AS RETURN
SELECT L.* --Replace this with your explicit columns
FROM dbo.log L
JOIN dbo.md_units MU ON L.unit_id = MU.unit_id
WHERE MU.Unit = #pName
AND L.begin_date >= (SELECT TOP (1) sq.begin_date
FROM dbo.log sq
WHERE sq.operation_id = 1
AND sq.unit_id = MU.unit_id
AND sq.begin_date <= #pTime
ORDER BY sq.cip_id DESC)
AND L.inserted_date <= (SELECT TOP (1) sq.inserted_date
FROM dbo.log sq
WHERE sq.operation_id = 99
AND sq.unit_id = MU.unit_id
AND sq.inserted_date >= #pTime
ORDER BY sq.cip_id ASC)
GO

Related

Can I use a variable inside cursor declaration?

Is this code valid?
-- Zadavatel Login ID
DECLARE #ZadavatelLoginId nvarchar(max) =
(SELECT TOP 1 LoginId
FROM
(SELECT Z.LoginId, z.Prijmeni, k.spojeni
FROM TabCisZam Z
LEFT JOIN TabKontakty K ON Z.ID = K.IDCisZam
WHERE druh IN (6,10)) t1
LEFT JOIN
(SELECT ko.Prijmeni, k.spojeni, ko.Cislo
FROM TabCisKOs KO
LEFT JOIN TabKontakty K ON K.IDCisKOs = KO.id
WHERE druh IN (6, 10)) t2 ON t1.spojeni = t2.spojeni
AND t1.Prijmeni = t2.Prijmeni
WHERE
t2.Cislo = (SELECT CisloKontOsoba
FROM TabKontaktJednani
WHERE id = #IdKJ))
-- Pokud je řešitelský tým prázdný
IF NOT EXISTS (SELECT * FROM TabKJUcastZam WHERE IDKJ = #IdKJ)
BEGIN
DECLARE ac_loginy CURSOR FAST_FORWARD LOCAL FOR
-- Zadavatel
SELECT #ZadavatelLoginId
END
ELSE BEGIN
I am trying to pass the variable #ZadavatelLoginId into the cursor declaration and SSMS keeps telling me there is a problem with the code even though it is working.
Msg 116, Level 16, State 1, Procedure et_TabKontaktJednani_ANAFRA_Tis_Notifikace, Line 575 [Batch Start Line 7]
Only one expression can be specified in the select list when the subquery is not introduced with EXISTS
Can anyone help?

I do not see anything in your posted query that could trigger the specific message that you listed. You might get an error if the subquery (SELECT CisloKontOsoba FROM TabKontaktJednani WHERE id = #IdKJ) returned more than one value, but that error would be a very specific "Subquery returned more than 1 value...".
However, as written, your cursor query is a single select of a scalar, which would never yield anything other than a single row.
If you need to iterate over multiple user IDs, but wish to separate your selection query from your cursor definition, what you likely need is a table variable than can hold multiple user IDs instead of a scalar variable.
Something like:
DECLARE #ZadavatelLoginIds TABLE (LoginId nvarchar(max))
INSERT #ZadavatelLoginIds
SELECT t1.LoginId
FROM ...
DECLARE ac_loginy CURSOR FAST_FORWARD LOCAL FOR
SELECT LoginId
FROM #ZadavatelLoginIds
OPEN ac_loginy
DECLARE #LoginId nvarchar(max)
FETCH NEXT FROM ac_loginy INTO #LoginId
WHILE ##FETCH_STATUS = 0
BEGIN
... Send email to #LoginId ...
FETCH NEXT FROM ac_loginy INTO #LoginId
END
CLOSE ac_loginy
DEALLOCATE ac_loginy
A #Temp table can also be used in place of the table variable with the same results, but the table variable is often more convenient to use.
As others have mentioned, I believe that your login selection query is overly complex. Although this was not the focus of your question, I still suggest that you attempt to simplify it.
An alternative might be something like:
SELECT Z.LoginId
FROM TabKontaktJednani KJ
JOIN TabCisKOs KO ON KO.Cislo = KJ.CisloKontOsoba
JOIN TabCisZam Z ON Z.Prijmeni = KO.Prijmeni
JOIN TabKontakty K ON K.IDCisZam = Z.ID
WHERE KJ.id = #IdKJ
AND K.druh IN (6,10)
The above is my attempt to rewrite your posted query after tracing the relationships. I did not see any LEFT JOINS that were not superseded by other conditions that forced them into effectively being inner joins, so the above uses inner joins for everything. I have assumed that the druh column is in the TabKontakty table. Otherwise I see no need for that table. I do not guarantee that my re-interpretation is correct though.

How about you create a #temp table for each sub query since the problem is coming up due to the sub queries?
CREATE TABLE #TEMP1
(
LoginID nvarchar(max)
)
CREATE TABLE #TEMP2
(
ko.Prijmeni nvarchar(max),
k.spojeni nvarchar(max),
ko.Cislo nvarchar(max)
)

(SOLVED) - First iteration of WHILE loop runs out of memory despite manual reconstruction of query succeeding

Environment: SQL Server 2019 (v15).
I have a large query that uses too much space when run as a single SELECT statement. When I try to run it, I get the following error:
Could not allocate a new page for database 'TEMPDB' because of insufficient disk space in filegroup 'DEFAULT'.
However, the problem breaks down naturally into a dozen or so pieces, so I wrote a WHILE loop to iterate through each piece and insert into a results table. Unfortunately, the first iteration of the WHILE loop also returns the same memory error. All the WHILE loop is doing is changing a few values in the WHERE clause.
The key thing confusing me here, is that when I manually run one iteration of the INSERT statement, absent all looping logic, it works perfectly.
Manually coding the first iteration to use the first institution_name just works, so I don't think the joins here are going wrong and causing the memory error.
WITH my_cte AS
(
SELECT [columns]
FROM mytable a
INNER JOIN bigtable b ON a.institution_name = b.institution_name
AND a.personID = b.personID
WHERE a.institution_name = 'ABC'
AND b.institution_name = 'ABC'
)
INSERT INTO results (personID, institution_name, ...)
SELECT personID, institution_name, [some aggregations]
FROM my_cte
GROUP BY personID, institution_name;
The version with the WHILE loop fails. I need to run the query with different values for institution_name.
Here I show three different values but even just the first iteration fails.
DECLARE #INSTITUTION varchar(10)
DECLARE #COUNTER int
SET #COUNTER = 0
DECLARE #LOOKUP table (temp_val varchar(10), temp_id int)
INSERT INTO #LOOKUP (temp_val, temp_id)
VALUES ('ABC', 1), ('DEF', 2), ('GHI', 3)
WHILE #COUNTER < 3
BEGIN
SET #COUNTER = #COUNTER + 1
SELECT #INSTITUTION = temp_val
FROM #LOOKUP
WHERE temp_id = #COUNTER;
WITH my_cte AS
(
SELECT [columns]
FROM mytable a
INNER JOIN bigtable b ON a.institution_name = b.institution_name
AND a.personID = b.personID
WHERE a.institution_name = #INSTITUTION
AND b.institution_name = #INSTITUTION
)
INSERT INTO results (personID, institution_name, ...)
SELECT personID, institution_name, [some aggregations]
FROM my_cte
GROUP BY personID, institution_name
END
As I write this question, I have quite literally just copy-pasted the insert statement a dozen times, changed the relevant WHERE clause, and run it without errors. Could it be some kind of datatype issue where the query can properly subset if a string literal is put in the WHERE column, but the lookup on my temporary table is failing due to the datatype? I notice that mytable.institution_name is varchar(10) while bigtable.institution_name is nvarchar(10). Setting the temp table to use nvarchar(10) didn't fix it either.

Searching for multiple patterns in a string in T-SQL

In t-sql my dilemma is that I have to parse a potentially long string (up to 500 characters) for any of over 230 possible values and remove them from the string for reporting purposes. These values are a column in another table and they're all upper case and 4 characters long with the exception of two that are 5 characters long.
Examples of these values are:
USFRI
PROME
AZCH
TXJS
NYDS
XVIV. . . . .
Example of string before:
"Offered to XVIV and USFRI as back ups. No response as of yet."
Example of string after:
"Offered to and as back ups. No response as of yet."
Pretty sure it will have to be a UDF but I'm unable to come up with anything other than stripping ALL the upper case characters out of the string with PATINDEX which is not the objective.

This is unavoidably cludgy but one way is to split your string into rows, once you have a set of words the rest is easy; Simply re-aggregate while ignoring the matching values*:
with t as (
select 'Offered to XVIV and USFRI as back ups. No response as of yet.' s
union select 'Another row AZCH and TXJS words.'
), v as (
select * from (values('USFRI'),('PROME'),('AZCH'),('TXJS'),('NYDS'),('XVIV'))v(v)
)
select t.s OriginalString, s.Removed
from t
cross apply (
select String_Agg(j.[value], ' ') within group(order by Convert(tinyint,j.[key])) Removed
from OpenJson(Concat('["',replace(s, ' ', '","'),'"]')) j
where not exists (select * from v where v.v = j.[value])
)s;
* Requires a fully-supported version of SQL Server.

build a function to do the cleaning of one sentence, then call that function from your query, something like this SELECT Col1, dbo.fn_ReplaceValue(Col1) AS cleanValue, * FROM MySentencesTable. Your fn_ReplaceValue will be something like the code below, you could also create the table variable outside the function and pass it as parameter to speed up the process, but this way is all self contained.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION fn_ReplaceValue(#sentence VARCHAR(500))
RETURNS VARCHAR(500)
AS
BEGIN
DECLARE #ResultVar VARCHAR(500)
DECLARE #allValues TABLE (rowID int, sValues VARCHAR(15))
DECLARE #id INT = 0
DECLARE #ReplaceVal VARCHAR(10)
DECLARE #numberOfValues INT = (SELECT COUNT(*) FROM MyValuesTable)
--Populate table variable with all values
INSERT #allValues
SELECT ROW_NUMBER() OVER(ORDER BY MyValuesCol) AS rowID, MyValuesCol
FROM MyValuesTable
SET #ResultVar = #sentence
WHILE (#id <= #numberOfValues)
BEGIN
SET #id = #id + 1
SET #ReplaceVal = (SELECT sValue FROM #allValues WHERE rowID = #id)
SET #ResultVar = REPLACE(#ResultVar, #ReplaceVal, SPACE(0))
END
RETURN #ResultVar
END
GO

I suggest creating a table (either temporary or permanent), and loading these 230 string values into this table. Then use it in the following delete:
DELETE
FROM yourTable
WHERE col IN (SELECT col FROM tempTable);
If you just want to view your data sans these values, then use:
SELECT *
FROM yourTable
WHERE col NOT IN (SELECT col FROM tempTable);

SQL - Add new column with outputs as values

Just wondering how I might go about adding the ouputted results as a new column to an exsisting table.
What I'm tryng to do is extract the date from a string which is in another column. I have the below code to do this:
Code
CREATE FUNCTION dbo.udf_GetNumeric
(
#strAlphaNumeric VARCHAR(256)
)
RETURNS VARCHAR(256)
AS
BEGIN
DECLARE #intAlpha INT
SET #intAlpha = PATINDEX('%[^0-9]%', #strAlphaNumeric)
BEGIN
WHILE #intAlpha > 0
BEGIN
SET #strAlphaNumeric = STUFF(#strAlphaNumeric, #intAlpha, 1, '' )
SET #intAlpha = PATINDEX('%[^0-9]%', #strAlphaNumeric )
END
END
RETURN ISNULL(#strAlphaNumeric,0)
END
GO
Now use the function as
SELECT dbo.udf_GetNumeric(column_name)
from table_name
The issue is that I want the result to be placed in a new column in an exsisting table. I have tried the below code but no luck.
ALTER TABLE [Data_Cube_Data].[dbo].[DB_Test]
ADD reportDated nvarchar NULL;
insert into [DB].[dbo].[DB_Test](reportDate)
SELECT
(SELECT dbo.udf_GetNumeric(FileNamewithDate) from [DB].[dbo].[DB_Test])

The syntax should be an UPDATE, not an INSERT, because you want to update existing rows, not insert new ones:
UPDATE Data_Cube_Data.dbo.DB_Test -- you don't need square bracket noise
SET reportDate = dbo.udf_GetNumeric(FileNamewithDate);
But yeah, I agree with the others, the function looks like the result of a "how can I make this object the least efficient thing in my entire database?" contest. Here's a better alternative:
-- better, set-based TVF with no while loop
CREATE FUNCTION dbo.tvf_GetNumeric
(#strAlphaNumeric varchar(256))
RETURNS TABLE
AS
RETURN
(
WITH cte(n) AS
(
SELECT TOP (256) n = ROW_NUMBER() OVER (ORDER BY ##SPID)
FROM sys.all_objects
)
SELECT output = COALESCE(STRING_AGG(
SUBSTRING(#strAlphaNumeric, n, 1), '')
WITHIN GROUP (ORDER BY n), '')
FROM cte
WHERE SUBSTRING(#strAlphaNumeric, n, 1) LIKE '%[0-9]%'
);
Then the query is:
UPDATE t
SET t.reportDate = tvf.output
FROM dbo.DB_Test AS t
CROSS APPLY dbo.tvf_GetNumeric(t.FileNamewithDate) AS tvf;
Example db<>fiddle that shows this has the same behavior as your existing function.

The function
As i mentioned in the comments, I would strongly suggest rewriting the function, it'll perform terribly. Multi-line table value function can perform poorly, and you also have a WHILE which will perform awfully. SQL is a set based language, and so you should be using set based methods.
There are a couple of alternatives though:
Inlinable Scalar Function
SQL Server 2019 can inline function, so you could inline the above. I do, however, assume that your value can only contain the characters A-z and 0-9. if it can contain other characters, such as periods (.), commas (,), quotes (") or even white space ( ), or your not on 2019 then don't use this:
CREATE OR ALTER FUNCTION dbo.udf_GetNumeric (#strAlphaNumeric varchar(256))
RETURNS varchar(256) AS
BEGIN
RETURN TRY_CONVERT(int,REPLACE(TRANSLATE(LOWER(#strAlphaNumeric),'abcdefghigclmnopqrstuvwxyz',REPLICATE('|',26)),'|',''));
END;
GO
SELECT dbo.udf_GetNumeric('abs132hjsdf');
The LOWER is there in case you are using a case sensitive collation.
Inline Table Value Function
This is the better solution in my mind, and doesn't have the caveats of the above.
It uses a Tally to split the data into individual characters, and then only reaggregate the characters that are a digit. Note that I assume you are using SQL Server 2017+ here:
DROP FUNCTION udf_GetNumeric; --Need to drop as it's a scalar function at the moment
GO
CREATE OR ALTER FUNCTION dbo.udf_GetNumeric (#strAlphaNumeric varchar(256))
RETURNS table AS
RETURN
WITH N AS (
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL)) N(N)),
Tally AS(
SELECT TOP (LEN(#strAlphaNumeric))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3, N N4)
SELECT STRING_AGG(CASE WHEN V.C LIKE '[0-9]' THEN V.C END,'') WITHIN GROUP (ORDER BY T.I) AS strNumeric
FROM Tally T
CROSS APPLY (VALUES(SUBSTRING(#strAlphaNumeric,T.I,1)))V(C);
GO
SELECT *
FROM dbo.udf_GetNumeric('abs132hjsdf');
Your table
You define reportDated as nvarchar; this means nvarchar(1). Your function, however, returns a varchar(256); this will rarely fit in an nvarchar(1).
Define the column properly:
ALTER TABLE [dbo].[DB_Test] ADD reportDated varchar(256) NULL;
If you've already created the column then do the following:
ALTER TABLE [dbo].[DB_Test] ALTER COLUMN reportDated varchar(256) NULL;
I note, however, that the column is called "dated", which implies a date value, but it's a (n)varchar; that sounds like a flaw.
Updating the column
Use an UPDATE statement. Depending on the solution this would one of the following:
--Scalar function
UPDATE [dbo].[DB_Test]
SET reportDated = dbo.udf_GetNumeric(FileNamewithDate);
--Table Value Function
UPDATE DBT
SET reportDated = GN.strNumeric
FROM [dbo].[DB_Test] DBT
CROSS APPLY dbo.udf_GetNumeric(FileNamewithDate);

Optimizing sql server scalar-valued function

Here is my question,
I have a view calling another view. And that second view has a scalar function which obviously runs for each row of the table. For only 322 rows, it takes around 30 seconds. When I take out the calculated field, it takes 1 second.
I appreciate if you guys give me an idea if I can optimize the function or if there is any other way to increase the performance?
Here is the function:
ALTER FUNCTION [dbo].[fnCabinetLoad] (
#site nvarchar(15),
#cabrow nvarchar(50),
#cabinet nvarchar(50))
RETURNS float
AS BEGIN
-- Declare the return variable here
DECLARE #ResultVar float
-- Add the T-SQL statements to compute the return value here
SELECT #ResultVar = SUM(d.Value)
FROM
(
SELECT dt.*,
ROW_NUMBER()
OVER (PARTITION BY dt.tagname ORDER BY dt.timestamp DESC) 'RowNum'
FROM vDataLog dt
WHERE dt.Timestamp BETWEEN dateadd(minute,-15,getdate()) AND GetDate()
) d
INNER JOIN [SKY_EGX_CONFIG].[dbo].[vPanelSchedule] AS p
ON p.rpp = left(d.TagName,3) + substring(d.TagName,5,5)
+ substring(d.TagName,11,8)
AND right(p.pole,2) = substring(d.TagName,23,2)
AND p.site = #site
AND p.EqpRowNumber = #cabrow
AND p.EqpCabinetName= #cabinet
WHERE d.RowNum = 1
AND Right(d.TagName, 6) = 'kW Avg'
RETURN #ResultVar
END

Scalar-valued functions have atrocious performance. Your function looks like an excellent candidate for an inline table-valued function that you can CROSS APPLY:
CREATE FUNCTION [dbo].[fnCabinetLoad]
(
#site nvarchar(15),
#cabrow nvarchar(50),
#cabinet nvarchar(50)
)
RETURNS TABLE
AS RETURN
SELECT SUM(d.Value) AS [TotalLoad]
FROM
(
SELECT dt.*, ROW_NUMBER() OVER (PARTITION BY dt.tagname ORDER BY dt.timestamp DESC) 'RowNum'
FROM vDataLog dt
WHERE dt.Timestamp BETWEEN dateadd(minute,-15,getdate()) AND GetDate()) d INNER JOIN [SKY_EGX_CONFIG].[dbo].[vPanelSchedule] AS p
ON p.rpp = left(d.TagName,3) + substring(d.TagName,5,5) + substring(d.TagName,11,8)
AND right(p.pole,2) = substring(d.TagName,23,2)
AND p.site = #site
AND p.EqpRowNumber = #cabrow
AND p.EqpCabinetName= #cabinet
WHERE d.RowNum = 1
AND Right(d.TagName, 6) = 'kW Avg'
In your view:
SELECT ..., cabinetLoad.TotalLoad
FROM ... CROSS APPLY dbo.fnCabinetLoad(.., .., ..) AS cabinetLoad

My understanding is the returned result set is 322 rows, but if the vDataLog table is significantly larger, I would run that subquery first and dump that result set into a table variable. Then, you can use that table variable instead of a nested query.
Otherwise, as it stands now, I think the joins are being done on all rows of the nested query and then you're stripping them off with the where clause to get the rows you want.

You really don't need a function and get rid of nested view(very poor performant)! Encapsulate the entire logic in a stored proc to get the desired result, so that instead of computing everything row by row, it's computed as a set. Instead of view, use the source table to do the computation inside the stored proc.
Apart from that, you are using the functions RIGHT, LEFT AND SUBSTRING inside your code. Never have them in WHERE OR JOIN. Try to compute them before hand and dump them into a temp table so that they are computed once. Then index the temp tables on these columns.
Sorry for the theoretical answer, but right now code seems a mess. It needs to go through layers of changes to have decent performance.

Turn the function into a view.
Use it by restraining on the columns site, cabrow and cabinet and Timestamp. When doing that, try storing GetDate() and dateadd(minute,-15,getdate()) on a variable. I think not doing so can prevent you from taking advantage on any index on Timestamp.
SELECT SUM(d.Value) AS [TotalLoad],
dt.Timestamp,
p.site,
p.EqpRowNumber AS cabrow,
p.EqpCabinetName AS cabinet
FROM
( SELECT dt.*,
ROW_NUMBER() OVER (PARTITION BY dt.tagname ORDER BY dt.timestamp DESC)'RowNum'
FROM vDataLog dt) d
INNER JOIN [SKY_EGX_CONFIG].[dbo].[vPanelSchedule] AS p
ON p.rpp = left(d.TagName,3) + substring(d.TagName,5,5) + substring(d.TagName,11,8)
AND right(p.pole,2) = substring(d.TagName,23,2)
WHERE d.RowNum = 1
AND d.TagName LIKE '%kW Avg'