What has better performance on SQL Server - sql-server

I want to compare a number of values (up to ten) with a function that will return the smallest value of them.
My colleague wrote the function like:
set #smallest = null
if #smallest is null or #date0 < #smallest
begin
set #smallest = #date0
end
if #smallest is null or #date1 < #smallest
begin
set #smallest = #date1
end
... (repeating 10 times)
Beside of that the if statement could be written smarter (the null check can fall away after the first comparison) I was wondering if creating an in-memory indexed table and let the function return me the first value would be more efficient?
Is there any documentation that I could read for this?

creating an in-memory indexed table
There is no point having an index on 10 records. Create a derived table (will sit in memory) as shown below, then run MIN across the table:
select #smallest = MIN(Adate)
from (
select #date0 Adate union all
select #date1 union all
select #date2 union all
-- ....
select #date9) X

Related

SQL - Add new column with outputs as values

Just wondering how I might go about adding the ouputted results as a new column to an exsisting table.
What I'm tryng to do is extract the date from a string which is in another column. I have the below code to do this:
Code
CREATE FUNCTION dbo.udf_GetNumeric
(
#strAlphaNumeric VARCHAR(256)
)
RETURNS VARCHAR(256)
AS
BEGIN
DECLARE #intAlpha INT
SET #intAlpha = PATINDEX('%[^0-9]%', #strAlphaNumeric)
BEGIN
WHILE #intAlpha > 0
BEGIN
SET #strAlphaNumeric = STUFF(#strAlphaNumeric, #intAlpha, 1, '' )
SET #intAlpha = PATINDEX('%[^0-9]%', #strAlphaNumeric )
END
END
RETURN ISNULL(#strAlphaNumeric,0)
END
GO
Now use the function as
SELECT dbo.udf_GetNumeric(column_name)
from table_name
The issue is that I want the result to be placed in a new column in an exsisting table. I have tried the below code but no luck.
ALTER TABLE [Data_Cube_Data].[dbo].[DB_Test]
ADD reportDated nvarchar NULL;
insert into [DB].[dbo].[DB_Test](reportDate)
SELECT
(SELECT dbo.udf_GetNumeric(FileNamewithDate) from [DB].[dbo].[DB_Test])
The syntax should be an UPDATE, not an INSERT, because you want to update existing rows, not insert new ones:
UPDATE Data_Cube_Data.dbo.DB_Test -- you don't need square bracket noise
SET reportDate = dbo.udf_GetNumeric(FileNamewithDate);
But yeah, I agree with the others, the function looks like the result of a "how can I make this object the least efficient thing in my entire database?" contest. Here's a better alternative:
-- better, set-based TVF with no while loop
CREATE FUNCTION dbo.tvf_GetNumeric
(#strAlphaNumeric varchar(256))
RETURNS TABLE
AS
RETURN
(
WITH cte(n) AS
(
SELECT TOP (256) n = ROW_NUMBER() OVER (ORDER BY ##SPID)
FROM sys.all_objects
)
SELECT output = COALESCE(STRING_AGG(
SUBSTRING(#strAlphaNumeric, n, 1), '')
WITHIN GROUP (ORDER BY n), '')
FROM cte
WHERE SUBSTRING(#strAlphaNumeric, n, 1) LIKE '%[0-9]%'
);
Then the query is:
UPDATE t
SET t.reportDate = tvf.output
FROM dbo.DB_Test AS t
CROSS APPLY dbo.tvf_GetNumeric(t.FileNamewithDate) AS tvf;
Example db<>fiddle that shows this has the same behavior as your existing function.
The function
As i mentioned in the comments, I would strongly suggest rewriting the function, it'll perform terribly. Multi-line table value function can perform poorly, and you also have a WHILE which will perform awfully. SQL is a set based language, and so you should be using set based methods.
There are a couple of alternatives though:
Inlinable Scalar Function
SQL Server 2019 can inline function, so you could inline the above. I do, however, assume that your value can only contain the characters A-z and 0-9. if it can contain other characters, such as periods (.), commas (,), quotes (") or even white space ( ), or your not on 2019 then don't use this:
CREATE OR ALTER FUNCTION dbo.udf_GetNumeric (#strAlphaNumeric varchar(256))
RETURNS varchar(256) AS
BEGIN
RETURN TRY_CONVERT(int,REPLACE(TRANSLATE(LOWER(#strAlphaNumeric),'abcdefghigclmnopqrstuvwxyz',REPLICATE('|',26)),'|',''));
END;
GO
SELECT dbo.udf_GetNumeric('abs132hjsdf');
The LOWER is there in case you are using a case sensitive collation.
Inline Table Value Function
This is the better solution in my mind, and doesn't have the caveats of the above.
It uses a Tally to split the data into individual characters, and then only reaggregate the characters that are a digit. Note that I assume you are using SQL Server 2017+ here:
DROP FUNCTION udf_GetNumeric; --Need to drop as it's a scalar function at the moment
GO
CREATE OR ALTER FUNCTION dbo.udf_GetNumeric (#strAlphaNumeric varchar(256))
RETURNS table AS
RETURN
WITH N AS (
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL)) N(N)),
Tally AS(
SELECT TOP (LEN(#strAlphaNumeric))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3, N N4)
SELECT STRING_AGG(CASE WHEN V.C LIKE '[0-9]' THEN V.C END,'') WITHIN GROUP (ORDER BY T.I) AS strNumeric
FROM Tally T
CROSS APPLY (VALUES(SUBSTRING(#strAlphaNumeric,T.I,1)))V(C);
GO
SELECT *
FROM dbo.udf_GetNumeric('abs132hjsdf');
Your table
You define reportDated as nvarchar; this means nvarchar(1). Your function, however, returns a varchar(256); this will rarely fit in an nvarchar(1).
Define the column properly:
ALTER TABLE [dbo].[DB_Test] ADD reportDated varchar(256) NULL;
If you've already created the column then do the following:
ALTER TABLE [dbo].[DB_Test] ALTER COLUMN reportDated varchar(256) NULL;
I note, however, that the column is called "dated", which implies a date value, but it's a (n)varchar; that sounds like a flaw.
Updating the column
Use an UPDATE statement. Depending on the solution this would one of the following:
--Scalar function
UPDATE [dbo].[DB_Test]
SET reportDated = dbo.udf_GetNumeric(FileNamewithDate);
--Table Value Function
UPDATE DBT
SET reportDated = GN.strNumeric
FROM [dbo].[DB_Test] DBT
CROSS APPLY dbo.udf_GetNumeric(FileNamewithDate);

How do I use ##RowCount in a stored procedure, against rows in another table to work out the percentage?

Firstly, may I state that I'm aware of the ability to, e.g., create a new function, declare variables for rowcount1 and rowcount2, run a stored procedure that returns a subset of rows from a table, then determine the entire rowcount for that same table, assign it to the second variable and then 1 / 2 x 100....
However, is there a cleaner way to do this which doesn't result in numerous running of things like this stored procedure? Something like
select (count(*stored procedure name*) / select count(*) from table) x 100) as Percentage...
Sorry for the crap scenario!
EDIT: Someone has asked for more details. Ultimately, and to cut a very long story short, I wish to know what people would consider the quickest and most processor-concise method there would be to show the percentage of rows that are returned in the stored procedure, from ALL rows available in that table. Does that make more sense?
The code in the stored procedure is below:
SET #SQL = 'SELECT COUNT (DISTINCT c.ElementLabel), r.FirstName, r.LastName, c.LastReview,
CASE
WHEN c.LastReview < DateAdd(month, -1, GetDate()) THEN ''OUT of Date''
WHEN c.LastReview >= DateAdd(month, -1, GetDate()) THEN ''In Date''
WHEN c.LastReview is NULL THEN ''Not Yet Reviewed'' END as [Update Status]
FROM [Residents-'+#home_name+'] r
LEFT JOIN [CarePlans-'+#home_name+'] c ON r.PersonID = c.PersonID
WHERE r.Location = '''+#home_name+'''
AND CarePlanType = 0
GROUP BY r.LastName, r.FirstName, c.LastReview
HAVING COUNT(ELEMENTLABEL) >= 14
Thanks
Ant
I could not tell from your question if you are attempting to get the count and the result set in one query. If it is ok to execute the SP and separately calculate a table count then you could store the results of the stored procedure into a temp table.
CREATE TABLE #Results(ID INT, Value INT)
INSERT #Results EXEC myStoreProc #Parameter1, #Parameter2
SELECT
Result = ((SELECT COUNT(*) FROM #Results) / (select count(*) from table))* 100

How to extract every 7 characters of an nvarchar into another table?

I have an nvarchar(200) called ColumnA in Table1 that contains, for example, the value:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
I want to extract every 7 characters into Table2, ColumnB and end up with all of these values below.
ABCDEFG
BCDEFGH
CDEFGHI
DEFGHIJ
EFGHIJK
FGHIJKL
GHIJKLM
HIJKLMN
IJKLMNO
JKLMNOP
KLMNOPQ
LMNOPQR
MNOPQRS
NOPQRST
OPQRSTU
PQRSTUV
QRSTUVW
RSTUVWX
STUVWXY
TUVWXYZ
[Not the real table and column names.]
The data is being loaded to Table1 and Table2 in an SSIS Package, and I'm puzzling whether it is better to do the string handling in TSQL in a SQL Task or parse out the string in a VB Script Component.
[Yes, I think we're the last four on the planet using VB in Script Components. I cannot persuade the other three that this C# thing is here to stay. Although, maybe it is a perfect time to go rogue.]
You can use a recursive CTE calculating the offsets step by step and substring().
WITH
cte
AS
(
SELECT 1 n
UNION ALL
SELECT n + 1 n
FROM cte
WHERE n + 1 <= len('ABCDEFGHIJKLMNOPQRSTUVWXYZ') - 7 + 1
)
SELECT substring('ABCDEFGHIJKLMNOPQRSTUVWXYZ', n, 7)
FROM cte;
db<>fiddle
If you have a physical numbers table, this is easy. If not, you can create a tally-on-the-fly:
DECLARE #string VARCHAR(100)='ABCDEFGHIJKLMNOPQRSTUVWXYZ';
--We create the tally using ROW_NUMBER against any table with enough rows.
WITH Tally(Nmbr) AS
(SELECT TOP(LEN(#string)-6) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values)
SELECT Nmbr
,SUBSTRING(#string,Nmbr,7) AS FragmentOf7
FROM Tally
ORDER BY Nmbr;
The idea in short:
The tally returns a list of numbers from 1 to n (n=LEN(#string)-6). This Number is used in SUBSTRING to define the starting position.
You can do it with T-SQL like this:
DECLARE C CURSOR LOCAL FOR SELECT [ColumnA] FROM [Table1]
OPEN C
DECLARE #Val nvarchar(200);
FETCH NEXT FROM C into #Val
WHILE ##FETCH_STATUS = 0 BEGIN
DECLARE #I INTEGER;
SELECT #I = 1;
WHILE #I <= LEN(#vAL)-6 BEGIN
PRINT SUBSTRING(#Val, #I, 7)
SELECT #I = #I + 1
END
FETCH NEXT FROM C into #Val
END
CLOSE C
Script Component solution
Assuming that the input Column name is Column1
Add a script component
Open the script component configuration form
Go to Inputs and Outputs Tab
Click on the Output icon and set the Synchronous Input property to None
Add an Output column (example outColumn1)
In the Script editor, use a similar code in the row processing function:
Dim idx as integer = 0
While Row.Column1.length > idx + 7
Output0Buffer.AddRow()
Output0Buffer.outColumn1 = Row.
Column1.Substring(idx,7)
idx +=1
End While

Insert values in temp table form same table and different variables

I have this code, that works, but I want to insert in the temp table the same values (DateTime and Value) from another variable (UBB_PreT_Line_LA.If_TotalInFeddWeight) present in the same table ([Runtime].[dbo].[History]). Then, I show the result in SQL Report Builder 3.0 in a table.
SET NOCOUNT ON
DECLARE #fechaItem DATETIME;
DECLARE #fechaFinTotal DATETIME;
SET #fechaItem = DateAdd(hh,7,#Fecha)
SET #fechaFinTotal = DateAdd(hh,23,#Fecha)
SET NOCOUNT OFF
DECLARE #tblTotales TABLE
(
VALOR_FECHA DATETIME,
VALOR_VALUE float
)
WHILE #fechaItem < #fechaFinTotal
BEGIN
DECLARE #fechaFin DATETIME;
SET #fechaFin = DATEADD(minute, 15, #fechaItem );
INSERT INTO #tblTotales
SELECT
MAX( [DateTime] ),
MAX( [Value] )
FROM [Runtime].[dbo].[History]
WHERE
[DateTime] >= #fechaItem
AND [DateTime] <= #fechaFin
AND (History.TagName='UBB_PreT_Belt_PF101A.Time_Running')
SET #fechaItem = #fechaFin;
END
SELECT TOP 64 VALOR_FECHA as Fecha,VALOR_VALUE as Valor
FROM #tblTotales
order by Valor ASC
What I want, is to join in a single query the result I get in these two tables, with the same query in which only the variable that is queried changes.
The purpose is to create a unique Dataset in Report Builder to display in a single table, the data of the two tables of the image. The 15 minute interval is because I just want to show the variation of the values every 15 minutes.
enter image description here
I have modified the code (Image_02), and with the Query Designer of the Report Builder I have obtained what is shown in the Image_03. The final goal would be to have the data of the second variable, in two more columns on the right (Fecha_Ton and Valor_Ton). How can I do it?
enter image description here
enter image description here
If I've understood your question correctly, I think that this query replaces your code entirely (and adds the second value):
declare #sample table (Datetime datetime not null, Value int not null,
TagName varchar(50) not null)
insert into #sample (DateTime, Value, TagName) values
('2018-08-16T10:14:00',6,'UBB_PreT_Belt_PF101A.Time_Running'),
('2018-08-16T10:08:00',8,'UBB_PreT_Belt_PF101A.Time_Running'),
('2018-08-16T10:23:00',7,'UBB_PreT_Belt_PF101A.Time_Running'),
('2018-08-16T10:07:00',7,'UBB_PreT_Line_LA.If_TotalInFeddWeight')
declare #Fecha datetime
set #Fecha = '20180816'
select
MAX(DateTime),
MAX(CASE WHEN TagName='UBB_PreT_Line_LA.If_TotalInFeddWeight' THEN Value END) as Fed,
MAX(CASE WHEN TagName='UBB_PreT_Belt_PF101A.Time_Running' THEN Value END) as Running
from
#sample
where
DateTime >= DATEADD(hour,7,#Fecha) and
DateTime < DATEADD(hour,23,#Fecha) and
TagName in ('UBB_PreT_Line_LA.If_TotalInFeddWeight',
'UBB_PreT_Belt_PF101A.Time_Running')
group by DATEADD(minute,((DATEDIFF(minute,0,DateTime)/15)*15),0)
order by MAX(DateTime) asc
Results:
Fed Running
----------------------- ----------- -----------
2018-08-16 10:14:00.000 7 8
2018-08-16 10:23:00.000 NULL 7
(You may want two separate dates following the same pattern using CASE as the values)
You shouldn't be building your data up row by agonising row1, you should find as way (such as that above) to express what the entire result set should look like as a single query. Let SQL Server itself decide whether it's going to do that by searching through the rows in date order, etc.
1There may be circumstances where you end up having to do this, but first exhaust any likely set-based options first.

Performance issue with larger resultsets MSSQL

I currently have a stored procedure in MSSQL where I execute a SELECT-statement multiple times based on the variables I give the stored procedure. The stored procedure counts how many results are going to be returned for every filter a user can enable.
The stored procedure isn't the issue, I transformed the select statement from te stored procedure to a regular select statement which looks like:
DECLARE #contentRootId int = 900589
DECLARE #RealtorIdList varchar(2000) = ';880;884;1000;881;885;'
DECLARE #publishSoldOrRentedSinceDate int = 8
DECLARE #isForSale BIT= 1
DECLARE #isForRent BIT= 0
DECLARE #isResidential BIT= 1
--...(another 55 variables)...
--Table to be returned
DECLARE #resultTable TABLE
(
variableName varchar(100),
[value] varchar(200)
)
-- Create table based of inputvariable. Example: turns ';18;118;' to a table containing two ints 18 AND 118
DECLARE #RealtorIdTable table(RealtorId int)
INSERT INTO #RealtorIdTable SELECT * FROM dbo.Split(#RealtorIdList,';') option (maxrecursion 150)
INSERT INTO #resultTable ([value], variableName)
SELECT [Value], VariableName FROM(
Select count(*) as TotalCount,
ISNULL(SUM(CASE WHEN reps.ForRecreation = 1 THEN 1 else 0 end), 0) as ForRecreation,
ISNULL(SUM(CASE WHEN reps.IsQualifiedForSeniors = 1 THEN 1 else 0 end), 0) as IsQualifiedForSeniors,
--...(A whole bunch more SUM(CASE)...
FROM TABLE1 reps
LEFT JOIN temp t on
t.ContentRootID = #contentRootId
AND t.RealEstatePropertyID = reps.ID
WHERE
(EXISTS(select 1 from #RealtorIdTable where RealtorId = reps.RealtorID))
AND (#SelectedGroupIds IS NULL OR EXISTS(select 1 from #SelectedGroupIdtable where GroupId = t.RealEstatePropertyGroupID))
AND (ISNULL(reps.IsForSale,0) = ISNULL(#isForSale,0))
AND (ISNULL(reps.IsForRent, 0) = ISNULL(#isForRent,0))
AND (ISNULL(reps.IsResidential, 0) = ISNULL(#isResidential,0))
AND (ISNULL(reps.IsCommercial, 0) = ISNULL(#isCommercial,0))
AND (ISNULL(reps.IsInvestment, 0) = ISNULL(#isInvestment,0))
AND (ISNULL(reps.IsAgricultural, 0) = ISNULL(#isAgricultural,0))
--...(Around 50 more of these WHERE-statements)...
) as tbl
UNPIVOT (
[Value]
FOR [VariableName] IN(
[TotalCount],
[ForRecreation],
[IsQualifiedForSeniors],
--...(All the other things i selected in above query)...
)
) as d
select * from #resultTable
The combination of a Realtor- and contentID gives me a set default set of X amount of records. When I choose a Combination which gives me ~4600 records, the execution time is around 250ms. When I execute the sattement with a combination that gives me ~600 record, the execution time is about 20ms.
I would like to know why this is happening. I tried removing all SUM(CASE in the select, I tried removing almost everything from the WHERE-clause, and I tried removing the JOIN. But I keep seeing the huge difference between the resultset of 4600 and 600.
Table variables can perform worse when the number of records is large. Consider using a temporary table instead. See When should I use a table variable vs temporary table in sql server?
Also, consider replacing the UNPIVOT by alternative SQL code. Writing your own TSQL code will give you more control and even increase performance. See for example PIVOT, UNPIVOT and performance

Resources