SQL Server Optimization: Function in SELECT calculated before or after WHERE Clause? - sql-server

I was just wondering if in SQL Server if in a statement like this:
SELECT A.Field1, dbo.someFunction(A.IdentifierID) As Field
FROM Table A WHERE A.IdentifierID = 1000
Will it call someFunction for all the rows in the table, or will it call it once?
Thanks!

It will be called for every row of result

the count of calls depends on the resulted rows, i.e, if the output of the query 100 row the it will be called 100 times, not for all rows in the table
but if it were in the where clause, the it will evaluated for each row

Will call it for every row, but since you are selecting a specific one (or so it seems), it will be called once on your example.

As vlad commented you should try it and see. There are of course two options:
It could call it just once knowing IdentifierID will always be 1000.
but that might not be the right thing...
It could call it for each row - maybe the function has side-effects?
I strongly suspect #2.

Compliant SQL servers are supposed to appear as if they apply the WHERE clause before they apply the SELECT clause. So it should appear to evaluate the function at most once for each row that satisfies WHERE A.IdentifierID = 1000.
But database engines are free to optimize however they like, as long as they give you the same results the standards require. So in your case, since you're selecting a single ID number, it might only have to evaluate the function once.

Related

How to match a substring exactly in a string in SQL server?

I have a column workId in my table which has values like :
W1/2009/12345, G2/2018/2345
Now a user want to get this particular id G2/2018/2345. I am using like operator in my query as below:
select * from u_table as s where s.workId like '%2345%' .
It is giving me both above mentioned workids. I tried following query:
select * from u_table as s where s.workId like '%2345%' and s.workId not like '_2345'
This query also giving me same result.
If anyone please provide me with the correct query. Thanks!
Why not use the existing delimiters to match with your criteria?
select *
from u_table
where concat('/', workId, '/') like concat('%/', '2345', '/%');
Ideally of course your 3 separate values would be 3 separate columns; delimiting multiple values in a single column goes against first-normal form and prevents the optimizer from performing an efficient index seek, forcing a scan of all rows every time, hurting performance and concurrency.

Using subquery for filtering with IN throws wrong data [duplicate]

As always, there will be a reasonable explanation for my surprise, but till then....
I have this query
delete from Photo where hs_id in (select hs_id from HotelSupplier where id = 142)
which executes just fine (later i found out that the entire photo table was empty)
but the strange thing: there is no field hs_id in HotelSupplier, it is called hs_key!
So when i execute the last part
select hs_id from HotelSupplier where id = 142
separately (select that part of the query with the mouse and hit F5), i get an error, but when i use it in the in clause, it doesn't!
I wonder if this is normal behaviour?
It is taking the value of hs_id from the outer query.
It is perfectly valid to have a query that doesn't project any columns from the selected table in its select list.
For example
select 10 from HotelSupplier where id = 142
would return a result set with as many rows as matched the where clause and the value 10 for all rows.
Unqualified column references are resolved from the closest scope outwards so this just gets treated as a correlated sub query.
The result of this query will be to delete all rows from Photo where hs_id is not null as long as HotelSupplier has at least one row where id = 142 (and so the subquery returns at least one row)
It might be a bit clearer if you consider what the effect of this is
delete from Photo where Photo.hs_id in (select Photo.hs_id)
This is of course equivalent to
delete from Photo where Photo.hs_id = Photo.hs_id
By the way this is far and away the most common "bug" that I personally have seen erroneously reported on Microsoft Connect. Erland Sommarskog includes it in his wishlist for SET STRICT_CHECKS ON
It's a strong argument for keeping column names consistent between tables. As #Martin says, the SQL syntax allows column names to be resolved from the outer query, when there's no match in the inner query. This is a boon when writing correlated subqueries, but can trip you up sometimes (as here)

SQL Server function optimization (takes too long)

I have designed a function that works with an SSRS report. I have a drop down parameter that lists multiple items and only one can be selected. This drop down gets its data from a query/data set, and I added one line of data that says 'All' in it. So the dropdown will look like this:
Item1
Item2
Item3
All
And then in the function, I make one small change in the where clause:
...where (#parameterName = 'All' or table.name = #parameterName)
The problem with this is that table.name has about 50000 rows of data. When the user selects 'All' in the drop down, I would have thought that since the first statement in brackets is true, and that the next statement (after the 'or') should not even be executed. But it causes the query to run for 5-20 minutes and still does not produce any result after that long. If I simply change the where clause to
...where (#parameterName = 'All')
The function runs in less than a second, if the user still selects 'All' from the drop down.
I implement a similar concept with another filter but I guess because the table that that parameter uses is much smaller (about 90 rows), so it doesn't take long.
Is there basically a way to have an optional parameter that is not expensive to calculate?
EDIT: I will add that the parameter is declared as nvarchar(max). Will changing this to something smaller help the query?
What you have there is a catch-all query. Consider adding OPTION (RECOMPILE) to the end of your statement. This'll force the engine to recreate the plan each time it runs the query, meaning it won't use poor choices based on a previous run where your variable has a value like 'Item1'.

TSQL Prevent "FAST N" output by default

When I execute a TSQL query or a stored procedure that returns large number of records (1M+) by default it begins to display result while query is still executing.
Is there way to prevent this and postpone returning of the result until the query execution is complete?
If you add a column like the following to the returned dataset, it almost certainly will not be able to do a FAST(N):
.., MAX(prevColumn) OVER(PARTITON BY 1) As Dummy, ...
Where prevColumn is any other column that you are already returning, especially if it's not an indexed column.
You can use ORDER BY or Aggregate function in SELECT will help you.

What makes a SQL statement sargable?

By definition (at least from what I've seen) sargable means that a query is capable of having the query engine optimize the execution plan that the query uses. I've tried looking up the answers, but there doesn't seem to be a lot on the subject matter. So the question is, what does or doesn't make an SQL query sargable? Any documentation would be greatly appreciated.
For reference: Sargable
The most common thing that will make a query non-sargable is to include a field inside a function in the where clause:
SELECT ... FROM ...
WHERE Year(myDate) = 2008
The SQL optimizer can't use an index on myDate, even if one exists. It will literally have to evaluate this function for every row of the table. Much better to use:
WHERE myDate >= '01-01-2008' AND myDate < '01-01-2009'
Some other examples:
Bad: Select ... WHERE isNull(FullName,'Ed Jones') = 'Ed Jones'
Fixed: Select ... WHERE ((FullName = 'Ed Jones') OR (FullName IS NULL))
Bad: Select ... WHERE SUBSTRING(DealerName,4) = 'Ford'
Fixed: Select ... WHERE DealerName Like 'Ford%'
Bad: Select ... WHERE DateDiff(mm,OrderDate,GetDate()) >= 30
Fixed: Select ... WHERE OrderDate < DateAdd(mm,-30,GetDate())
Don't do this:
WHERE Field LIKE '%blah%'
That causes a table/index scan, because the LIKE value begins with a wildcard character.
Don't do this:
WHERE FUNCTION(Field) = 'BLAH'
That causes a table/index scan.
The database server will have to evaluate FUNCTION() against every row in the table and then compare it to 'BLAH'.
If possible, do it in reverse:
WHERE Field = INVERSE_FUNCTION('BLAH')
This will run INVERSE_FUNCTION() against the parameter once and will still allow use of the index.
In this answer I assume the database has sufficient covering indexes. There are enough questions about this topic.
A lot of the times the sargability of a query is determined by the tipping point of the related indexes. The tipping point defines the difference between seeking and scanning an index while joining one table or result set onto another. One seek is of course much faster than scanning a whole table, but when you have to seek a lot of rows, a scan could make more sense.
So among other things a SQL statement is more sargable when the optimizer expects the number of resulting rows of one table to be less than the tipping point of a possible index on the next table.
You can find a detailed post and example here.
For an operation to be considered sargable, it is not sufficient for it to just be able to use an existing index. In the example above, adding a function call against an indexed column in the where clause, would still most likely take some advantage of the defined index. It will "scan" aka retrieve all values from that column (index) and then eliminate the ones that do not match to the filter value provided. It is still not efficient enough for tables with high number of rows.
What really defines sargability is the query ability to traverse the b-tree index using the binary search method that relies on half-set elimination for the sorted items array. In SQL, it would be displayed on the execution plan as a "index seek".

Resources