Weird Coldfusion cfqueryparam - sql-server

SELECT DISTINCT Table3.ID
FROM Table1
INNER JOIN Table2 ON Table1.thisID = Table2.thisID
INNER JOIN Table3 ON Table2.ID = Table3.ID
WHERE ( Table1.ID IN
(
<cfqueryparam cfsqltype="cf_sql_integer"
value="#idlist#" list="yes">
)
)
AND Table2.ID IN
(
<cfqueryparam cfsqltype="cf_sql_integer"
value="#idlist2#" list="yes">
)
AND Table3.active=1
ORDER BY Table3.ID
When I run the above code it takes 11 to 15 seconds. If I remove the cfqueryparam, and just use the idlist2 variable, the query only takes 32 milliseconds.
Is this an issue with cfqueryparam, or am I doing something incorrect?

SQL performance can drop precipitously with long lists in an IN clause. If you can reduce the length of the lists, your query performance will likely improve.
When you use cfqueryparam, the values are passed to SQL as a list of arguments/parameters/variables. When you do NOT use cfqueryparam, the list of values is hardcoded into the query string. This allows SQL's "query execution plan" to be pre-optimized for that specific list of values. It also allows the plan to be cached from one execution to the next. This can result in subsequent identical queries to execute very fast, like during debugging and testing.
If this is a dynamic query, if the list of values changes each time the query is run, then you want to make sure to use cfqueryparam so that SQL Server isn't caching the execution plan for each one-time hardcoded query.
Furthermore, cfqueryparam gives you a LOT of protection against SQL Injection attacks. From a security aspect, I recommend that all values being passed into a query should use cfqueryparam.
Finally, try running the query in SQL Server Management Studio and click the Show Actual Execution Plan button. It can help you determine if adding one or more indexes on your tables would help the execution time.
'Missing Index' feature of SQL Server Management Studio

Related

Does SQL Server execute a CTE expression if it is not used?

I'm trying to debug a query that is performing slowly. It has several with expressions which are left joined. When I remove the joins, it speeds up considerably.
Original query:
;with CTE as
(
Select *
from table1
)
SELECT *
FROM table2
LEFT JOIN CTE ON table2.CTEID
Better performing query:
;with CTE as
(
Select *
from table1
)
SELECT *
FROM table2
In the above, does it not execute CTE since it is not joined, or does it execute it regardless?
My guess is probably not-- the query optimizer is pretty smart about not executing unnecessary stuff. Every query is different, and the query optimizer uses statistics about your actual data to decide how to evaluate it, so the only way to know for sure, is to get SQL Server to tell you how it evaluated your query.
To do this, execute your query in SQL Server Management Studio with 'Include Actual Execution Plan' and you will be see clearly how it evaluated the query.

Sql Server strange execution plan choices

I have a query on sql server 2012 sp3 which is built dynamically through an application. I have noticed a case where it runs slow due to insufficient execution plan and I am trying to figure out the problem.
In this case the query that is being built has the following form
Select some columns from
(SELECT TOP 1 1 AS NEW FROM tr) AS AL
JOIN
(select some columns from a view join some tables
where column = 'a' or column = 'b' column = 'c'...) t5
ON 1=1 WHERE [t5].[ROW_NUMBER] BETWEEN 0+1 AND 0+20 ORDER BY [t5].[ROW_NUMBER]
The outer select is being used for pagination. The inner select labeled as t5 runs fast when is being executed alone in any case. However combined with the outer select for pagination it can be very slow depended on the number of values chosen in its where statement and how selective (small number of rows fetched) it is.
I have tried to change the query to improve performance but when i do this i ruin the performance of queries built by the application which is not selective (fetch many rows)
From what I see, the execution plan is depended on the values selected in the where statement. Is there a way to help sql server choose the right execution plan so that it can avoid useless rows reads?
I would appreciate any suggestion.

TSQL Join, Query Processing order and storage

Table structure:
CREATE TABLE dbo.Transactions
(
actid INT NOT NULL, --Account ID
tranid INT NOT NULL, -- Transaction ID
val MONEY NOT NULL, --- Transaction value
CONSTRAINT PK_Transactions PRIMARY KEY(actid, tranid)
);
The following inefficient query tries to determine the running balance after each transaction
SELECT
T1.actid, T1.tranid, T1.val,
SUM(T2.val) AS balance
FROM
dbo.Transactions AS T1
JOIN
dbo.Transactions AS T2 ON T2.actid = T1.actid
AND T2.tranid <= T1.tranid
GROUP BY
T1.actid, T1.tranid, T1.val;
I am not sure how the join is processed in query. Is the join treated as a subquery where for each group (T1.actid, T1.tranid, T1.val) the join statement is executed? Does that mean if there 10K Transactions , 10K joined data sets are created by this query?
Execute your query in SSMS. Then highlight it and press Ctrl + L to view the Execution Plan. This will show you how SQL Server plans to execute the query and sometimes suggest indexes, etc.
It means you will have exactly number of rows the join satisfy
Each row in T1 is processed and brings in rows from T2 that satisfies the join conditions.
The join can be process as loop, hash, or merge. Typically the optimizer ill use hash.
The best think to do is just run it. The output should tell a story.
The ONLY way to know is by 'studying' the query plan.
FYI: it seems to me your query is equivalent to
SELECT
T1.actid, T1.tranid, T1.val,
balance = (SELECT SUM(T2.val)
FROM dbo.Transactions
WHERE T2.actid = T1.actid
AND T2.tranid <= T1.tranid)
FROM
dbo.Transactions AS T1
To be honest, I prefer 'this' version because it looks more readable to me; I'm also expecting this version to be slightly 'leaner' as there is less need for sorting, but only actual testing will tell. It's sometimes surprising to see what the optimizer does behind the scenes! Again, the query plan will show.
Therefore, run both queries and compare the resulting query plans, those should give you an idea about their relative cost. Now, keep in mind that "cost" isn't always directly correlated to "time"; so you might want to check which one runs faster too on your hardware and under 'typical load'; also keep in mind that e.g. caching may have an effect here!

Nested pass-through queries?

I have an ODBC connection to a SQL Server database, and because I'm returning large record sets with my queries, I've found that it's faster to run pass-through queries than native Access queries.
But I'm finding it hard to write and organize my queries because, as far as I know, I can't save several different pass-through queries and join them in another pass-through query. I have read-only access to this database, so I can't save stored procedures in SQL Server and then reference them in the pass-through.
For example, suppose I want to get only those entries with the maximum value of o_version from the following query:
select d.o_filename,d.o_version,parent.o_projectname
from dms_doc d
left join
dms_proj p
on
d.o_projectno=p.o_projectno
left join
dms_proj parent
on
p.o_parentno=parent.o_projectno
where
p.o_projectname='ABC'
and
lower(left(right(d.o_filename,4),3))='xls'
and
charindex('xyz',lower(d.o_filename))=0
I want to get only those entries with the maximum value of d.o_version. Ordinarily I would save this as a query called, e.g., abc, and then write another query abcMax:
select * from abc
inner join
(select o_filename,o_projectname,max(o_version) as maxVersion from abc
group by o_filename,o_projectname) abc2
on
abc.o_filename=abc2.o_filename
and
abc.o_projectname=abc2.o_projectname
where
abc.o_version=abc2.maxVersion
But if I can't store abc as a query that can be used in the pass-through query abcMax, then not only do I have to copy the entire body of abc into abcMax several times, but if I make any changes to the content of abc, then I need to make them to every copy that's embedded in abcMax.
The alternative is to write abcMax as a regular Access query that calls abc, but that will reduce the performance because the query is now being handled by ACE instead of SQL Server.
Is there any way to nest stored pass-through queries in Access? Or is creating stored procedures in SQL Server the only way to accomplish this?
If you have (or can get) permission to create temporary tables on the SQL Server then you might be able to use them to some advantage. For example, you could run one pass-through query to create a temporary table with the results from the first query (vastly simplified, in this example):
CREATE TABLE #abc (o_filename NVARCHAR(50), o_version INT, o_projectname NVARCHAR(50));
INSERT INTO #abc SELECT o_filename, o_version, o_projectname FROM dms_doc;
and then your second pass-through query could just reference the temporary table
select * from #abc
inner join
(select o_filename,o_projectname,max(o_version) as maxVersion from #abc
group by o_filename,o_projectname) abc2
on
#abc.o_filename=abc2.o_filename
and
#abc.o_projectname=abc2.o_projectname
where
#abc.o_version=abc2.maxVersion
When you're finished you can run a pass-through query to explicitly delete the temporary table
DROP TABLE #abc
or SQL Server will delete it for you automatically when your connection to the SQL Server closes.
For anyone still needing this info:
Pass through queries allow for the use of cte queries as can be used with Oracle SQL. Similar to creating multiple select queries, but much faster and efficient, without the clutter and confusion of “stacked” Select queries since you can see all the underlying queries in one view.
Example:
With Prep AS (
SELECT A.name,A.city
FROM Customers AS A
)
SELECT P.city, COUNT(P.name) AS clients_per_city
FROM Prep AS P
GROUP BY P.city

How to force SQL Server to process CONTAINS clauses before WHERE clauses?

I have a SQL query that uses both standard WHERE clauses and full text index CONTAINS clauses. The query is built dynamically from code and includes a variable number of WHERE and CONTAINS clauses.
In order for the query to be fast, it is very important that the full text index be searched before the rest of the criteria are applied.
However, SQL Server chooses to process the WHERE clauses before the CONTAINS clauses and that causes tables scans and the query is very slow.
I'm able to rewrite this using two queries and a temporary table. When I do so, the query executes 10 times faster. But I don't want to do that in the code that creates the query because it is too complex.
Is there an a way to force SQL Server to process the CONTAINS before anything else? I can't force a plan (USE PLAN) because the query is built dynamically and varies a lot.
Note: I have the same problem on SQL Server 2005 and SQL Server 2008.
You can signal your intent to the optimiser like this
SELECT
*
FROM
(
SELECT *
FROM
WHERE
CONTAINS
) T1
WHERE
(normal conditions)
However, SQL is declarative: you say what you want, not how to do it. So the optimiser may decide to ignore the nesting above.
You can force the derived table with CONTAINS to be materialised before the classic WHERE clause is applied. I won't guarantee performance.
SELECT
*
FROM
(
SELECT TOP 2000000000
*
FROM
....
WHERE
CONTAINS
ORDER BY
SomeID
) T1
WHERE
(normal conditions)
Try doing it with 2 queries without temp tables:
SELECT *
FROM table
WHERE id IN (
SELECT id
FROM table
WHERE contains_criterias
)
AND further_where_classes
As I noted above, this is NOT as clean a way to "materialize" the derived table as the TOP clause that #gbn proposed, but a loop join hint forces an order of evaluation, and has worked for me in the past (admittedly usually with two different tables involved). There are a couple of problems though:
The query is ugly
you still don't get any guarantees that the other WHERE parameters don't get evaluated until after the join (I'll be interested to see what you get)
Here it is though, given that you asked:
SELECT OriginalTable.XXX
FROM (
SELECT XXX
FROM OriginalTable
WHERE
CONTAINS XXX
) AS ContainsCheck
INNER LOOP JOIN OriginalTable
ON ContainsCheck.PrimaryKeyColumns = OriginalTable.PrimaryKeyColumns
AND OriginalTable.OtherWhereConditions = OtherValues

Resources