Does cfquery becomes a prepared statement as long as there's 1 cfqueryparam? Or are there other conditions?
What happen when the ORDER BY clause or FROM clause is dynamic? Would every unique combination becomes a prepared statement?
And what happen when we're doing cfloop with INSERT, with every value cfqueryparam'ed, and invoke the cfquery with different number of iterations?
Any potential problems with too many prepared statements?
How does DB handle prepared statement? Will they be converted into something similar to store procedure?
Under what circumstances should we Not use prepared statement?
Thank you!
I can answer some parts of your question:
a query will become a preparedStatement as long as there is one <queryparam. I have in the past added a
where 1 = <cfqueryparam value="1" to queries which didn't have any dynamic parameters, in order to get them run as preparedStatements
Most DBs handle preparedStarements similarly to Stored Procedures, just held temporarily, rather than long-term, however the details are likely to be DB-specific.
Assuming you are using the drivers supplied with ColdFusion, if you turn on the 'Log Activity' checkbox in the advanced panel of the DataSource setup, then you'll get very detailed information about how CF is interacting with he DB and when it is creating a new preparedStatement and when it is re-using them. I'd recommend trying this out for yourself, as so many factors are involved (DB setup, Driver, CF version etc). If you do use the DB logging, re-start CF before running your test code, so you can see it creating the prepared statements, otherwise you'll just see it re-using statements by ID, without seeing what those statements are.
In addition, if you are asking about execution plans then there is more involved than just the number PreparedStatement's generated. It is a huge topic and very database dependent. I do not have a DBA's grasp on it, but I can answer a few of the questions about MS SQL.
What happen when the ORDER BY clause or FROM clause is dynamic? Would
every unique combination becomes a prepared statement?
The base sql is different. So you will end up with separate execution plans for each unique ORDER BY clause.
And what happen when we're doing cfloop with INSERT, with every value
cfqueryparam'ed, and invoke the cfquery with different number of
iterations?
MS SQL should reuse the same plan for all iterations because only the parameters change.
The sys.dm_exec_cached_plans view is very useful for seeing what plans are cached and how often they are reused.
SELECT p.usecounts, p.cacheobjtype, p.objtype, t.text
FROM sys.dm_exec_cached_plans p
CROSS APPLY sys.dm_exec_sql_text( p.plan_handle) t
ORDER BY p.usecounts DESC
To clear the cache first, use DBCC FLUSHPROCINDB. Obviously do not use it on a production server.
DECLARE #ID int
SET #ID = DB_ID(N'YourTestDatabaseName')
DBCC FLUSHPROCINDB( #ID )
Related
From this SO answer a view should provide the same performance as using the same query directly.
Is querying over a view slower than executing SQL directly?
I have a view where this is not true.
This query targeting a view
SELECT
*
FROM
[Front].[vw_Details] k
WHERE
k.Id = 970435
Takes 10 seconds to complete. Copying the query from the view and adding WHERE k.Id = 970435 to it completes in less than 1 second. The view is nothing special, 4 LEFT JOINs, and a few CASE directives to clean up data.
How can I figure out what the problem is, or what do I need to complete this question with in order for this to be answerable?
Update 1:
SQL Server Version: 12.0.4436.0
Query plan for view: https://pastebin.com/RY40Ab0k
Query plan for select: https://pastebin.com/gwahhgpu
Your query plan is no longer visible, but if you look in the plan, you will most likely see a triangle complaining about the cardinality estimation and/or implicite declaration. What it means is that you are joining tables in a way where your keys are hard to guess for the SQL engine.
It is instant when you run from a query directly, probably because it doesn't need to guess the size of your key is
For example:
k.Id = 970435
SQLSERVER already knows that it is looking for 970435 a 6 digit number.
It can eliminate all the key that doesn't start by 9 and doesn't have 6 digits. :)
However, in a view, it has to build the plan in a way to account for unknown. Because it doesn't know what kind of key it may hold.
See the microsoft for various example and scenario that may help you.
https://learn.microsoft.com/en-us/sql/relational-databases/query-processing-architecture-guide?view=sql-server-ver15
If you are always looking for an int, one work around is to force the type with a cast or convert clause. It's may cause performance penalty depending on your data, but it is a trick in the toolbox to tell sql to not attempt the query a plan as varchar(max) or something along that line.
SELECT *
FROM [Front].[vw_Details] k
WHERE TRY_CONVERT(INT,k.Id) = 970435
use a stored procedure to return results. stored procedures use indexes, whereas, views often don't
or
use a table function and query the table function
I'm using SQL Server and I want to benefit from reusing query plan. I found this document, but it remains unclear for me whether the plan for my query is being reused or not.
declare #su dbo.IntCollection -- TABLE (Value int not null)
insert into #su values (1),(2),(3) --... about 500 values
update mt
set mt.MyField = getutcdate()
from MyTable mt
join #su vsu on mt.Id = vsu.Value -- Clustered PK, int
Technically the text of batch differs from run to run, as different values are being inserted in #su. But the text of update query remains the same. If I were using .NET I would basically pass a table variable to SQL command, but I'm using Python and it looks like there no way to pass table parameter from my program.
Question 1: does the plan for update query get reused? Or does optimizer look that text of batch is different and does not analyze single queries in batch? In other words, is it the same as
update MyTable
set MyField = getutcdate()
where Id in (1, 2, 3 ...)
Question 2: I can force SQL to remain the same between calls by introducing a stored procedure with table parameter, but will I benefit from it?
Question 3: how to identify for a given query whether its plan was reused or computed again?
Question 4: should I worry about all above in my specific case? After all it is just an update of table on bunch of IDs...
Just answers to your questions..
Question 1: does the plan for update query get reused? Or does optimizer look that text of batch is different and does not analyze single queries in batch? In other words, is it the same as
Your both update statements are treated as new queries,since SQL tries to calculate hash of the query and any simple change will not match with old hash
Question 2: I can force SQL to remain the same between calls by introducing a stored procedure with table parameter, but will I benefit from it?
this sounds like a good approach to me..rather than a bunch of IN's
Question 3: how to identify for a given query whether its plan was reused or computed again?
select usecounts from sys.dm_exec_cached_plans ec
cross apply
sys.dm_exec_sql_text(ec.plan_handle) txt
where txt.text like '%your query text%'
Question 4: should I worry about all above in my specific case? After all it is just an update of table on bunch of IDs...
it seems to me,you are worrying much..There are many rules which enforce query plan reuse behaviour as pointed out in the white paper you referred..so most of the times,query plan will be reused..
I would start worrying about plan re usability only when i see high SQL Compilations/sec coupled with Batch Requests/sec
Taken from Answer here :https://dba.stackexchange.com/questions/19544/how-badly-do-sql-compilations-impact-the-performance-of-sql-server
SQL Compilations/sec is a good metric, but only when coupled with Batch Requests/sec. By itself, compilations per sec doesn't really tell you much.
You are seeing 170. If batch req per sec is only 200 (a little exaggerated for effect) then yes, you need to get down to the bottom of the cause (most likely an overuse of ad hoc querying and single-use plans). But if your batch req per sec is measuring about 5000 then 170 compilations per sec is not bad at all. It's a general rule of thumb that Compilations/sec should be at 10% or less than total Batch Requests/sec.
Here is simpler version of one of the SELECT statement from my procedure:
select
...
from
...
where
((#SearchTextList IS NULL) OR
(SomeColumn IN (SELECT SomeRelatedColumn From #SearchTextListTable)))
#SearchTextList is just a varchar variable that holds a comma-separated list of strings. #SearchTextListTable is single column temp table that holds search text values.
This query takes 30 seconds to complete, which is performance issue in my application.
If I get rid of the first condition (i.e. if I remove OR condition), it takes just ONE second.
select
...
from
...
where
SomeColumn IN (SELECT SomeRelatedColumn From #SearchTextListTable)
Can somebody please explain why this much difference?
What's going on internally in SQL Server engine?
Thanks.
Since you said that the SQL is fast when you don't have the OR specified, I assume the table has index for SomeColumn and the amount of rows in #SearchTextListTable is small. When that is the case, SQL Server can decide to use the index for searching the rows.
If you specify the or clause, and the query is like this:
((#SearchTextList IS NULL) OR
(SomeColumn IN (SELECT SomeRelatedColumn From #SearchTextListTable)))
SQL Server can't create a plan where the index is used because the plans are cached and must be usable also when #SearchTextList is NULL.
There's usually 2 ways to improve this, either use dynamic SQL or recompile the plan for each execution.
To get the plan recompiled, just add option (recompile) to the end of the query. Unless this query is executed really often, that should be an ok solution. The downside is that it causes slightly higher CPU usage because the plans can't be re-used.
The other option is to create dynamic SQL and execute it with sp_executesql. Since in that point you know if #SearchTextList will be NULL, you can just omit the SomeColumn IN ... when it's not needed. Be aware of SQL injection in this case and don't just concatenate the variable values into the SQL string, but use variables in the SQL and give those as parameter for sp_executesql.
If you only have this one column in the SQL, you could also make 2 separate procedures for both options and execute them from the original procedure depending on which is the case.
I have a simple SELECT statement with a couple columns referenced in the WHERE clause. Normally I do these simple ones in the VB code (setup a Command object, set Command Type to text, set Command Text to the Select statement). However I'm seeing timeout problems. We've optimized just about everything we can with our tables, etc.
I'm wondering if there'd be a big performance hit just because I'm doing the query this way, versus creating a simple stored procedure with a couple params. I'm thinking maybe the inline code forces SQL to do extra work compiling, creating query plan, etc. which wouldn't occur if I used a stored procedure.
An example of the actual SQL being run:
SELECT TOP 1 * FROM MyTable WHERE Field1 = #Field1 ORDER BY ID DESC
A well formed "inline" or "ad-hoc" SQL query - if properly used with parameters - is just as good as a stored procedure.
But this is absolutely crucial: you must use properly parametrized queries! If you don't - if you concatenate together your SQL for each request - then you don't benefit from these points...
Just like with a stored procedure, upon first executing, a query execution plan must be found - and then that execution plan is cached in the plan cache - just like with a stored procedure.
That query plan is reused over and over again, if you call your inline parametrized SQL statement multiple times - and the "inline" SQL query plan is subject to the same cache eviction policies as the execution plan of a stored procedure.
Just from that point of view - if you really use properly parametrized queries - there's no performance benefit for a stored procedure.
Stored procedures have other benefits (like being a "security boundary" etc.), but just raw performance isn't one of their major plus points.
It is true that the db has to do the extra work you mention, but that should not result in a big performance hit (unless you are running the query very, very frequently..)
Use sql profiler to see what is actually getting sent to the server. Use activity monitor to see if there are other queries blocking yours.
Your query couldn't be simpler. Is Field1 indexed? As others have said, there is no performance hit associated with "ad-hoc" queries.
For where to put your queries, this is one of the oldest debates in tech. I would argue that your requests "belong" to your application. They will be versionned with your app, tested with your app and should disappear when your app disappears. Putting them anywhere other than in your app is walking into a world of pain. But for goodness sake, use .sql files, compiled as embedded resources.
Select statement which is part of form clause of any
another statement is called as inline query.
Cannot take parameters.
Not a database object
Procedure:
Can take paramters
Database object
can be used globally if same action needs to be performed.
I have a query that has been running every day for a little over 2 years now and has typically taken less than 30 seconds to complete. All of a sudden, yesterday, the query started taking 3+ hours to complete and was using 100% CPU the entire time.
The SQL is:
SELECT
#id,
alpha.A, alpha.B, alpha.C,
beta.X, beta.Y, beta.Z,
alpha.P, alpha.Q
FROM
[DifferentDatabase].dbo.fnGetStuff(#id) beta
INNER JOIN vwSomeData alpha ON beta.id = alpha.id
alpha.id is a BIGINT type and beta.id is an INT type. dbo.fnGetStuff() is a simple SELECT statement with 2 INNER JOINs on tables in the same DB, using a WHERE id = #id. The function returns approximately 11000 results.
The view vwSomeData is a simple SELECT statement with two INNER JOINs that returns about 590000 results.
Both the view and the function will complete in less than 10 seconds when executed by themselves. Selecting the results of the function into a temporary table first and then joining on that makes the query finish in < 10 seconds.
How do I troubleshoot what's going on? I don't see any locks in the activity manager.
Look at the query plan. My guess is that there is a table scan or more in the execution plan. This will cause huge amounts of I/O for the few record you get in the result.
You could use the SQL Server Profiler tool to monitor what queries are running on SQL Server. It doesn't show the locks, but it can for instance also give you hints on how to improve your query by suggesting indexes.
If you've got a reasonably recent version of SQL Server Management Studio, it has a Database Tuning Adviser as well, under Tools. It takes a trace from the Profiler and makes some, sometimes highly useful, suggestions. Makes sure there's not too many queries - it takes a long time to build advice.
I'm not an expert on it, but have had some luck with it in the past.
Do you need to use a function? Can you re-write the entire thing into a stored procedure in which you pass in the #ID as a parameter.
Even if your table has indexes because you pass the #ID as a variable to the WHERE clause potentially greatly increasing the amount of time for the query to run.
The reason the indexes may not be used is because the Query Analyzer does not know the value of the variables when it selects an access method to perform the query. Because this is a batch file, only one pass is made of the Transact-SQL code, preventing the Query Optimizer from knowing what it needs to know in order to select an access method that uses the indexes.
You might want to consider an INDEX query hint if you cannot re-write the SQL.
it might also be possible, since this just started happening, that the INDEXes have become fragmented and might need to be rebuilt.
I've had similar problems with joining functions that return large datasets. I had to do what you've already suggested. Put the results in a temp table and join on that.
Look at the estimated plan, this will probably shed some light. Typically when query cost gets orders of magnitude more expensive it is because a loop or merge join is being used where a hash join is more appropriate. If you see a loop or merge join in the estimated plan, look at the number of rows it expects to process - is it far smaller than the number of rows you know will actually be in play? You can also specify a hint to use a hash join and see if it performs much better. If so, try updating statistics and see if it goes back to a hash join without a hint.
SELECT
#id,
alpha.A, alpha.B, alpha.C,
beta.X, beta.Y, beta.Z,
alpha.P, alpha.Q
FROM
[DifferentDatabase].dbo.fnGetStuff(#id) beta
INNER HASH JOIN vwSomeData alpha ON beta.id = alpha.id
-- having no idea what type of schema is in place and just trying to throw out ideas:
Like others have said... use Profiler and find the source of pain... but I'm thinking it is the function on the other database. Since that function might be a source of pain, have you thought about a little denormalization or anything on [DifferentDatabase]. I think you'll find a bit more scalability in joining to a more flattened table with indexes than a costly function.
Run this command:
SET SHOWPLAN_ALL ON
Then run your query. It will display the execution plan, look for a "SCAN" on an index or a table. That is most likely what is happening to your query now. If that is the case, try to figure out why it is not using indexes now (refresh statistics, etc)