I am using Azure SQL database in my project and in which some same set of queries are being executed very frequently.
Recently I received a performance recommendation saying - Non-Parameterized queries are causing performance issues. and are suggesting to execute the following statement in my database.
ALTER DATABASE [TestDB] SET PARAMETERIZATION FORCED
I came to know that Forced parameterization may improve the performance of certain databases by reducing the frequency of query compilations and recompilation.
Also, it is known that stored procedures are executable code and are automatically cached and shared among user and it can prevent recompilations.
Please help me with below-listed questions.
1)Do turning database into Forced PARAMETERIZATION would work better than making frequently used queries into stored procedures?
2)Is it safe performing the Forced Parameterization option in my database?
1)Do turning database into Forced PARAMETERIZATION would work better than making frequently used queries into stored procedures?
No. Forced Parameterization is a workaround for applications that don't properly parameterize queries. It's better to use parameters for frequently-run queries, and hard-coded values where you want the plan to be based on individual value.
EG
select *
from Orders
where CustomerId = #customerID
and Active = 1
It's hard to say if it will work better or not without further testing, but if the advisor is telling you that enabling Parameterization will improve performance, then you should definitely try it. Here's why:
You can apply this recommendation quickly and easily by clicking on
the Apply command. Once you apply this recommendation, it will enable
forced parameterization within minutes on your database and it starts
the monitoring process which approximately lasts for 24 hours. After
this period, you will be able to see the validation report that shows
CPU usage of your database 24 hours before and after the
recommendation has been applied. SQL Database Advisor has a safety
mechanism that automatically reverts the applied recommendation in
case a performance regression has been detected.
More info here:
https://learn.microsoft.com/en-us/azure/sql-database/sql-database-advisor#parameterize-queries-recommendations
Related
I have few tables (base tables) which are getting inserted and updated twice a week. I have indexes created on these tables long back.
I'm applying logic on top of these tables in a stored procedure (without any parameter) and creating a final output table.
I'm scheduling this stored procedure twice a week using SQL Server agent job.
It is running slowly now (50 minutes) whereas if I run the stored procedure manually, it is running faster (15 - 18 minutes)
Do I have to drop the indexes whenever insert or update is happening in base tables and recreate it again after the insert or update?
If so do I have to do it every week?
What is its effect in performance of SQL Server agent jobs?
Indexes do require maintenance, but the rate at which they do depends entirely on how much data is changed, and how those changes are ordered. You can google around for any number of scripts to check your index fragmentation, and how to defragment them. Usually even for larger databases, weekly or nightly maintenances are more than enough.
Anyway, the fact that the execution time differs depending on how you run it, points to two possible causes:
Parametrization, or the SET properties used by the connection.
If your procedure uses parameters but you run the script manually giving the parameters values as you do, then SQL Server knows exactly which values you're using, and can optimize the query execution to use the correct indexes etc on the spot. If your agent calls the procedure with the same parameters, then the process is different. SQL Server may not know which values are being used, so it has to use covering indexes or worse yet, even full on table scans (reading all the data in the whole table, rendering indexes useless) to make sure that it will find all the relevant data for the query. Google SQL Server parametrization, and you can find out more.
The set properties on the other hand control specific session properties that are applied automatically when you connect directly to the database via Management Studio. But when you use an agent job, that may not be the case. This can also result in a different plan which will take far more time.
Both these cases, depend on your database settings and the way your procedure works. So we have to guess here.
But typically, you need to set the following properties in the beginning of a script in an agent job to match the session properties used in your regular Management Studio session:
SET ANSI_NULLS ON;
GO
SET QUOTED_IDENTIFIER ON;
GO
All of the terms here can be googled. I suggest you do so. Those articles can explain these things far better than I've the time for, especially given that - no disrespect intended - you're relatively new to SQL Server. So explaining these things with a suitable terminology here, is difficult. :)
I have a few "inefficient" queries that I am trying to debug on Azure SQL (v12). The problem I have is that after the query executes for the first time (albeit, many seconds) Azure appears to cache the query / execution plan. I have done some research and several people have suggested adding and removing a column will clear the cache but this doesn't seem to work. If I leave the server alone for a few hours / overnight and re-run the query it takes its usual time to execute but once again the cache is in place - this makes it very hard to optimise my query. Does anyone know how to force Azure SQL to not cache my queries / execution plans?
ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE is designed to help wit this problem.
https://learn.microsoft.com/en-us/sql/t-sql/statements/alter-database-scoped-configuration-transact-sql?view=sql-server-2017
This is closest to the DBCC FREEPROCCACHE you have in SQL Server but is scoped to a database instead of the server instance. This does not prevent caching of query plans - it just invalidates the current cache entries.
Please note that the query store is there to help you in SQL Azure (on-by-default). It stores a history of plan choices and plan performance (per-plan). So, if you have a prior plan that performs better available in the history of your application, you can force it using SSMS if you'd prefer to have the query optimizer pick this plan each time your query compiles. One common reason for what you are seeing is parameter-sensitivity in the plan choice where the optimizer will use the passed parameter value to try to generate the query plan, assuming it is representing a common pattern when you run that query. If that value is actually not close to a common value (in terms of how frequent it is in the table), then you can sometimes compile and cache a plan that is not better on average for your application.
Query store has an overview here:
https://learn.microsoft.com/en-us/sql/relational-databases/performance/monitoring-performance-by-using-the-query-store?view=sql-server-2017
Note that SQL Azure also has an automated mechanism to try forcing prior plans if it notices a performance regression. It is somewhat conservative, however, so it may not kick in for every single regression until it sees an obvious pattern over time. So, while you can force things in SSMS, you can also potentially just wait (assuming this is the issue you were seeing)
Setup:
Using SQL Server 2008 R2.
We've got a stored procedure that has been intermittently running very long. I'd like to test a theory that parameter sniffing is causing the query engine to choose a bad plan.
Question:
How can I copy the query's execution plans from one database to another (test) database?
Note:
I'm fully aware that this may not be parameter sniffing issues. However, I'd like to go through the motions of creating a test plan and using it, if at all possible. Therefore please do not ask me to post code and/or table schema, since this is irrelevant at this time.
Plans are not portable, they bind to object IDs. You can use planguides, but they are strictly tied to the database. What you have to do is test on a restored backup of the same database. On a restored backup you can use a planguide. But for relevance the physical characteristics of the machines should be similar (CPUs, RAM, Disks).
Normally though one does not need to resort to such shenanigans as copy the plans. Looking at actual execution plans all the answers are right there.
Have you tried using OPTIMIZE FOR clause? With it you can tune your procedure easier, and without the risk that plan that you copy from another database will be inappropriate due to differences in those databases (if copying the plan is even possible).
http://www.mssqltips.com/sqlservertip/1354/optimize-parameter-driven-queries-with-sql-server-optimize-for-hint/
I apologize in advance for not having all of the specifics available, but the machine is building an index probably for a good while still and is almost completely unresponsive.
I've got a table on SQL Server 2005 with a good number of columns, maybe 20, but a mammoth number of rows (tens, more likely hundreds of millions). In order to simplify the amount of JPA work I'd need to do to access it, I created a view that contained the bits I was interested in. The view was created as:
SELECT bigtable.ID, bigtable.external_identification, mediumtable.hostname,
CONVERT(VARCHAR, bigtable.datefield, 121) AS datefield
FROM schema.bigtable JOIN schema.mediumtable ON bigtable.joinID = mediumtable.ID;
When I want to select from the view, I do:
SELECT * FROM vwTable WHERE external_identification = 'some string';
This works just fine in SQL Management Studio. The external_identification column has a non-unique, non-clustered index in bigtable. This also worked just fine on our remotely executing Java program in our test environment. Now that we're a day or two away from production, the code has been changed a bit (although the fundamental JPA NamedQuery is still straightforward), but we have a new SQLServer installation on new hardware; the test version was on a 32-bit single core machine, the new hardware is 64-bit multi-core.
Whenever I try to run the code that uses this view on the new hardware, it either hangs indefinitely on the first call of this query or times out if I have a timeout specified. After doing some digging, something like:
SELECT status, command, wait_type, last_wait_type FROM sys.dm_exec_requests;
confirmed that the query was running, but showed it in the state:
suspended, SELECT, CXPACKET, CXPACKET
for as long as I cared to wait for it. Whenever I ran the exact same query from within the Management Studio, it completed immediately. So I did some research, and found out this is due to waiting on some kind of concurrent operation to start/finish. In an attempt to circumvent that, I set the server-wide MAXDOP to 1 (disabled concurrency). After that, the query still hangs, but the sys.dm_exec_requests would show:
suspended, SELECT, PAGEIOLATCH_SH, PAGEIOLATCH_SH
This indicates that it's some sort of HD/scanning issue. While certainly the machine is less responsive than I'd expect for newer hardware, I wouldn't expect this query (even over the view) to require much scanning, since the column I'm searching by is indexed in the underlying table and it works if I run it locally. But just because I'm out of ideas and under the gun, I'm adding indexes to the view; first I have to add the unique clustered index (over ID) before I can attempt to add the non-unique non-clustered index over external_identification.
I'm the only one using this database; when I select from sys.dm_exec_requests the only two results are the query I'm actively inspecting and the select from sys.dm_exec_requests query. So it's not like it's under legitimately heavy, or even at all concurrent, load.
But I suspect I'm grasping at straws. I'm no DBA, and every time I have to interact with SQL Server outside of querying it it baffles my intuitions. Does anyone have any ideas why a query executed remotely would immediately go into a suspended state while the same query locally would execute immediately?
Wow, this one caught me straight out of left field. It turns out that by default, the MSSQL JDBC driver sends its String datatypes as Unicode, which the table/view might not be prepared to handle specifically. In our case, the columns and indexes were not, so MSSQL would perform a full table scan for each lookup.
In our test environment, the table was small enough that this didn't matter, so I was tricked into thinking it worked fine. In retrospect, I'm glad it didn't -- I can't stand it when computers give the illusion of inconsistency.
When I added this little parameter to the end of my JDBC connection string:
jdbc:sqlserver://[IP]:1433;databaseName=[db];sendStringParametersAsUnicode=false
things immediately and magically started working. Sorry for the slightly misleading question (I barely even mentioned JPA), but I had no idea what the cause was and really did believe it was something SQL Server side. Task Manager didn't report heavy CPU/Memory usage while the query was suspended, so I just thought it was idling even though it was really under heavy disk usage.
More info about MSSQL JDBC and Unicode can be found where I stumbled across the solution, at http://server.pramati.com/blog/2010/06/02/perfissues-jdbcdrivers-mssqlserver/ . Thanks, Ed, for that detailed shot in the dark -- it may not have been the problem, but I certainly learned a lot (and fast!) about MSSQL's gritty parts!
It is likely that the query run in SSMS and by your application are using different query plans - from the wait types you're seeing in dm_exec_requests it sounds like the plan created for the application is doing a table scan where the plan for SSMS is using an index seek.
This is possible because the SSMS and application database connections likely use different connection options, some of which are used as a key to the database plan cache.
You can find out which options your application is using by running a default SQL server profiler trace against the server; the first command after the connection is created will be a number of SET... options:
SET DATEFORMAT dmy
SET ANSI_NULLS ON
...
I suspect this list will be different between your application and your SSMS connection - a common candidate is SET ARITHABORT {ON|OFF}, since that forms part of the key of the cached plan.
If you run the SET... commands in an SSMS window before executing the query, the same (bad) plan as is being used by the application should then be picked up.
Assuming this demonstrates the problem, the next step is to work out how to prevent the bad plan getting into cache. It's difficult to give generic instructions about how to do this, since there are a few possible causes.
It's a bit of a scattergun approach (there are other more targetted ways to attempt to resolve this problem but they require more detailed understanding of the issue that I have now), but one thing to try is to add OPTION (RECOMPILE) to the end of your query - this forces a new plan to be generated for every execution, and should prevent the bad plan being reused:
SELECT * FROM vwTable WHERE external_identification = 'some string' OPTION (RECOMPILE);
Assuming you can replicate the bad performance is SSMS using the steps above, you should be able to test this there.
Beware that this can have negative performance consequences if the query is executed very frequently (since each recompilation requires CPU) - this depends on the workload of your application and will need testing.
A couple of other thoughts:
Check the schemas between the test and production systems; this might be as simple as a missing index from one of the tables in the production database, although given that SSMS queries perform OK this is unlikely.
You should re-enable parallelism by taking the server-wide MAXDOP=1 off, since this will limit the performance of your system overall. The problem is almost certainly the query plan, not parallelism
You also need to beware of the consequences of adding indexes to the view - doing so effectively materialises the view, which will (given the size of the table) require a lot of storage overhead - the indexes will also need to be maintained when INSERT/UPDATE/DELETE statements take place on the base table. Indexing the view is probably unnecessary given that (from SSMS) you know it's possible for the query to perform.
Related
LINQ-to-SQL vs stored procedures?
I have heard a lot of talk back and forth about the advantages of stored procedures being pre compiled. But what are the actual performance difference between LINQ and Stored procedures on Selects, Inserts, Updates, Deletes? Has anyone run any tests at all to see if there is any major difference. I'm also curious if a greater number of transactions makes a difference.
My guess is that LINQ statements get cached after the first transaction and performance is probably going to be nearly identical. Thoughts?
LINQ should be close in performance but I disagree with the statement above that says LINQ is faster, it can't be faster, it could possibly be just as as fast though, all other things being equal.
I think the difference is that a good SQL developer, who knows how to optimize, and uses stored procedures is always going to have a slight edge in performance. If you are not strong on SQL, let Linq figure it out for you, and your performance is most likely going to be acceptable. If you are a strong SQL developer, use stored procedures to squeeze out a bit of extra performance if you app requires it.
It certainly is possible if you write terrible SQL to code up some stored procedures that execute slower than Linq would, but if you know what you are doing, stored procedures and a Datareader can't be beat.
Stored procedures are faster as compared to LINQ query they can take the full advantage of SQL features.
when a stored procedure is being executed next time, the database used the cached execution plan to execute that stored procedure.
while LINQ query is compiled each and every time.
Hence, LINQ query takes more time in execution as compared to stored procedures.
Stored procedure is a best way for writing complex queries as compared to LINQ.
LINQ queries can (and should be) precompiled as well. I don't have any benchmarks to share with you, but I think everyone should read this article for reference on how to do it. I'd be especially interested to see some comparison of precompiled LINQ queries to SPROCS.
There is not much difference except that LINQ can degrade when you have lot of data and you need some database tuning.
LINQ2SQL queries will not perform any differently from any other ad-hoc parameterized SQL query, other than the possibility that the generator may not optimize the query in the best fashion.
The common perception is that ad-hoc sql queries perform better than Stored Procedures. However, this is false:
SQL Server 2000 and SQL Server version
7.0 incorporate a number of changes to statement processing that extend many
of the performance benefits of stored
procedures to all SQL statements. SQL
Server 2000 and SQL Server 7.0 do not
save a partially compiled plan for
stored procedures when they are
created. A stored procedure is
compiled at execution time, like any
other Transact-SQL statement. SQL
Server 2000 and SQL Server 7.0 retain
execution plans for all SQL statements
in the procedure cache, not just
stored procedure execution plans.
-- SqlServer's Books Online
Given the above and the fact that LINQ generates ad-hoc queries, my conclusion is that there is no performance difference between Stored Procedures & LINQ. And I am also apt to believe that SQL Server wouldn't move backwards in terms of query performance.
Linq should be used at the business logic layer on top of views created in sql or oracle. Linq helps you insert another layer for business logic, maintenance of which is in the hands of coders or non sql guy. It will definitely not perform as well as sql coz its not precompiled and you can perform lots of different things in sps.
But you can definitely add a programming detail and segregate the business logic from core sql tables and database objects using Linq.
See LINQ-to-SQL vs stored procedures for help - I think that post has most of the info. you need.
Unless you are trying to get every millisecond out of your application, whether to use a stored procedure or LINQ may need to be determined by what you expect developers to know and maintainability.
Stored procedures will be fast, but when you are actively developing an application you may find that the easy of using LINQ may be a positive, as you can change your query and your anonymous type that is created from LINQ very quickly.
Once you are done writing the application and you know what you need, and start to look at optimizing it, then you can look at other technologies and if you have good unit testing then you should be able to compare different techniques and determine which solution is best.
You may find this comparison of various ways for .NET 3.5 to interact with the database useful.
http://toomanylayers.blogspot.com/2009/01/entity-framework-and-linq-to-sql.html
I don't think I would like having my database layer in compiled code. It should be a separate layer not combined. As I develop and make use of Agile I am constantly changing the database design, and the process goes very fast. Adding columns, removing columns or creating a new tables in SQL Server is as easy as typing into Excel. Normalizing a table or de-normalizing is also pretty fast at the database level. Now with Linq I would also have to change the object representation every time I make a change to the database or live with it not truly reflecting how the data is stored. That is a lot of extra work.
I have heard that Linq entities can shelter your application from database change but that doesn't make sense. The database design and application design need to go hand in hand. If I normalize several tables or do some other redesign of the database I wouldn't want a Linq object model to no longer reflect the actual database design.
And what about advantage of tweaking a View or Stored Procedure. You can do that directly at the database level without having to re-compile code and release it to production. If I have a View which shows data from several tables and I decide to change the database design all I have to do is change that View. All my code remains the same.
Consider a database table with a million entries, joined to another table with a million entries... do you honestly think that doing this on the webserver (be it in LINQ or ad-hoc SQL) is going to be faster or more efficient than letting SQL Server do it on the database?
For simple queries, then LINQ is obviously better as it will be pre-compiled, giving you the advantage of having type checking , etc. However, for any intensive database operations, report building, bulk data analysis that need doing, stored procedures will win hands down.
<script>alert("hello") </script> I think that doing this on the webserver (be it in LINQ or ad-hoc SQL) is going to be faster or more efficient than letting SQL Server do it on the database?
For simple queries, then LINQ is obviously better as it will be pre-compiled, giving you the advantage of having type checking , etc. However, for any intensive database operations, report building, bulk data analysis that need doing, stored procedures will win hands dow