NHibernate.Exceptions.GenericADOException Timeout from code but not from DB - sql-server

I have an NHibernate Query (which is populating an EXTJS grid)
It's firing 2 queries off to the database, one to get the record count (for paging purposes) and the other to get the top N rows to fill the grid with.
From code, I'm consistently getting an exception on the Select count(*) statement.
NHibernate.Exceptions.GenericADOException:
Failed to execute multi criteria[SQL:
SELECT count(*) as y0_ FROM RecordView this_ inner join ProcessesView
process1_ on this_.ProcessId=process1_.Id inner join BusinessModelsView
businessmo3_ on process1_.BusinessModelId=businessmo3_.Id inner join BatchesView
batch2_ on this_.BatchId=batch2_.Id WHERE this_.ProcessId = ?;
] ---> System.Data.SqlClient.SqlException: Timeout expired.
The timeout period elapsed prior to completion of the operation or the server
is not responding.
However if I take that exact query and drop it into an SSMS window, and run it, it executes in a <1 second.
Is NHibernate doing anything "funny" under the hood here. Are there execution plan/cache issues. I'm at a complete loss as to why this is occurring.

Whenever I encountered this error, the reason was locking (never performance). There was two sessions opened (accidently). Both started transaction and one of them locked the table.
The problem could be some not disposed session, or "unintended" singleton... holding opened session.
This answer is not as straigth forward as I wish, but I am sure about the direction. Because I experienced the same (and was guilty)
BTW: as Oskar Berggren found out from you, 30 secods timeout would be related to the <property name="command_timeout">30</property>. I am sure, if you will provide 60, 120 ... it will be not enough because of lock

Your two queries are not handled in the same way by SQL SERVER
your NH query has been compiled on its first execution, based on table statistics and on the first value of the parameter. The generated query plan will then be used for all subsequent calls, witout considering the parameter value
your SQL query (where, I guess, you replace the ? with an actual value) gets a different compilation for each value, based on statistics, and on the value.
Your first NH compilation might have produced a query plan, effective for the first value, but not in the general case.
First, I would suggest that :
you count on a projection (say on the main table id), as it is slightly more effective than count(*), allowing the DB to work only on indexes when possible
you check that you don't miss any index necessary to your query
you check that all your table statistics are up to date
If this does not improve execution time, this post offers some options (recompile might be the good one) :
Query executed from Nhibernate is slow, but from ADO.NET is fast

Related

Clustered DB - Stado - Slow first query

Using PostgreSQL in a clustered database (stado) on two nodes. I managed to configure stado coordinator and nodes agents successfully but when I try running a heavy query, the first time it takes too long to show results then after that it was fast.
When I restart the server it goes slow again. It's like stado does some caching or something. I thought the problem was because of stado initialization and thus configured agents but still the problem exists! Any ideas?
EDIT
Query:
SELECT id,position,timestamp
FROM table t1
WHERE id <> 0
AND ST_Intersects(ST_Buffer_Meters(ST_SetSRID(
ST_MakePoint(61.4019, 15.218205), 4326), 1160006), position)
AND timestamp BETWEEN '2013-10-01' AND '2014-01-01';
Explain:
ٍٍStep 0
_______
Target: CREATE UNLOGGED TABLE "TMPTT7_1" ( "XCOL1" INT) WITHOUT OIDS
SELECT: SELECT count(*) AS "XCOL1" FROM "t1" WHERE "t1"."timestamp" BETWEEN '2013-10-01' AND '2014-01-01' AND ("t1"."id"<>0) AND ST_Intersects(ST_Buffer_Meters(ST_SetSRID(
ST_MakePoint(61.4019, 15.218205), 4326), 1160006), "t1"."position")
Step: 1
_______
Select: SELECT SUM("XCOL1") AS "EXPRESSION6" FROM "TMPTT7_1"
Drop:
TMPTT7_1
Two reasons.
Caching, obviously. When a query is executed the first time with cold cache, obviously the cache is populated. That goes for system cache as well as database cache, both work together, at least in standard Postgres. Can make a huge difference.
Query plan caching, possibly. To a much lesser degree. If you run the same query in a single session repeatedly, plans for PL/pgSQL functions for instance are cached.
Depending on your type of connection to the database, there may also be network latency, which may be higher for the first call.
Caching in memory is the reason, that is correct. A good tip for this type of situation is to "warm-up" the database each time you restart it with a script that runs the query (or a similar query that still accesses the same data). In some cases I have seen instances where several "warm-up" queries are run after any type of restart, then users still have a good experience. You will still have to wait for the warm-up query to finish after a restart, but at least it will not be a user waiting for that.
The other possibility is that you are doing a non-indexed query, you should check for that. If it is indexed and accessing a reasonable amount of data by a key, then it should be fast (even without the warm-up for most queries). This is a very common problem, easy to miss. Use the Postres EXPLAIN command, it will show you how the query is being performed against the database (i.e., with an index or without).

Intermittent slow query on SQL Server 2008

I am developing a system which periodically (4-5 times daily) runs a select statement, that normally takes less than 10 seconds but periodically has taken up to 40 minutes.
The database is on Windows Server 2008 + SQL Server 2008 R2; both 64bit.
There is a service on the machine running the database which polls the database and generates values for records which require it. These records are then periodically queried using a multi table join select from a service on a second machine written in C++ (VS 2010) using the MFC CRecordset class to extract the data. An example of the the query causing the problem is shown below.
SELECT DISTINCT "JobKeysFrom"."Key" AS "KeyFrom","KeysFrom"."ID" AS "IDFrom",
"KeysFrom"."X" AS "XFrom","KeysFrom"."Y" AS "YFrom","JobKeysTo"."Key" AS "KeyTo",
"KeysTo"."ID" AS "IDTo","KeysTo"."X" AS "XTo","KeysTo"."Y" AS "YTo",
"Matrix"."TimeInSeconds","Matrix"."DistanceInMetres","Matrix"."Calculated"
FROM "JobKeys" AS "JobKeysFrom"
INNER JOIN "JobKeys" AS "JobKeysTo" ON
("JobKeysFrom"."Key"<>"JobKeysTo"."Key") AND
("JobKeysFrom"."JobID"=531) AND
("JobKeysTo"."JobID"=531)
INNER JOIN "Keys" AS "KeysFrom" ON
("JobKeysFrom"."Key"="KeysFrom"."Key") AND ("JobKeysFrom"."Status"=4)
INNER JOIN "Keys" AS "KeysTo" ON
("JobKeysTo"."Key"="KeysTo"."Key") AND ("JobKeysTo"."Status"=4)
INNER JOIN "Matrix" AS "Matrix" ON
("Matrix"."IDFrom"="KeysFrom"."ID") AND ("Matrix"."IDTo"="KeysTo"."ID")
ORDER BY "JobKeysFrom"."Key","JobKeysTo"."Key"
I have tried the following
checked the indexes and all seem correct and they are active and are being used according to the query
the design advisor comes back with no suggestions
I have tried defragging the indexes and data
rebuilt the database from scratch by exporting the data and reimporting it in a new database.
ran the profiler on it and found that when it goes wrong it seems to do many millions (up to 100 million) of reads rather than a few hundred thousand.
ran the database on a different server
During the time it is running the query, I can run exactly the same query in the management studio window and it will be back to running in 10 seconds. The problem does not seem to be lock, deadlock, CPU, disk or memory related as it has done it when the machine running the database was only running this one query. The server has 4 processors and 16 gb of memory to run it in. I have also tried upgrading the disks to much faster ones and this had no effect.
It seems to me that it is almost as though the database receives the query, starts to process it and then goes to sleep for 40 minutes or runs the query without using the indexes.
When it takes a long time it will eventually finish and send the query results (normally about 70-100000 records) back to the calling application.
Any help or suggestions would be gratefully received, many thanks
This sounds very much like parameter sniffing.
When a stored procedure is invoked and there is no existing execution plan in the cache matching the set options for the connection a new execution plan will be compiled using the parameter values passed in on that invocation.
Sometimes this will happen when the parameters passed are atypical (e.g. have unusually high selectivity) so the generated plan will not be suitable for most other invocations with different parameters. For example it may choose a plan with index seeks and bookmark lookups which is fine for a highly selective case but poor if it needs to be done hundreds of thousands of times.
This would explain why the number of reads goes through the roof.
Your SSMS connection will likely have different SET ... options so will not get handed the same problematic plan from the cache when you execute the stored procedure inside SSMS
You can use the following to get the plan for the slow session
select p.query_plan, *
from sys.dm_exec_requests r
cross apply sys.dm_exec_query_plan(r.plan_handle) p
where r.session_id = <session_id>
Then compare with the plan for the good session.
If you do determine that parameter sniffing is at fault you can use OPTIMIZE FOR hints to avoid it choosing the bad plan.
Check that you don't have a maintenance task running that is rebuilding indexes, or that your database statistics are somehow invalid when the query is executed.
This is exactly the sort of thing one would expect to see if the query is not using your indexes, which is usually because either the indexes are not accessible to the query at the point it runs or because the statistics are invalid and make the optimiser believe that your large table(s) only have a few rows in them and the query would run faster with a full table scan than using indexed access.

SQL Server lock/hang issue

I'm using SQL Server 2008 on Windows Server 2008 R2, all sp'd up.
I'm getting occasional issues with SQL Server hanging with the CPU usage on 100% on our live server. It seems all the wait time on SQL Sever when this happens is given to SOS_SCHEDULER_YIELD.
Here is the Stored Proc that causes the hang. I've added the "WITH (NOLOCK)" in an attempt to fix what seems to be a locking issue.
ALTER PROCEDURE [dbo].[MostPopularRead]
AS
BEGIN
SET NOCOUNT ON;
SELECT
c.ForeignId , ct.ContentSource as ContentSource
, sum(ch.HitCount * hw.Weight) as Popularity
, (sum(ch.HitCount * hw.Weight) * 100) / #Total as Percent
, #Total as TotalHits
from
ContentHit ch WITH (NOLOCK)
join [Content] c WITH (NOLOCK) on ch.ContentId = c.ContentId
join HitWeight hw WITH (NOLOCK) on ch.HitWeightId = hw.HitWeightId
join ContentType ct WITH (NOLOCK) on c.ContentTypeId = ct.ContentTypeId
where
ch.CreatedDate between #Then and #Now
group by
c.ForeignId , ct.ContentSource
order by
sum(ch.HitCount * hw.HitWeightMultiplier) desc
END
The stored proc reads from the table "ContentHit", which is a table that tracks when content on the site is clicked (it gets hit quite frequently - anything from 4 to 20 hits a minute). So its pretty clear that this table is the source of the problem. There is a stored proc that is called to add hit tracks to the ContentHit table, its pretty trivial, it just builds up a string from the params passed in, which involves a few selects from some lookup tables, followed by the main insert:
BEGIN TRAN
insert into [ContentHit]
(ContentId, HitCount, HitWeightId, ContentHitComment)
values
(#ContentId, isnull(#HitCount,1), isnull(#HitWeightId,1), #ContentHitComment)
COMMIT TRAN
The ContentHit table has a clustered index on its ID column, and I've added another index on CreatedDate since that is used in the select.
When I profile the issue, I see the Stored proc executes for exactly 30 seconds, then the SQL timeout exception occurs. If it makes a difference the web application using it is ASP.NET, and I'm using Subsonic (3) to execute these stored procs.
Can someone please advise how best I can solve this problem? I don't care about reading dirty data...
EDIT:
The MostPopularRead stored proc is called very infrequently - its called on the home page of the site, but the results are cached for a day. The pattern of events that I am seeing is when I clear the cache, multiple requests come in for the home site, and they all hit the stored proc because it hasn't yet been cached. SQL Server then maxes out, and can only be resolved by restarting the sql server process. When I do this, usually the proc will execute OK (in about 200 ms) and put the data back in the cache.
EDIT 2:
I've checked the execution plan, and the query looks quite sound. As I said earlier when it does run it only takes around 200ms to execute. I've added MAXDOP 1 to the select statement to force it to use only one CPU core, but I still see the issue. When I look at the wait times I see that XE_DISPATCHER_WAIT, ONDEMAND_TASK_QUEUE, BROKER_TRANSMITTER, KSOURCE_WAKEUP and BROKER_EVENTHANDLER are taking up a massive amount of wait time.
EDIT 3:
I previously thought that this was related to Subsonic, our ORM, but having switched to ADO.NET, the erros is still live.
The issue is likely concurrency, not locking. SOS_SCHEDULER_YIELD occurs when a task voluntarily yields the scheduler for other tasks to execute. During this wait the task is waiting for its quantum to be renewed.
How often is [MostPopularRead] SP called and how long does it take to execute?
The aggregation in your query might be rather CPU-intensive, especially if there are lots of data and/or ineffective indexes. So, you might end up with high CPU pressure - basically, a demand for CPU time is too high.
I'd consider the following:
Check what other queries are executing while CPU is 100% busy? Look at sys.dm_os_waiting_tasks, sys.dm_os_tasks, sys.dm_exec_requests.
Look at the query plan of [MostPopularRead], try to optimize the query. Quite often an ineffective query is the root cause of a performance problem, and query optimization is much more straightforward than other performance improvement techniques.
If the query plan is parallel and the query is often called by multiple clients simultaneously, forcing a single-thread plan with MAXDOP=1 hint might help (abundant use of parallel plans is usually indicated by SOS_SCHEDULER_YIELD and CXPACKET waits).
Also, have a look at this paper: Performance tuning with wait statistics. It gives a pretty good summary of different wait types and their impact on performance.
P.S. It is easier to use SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED before a query instead of adding (nolock) to each table.
Remove the NOLOCK hint.
Open a query in SSMS, run SET STATISTICSIO ON and run the query in the procedure. Let it finish and post here the IO stats messages. Then post the table definitions and all indexes defined on them. Then somebody will be able to reply with the proper indexes you need.
As with all SQL performance problem, the text of the query is largely irrelevant without complete schema definition.
A guesstimate covering index would be:
create index ContentHitCreatedDate
on ContentHit (CreatedDate)
include (HitCount, ContentId, HitWeightId);
Update
XE_DISPATCHER_WAIT, ONDEMAND_TASK_QUEUE, BROKER_TRANSMITTER, KSOURCE_WAKEUP and BROKER_EVENTHANDLER: you can safely ignore all these waits. They show up because they represent threads parked and waiting to dispatch XEvents, Service Broker or internal SQL thread pool work items. As they spend most of their time parked and waiting, they get accounted for unrealistic wait times. Ignore them.
If you believe ContentHit to be the source of your problem, you could add a Covering Index
CREATE INDEX IX_CONTENTHIT_CONTENTID_HITWEIGHTID_HITCOUNT
ON dbo.ContentHit (ContentID, HitWeightID, HitCount)
Take a look at the Query Plan if you want to be certain about the bottleneck in your query.
By default settings sql server uses all the core/cpu for all queries (max DoP setting> advanced property, DoP= Degree of Parallelism), which can lead to 100% CPU even if only one core is actually waiting for some I/O.
If you search the net or this site you will find resource explaining it better than me (like monitoring your I/o despite you see a CPU-bound problem).
On one server we couldn't change the application with a bad query that locked down all resources (CPU) but by setting DoP to the half of the number of core we managed to avoid that the server get "stopped". The effect on the queries being less parallel was negligible in our case.
--
Dom
Thanks to all who posted, I got some great SQL Server perf tuning tips.
In the end we ran out time to resolve this mystery - we found a more effecient way to collect this information and cache it in the database, so this solved the problem for us.

how can I test performance in Sql Server Mgmt Studio without outputting data?

Using SQL Server Management Studio.
How can I test the performance of a large select (say 600k rows) without the results window impacting my test? All things being equal it doesn't really matter, since the two queries will both be outputting to the same place. But I'd like to speed up my testing cycles and I'm thinking that the output settings of SQL Server Management Studio are getting in my way. Output to text is what I'm using currently, but I'm hoping for a better alternative.
I think this is impacting my numbers because the database is on my local box.
Edit: Had a question about doing WHERE 1=0 here (thinking that the join would happen but no output), but I tested it and it didn't work -- not a valid indicator of query performance.
You could do SET ROWCOUNT 1 before your query. I'm not sure it's exactly what you want but it will avoid having to wait for lots of data to be returned and therefore give you accurate calculation costs.
However, if you add Client Statistics to your query, one of the numbers is Wait time on server replies which will give you the server calculation time not including the time it takes to transfer the data over the network.
You can SET STATISTICS TIME ON to get a measurement of the time on server. And you can use the Query/Include Client Statistics (Shift+Alt+S) on SSMS to get detail information about the client time usage. Note that SQL queries don't run and then return the result to the client when finished, but instead they run as they return results and even suspend execution if the communication channel is full.
The only context under which a query completely ignores sending the result packets back to the client is activation. But then the time to return the output to the client should be also considered when you measure your performance. Are you sure your own client will be any faster than SSMS?
SET ROWCOUNT 1 will stop processing after the first row is returned which means unless the plan happens to have a blocking operator the results will be useless.
Taking a trivial example
SELECT * FROM TableX
The cost of this query in practice will heavily depend on the number of rows in TableX.
Using SET ROWCOUNT 1 won't show any of that. Irrespective of whether TableX has 1 row or 1 billion rows it will stop executing after the first row is returned.
I often assign the SELECT results to variables to be able to look at things like logical reads without being slowed down by SSMS displaying the results.
SET STATISTICS IO ON
DECLARE #name nvarchar(35),
#type nchar(3)
SELECT #name = name,
#type = type
FROM master..spt_values
There is a related Connect Item request Provide "Discard results at server" option in SSMS and/or TSQL
The best thing you can do is to check the Query Execution Plan (press Ctrl+L) for the actual query. That will give you the best guesstimate for performance available.
I'd think that the where clause of WHERE 1=0 is definitely happening on the SQL Server side, and not Management Studio. No results would be returned.
Is you DB engine on the same machine that you're running the Mgmt Studio on?
You could :
Output to Text or
Output to File.
Close the Query Results pane.
That'd just move the cycles spent on drawing the grid in Mgmt Studio. Perhaps the Resuls to Text would be more performant on the whole. Hiding the pane would save the cycles on Mgmt Studio on having to draw the data. It's still being returned to the Mgmt Studio, so it really isn't saving a lot of cycles.
How can you test performance of your query if you don't output the results? Speeding up the testing is pointless if the testing doesn't tell you anything about how the query is going to perform. Do you really want to find out this dog of a query takes ten minutes to return data after you push it to prod?
And of course its going to take some time to return 600,000 records. It will in your user interface as well, it will probably take longer than in your query window because the info has to go across the network.
There is a lot of more correct answers of answers but I assume real question here is the one I just asked myself when I stumbled upon this question:
I have a query A and a query B on the same test data. Which is faster? And I want to check quick and dirty. For me the answer is - temp tables (overhead of creating temp table here is easy to ignore). This is to be done on perf/testing/dev server only!
Query A:
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS (to clear statistics
SELECT * INTO #temp1 FROM ...
Query B
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
SELECT * INTO #temp2 FROM ...

SQL Server query taking up 100% CPU and runs for hours

I have a query that has been running every day for a little over 2 years now and has typically taken less than 30 seconds to complete. All of a sudden, yesterday, the query started taking 3+ hours to complete and was using 100% CPU the entire time.
The SQL is:
SELECT
#id,
alpha.A, alpha.B, alpha.C,
beta.X, beta.Y, beta.Z,
alpha.P, alpha.Q
FROM
[DifferentDatabase].dbo.fnGetStuff(#id) beta
INNER JOIN vwSomeData alpha ON beta.id = alpha.id
alpha.id is a BIGINT type and beta.id is an INT type. dbo.fnGetStuff() is a simple SELECT statement with 2 INNER JOINs on tables in the same DB, using a WHERE id = #id. The function returns approximately 11000 results.
The view vwSomeData is a simple SELECT statement with two INNER JOINs that returns about 590000 results.
Both the view and the function will complete in less than 10 seconds when executed by themselves. Selecting the results of the function into a temporary table first and then joining on that makes the query finish in < 10 seconds.
How do I troubleshoot what's going on? I don't see any locks in the activity manager.
Look at the query plan. My guess is that there is a table scan or more in the execution plan. This will cause huge amounts of I/O for the few record you get in the result.
You could use the SQL Server Profiler tool to monitor what queries are running on SQL Server. It doesn't show the locks, but it can for instance also give you hints on how to improve your query by suggesting indexes.
If you've got a reasonably recent version of SQL Server Management Studio, it has a Database Tuning Adviser as well, under Tools. It takes a trace from the Profiler and makes some, sometimes highly useful, suggestions. Makes sure there's not too many queries - it takes a long time to build advice.
I'm not an expert on it, but have had some luck with it in the past.
Do you need to use a function? Can you re-write the entire thing into a stored procedure in which you pass in the #ID as a parameter.
Even if your table has indexes because you pass the #ID as a variable to the WHERE clause potentially greatly increasing the amount of time for the query to run.
The reason the indexes may not be used is because the Query Analyzer does not know the value of the variables when it selects an access method to perform the query. Because this is a batch file, only one pass is made of the Transact-SQL code, preventing the Query Optimizer from knowing what it needs to know in order to select an access method that uses the indexes.
You might want to consider an INDEX query hint if you cannot re-write the SQL.
it might also be possible, since this just started happening, that the INDEXes have become fragmented and might need to be rebuilt.
I've had similar problems with joining functions that return large datasets. I had to do what you've already suggested. Put the results in a temp table and join on that.
Look at the estimated plan, this will probably shed some light. Typically when query cost gets orders of magnitude more expensive it is because a loop or merge join is being used where a hash join is more appropriate. If you see a loop or merge join in the estimated plan, look at the number of rows it expects to process - is it far smaller than the number of rows you know will actually be in play? You can also specify a hint to use a hash join and see if it performs much better. If so, try updating statistics and see if it goes back to a hash join without a hint.
SELECT
#id,
alpha.A, alpha.B, alpha.C,
beta.X, beta.Y, beta.Z,
alpha.P, alpha.Q
FROM
[DifferentDatabase].dbo.fnGetStuff(#id) beta
INNER HASH JOIN vwSomeData alpha ON beta.id = alpha.id
-- having no idea what type of schema is in place and just trying to throw out ideas:
Like others have said... use Profiler and find the source of pain... but I'm thinking it is the function on the other database. Since that function might be a source of pain, have you thought about a little denormalization or anything on [DifferentDatabase]. I think you'll find a bit more scalability in joining to a more flattened table with indexes than a costly function.
Run this command:
SET SHOWPLAN_ALL ON
Then run your query. It will display the execution plan, look for a "SCAN" on an index or a table. That is most likely what is happening to your query now. If that is the case, try to figure out why it is not using indexes now (refresh statistics, etc)

Resources