I'm trying to delete one single record from the database.
Code is very simple:
SELECT * FROM database.tablename
WHERE SerialNbr = x
This gives me the one record I'm looking for. It has that SerialNbr plus a number ids that are foreign keys to other tables. I took care of all foreign key constraints to where the next line of the code will start running.
After that the code is followed by:
DELETE FROM tablename
WHERE SerialNbr = x
This should be a relatively simple and quick query I would think. However it has now run for 30 minutes with no results. It isn't yelling about any problems with foreign keys or anything like that, it just is taking a very very long time to process. Is there anything I can do to speed up this process? Or am I just stuck waiting? It seems something is wrong that deleting one single record would take this long.
I am using Microsoft SQL Server 2008.
It's not taking a long time to delete the row, it's waiting in line for its turn to access the table. This is called blocking, and is a fundamental part of how databases work. Essentially, you can't delete that row if someone else has a lock on it - they may be reading it and want to be sure it doesn't change (or disappear) before they're done, or they may be trying to update it (to an unsatisfactory end, of course, if you wait it out, since once they commit your delete will remove it anyway).
Check the SPID for the window where you're running the query. If you have to, stop the current instance of the query, then run this:
SELECT ##SPID;
Make note of that number, then try to run the DELETE again. While it's sitting there taking forever, check for a blocker in a different query window:
SELECT blocking_session_id FROM sys.dm_exec_requests WHERE session_id = <that spid>;
Take the number there, and issue something like:
DBCC INPUTBUFFER(<the blocking session id>);
This should give you some idea about what the blocker is doing (you can get other information from sys.dm_exec_sessions etc). From there you can decide what you want to do about it - issue KILL <the spid>;, wait it out, go ask the person what they're doing, etc.
You may need to repeat this process multiple times, e.g. sometimes a blocking chain can be several sessions deep.
What I think is happening is there is some kind of performance hunt in the database or that particular table you can see that by simply run sp_who2 and kill the SP's. Be careful on running the kill sp because it might not be your query.
Delete From Database.Tablename
where SerialNbr=x
Related
I was running a proc inside a cursor. After a lot of succesful iterations, I got this:
Transaction (Process ID 104) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
I am not posting the full details, so I don't expect a fine grain debugging answer. The facts:
I am sure noone else (including myself in another session) was using the proc, as I was developing it and
This transaction was "stuck" when doing a select ( i saw the running query from dm exec requests)
If I am not mistaken on my 2 points, is it ever possible to have a deadlock? Wouldn't the deadlock require all of the involved users of a resource to be doing write operations on them, which would create a cycle in the resource request graph? I understand a timeout error in a select, but cannot understand a deadlock. What am I missing?
An update:
I abandoned further debugging because I noticed that an index I thought existed didn't. When it was created, the performance was OK.
However, in hopes to keep this useful and hopefully come up with an answer, here is some more things I investigated, some facts, and thoughts on comments:
First, the sql server version is 2008. I understand this is not supported. I am in no position to make recommendations, much less update the server though.
I found Jeroen Mostert's comment interesting. How much is "the past"? I noticed in sys.dm_os_waiting_tasks the session being blocked by itself multiple times with wait type CXPACKET. I did some searching around, but option(maxdop 1) did not solve the problem. However, remember the index that did not exist which would cause abysmal performance. Could it be that there was correct parallelism appended, but the operations were too many? Still, I also witnessed a huge dm_exec_requests.wait_time. So, even though the query was bad, I am led to believe that there were strange (dead)locks around.
If an answer/comment comes up with specific queries/steps to do to trace the problem, I'll be happy to recreate it.
It is possible for a SELECT to cause a deadlock if someone else is using the table.
This example is ripped almost 100% from Brent Ozar's video on deadlocks, but changed one command to a SELECT.
To start with, create two tables
CREATE TABLE Lefty (ID int PRIMARY KEY)
CREATE TABLE Righty (ID int PRIMARY KEY)
INSERT INTO Lefty (ID) VALUES (1)
INSERT INTO Righty (ID) VALUES (2)
Then open two windows in SSMS. In the first put this code (don't run it yet)
BEGIN TRAN
UPDATE Lefty SET ID = 5
SELECT * FROM Righty
COMMIT TRAN
In the second window put in this code (also don't run it yet).
BEGIN TRAN
UPDATE Righty SET ID = 5
UPDATE Lefty SET ID = 5
COMMIT TRAN
Now, in the first window, run the first two commands (BEGIN TRAN AND UPDATE LEFTY).
That starts.
In the second window, run the whole transaction. It sits there waiting for your first window, and will wait forever.
In the first window, go back and run the SELECT * FROM Righty and COMMIT TRAN. 5, 4, 3, 2, 1 Boom deadlock - because the second window already had a lock on the table and therefore the SELECT in the first window couldn't run (and the second window couldn't run because the first had a lock on a table it needed).
(I'd like to reiterate - this is Brent Ozar's demo not mine! I'm just passing it on. Indeed, I recommend them).
For some strange reason, when trying to access the last 100 records of a table, SQL Server MS sits and spins and takes forever to query the results. Selecting the first 100 records comes back really fast (1 s). Any idea what might be going on? Row lock or something else?
That really seems strange.
Thanks.
It sound like another pid has an open transaction holding locks on the table you're trying to read.
In another SSMS window try running DBCC OPENTRAN (look up the options if this is a higher volume system.
EDIT
+1 to #Martin's comment .... add a nolock hint to your query for quick and dirty way to test.
SELECT ID
FROM MyTable WITH (nolock)
I'm using SQL Server 2008 on Windows Server 2008 R2, all sp'd up.
I'm getting occasional issues with SQL Server hanging with the CPU usage on 100% on our live server. It seems all the wait time on SQL Sever when this happens is given to SOS_SCHEDULER_YIELD.
Here is the Stored Proc that causes the hang. I've added the "WITH (NOLOCK)" in an attempt to fix what seems to be a locking issue.
ALTER PROCEDURE [dbo].[MostPopularRead]
AS
BEGIN
SET NOCOUNT ON;
SELECT
c.ForeignId , ct.ContentSource as ContentSource
, sum(ch.HitCount * hw.Weight) as Popularity
, (sum(ch.HitCount * hw.Weight) * 100) / #Total as Percent
, #Total as TotalHits
from
ContentHit ch WITH (NOLOCK)
join [Content] c WITH (NOLOCK) on ch.ContentId = c.ContentId
join HitWeight hw WITH (NOLOCK) on ch.HitWeightId = hw.HitWeightId
join ContentType ct WITH (NOLOCK) on c.ContentTypeId = ct.ContentTypeId
where
ch.CreatedDate between #Then and #Now
group by
c.ForeignId , ct.ContentSource
order by
sum(ch.HitCount * hw.HitWeightMultiplier) desc
END
The stored proc reads from the table "ContentHit", which is a table that tracks when content on the site is clicked (it gets hit quite frequently - anything from 4 to 20 hits a minute). So its pretty clear that this table is the source of the problem. There is a stored proc that is called to add hit tracks to the ContentHit table, its pretty trivial, it just builds up a string from the params passed in, which involves a few selects from some lookup tables, followed by the main insert:
BEGIN TRAN
insert into [ContentHit]
(ContentId, HitCount, HitWeightId, ContentHitComment)
values
(#ContentId, isnull(#HitCount,1), isnull(#HitWeightId,1), #ContentHitComment)
COMMIT TRAN
The ContentHit table has a clustered index on its ID column, and I've added another index on CreatedDate since that is used in the select.
When I profile the issue, I see the Stored proc executes for exactly 30 seconds, then the SQL timeout exception occurs. If it makes a difference the web application using it is ASP.NET, and I'm using Subsonic (3) to execute these stored procs.
Can someone please advise how best I can solve this problem? I don't care about reading dirty data...
EDIT:
The MostPopularRead stored proc is called very infrequently - its called on the home page of the site, but the results are cached for a day. The pattern of events that I am seeing is when I clear the cache, multiple requests come in for the home site, and they all hit the stored proc because it hasn't yet been cached. SQL Server then maxes out, and can only be resolved by restarting the sql server process. When I do this, usually the proc will execute OK (in about 200 ms) and put the data back in the cache.
EDIT 2:
I've checked the execution plan, and the query looks quite sound. As I said earlier when it does run it only takes around 200ms to execute. I've added MAXDOP 1 to the select statement to force it to use only one CPU core, but I still see the issue. When I look at the wait times I see that XE_DISPATCHER_WAIT, ONDEMAND_TASK_QUEUE, BROKER_TRANSMITTER, KSOURCE_WAKEUP and BROKER_EVENTHANDLER are taking up a massive amount of wait time.
EDIT 3:
I previously thought that this was related to Subsonic, our ORM, but having switched to ADO.NET, the erros is still live.
The issue is likely concurrency, not locking. SOS_SCHEDULER_YIELD occurs when a task voluntarily yields the scheduler for other tasks to execute. During this wait the task is waiting for its quantum to be renewed.
How often is [MostPopularRead] SP called and how long does it take to execute?
The aggregation in your query might be rather CPU-intensive, especially if there are lots of data and/or ineffective indexes. So, you might end up with high CPU pressure - basically, a demand for CPU time is too high.
I'd consider the following:
Check what other queries are executing while CPU is 100% busy? Look at sys.dm_os_waiting_tasks, sys.dm_os_tasks, sys.dm_exec_requests.
Look at the query plan of [MostPopularRead], try to optimize the query. Quite often an ineffective query is the root cause of a performance problem, and query optimization is much more straightforward than other performance improvement techniques.
If the query plan is parallel and the query is often called by multiple clients simultaneously, forcing a single-thread plan with MAXDOP=1 hint might help (abundant use of parallel plans is usually indicated by SOS_SCHEDULER_YIELD and CXPACKET waits).
Also, have a look at this paper: Performance tuning with wait statistics. It gives a pretty good summary of different wait types and their impact on performance.
P.S. It is easier to use SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED before a query instead of adding (nolock) to each table.
Remove the NOLOCK hint.
Open a query in SSMS, run SET STATISTICSIO ON and run the query in the procedure. Let it finish and post here the IO stats messages. Then post the table definitions and all indexes defined on them. Then somebody will be able to reply with the proper indexes you need.
As with all SQL performance problem, the text of the query is largely irrelevant without complete schema definition.
A guesstimate covering index would be:
create index ContentHitCreatedDate
on ContentHit (CreatedDate)
include (HitCount, ContentId, HitWeightId);
Update
XE_DISPATCHER_WAIT, ONDEMAND_TASK_QUEUE, BROKER_TRANSMITTER, KSOURCE_WAKEUP and BROKER_EVENTHANDLER: you can safely ignore all these waits. They show up because they represent threads parked and waiting to dispatch XEvents, Service Broker or internal SQL thread pool work items. As they spend most of their time parked and waiting, they get accounted for unrealistic wait times. Ignore them.
If you believe ContentHit to be the source of your problem, you could add a Covering Index
CREATE INDEX IX_CONTENTHIT_CONTENTID_HITWEIGHTID_HITCOUNT
ON dbo.ContentHit (ContentID, HitWeightID, HitCount)
Take a look at the Query Plan if you want to be certain about the bottleneck in your query.
By default settings sql server uses all the core/cpu for all queries (max DoP setting> advanced property, DoP= Degree of Parallelism), which can lead to 100% CPU even if only one core is actually waiting for some I/O.
If you search the net or this site you will find resource explaining it better than me (like monitoring your I/o despite you see a CPU-bound problem).
On one server we couldn't change the application with a bad query that locked down all resources (CPU) but by setting DoP to the half of the number of core we managed to avoid that the server get "stopped". The effect on the queries being less parallel was negligible in our case.
--
Dom
Thanks to all who posted, I got some great SQL Server perf tuning tips.
In the end we ran out time to resolve this mystery - we found a more effecient way to collect this information and cache it in the database, so this solved the problem for us.
i have a table which was always updatable before, but then suddenly i can no longer update the any of the columns in the table. i can still query the whole table and the results come back very fast, but the moment i try to update a column in the table, the update query simply stalls and does nothing.
i tried using
select req_transactionUOW
from master..syslockinfo
where req_spid = -2
to see if some orphaned transaction was locking the table, but it returns no results.
i can't seems to find signs of my table being locked, but simply cannot update it. any clues as to how to fix the table or whatever state it is in?
Could you please issue this query:
SELECT COUNT(*)
FROM mytable WITH (UPDLOCK, READPAST)
which will skip the locked records and make sure it returns the same number of records as
SELECT COUNT(*)
FROM mytable
You may need to repeat it with every index on the table forced, to make sure that no index resources is locked as well.
When you say "times out", does it hit the client timeout? For example, the default .net command timeout is 30 seconds. I would suggest increasing this to a very large value or running the update in SQL tools (by default no timeout set).
Other than that, an update will finish at some point or error and rollback: are you leaving enough time?
There is also the blocking, last index rebuild, last statistics update, triggers, accidental cross join, MDF or LDF file growth, poor IO, OS paging... etc. And have you restarted the SQL instance or server to remove environmental issues and kill all other connections?
There simply isn't enough information to make a judgement right now sorry.
I'm guessing this isn't a permissions issue as you're not getting an error.
So the closest I have had to this before is when the indexes on the table have become corrupt. Have you tried dropping the indexes and recreating them? Try one by one at first.
If you suspect locking, one of the first things I would do would be to run sp_lock. It will give you a list of all of the current locks held. You can use the DB_NAME and OBJECT_NAME functions to get the names that correspond to the dbid and ObjId columns.
Have you got any triggers on the table?
If so it could be that the trigger is failing so preventing the update.
Can you update other tables? If not (or anyways, if you like) you could check if the transaction log is full (if you use the full recovery model)/the partition your transaction log resides on is full. I think if SQL Server is unable to write to the transaction log you would/could experience this behaviour.
DBCC would be your friend: DBCC SQLPERF(LOGSPACE) shows you, how much (in percent) of your log is used. If it is (close to) 100% this might be your issue.
Just my two pennies worth.
So for the second day in a row, someone has wiped out an entire table of data as opposed to the one row they were trying to delete because they didn't have the qualified where clause.
I've been all up and down the mgmt studio options, but can't find a confirm option. I know other tools for other databases have it.
I'd suggest that you should always write SELECT statement with WHERE clause first and execute it to actually see what rows will your DELETE command delete. Then just execute DELETE with the same WHERE clause. The same applies for UPDATEs.
Under Tools>Options>Query Execution>SQL Server>ANSI, you can enable the Implicit Transactions option which means that you don't need to explicitly include the Begin Transaction command.
The obvious downside of this is that you might forget to add a Commit (or Rollback) at the end, or worse still, your colleagues will add Commit at the end of every script by default.
You can lead the horse to water...
You might suggest that they always take an ad-hoc backup before they do anything (depending on the size of your DB) just in case.
Try using a BEGIN TRANSACTION before you run your DELETE statement.
Then you can choose to COMMIT or ROLLBACK same.
In SSMS 2005, you can enable this option under Tools|Options|Query Execution|SQL Server|ANSI ... check SET IMPLICIT_TRANSACTIONS. That will require a commit to affect update/delete queries for future connections.
For the current query, go to Query|Query Options|Execution|ANSI and check the same box.
This page also has instructions for SSMS 2000, if that is what you're using.
As others have pointed out, this won't address the root cause: it's almost as easy to paste a COMMIT at the end of every new query you create as it is to fire off a query in the first place.
First, this is what audit tables are for. If you know who deleted all the records you can either restrict their database privileges or deal with them from a performance perspective. The last person who did this at my office is currently on probation. If she does it again, she will be let go. You have responsibilites if you have access to production data and ensuring that you cause no harm is one of them. This is a performance problem as much as a technical problem. You will never find a way to prevent people from making dumb mistakes (the database has no way to know if you meant delete table a or delete table a where id = 100 and a confirm will get hit automatically by most people). You can only try to reduce them by making sure the people who run this code are responsible and by putting into place policies to help them remember what to do. Employees who have a pattern of behaving irresponsibly with your busness data (particulaly after they have been given a warning) should be fired.
Others have suggested the kinds of things we do to prevent this from happening. I always embed a select in a delete that I'm running from a query window to make sure it will delete only the records I intend. All our code on production that changes, inserts or deletes data must be enclosed in a transaction. If it is being run manually, you don't run the rollback or commit until you see the number of records affected.
Example of delete with embedded select
delete a
--select a.* from
from table1 a
join table 2 b on a.id = b.id
where b.somefield = 'test'
But even these techniques can't prevent all human error. A developer who doesn't understand the data may run the select and still not understand that it is deleting too many records. Running in a transaction may mean you have other problems when people forget to commit or rollback and lock up the system. Or people may put it in a transaction and still hit commit without thinking just as they would hit confirm on a message box if there was one. The best prevention is to have a way to quickly recover from errors like these. Recovery from an audit log table tends to be faster than from backups. Plus you have the advantage of being able to tell who made the error and exactly which records were affected (maybe you didn't delete the whole table but your where clause was wrong and you deleted a few wrong records.)
For the most part, production data should not be changed on the fly. You should script the change and check it on dev first. Then on prod, all you have to do is run the script with no changes rather than highlighting and running little pieces one at a time. Now inthe real world this isn't always possible as sometimes you are fixing something broken only on prod that needs to be fixed now (for instance when none of your customers can log in because critical data got deleted). In a case like this, you may not have the luxury of reproducing the problem first on dev and then writing the fix. When you have these types of problems, you may need to fix directly on prod and you should have only dbas or database analysts, or configuration managers or others who are normally responsible for data on the prod do the fix not a developer. Developers in general should not have access to prod.
That is why I believe you should always:
1 Use stored procedures that are tested on a dev database before deploying to production
2 Select the data before deletion
3 Screen developers using an interview and performance evaluation process :)
4 Base performance evaluation on how many database tables they do/do not delete
5 Treat production data as if it were poisonous and be very afraid
So for the second day in a row, someone has wiped out an entire table of data as opposed to the one row they were trying to delete because they didn't have the qualified where clause
Probably the only solution will be to replace someone with someone else ;). Otherwise they will always find their workaround
Eventually restrict the database access for that person and provide them with the stored procedure that takes the parameter used in the where clause and grant them access to execute that stored procedure.
Put on your best Trogdor and Burninate until they learn to put in the WHERE clause.
The best advice is to get the muckety-mucks that are mucking around in the database to use transactions when testing. It goes a long way towards preventing "whoops" moments. The caveat is that now you have to tell them to COMMIT or ROLLBACK because for sure they're going to lock up your DB at least once.
Lock it down:
REVOKE delete rights on all your tables.
Put in an audit trigger and audit table.
Create parametrized delete SPs and only give rights to execute on an as needed basis.
Isn't there a way to give users the results they need without providing raw access to SQL? If you at least had a separate entry box for "WHERE", you could default it to "WHERE 1 = 0" or something.
I think there must be a way to back these out of the transaction journaling, too. But probably not without rolling everything back, and then selectively reapplying whatever came after the fatal mistake.
Another ugly option is to create a trigger to write all DELETEs (maybe over some minimum number of records) to a log table.