Performance slow of Delete query in SQL Server - sql-server

I have an application which is using Entity framework for DB operations. In an one table when performing the delete operation it takes more than 3 minutes. But other similar tables doesn't take much time. I have debugged the code and find out there is no issue with the code.But executing the query in the sql server took much time.
Any troubleshooting steps/root cause for this issue ?
My table is as below,
Id (PK,uniqueidentifier,not null)
FirstValue(real,not null)
SecondValue(real,not null)
ThirdValue(real,not null)
LastValue(int,not null)
Config_Id(FK,uniqueidentifier,not null)
Query Execution Plan

Something isn't adding up here, we're not seeing the full picture...
There are a multitude of things which can slow down deletes (usually):
deleting a lot of records (which we know isn't the case here)
many indexes (which I suspect IS the case here)
deadlocks and blocking (is this a development or production database?)
triggers
cascade delete
transaction log needing to grow
many foreign keys to check (I suspect this might also be happening)
Can you please give us a screenshot of the "View Dependencies" feature in SSMS? To get this, right click on the table in the object explorer and select View Dependencies.
Also, can you open up a query on the master database, run the following queries and post the results:
SELECT name, value, value_in_use, minimum, maximum, [description], is_dynamic, is_advanced
FROM sys.configurations WITH (NOLOCK)
where name in (
'backup compression default',
'clr enabled',
'cost threshold for parallelism',
'lightweight pooling',
'max degree of parallelism',
'max server memory',
'optimize for ad hoc workloads',
'priority boost',
'remote admin connections'
)
ORDER BY name OPTION (RECOMPILE);
SELECT DB_NAME([database_id]) AS [Database Name],
[file_id], [name], physical_name, [type_desc], state_desc,
is_percent_growth, growth,
CONVERT(bigint, growth/128.0) AS [Growth in MB],
CONVERT(bigint, size/128.0) AS [Total Size in MB]
FROM sys.master_files WITH (NOLOCK)
ORDER BY DB_NAME([database_id]), [file_id] OPTION (RECOMPILE);

Related

Deleting larges amounts of data from SQL without growing the transaction log

OK I know this has been asked a lot and I thought I had found the answer but it's not working.
I need to delete a large amount of data from some SQL tables. Copying the data I want to keep and truncating or deleting the old table is not an option.
The database is set to simple logging.
I'm only deleting 3,000 rows.
I've tried indies a BEGIN/END Transaction and without that and have a CHECKPOINT command
My understanding was doing this would cause the transaction log to not grow but I'm still getting a 100+ gig transaction log.
I'm looking for a way to delete and not grow the transaction log.
I understand that it's to roll things back if needed but I don't need to I just want to delete and not have the log filled up.
kindly check VLf count also check Log_reuse in database.
SELECT [name] AS 'Database Name',
COUNT(li.database_id) AS 'VLF Count',
SUM(li.vlf_size_mb) AS 'VLF Size (MB)',
SUM(CAST(li.vlf_active AS INT)) AS 'Active VLF',
SUM(li.vlf_active*li.vlf_size_mb) AS 'Active VLF Size (MB)',
COUNT(li.database_id)-SUM(CAST(li.vlf_active AS INT)) AS 'Inactive VLF',
SUM(li.vlf_size_mb)-SUM(li.vlf_active*li.vlf_size_mb) AS 'Inactive VLF Size (MB)'
FROM sys.databases s
CROSS APPLY sys.dm_db_log_info(s.database_id) li
GROUP BY [name]
ORDER BY COUNT(li.database_id) DESC;
check why log reused in database which give description to resolve issue.
SELECT name, log_reuse_wait_desc FROM sys.databases;

SQL Server: How to retrieve last 12 hours of logs using fn_dblog?

Having accidentally nullified a column in MS SQL 2012, I'm looking at how to use fn_dblog for the first time. I had previously backed up the table, and deleted it this morning. I am using full recovery mode (code below for anyone in the future who would like to find out):
SELECT name, recovery_model_desc
FROM sys.databases
WHERE name = 'model' ;
GO
Is it possible to rollback a DROP TABLE transaction that was committed within the past 12 hours?
I found this transaction which seems to be exactly what I want. But only the last 3431 rows are found:
SELECT [Current LSN],
[Operation],
[Transaction ID],
[Parent Transaction ID],
[Begin Time],
[Transaction Name],
[Transaction SID]
FROM fn_dblog(NULL, NULL)
WHERE [Operation] = 'LOP_BEGIN_XACT'
How can I return earlier transactions using this query?
I am in unfamiliar territory here. What else should I be thinking of?
How do I know if logs exist and haven't been truncated?
Is it easier to reinstate a table vs a column deletion?
What are the dangers of using fn_fblog? On a blog post I found this (https://raresql.com/2013/04/15/sql-server-undocumented-function-fn_dblog/):
"No doubt fn_dblog is one of the helpful undocumented functions but do not use this function in the production server unless otherwise required." What is the reason for this?
=== EDIT ===
On a side note, a very helpful article is here:
A very helpful introduction to MS SQL logging:
http://www.sqlshack.com/reading-sql-server-transaction-log/

Linked Server Query Runs But Doesn't Finish?

June 29, 2010 - I had an un-committed action from a previous delete statement. I committed the action and I got another error about conflicting primary id's. I can fix that. So morale of the story, commit your actions.
Original Question -
I'm trying to run this query:
with spd_data as (
select *
from openquery(IRPROD,'select * from budget_user.spd_data where fiscal_year = 2010')
)
insert into [IRPROD]..[BUDGET_USER].[SPD_DATA_BUD]
(REC_ID, FISCAL_YEAR, ENTITY_CODE, DIVISION_CODE, DEPTID, POSITION_NBR, EMPLID,
spd_data.NAME, JOB_CODE, PAY_GROUP_CODE, FUND_CODE, FUND_SOURCE, CLASS_CODE,
PROGRAM_CODE, FUNCTION_CODE, PROJECT_ID, ACCOUNT_CODE, SPD_ENC_AMT, SPD_EXP_AMT,
SPD_FB_ENC_AMT, SPD_FB_EXP_AMT, SPD_TUIT_ENC_AMT, SPD_TUIT_EXP_AMT,
spd_data.RUNDATE, HOME_DEPTID, BUD_ORIG_AMT, BUD_APPR_AMT)
SELECT REC_ID, FISCAL_YEAR, ENTITY_CODE, DIVISION_CODE, DEPTID, POSITION_NBR, EMPLID,
spd_data.NAME, JOB_CODE, PAY_GROUP_CODE, FUND_CODE, FUND_SOURCE, CLASS_CODE,
PROGRAM_CODE, FUNCTION_CODE, PROJECT_ID, ACCOUNT_CODE, SPD_ENC_AMT, SPD_EXP_AMT,
SPD_FB_ENC_AMT, SPD_FB_EXP_AMT, SPD_TUIT_ENC_AMT, SPD_TUIT_EXP_AMT,
spd_data.RUNDATE, HOME_DEPTID, lngOrig_amt, lngAppr_amt
from spd_data
left join Budgets.dbo.tblAllPosDep on project_id = projid
and job_code = jcc and position_nbr = psno
and emplid = empid
where OrgProjTest = 'EQUAL';
Basically I'm selecting a table from IRPROD (an oracle db), joining it with a local table, and inserting the results back on IRPROD.
The problem I'm having is that while the query runs, it never stops. I've let it run for an hour and it keeps going until I cancel it. I can see on a bandwidth monitor on the SQL Server data going in and out. Also, if I just run the select part of the query it returns the results in 4 seconds.
Any ideas why it's not finishing? I've got other queryies setup in a similar manner and do not have any problems (granted those insert from local tables and not a remote table).
You didn't included any volume metrics. But I would recommend to use a temporary table to gather the results.
Then you should try to insert the first couple of rows. If this succeeds you'll have a strong indicator that everything is fine.
Try to break down each insert task by project_id or emplid to avoid large transactions logs.
You should also think about crafting a bulk batch process.
If you run just the select without the insert, how many records are returned? Does the data look right or are there multiple records due to the join?
Are there triggers on the table you are inserting into? If you are returning many records and triggers are on the table that are designed to run row-byrow this could be slowing things down. You are also sending to another server, so the network pipeline could be what is slowing you down. Maybe it would be better to send the budget data to the Oracle server and do the insert from there rather than from the SQL Server.

Next log backup - will it contain any transactions?

I'm trying to efficiently determine if a log backup will contain any data.
The best I have come up with is the following:
DECLARE #last_lsn numeric(25,0)
SELECT #last_lsn = last_log_backup_lsn
FROM sys.database_recovery_status WHERE database_id = DB_ID()
SELECT TOP 1 [Current LSN] FROM ::fn_dblog(#last_lsn, NULL)
The problem is when there are no transactions since the last backup, fn_dblog throws error 9003 with severity 20(!!) and logs it to the ERRORLOG file and event log. That makes me nervous -- I wish it just returned no records.
FYI, the reason I care is I have hundreds of small databases that can have activity at any time of day, but are typically used 8 hours/day. That means 2/3 of my log backups are empty. Those extra thousands of files can have a measurable impact on the time required for both off-site backup and recovering from a disaster.
I figured out an answer that works for my particular application. If I compare the results of the following two queries, I can determine if any activity has occurred on the database since the last log backup.
SELECT MAX(backup_start_date) FROM msdb..backupset WHERE type = 'L' AND database_name = DB_NAME();
SELECT MAX(last_user_update) FROM sys.dm_db_index_usage_stats WHERE database_id = DB_ID() AND last_user_update IS NOT NULL;
If I run
SELECT [Current LSN] FROM ::fn_dblog(null, NULL)
It seems to return my current LSN at the top that matches the last log backup.
What happens if you change the select from ::fn_dblog to a count(*)? Does that eliminate the error?
If not, maybe select the log records into a temp table (top 100 from ::fn_dblog(null, NULL), ordering by a date, if there is one) and then query that.

CPU utilization by database?

Is it possible to get a breakdown of CPU utilization by database?
I'm ideally looking for a Task Manager type interface for SQL server, but instead of looking at the CPU utilization of each PID (like taskmgr) or each SPID (like spwho2k5), I want to view the total CPU utilization of each database. Assume a single SQL instance.
I realize that tools could be written to collect this data and report on it, but I'm wondering if there is any tool that lets me see a live view of which databases are contributing most to the sqlservr.exe CPU load.
Sort of. Check this query out:
SELECT total_worker_time/execution_count AS AvgCPU
, total_worker_time AS TotalCPU
, total_elapsed_time/execution_count AS AvgDuration
, total_elapsed_time AS TotalDuration
, (total_logical_reads+total_physical_reads)/execution_count AS AvgReads
, (total_logical_reads+total_physical_reads) AS TotalReads
, execution_count
, SUBSTRING(st.TEXT, (qs.statement_start_offset/2)+1
, ((CASE qs.statement_end_offset WHEN -1 THEN datalength(st.TEXT)
ELSE qs.statement_end_offset
END - qs.statement_start_offset)/2) + 1) AS txt
, query_plan
FROM sys.dm_exec_query_stats AS qs
cross apply sys.dm_exec_sql_text(qs.sql_handle) AS st
cross apply sys.dm_exec_query_plan (qs.plan_handle) AS qp
ORDER BY 1 DESC
This will get you the queries in the plan cache in order of how much CPU they've used up. You can run this periodically, like in a SQL Agent job, and insert the results into a table to make sure the data persists beyond reboots.
When you read the results, you'll probably realize why we can't correlate that data directly back to an individual database. First, a single query can also hide its true database parent by doing tricks like this:
USE msdb
DECLARE #StringToExecute VARCHAR(1000)
SET #StringToExecute = 'SELECT * FROM AdventureWorks.dbo.ErrorLog'
EXEC #StringToExecute
The query would be executed in MSDB, but it would poll results from AdventureWorks. Where should we assign the CPU consumption?
It gets worse when you:
Join between multiple databases
Run a transaction in multiple databases, and the locking effort spans multiple databases
Run SQL Agent jobs in MSDB that "work" in MSDB, but back up individual databases
It goes on and on. That's why it makes sense to performance tune at the query level instead of the database level.
In SQL Server 2008R2, Microsoft introduced performance management and app management features that will let us package a single database in a distributable and deployable DAC pack, and they're promising features to make it easier to manage performance of individual databases and their applications. It still doesn't do what you're looking for, though.
For more of those, check out the T-SQL repository at Toad World's SQL Server wiki (formerly at SQLServerPedia).
Updated on 1/29 to include total numbers instead of just averages.
SQL Server (starting with 2000) will install performance counters (viewable from Performance Monitor or Perfmon).
One of the counter categories (from a SQL Server 2005 install is:)
- SQLServer:Databases
With one instance for each database. The counters available however do not provide a CPU % Utilization counter or something similar, although there are some rate counters, that you could use to get a good estimate of CPU. Example would be, if you have 2 databases, and the rate measured is 20 transactions/sec on database A and 80 trans/sec on database B --- then you would know that A contributes roughly to 20% of the total CPU, and B contributes to other 80%.
There are some flaws here, as that's assuming all the work being done is CPU bound, which of course with databases it's not. But that would be a start I believe.
Here's a query that will show the actual database causing high load. It relies on the query cache which might get flushed frequently in low-memory scenarios (making the query less useful).
select dbs.name, cacheobjtype, total_cpu_time, total_execution_count from
(select top 10
sum(qs.total_worker_time) as total_cpu_time,
sum(qs.execution_count) as total_execution_count,
count(*) as number_of_statements,
qs.plan_handle
from
sys.dm_exec_query_stats qs
group by qs.plan_handle
order by sum(qs.total_worker_time) desc
) a
inner join
(SELECT plan_handle, pvt.dbid, cacheobjtype
FROM (
SELECT plan_handle, epa.attribute, epa.value, cacheobjtype
FROM sys.dm_exec_cached_plans
OUTER APPLY sys.dm_exec_plan_attributes(plan_handle) AS epa
/* WHERE cacheobjtype = 'Compiled Plan' AND objtype = 'adhoc' */) AS ecpa
PIVOT (MAX(ecpa.value) FOR ecpa.attribute IN ("dbid", "sql_handle")) AS pvt
) b on a.plan_handle = b.plan_handle
inner join sys.databases dbs on dbid = dbs.database_id
I think the answer to your question is no.
The issue is that one activity on a machine can cause load on multiple databases. If I have a process that is reading from a config DB, logging to a logging DB, and moving transactions in and out of various DBs based on type, how do I partition the CPU usage?
You could divide CPU utilization by the transaction load, but that is again a rough metric that may mislead you. How would you divide transaction log shipping from one DB to another, for instance? Is the CPU load in the reading or the writing?
You're better off looking at the transaction rate for a machine and the CPU load it causes. You could also profile stored procedures and see if any of them are taking an inordinate amount of time; however, this won't get you the answer you want.
With all said above in mind.
Starting with SQL Server 2012 (may be 2008 ?) , there is column database_id in sys.dm_exec_sessions.
It gives us easy calculation of cpu for each database for currently connected sessions. If session have disconnected, then its results have gone.
select session_id, cpu_time, program_name, login_name, database_id
from sys.dm_exec_sessions
where session_id > 50;
select sum(cpu_time)/1000 as cpu_seconds, database_id
from sys.dm_exec_sessions
group by database_id
order by cpu_seconds desc;
Take a look at SQL Sentry. It does all you need and more.
Regards,
Lieven
Have you looked at SQL profiler?
Take the standard "T-SQL" or "Stored Procedure" template, tweak the fields to group by the database ID (I think you have to used the number, you dont get the database name, but it's easy to find out using exec sp_databases to get the list)
Run this for a while and you'll get the total CPU counts / Disk IO / Wait etc. This can give you the proportion of CPU used by each database.
If you monitor the PerfMon counter at the same time (log the data to a SQL database), and do the same for the SQL Profiler (log to database), you may be able to correlate the two together.
Even so, it should give you enough of a clue as to which DB is worth looking at in more detail. Then, do the same again with just that database ID and look for the most expensive SQL / Stored Procedures.
please check this query:
SELECT
DB_NAME(st.dbid) AS DatabaseName
,OBJECT_SCHEMA_NAME(st.objectid,dbid) AS SchemaName
,cp.objtype AS ObjectType
,OBJECT_NAME(st.objectid,dbid) AS Objects
,MAX(cp.usecounts)AS Total_Execution_count
,SUM(qs.total_worker_time) AS Total_CPU_Time
,SUM(qs.total_worker_time) / (max(cp.usecounts) * 1.0) AS Avg_CPU_Time
FROM sys.dm_exec_cached_plans cp
INNER JOIN sys.dm_exec_query_stats qs
ON cp.plan_handle = qs.plan_handle
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) st
WHERE DB_NAME(st.dbid) IS NOT NULL
GROUP BY DB_NAME(st.dbid),OBJECT_SCHEMA_NAME(objectid,st.dbid),cp.objtype,OBJECT_NAME(objectid,st.dbid)
ORDER BY sum(qs.total_worker_time) desc

Resources