i am running a delete statement:
DELETE FROM TransactionEntries
WHERE SessionGUID = #SessionGUID
The actual execution plan of the delete is:
Execution Tree
--------------
Clustered Index Delete(
OBJECT:([GrobManagementSystemLive].[dbo].[TransactionEntries].IX_TransactionEntries_SessionGUIDTransactionGUID]),
WHERE:([TransactionEntries].[SessionGUID]=[#SessionGUID])
)
The table is clustered by SessionGUID, so the 240 rows are physically together.
The table has no triggers on it.
The operation takes:
Duration: 11821 ms
CPU: 297
Reads: 14340
Writes: 1707
The table contains 11 indexes:
1 clustered index (SessionGUID)
1 unique (primary key) index
9 other non-unique, non-clustered indexes
How can i figure out why this delete operation is performing 14,340 reads, and takes 11 seconds?
the Avg. Disk Read Queue Length reaches 0.8
the Avg. Disk sec/Read never exceeds 4ms
the Avg. Disk Write Queue Length reaches 0.04
the Avg. Disk sec/Write never exceeds 4ms
What are the other reads for? The execution plan gives no indication of what it's reading.
Update:
EXECUTE sp_spaceused TransactionEntries
TransactionEntries
Rows 6,696,199
Data: 1,626,496 KB (249 bytes per row)
Indexes: 7,303,848 KB (1117 bytes per row)
Unused: 91,648 KB
============
Reserved: 9,021,992 KB (1380 bytes per row)
With 1,380 bytes per row, and 240 rows, that's 340 kB to be deleted.
Counter intuitive that it can be so difficult for 340 kB.
Update Two: Fragmentation
Name Scan Density Logical Fragmentation
============================= ============ =====================
IX_TransactionEntries_Tran... 12.834 48.392
IX_TransactionEntries_Curr... 15.419 41.239
IX_TransactionEntries_Tran... 12.875 48.372
TransactionEntries17 98.081 0.0049325
TransactionEntries5 12.960 48.180
PK_TransactionEntries 12.869 48.376
TransactionEntries18 12.886 48.480
IX_TranasctionEntries_CDR... 12.799 49.157
IX_TransactionEntries_CDR... 12.969 48.103
IX_TransactionEntries_Tra... 13.181 47.127
i defragmented TransactionEntries17
DBCC INDEXDEFRAG (0, 'TransactionEntries', 'TransactionEntries17')
since INDEXDEFRAG is an "online operation" (i.e. it only holds IS Intent Shared locks). i was going to then manually defragment the others until the business operations called, saying that the system is dead - and they switched to doing everything on paper.
What say you; 50% fragmentation, and only 12% scan density, cause horrible index scan performance?
As #JoeStefanelli points out in comments, it's the extra non-clustered indexes.
You are deleting 240 rows from the table.
This equates to 2640 index rows, 240 of which include all fields in the table.
Depending on how wide they are and how many included fields you have, this could equate to all the extra read activity you are seeing.
The non-clustered index rows will definitely NOT be grouped together on disk, which will increase delays.
I think the indexing might be the likeliest culprit but I wanted to throw out another possibility. You mentioned no triggers, but are there any tables that have a foreign key relationship to this table? They would have to be checked to make sure no records are in them and if you have cascade delete turned on, those records would have to be deleted as well.
Having banged my head on many-a-SQL performance issue, my standard operating procedure for something like this is to:
Back up the data
Delete one of the indexes on the table in question
Measure the operation
Restore DB
Repeat w/#2 until #3 shows a drastic change. That's likely your culprit.
Related
I'm running 5 nodes in one DC of Cassandra 3.10.
As I'm trying to maintain those nodes I'm running on daily basis on every node
nodetool repair -pr
and weekly
nodetool repair -full
This is only table I have difficulties:
Table: user_tmp
SSTable count: 4
Space used (live): 366.71 MiB
Space used (total): 366.71 MiB
Space used by snapshots (total): 216.87 MiB
Off heap memory used (total): 5.28 MiB
SSTable Compression Ratio: 0.4690289976332873
Number of keys (estimate): 1968368
Memtable cell count: 2353
Memtable data size: 84.98 KiB
Memtable off heap memory used: 0 bytes
Memtable switch count: 1108
Local read count: 62938927
Local read latency: 0.324 ms
Local write count: 62938945
Local write latency: 0.018 ms
Pending flushes: 0
Percent repaired: 76.94
Bloom filter false positives: 0
Bloom filter false ratio: 0.00000
Bloom filter space used: 4.51 MiB
Bloom filter off heap memory used: 4.51 MiB
Index summary off heap memory used: 717.62 KiB
Compression metadata off heap memory used: 76.96 KiB
Compacted partition minimum bytes: 51
Compacted partition maximum bytes: 654949
Compacted partition mean bytes: 194
Average live cells per slice (last five minutes): 2.503074492537404
Maximum live cells per slice (last five minutes): 179
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1
Dropped Mutations: 19 bytes
Percent repaired is never above 80% on this table on this and one more node but on others is above 85%. RF is 3, and strategy is SizeTieredCompactionStrategy
gc_grace_period is on 10days and as I somewhere in that period I'm getting writetimeout on exactly this table but after consumer which got this timeout is immediately replaced with another one everything keep going like nothing happened. Its like one time writetimeout.
My question is: Are you maybe have suggestion for better repair strategy because I'm kind of a noob and every suggest is a big win for me + any other for this table?
Maybe repair -inc instead of repair -pr
The nodetool repair command in Casandra 3.10 defaults to running incremental repair. There have been some major issues with incremental repair and it's currently not recommended by the community to run incremental repair. Please see this article for some great insight into repair and the issues with incremental repair: http://thelastpickle.com/blog/2017/12/14/should-you-use-incremental-repair.html
I would recommend, as does many others, to run:
nodetool repair -full -pr
Please be aware that you need to run repair on every node in your cluster. This means that if you run repair on one node per day you can have a max of 7 nodes (since with default gc_grace you should aim to finish repair within 7 days). And you also have to rely on that nothing goes wrong when doing repair since you would have to restart any failing jobs.
This is why tools like Reaper exist. It solves these issues with ease, it automates repair and makes life simpler. Reaper runs scheduled repairs and provides a web interface to make administration easier. I would highly recommend using reaper for routine maintance and nodetool repair for unplanned activities.
Edit: Link http://cassandra-reaper.io/
I have a slight problem with a postgresql database with exploding read times.
Background info:
two tables, both with only 4 columns: uuid (uuid), timestamp (bigint), type (text) and value (double) in one, values (double[]) in the other. (yes, I thought about combining it in one table... decision on that isn't in my hands).
Given that only a fairly small amount of the held data is needed for each "projectrun", I'm already copying the needed data to tables dedicated to each projectrun. Now the interesting part starts, when I try to read the data:
CREATE TABLE fake_timeseries1
(
"timestamp" bigint,
uuid uuid,
value double precision,
type text COLLATE "default".pg_catalog
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
ALTER TABLE fake_timeseries1
OWNER to user;
CREATE INDEX fake_timeseries1_timestamp_idx
ON fake_timeseries1 USING btree
(timestamp)
TABLESPACE pg_default;
ALTER TABLE fake_timeseries1
CLUSTER ON fake_timeseries1_timestamp_idx;
From that temporary table I do:
"SELECT * FROM table_name WHERE timestamp BETWEEN ? AND ? ;"
Simple enough, should work rather fast, right? Wrong.
At the moment I'm testing with small batches (only x*40k rows, returning 25% of them).
For 10k rows, it takes "only" 6 sec, 20k already 34 sec, and for 40k rows (out of a mere 160k) it already takes 3 minutes per table ... 6 minutes for a mere 6 Mb of data. (yes, we are at a gb line, so it's probably no bottleneck there)
I already tried using an index and cluster on timestamp, but that does slow it down even more. Interestingly not on the creation of the temporary tables, but rather when reading the data.
What could I do to speed up the read process? It needs to be able to read those 10-50k rows in less than 5 minutes (preferably less than 1 minute) from a table that holds not 160k rows, but rather tens of millions.
What could be responsible for a simple Select being as slow as creating the whole table in the first place? (3 mins read vs. 3.5 mins create).
Thank you in advance.
As wished an analyze (for 20k out of 80k):
"Execution Time": 27.501,
"Planning Time": 0.514,
"Plan": {
"Filter": "((\"timestamp\" >= '1483224970970'::bigint) AND (\"timestamp\" <= '1483232170970'::bigint))",
"Node Type": "Seq Scan",
"Relation Name": "fake_timeseries1",
"Alias": "fake_timeseries1",
"Actual Rows": 79552,
"Rows Removed by Filter": 0,
"Actual Loops": 1
},
"Triggers": []
The real execution time was 34.047 seconds.
UPDATE:
continued testing with different test data sets. The following is an analyze from a significantly larger testset, where I read only 0.25% of the data... still using seq scan. Anyone an idea?
[
{
"Execution Time": 7121.59,
"Planning Time": 0.124,
"Plan": {
"Filter": "((\"timestamp\" >= '1483224200000'::bigint) AND (\"timestamp\" <= '1483233200000'::bigint))",
"Node Type": "Seq Scan",
"Relation Name": "fake_forecast",
"Alias": "fake_forecast",
"Actual Rows": 171859,
"Rows Removed by Filter": 67490381,
"Actual Loops": 1
},
"Triggers": []
}
]
UPDATE: After even more testing, on a second PostgresQL Database, it seems that I somehow have hit a hard cap.
Whatever I do, the max I can get is 3.3k rows from those two tables per second. And that's only if I use the sweet spot of calling for 20-80k rows in a large batch. Which take 6 resp. 24 seconds even on a DB on my own machine.
Is there nothing that can be done (except better hardware) to speed this up?
We have the application that writes logs in Azure SQL tables. The structure of the table is the following.
CREATE TABLE [dbo].[xyz_event_history]
(
[event_history_id] [uniqueidentifier] NOT NULL,
[event_date_time] [datetime] NOT NULL,
[instance_id] [uniqueidentifier] NOT NULL,
[scheduled_task_id] [int] NOT NULL,
[scheduled_start_time] [datetime] NULL,
[actual_start_time] [datetime] NULL,
[actual_end_time] [datetime] NULL,
[status] [int] NOT NULL,
[log] [nvarchar](max) NULL,
CONSTRAINT [PK__crg_scheduler_event_history] PRIMARY KEY NONCLUSTERED
(
[event_history_id] ASC
)
)
Table stored as clustered index by scheduled_task_id column (non-unique).
CREATE CLUSTERED INDEX [IDX__xyz_event_history__scheduled_task_id] ON [dbo].[xyz_event_history]
(
[scheduled_task_id] ASC
)
The event_history_id generated by the application, it's random (not sequential) GUID. The application either creates, updates and removes old entities from the table. The log column usually holds 2-10 KB of data, but it can grow up to 5-10 MB in some cases. The items are usually accessed by PK (event_history_id) and the most frequent sort order is event_date_time desc.
The problem we see after we lowered performance tier for the Azure SQL to "S3" (100 DTUs) is crossing transaction log rate limits. It can be clearly seen within sys.dm_exec_requests table - there will be records with wait type LOG_RATE_GOVERNOR (msdn).
Occurs when DB is waiting for quota to write to the log.
The operations I've noticed that cause big impact on log rate are deletions from xyz_event_history and updates in log column. The updates made in the following fashion.
UPDATE xyz_event_history
SET [log] = COALESCE([log], '') + #log_to_append
WHERE event_history_id = #id
The recovery model for Azure SQL databases is FULL and can not be changed.
Here is the physical index statistics - there are many pages that crossed 8K per row limit.
TableName AllocUnitTp PgCt AvgPgSpcUsed RcdCt MinRcdSz MaxRcdSz
xyz_event_history IN_ROW_DATA 4145 47.6372868791698 43771 102 7864
xyz_event_history IN_ROW_DATA 59 18.1995058067705 4145 11 19
xyz_event_history IN_ROW_DATA 4 3.75277983691623 59 11 19
xyz_event_history IN_ROW_DATA 1 0.914257474672597 4 11 19
xyz_event_history LOB_DATA 168191 97.592290585619 169479 38 8068
xyz_event_history IN_ROW_DATA 7062 3.65090190264393 43771 38 46
xyz_event_history IN_ROW_DATA 99 22.0080800593032 7062 23 23
xyz_event_history IN_ROW_DATA 1 30.5534964170991 99 23 23
xyz_event_history IN_ROW_DATA 2339 9.15620212503089 43771 16 38
xyz_event_history IN_ROW_DATA 96 8.70488015814184 2339 27 27
xyz_event_history IN_ROW_DATA 1 34.3711391153941 96 27 27
xyz_event_history IN_ROW_DATA 1054 26.5034840622683 43771 28 50
xyz_event_history IN_ROW_DATA 139 3.81632073140598 1054 39 39
xyz_event_history IN_ROW_DATA 1 70.3854707190511 139 39 39
Is there a way to reduce transaction log usage?
How does SQL Server log update transactions as in example above? Is it just "old" plus "new" value? (that would conceivably make adding little pieces of data frequently quite inefficient in terms of transaction log size)
UPDATE (April, 20):
I've made some experiments with suggestions in answers and was impressed by difference that INSERT instead of UPDATE makes.
As per following msdn article about SQL Server Transaction log internals (https://technet.microsoft.com/en-us/library/jj835093(v=sql.110).aspx):
Log records for data modifications record either the logical operation
performed or they record the before and after images of the modified
data. The before image is a copy of the data before the operation is
performed; the after image is a copy of the data after the operation
has been performed.
This automatically makes the scenario with UPDATE ... SET X = X + 'more' highly inefficient in terms of transaction log usage - it requires "before image" capture.
I've created simple test suite to test original way of adding data to "log" column versus the way where we just insert new piece of data into the new table. The results I got quite astonishing (at lest for me, not too experienced with SQL Server guy).
The test is simple: 5'000 times add 1'024 character long parts of log - just 5MB of text as the result (not too bad as one might think).
FULL recovery mode, SQL Server 2014, Windows 10, SSD
UPDATE INSERT
Duration 07:48 (!) 00:02
Data file grow ~8MB ~8MB
Tran. Log grow ~218MB (!) 0MB (why?!)
Just 5000 updates that add 1KB of data can hang out SQL Server for 8 minutes (wow!) - I didn't expect that!
I think original question is resolved at this point, but the following ones raised:
Why transaction log grow looks linear (not quadratic as we can expect when simply capturing "before" and "after" images)? From the diagram we can see that "items per second" grows proportionally to the square root - it's as expected if overhead grows linearly with amount of items inserted.
Why in case with inserts transaction log appears to have the same size as before any inserts at all?
I've took a look on the transaction log (with Dell's Toad) for the case with inserts and looks like only last 297 items are in there - conceivably transaction log got truncated, but why if it's FULL recovery mode?
UPDATE (April, 21).
DBCC LOGINFO output for case with INSERT - before and after. The physical size of the log file matches the output - exactly 1,048,576 bytes on disk.
Why it looks like transaction log remains still?
RecoveryUnitId FileId FileSize StartOffset FSeqNo Status Parity CreateLSN
0 2 253952 8192 131161 0 64 0
0 2 253952 262144 131162 2 64 0
0 2 253952 516096 131159 0 128 0
0 2 278528 770048 131160 0 128 0
RecoveryUnitId FileId FileSize StartOffset FSeqNo Status Parity CreateLSN
0 2 253952 8192 131221 0 128 0
0 2 253952 262144 131222 0 128 0
0 2 253952 516096 131223 2 128 0
0 2 278528 770048 131224 2 128 0
For those who interested I've recorded "sqlserv.exe" activities using Process Monitor - I can see that file being overwritten again and again - looks like SQL Server treats old log items as no longer needed by some reason: https://dl.dropboxusercontent.com/u/1323651/stackoverflow-sql-server-transaction-log.pml.
UPDATE (April, 24). Seems I've finally started to understand what is going on there and want to share with you. The reasoning above is true in general, but has serious caveat that also produced confusion about strange transaction log re-usage with INSERTs.
Database will behave like in SIMPLE recovery mode until first full
backup is taken (even though it's in FULL recovery mode).
We can treat numbers and diagram above as valid for SIMPLE recovery mode, and I have to redo my measurement for real FULL - they are even more astonishing.
UPDATE INSERT
Duration 13:20 (!) 00:02
Data file grow 8MB 11MB
Tran. log grow 55.2GB (!) 14MB
You are violating one of the basic tenants of the normal form with the log field. The log field seams to be holding an appending sequence of info related to the primary. The fix is to stop doing that.
1 Create a table. xyz_event_history_LOG(event_history_id,log_sequence#,log)
2 stop doing updates to the log field in [xyz_event_history], instead do inserts to the xyz_event_history_LOG
The amount of data in your transaction log will decrease GREATLY.
The transaction log contains all the changes to a database in the order they were made, so if you update a row multiple times you will get multiple entries to that row. It does store the entire value, old and new, so you are correct that multiple small updates to a large data type such as nvarchar(max) would be inefficient, you would be better off storing the updates in separate columns if they are only small values.
I have a table mytable with some columns including the column datekey (which is a date and has an index), a column contents which is a varbinary(max), and a column stringhash which is a varchar(100). The stringhash and the datekey together form the primary key of the table. Everything is running on my local machine.
Running
SELECT TOP 1 * FROM mytable where datekey='2012-12-05'
returns 0 rows and takes 0 seconds.
But if I add a datalength condition:
SELECT TOP 1 * FROM mytable where datekey='2012-12-05' and datalength(contents)=0
it runs for a very long time and does not return anything before I give up waiting.
My question:
Why? How do I find out why this takes such a long time?
Here is what I checked so far:
When I click "Display estimated execution plan" it also takes a very long time and does not return anything before I give up waiting.
If I do
SELECT TOP 1000 datalength(contents) FROM mytable order by datalength(contents) desc
it takes 7 seconds and returns a list 4228081, 4218689 etc.
exec sp_spaceused 'mytable'
returns
rows reserved data index_size unused
564019 50755752 KB 50705672 KB 42928 KB 7152 KB
So the table is quite large at 50 GB.
Running
SELECT TOP 1000 * FROM mytable
takes 26 seconds.
The sqlservr.exe process is around 6 GB which is the limit I have set for the database.
It takes a long time because your query needs DATALENGTH to be evaluated for every row and then the results sorted before it can return the 1st record.
If the DATALENGTH of the field (or whether it contains any value) is something you're likely to query repeatedly, I would suggest an additional indexed field (perhaps a persisted computed field) holding the result, and searching on that.
This old msdn blog post seems to agree with #MartW answer that datalength is evaluated for every row. But it's good to understand what is really meant by "evaluated" and what is the real root of the performance degradation.
As mentioned in the question, the size of every value in the column contents may be large. It means that every value bigger than ~8Kb is stored in special LOB-storage. So, taking into account the size of the other columns, it's clear that most of the space occupied by the table is taken by this LOB-storage, i.e. it's around 50Gb.
Even if the length of contents column for every row has been already evaluated, which is proved in post linked above, it's still stored in LOB. So engine still needs to read some parts of the LOB-storage to execute the query.
If LOB-storage isn't in RAM at the time of a query execution then we need to read it from a disk, which is of course much slower than from RAM. Also possibly the read of LOB-parts is rather randomized than linear which is even more slow as it tends to raise the whole number of memory-blocks needed to be read from a disk.
At the moment it probably won't be using the primary key because of the stringhash column included before the datekey column. Try adding an additional index that just contains the datekey column. Once that key is created if it's still slow you could also try a query hint such as:
SELECT TOP 1 * FROM mytable where datekey='2012-12-05' and datalength(contents)=0 WITH INDEX = IX_datekey
You could also create a seperate length column that's updated either in your application or in an insert / update trigger.
I have a large transaction table in SQL server which is used to store about 400-500 records each day. What is the data type should I use in my PK column? The PK column stores numeric values, for which integer seems suitable but I'm afraid it will exceed the maximum value for integer since I have so many records everyday.
I am currently using integer data type for my PK column.
With a type INT, starting at 1, you get over 2 billion possible rows - that should be more than sufficient for the vast majority of cases. With BIGINT, you get roughly 922 quadrillion (922 with 15 zeros - 922'000 billions) - enough for you??
If you use an INT IDENTITY starting at 1, and you insert a row every second, you need 66.5 years before you hit the 2 billion limit .... so with 400-500 rows per day - it will take centuries before you run out of possible values... take 1'000 rows per day - you should be fine for 5883 years - good enougH?
If you use a BIGINT IDENTITY starting at 1, and you insert one thousand rows every second, you need a mind-boggling 292 million years before you hit the 922 quadrillion limit ....
Read more about it (with all the options there are) in the MSDN Books Online.
I may be wrong here as maths has never been my strong point, but if you use bigint this has a max size of 2^63-1 (9,223,372,036,854,775,807)
so if you divide that by say 500 to get roughly the number of days-worth of records you get 18446744073709600 days-worth of 500 new records.
divide again by 365, gives you 50539024859478.2 years-worth of 500 records a day
so (((2^63-1) / 500) / 365)
if that's not me being stupid then that's a lot of days :-)