Sql server performance issue with multiple threads - sql-server

First I will describes the details of my scenario and then my question:
I am running an instance of sql server 2017 with the following properties (from SSMS):
product: SQL Server Express Edition (64-bit)
platform: NT x64
version: 10.0.2531.0
memory: 3980MB
processors: 4 (logical) 2 (physical)
collation: SQL_Latin_General_CP1_CI_AS
The instance is configured as follows:
recovery model is set to simple and I truncate the log file
via dbcc shrinkfile before I run my queries
Max server memory is set to 2147483647 and min to default
All connections to the instance are set to have an isolation level of read uncommitted
There are three tables whose total size is ~3.2GB upon which I perform selects and a single table, which is initialized to be empty, to which I perform inserts.
All the tables have clustered indecies on their relevant columns.
I execute a single stored proc which performs all the operations in the above tables, but the access pattern of the selects is random, i.e. each time the sp is called it access random rows in the big tables (all in all it access a total of 5 rows from all tables per invocation) and adds a new row in the small table, so there is no collision between the inserts
I measure the execution time of the stored procedure via datediff inside the body of the stored procedure
I run two optional scenarios:
a single thread from my app calls the sp N times and recieves an average time of 12sec
4 threads from my app concurrently call the sp N/4 times each and recieve average times of: 0.8sec 3sec 5sec 90sec
All threads in all scenarios perform all their queries successfully
My hypothesis & question(s):
It seems logical that the single thread out performs threads that run concurrently due to increased memory pollution caused by multiple threads accessing such big tables, but how come there are concurrent threads that out perform the single thread even though they are running with a more polluted memory? (I would expect the single thread to out perform ALL the concurrent threads.)
Is there a locking issue here although the isolation level is read uncommitted?

Related

Concurrent executions of stored procedure

I have to load 3 billion rows from one database into another with some processing. I have a stored procedure which does the processing and loads the data into the target.
To speed up the process I'm using a primary key column value in source table as a parameter to the stored procedure.
When running the stored procedure in diff sessions it's not performing as expected.
Please let me know how to improve the performance with concurrent executions in different sessions.
EX: if I've the ID's from 1 to 300000, I'm passing parameters as 1 to 1000,1000 to 2000,2000 to 3000..... to the stored procedure.
exec sp1 1,1000----session1
exec sp1 1000,2000----session2
exec sp1 2000,3000---session3
......
...
If I run just one process it's finishing fast. But if I run multiple processes it's consuming more time.
It's not a solution, it'a couple of advices:
You are right using batches to move data from one table to another. Many batches of small transactions are better than huge one
Do not do it in parallel. I'm sure, your code blocks source or destination table. And my experience says not to do it in parallel. Write one loop, fetch batches one by one, and move data
Test batch sizes. Usually, batch in 1 million of rows is better than 1000 batches in 1000 rows.
Yes, it can take a lot of time. Maybe, you need add some code, that allows users to work, while transfering in process. Hide data behind view, for example

SQL Server 2008r2 Commit performance issue

I have large SP that contains a primary outer cursor (I know, I know, but I had to examine each row to determine what to do with it, e.g. write to child tables, exception table, reports, etc) and some inner cursors. Within these cursors are calls to other SPs that do the actual INSERTS/UPDATE and DELETES.
At the top of the outer most cursor I do a BEGIN TRAN and at the end of the cursor, before I loop back up to the TOP, I do a COMMIT TRAN. Committing all the work (parent and children) for the outer most's row.
This is a standalone process that runs with no users accessing the target DB as it happens during a software upgrade.
I have a debug statement that displays the milliseconds duration to process the outer cursor. At most of our clients where I run this, that duration is pretty consistent through out the whole process however at one client that duration get progressively slower. Additionally, it appears that the last COMMIT at this client site takes 84 seconds to process whereas other it does not impact the duration any, same average time.
The code it identical between the clients. The Isolation Level is identical.
The sp_configure options are nearly identical.
Client DBMS's is on a virtual server with good to average SQLIO times to the data and log. Testing a select */into [table] from a million row table only took 3 seconds, so writes and auto-committing seems ok.
Thoughts or ideas to diagnose further?
The issue was a poor performing index used by one of the called sp's - it was doing a tablespace scan, which was gettting progressively slower as the seeks times increased.

Release unused memory sql server 2014 memory optimized table

My system
Microsoft SQL Server 2014 (SP1-CU4) (KB3106660) - 12.0.4436.0 (X64)
Dec 2 2015 16:09:44 Copyright (c) Microsoft Corporation Enterprise
Edition: Core-based Licensing (64-bit) on Windows NT 6.3 (Build
9600: ) (Hypervisor)
I use two table table1 and ``table2` memory optimazed table (each size 27 GB)
drop table1
IF OBJECT_ID('table1') IS NOT NULL
BEGIN
DROP TABLE [dbo].[table1]
END
After:
SQL server Memory Usage By Memory Optimazed Objects Reports
Table Name =table2 Table Used Memory = 26582,50 Table Unused Memory = 26792,69
How can I run sql server garbage collector manually ? this is possible or not ?
I need "Table Unused Memory" Release because another process always gives this error
"There is insufficient system memory in resource pool 'Pool' to run this query."
Thank you
Data for memory optimized tables is held in data & delta files.
A delete statement will not remove the data from the data file but insert a delete record into the delta file, hence your storage continuing to be large.
The data & delta file are maintained in pairs know as checkpoint file pairs (CFPs). Over time closed CFPs are merged based upon a merge policy from multiple CFPs into one merged target CFP.
A background thread evaluates all closed CFPs using a merge policy and then initiates one or more merge requests for the qualifying CFPs. These merge requests are processed by the offline checkpoint thread. The evaluation of merge policy is done periodically and also when a checkpoint is closed.
You can force merge the files using stored procedure sys.sp_xtp_merge_checkpoint_files following a checkpoint.
EDIT
Run statement:
SELECT
container_id,
internal_storage_slot,
file_type_desc,
state_desc,
inserted_row_count,
deleted_row_count,
lower_bound_tsn,
upper_bound_tsn
FROM
sys.dm_db_xtp_checkpoint_files
ORDER BY
file_type_desc,
state_desc
Then find the rows with status UNDER CONSTRUCTION and make a note of
the lower and upper transaction id.
Now execute:
EXEC sys.sp_xtp_merge_checkpoint_files 'myDB',1003,1004;
where 1003 and 1004 is the lower and upper transaction id.
To completely remove the files you will have to ensure that you have to:
Run Select statement from above
Run EXEC sys.sp_xtp_merge_checkpoint_files from above
Perform a Full Backup
CHECKPOINT
Backup the Log
EXEC sp_xtp_checkpoint_force_garbage_collection;
Checkpoint
Exec sp_filestream_force_garbage_collection 'MyDb' to remove files marked as Tombstone
You may need to run steps 3 - 7 twice to completely get rid of the files.
See The DBA who came to tea article
CFP's go through the following stages:
•PRECREATED – A small set of CFPs are kept pre-allocated to minimize or eliminate any waits to allocate new files as transactions are being executed. These are full sized with data file size of 128MB and delta file size of 8 MB but contain no data. The number of CFPs is computed as the number of logical processors or schedulers with a minimum of 8. This is a fixed storage overhead in databases with memory-optimized tables
•UNDER CONSTRUCTION – Set of CFPs that store newly inserted and possibly deleted data rows since the last checkpoint.
•ACTIVE - These contain the inserted/deleted rows from previous closed checkpoints. These CFPs contain all required inserted/deleted rows required before applying the active part of the transaction log at the database restart. We expect that size of these CFPs to be approximately 2x of the in-memory size of memory-optimized tables assuming merge operation is keeping up with the transactional workload.
•MERGE TARGET – CFP stores the consolidated data rows from the CFP(s) that were identified by the merge policy. Once the merge is installed, the MERGE TARGET transitions into ACTIVE state
•MERGED SOURCE – Once the merge operation is installed, the source CFPs are marked as MERGED SOURCE. Note, the merge policy evaluator may identify multiple merges a CFP can only participate in one merge operation.
•REQUIRED FOR BACKUP/HA – Once the merge has been installed and the MERGE TARGET CFP is part of durable checkpoint, the merge source CFPs transition into this state. CFPs in this state are needed for operational correctness of the database with memory-optimized table. For example, to recover from a durable checkpoint to go back in time. A CFP can be marked for garbage collection once the log truncation point moves beyond its transaction range.
•IN TRANSITION TO TOMBSTONE – These CFPs are not needed by in-memory OLTP engine can they can be garbage collected. This state indicates that these CFPs are waiting for the background thread to transition them to the next state TOMBSTONE
•TOMBSTONE – These CFPs are waiting to be garbage collected by the filestream garbage collector. Please refer to FS Garbage Collection for details

SQL Server Delete Lock issue

I have a SQL Server database where I am deleting rows from three tables A,B,C in batches with some conditions through a SQL script scheduled in a SQL job. The job runs for 2 hours as the tables have a large amount of data. While the job is running, my front end application is not accessible (giving timeout error) since the application inserts and updates data in these same tables A,B,C.
Is it possible for the front end application to run in parallel without any issues while the SQL script is running? I have checked for the locks on the table and SQL Server is acquiring page locks. Can Read Committed Snapshot or Snapshot isolation levels or converting page locks to row locks help here. Need advice.
Split the operation in two phases. In the first phase, collect the primary keys of rows to delete:
create table #TempList (ID int);
insert #TempList
select ID
from YourTable
In the second phase, use a loop to delete those rows in small batches:
while 1=1
begin
delete top (1000)
from YourTable
where ID in (select ID from #TempList)
if ##rowcount = 0
break
end
The smaller batches will allow your front end applications to continue in between them.
I suspect that SQL Server at some point escalates to table lock, and this means that the table is inaccessible, both for reading and updating.
To optimize locking and concurrency when dealing with large deletes, use batches. Start with 5000 rows at the time (to prevent lock escalation) and monitor how it behaves and whether it needs further tuning up or down. 5000 is a "magic number", but it's low enough number that lock manager doesn't consider escalating to table lock, and large enough for the performance.
Whether timeouts will happen or not depends on other factors as well, but this will surely reduce if not elliminate alltogether. If the timeout happen on read operations, you should be able to get rid of them. Another approach, of course, is to increase the command timeout value on client.
Snapshot (optimistic) isolation is an option as well, READ COMMITTED SNAPSHOT more precisely, but it won't help with updates from other sessions. Also, beware of version store (in tempdb) growth. Best if you combine it with the proposed batch approach to keep the transactions small.
Also, switch to bulk-logged recovery for the duration of delete if the database is in full recovery normally. But switch back as soon as it finishes, and make a backup.
Almost forgot -- if it's Enterprise edition of SQL Server, partition your table; then you can just switch the partition out, it's almost momentarilly and the clients will never notice it.

Intermittent slow query on SQL Server 2008

I am developing a system which periodically (4-5 times daily) runs a select statement, that normally takes less than 10 seconds but periodically has taken up to 40 minutes.
The database is on Windows Server 2008 + SQL Server 2008 R2; both 64bit.
There is a service on the machine running the database which polls the database and generates values for records which require it. These records are then periodically queried using a multi table join select from a service on a second machine written in C++ (VS 2010) using the MFC CRecordset class to extract the data. An example of the the query causing the problem is shown below.
SELECT DISTINCT "JobKeysFrom"."Key" AS "KeyFrom","KeysFrom"."ID" AS "IDFrom",
"KeysFrom"."X" AS "XFrom","KeysFrom"."Y" AS "YFrom","JobKeysTo"."Key" AS "KeyTo",
"KeysTo"."ID" AS "IDTo","KeysTo"."X" AS "XTo","KeysTo"."Y" AS "YTo",
"Matrix"."TimeInSeconds","Matrix"."DistanceInMetres","Matrix"."Calculated"
FROM "JobKeys" AS "JobKeysFrom"
INNER JOIN "JobKeys" AS "JobKeysTo" ON
("JobKeysFrom"."Key"<>"JobKeysTo"."Key") AND
("JobKeysFrom"."JobID"=531) AND
("JobKeysTo"."JobID"=531)
INNER JOIN "Keys" AS "KeysFrom" ON
("JobKeysFrom"."Key"="KeysFrom"."Key") AND ("JobKeysFrom"."Status"=4)
INNER JOIN "Keys" AS "KeysTo" ON
("JobKeysTo"."Key"="KeysTo"."Key") AND ("JobKeysTo"."Status"=4)
INNER JOIN "Matrix" AS "Matrix" ON
("Matrix"."IDFrom"="KeysFrom"."ID") AND ("Matrix"."IDTo"="KeysTo"."ID")
ORDER BY "JobKeysFrom"."Key","JobKeysTo"."Key"
I have tried the following
checked the indexes and all seem correct and they are active and are being used according to the query
the design advisor comes back with no suggestions
I have tried defragging the indexes and data
rebuilt the database from scratch by exporting the data and reimporting it in a new database.
ran the profiler on it and found that when it goes wrong it seems to do many millions (up to 100 million) of reads rather than a few hundred thousand.
ran the database on a different server
During the time it is running the query, I can run exactly the same query in the management studio window and it will be back to running in 10 seconds. The problem does not seem to be lock, deadlock, CPU, disk or memory related as it has done it when the machine running the database was only running this one query. The server has 4 processors and 16 gb of memory to run it in. I have also tried upgrading the disks to much faster ones and this had no effect.
It seems to me that it is almost as though the database receives the query, starts to process it and then goes to sleep for 40 minutes or runs the query without using the indexes.
When it takes a long time it will eventually finish and send the query results (normally about 70-100000 records) back to the calling application.
Any help or suggestions would be gratefully received, many thanks
This sounds very much like parameter sniffing.
When a stored procedure is invoked and there is no existing execution plan in the cache matching the set options for the connection a new execution plan will be compiled using the parameter values passed in on that invocation.
Sometimes this will happen when the parameters passed are atypical (e.g. have unusually high selectivity) so the generated plan will not be suitable for most other invocations with different parameters. For example it may choose a plan with index seeks and bookmark lookups which is fine for a highly selective case but poor if it needs to be done hundreds of thousands of times.
This would explain why the number of reads goes through the roof.
Your SSMS connection will likely have different SET ... options so will not get handed the same problematic plan from the cache when you execute the stored procedure inside SSMS
You can use the following to get the plan for the slow session
select p.query_plan, *
from sys.dm_exec_requests r
cross apply sys.dm_exec_query_plan(r.plan_handle) p
where r.session_id = <session_id>
Then compare with the plan for the good session.
If you do determine that parameter sniffing is at fault you can use OPTIMIZE FOR hints to avoid it choosing the bad plan.
Check that you don't have a maintenance task running that is rebuilding indexes, or that your database statistics are somehow invalid when the query is executed.
This is exactly the sort of thing one would expect to see if the query is not using your indexes, which is usually because either the indexes are not accessible to the query at the point it runs or because the statistics are invalid and make the optimiser believe that your large table(s) only have a few rows in them and the query would run faster with a full table scan than using indexed access.

Resources