I have three tables:
(table_a) has an AFTER INSERT trigger that inserts rows from another table (table_b) from a linked server into a local table (table_c).
Whenever a lot number is inserted into (table_a), the trigger inserts from (table_b) into (table_c) rows that contain the same column value as the lot number.
This seems to slow, sometimes freeze operations on my server. I found that insert from a table from a local server seems to run fine, so I suspect the problem is caused because it inserts from a linked server.
How can I improve insert speed?
Triggers should always be written to be as fast as possible to avoid exactly this problem. Intensive operations of any kind should ideally not be carried out within a trigger. That applies doubly when the operation involves contacting other servers as that can easily end up taking real time.
Instead queue the action and process your queue outside the trigger using a service or agent.
A queue might looks like a record in a table which gets flagged once processed. The record needs to contain enough information for the service to carry out the related actions, which could be contained within a Stored Procedure.
Related
I have been working on offloading data from a very large table(Close to 400 million records) in a SQL Server 2016 environment. I have been doing so by updating a column in the table used as a delete flag. The procedure also offloads the data into a separate table for archiving purposes. The flag is to be used by a procedure at the end of every month to delete the rows in the main table where deletion is flagged.
My goal is to have an efficient procedure with minimal blocking as the table in question is used quite frequently.
From what I have been reading online about the matter, the best way to deal with large updates and inserts is through batching to avoid blocking as much as possible, however, marking the main table and inserting into the archive table need to happen within a transaction to be sure one does not complete without the other, or the end of month deletion may end up deleting without archiving.
Is batching the best course of action when done inside a transaction since the transaction uses an exclusive lock anyway?
Is there a better method I could be using?
I'm in a very difficult situation with search, SQL Server and routine operations.
Every two hours a program runs and for at least 5 minutes the search implemented on another program doesn't work because the routine operation cleans the main table where all data is stored and after some elaboration the table will be refilled with new data.
There is a optimal way to manage this 5 minutes?
We try swapping table or renaming them but the relations will be lost and missing data will be delete in cascade.
The best solution will make the data always reachable.
(the table have around 3,5 million rows)
One way to do this would be to ensure that the routine operation cleans the table and repopulates using a transaction. Then in your code for searching, you'll need to set your TRANSACTION ISOLATION LEVEL to SNAPSHOT (https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql/snapshot-isolation-in-sql-server). This allows you to search for the data that was in there while the clean/refill operation is in progress.
The following statement takes at least 4 seconds:
INSERT INTO [SomeSmallTable]
SELECT * FROM ComplexView
WHERE [Date] = convert(datetime, '23/09/2020',103)
However, if we only run the SELECT part without the INSERT INTO, it takes less than half a second:
SELECT *
FROM ComplexView
WHERE [Date] = convert(datetime, '23/09/2020',103)
The view selects less than 200 rows, and the table called "SomeSmallTable" holds only a few rows. I think this issue started when we updated the view called "ComplexView". ComplexView is based on other views (and some of these views are based on other views itself), as well as some tables.
I tried to refresh all views using sp_refreshview, but to no avail.
How can we determine the cause of this issue and hopefully solve it?
[EDIT]
My reply to some comments:
#Dale K: I can't post the execution plans, I think they way to complex, and not relevant as they are equal for both statements, with or without the INSERT part, except for the Table Insert part. But I did see that the INSERT costs 100%. For some reason SQL has trouble inserting the view results in the table.
#Panagiotis Kanavos: Nobody but me is using the database. It's a copy of our clients database and I'm working on it on my local machine.
#gotqn: SomeSmallTable is a table, so no table variable or temporary table. However, it is created when a user opens a specific form in our application, and deleted then the user closes this form.
#Arvo: SomeSmallTable has no keys and no triggers. The view returns less than 200 rows which are inserted in this table, and before these are inserted the table is empty.
I followed the steps in the accepted answer, and eventually compared the current "ComplexView" with the previous version, and found out what caused this issue.
Checking the execution plan is the first step, as others have said. Given that the INSERT (rather than the query) is causing the delay, you could troubleshoot that further. Here are some things you can try:
Try using Statistics IO to find out more, as answered here.
Attempt an INSERT using static data (e.g. INSERT INTO [SomeSmallTable] VALUES (1, 2, '...etc');). This will tell you if the issue is any INSERT statement, or when inserting from a view specifically.
Check how much data the view is returning. 4s may or may not be reasonable, depending on how many rows are being inserted.
Check the table design to see how it is using primary keys, foreign keys, composite keys, indexes, triggers, etc. Some of these features optimise a table's design for selecting, but make insertion slower as a trade-off. A good answer about this can be found here.
If you know it's not a load issue (because you're the only one using this database), check whether something else might be restricting resources on the machine you're using (other resource-intensive tasks, any other queries happening at the same time, scheduled jobs within SQL Server, etc.) You can use SQL Server Profiler to watch the queries in real time.
If slow performance is not limited to this particular query, then there are other general design considerations you can look into.
We have a stored procedure which loads order details about an order. We always want the latest information about an order, so order details for the order are regenerated every time, when the stored procedure is called. We are using SQL Server 2016.
Pseudo code:
DELETE by clustered index based on order identifier
INSERT into the table, based on a huge query containing information about order
When multiple end-users are executing the stored procedure concurrently, there is a blocking created on orderdetails table. Once the first caller is done, second caller is queued, followed by third caller. So, the time for the generation of the orderdetails increases as time goes by. This is happening especially in the cases of big orders containing details rows in > 100k or 1 or 2 million, as there is table level lock is happening.
The approach we took
We partitioned the table based on the last digit of the order identifier for concurrent orderdetails loading. This improves the performance in the case of first time orderdetails loading, as there are no deletes. But, second time onwards, INSERT in first session is causing blocking for other sessions DELETE. The other sessions are blocked till first session is done with INSERT.
We are considering creation of separate orderdetails table for every order to avoid this concurrency issues.
Question
Can you please suggest some approach, which will support concurrent DELETE & INSERT scenario ?
We solved the contention issue by going for temporary table for orderdetails. We found that huge queries are taking longer SELECT time and this longer time was contributing to longer table level locks on the orderdetails table.
So, we first loaded data into temporary table #orderdetail and then went for DELETE and INSERT in the orderdetail table.
As the orderdetail table is already partitioned, DELETE were faster and INSERT were happening in parallel. INSERT was also very fast here, as it is simple table scan from #orderdetail table.
You can give a look to the Hekaton Engine. It is available even in SQL Server Standard Edition if you are using SP1.
If this is too complicated for implementation due to hardware or software limitations, you can try to play with the Isolation Levels of the database. Sometimes, queries that are reading huge amount of data are blocked or even deadlock victims of queries which are modifying parts of these data. You can ask yourself do you need to guarantee that the data read by the user is valid or you can afford for example some dirty reads?
I have reports that perform some time consuming data calculations for each user in my database, and the result is 10 to 20 calculated new records for each user. To improve report responsiveness, a nightly job was created to run the calculations and dump the results to a snapshot table in the database. It only runs for active users.
So with 50k users, 30k of which are active, the job "updates" 300k to 600k records in the large snapshot table. The method it currently uses is it deletes all previous records for a given user, then inserts the new set. There is no PK on the table, only a business key is used to group the sets of data.
So my question is, when removing and adding up to 600k records every night, are there techniques to optimize the table to handle this? For instance, since the data can be recreated on demand, is there a way to disable logging for the table as these changes are made?
UPDATE:
One issue is I cannot do this in batch because the way the script works, it's examining one user at a time, so it looks at a user, deletes the previous 10-20 records, and inserts a new set of 10-20 records. It does this over and over. I am worried that the transaction log will run out of space or other performance issues could occur. I would like to configure the table to now worry about data preservation or other items that could slow it down. I cannot drop the indexes and all that because people are accessing the table concurrently to it being updated.
It's also worth noting that indexing could potentially speed up this bulk update rather than slow it down, because UPDATE and DELETE statements still need to be able to locate the affected rows in the first place, and without appropriate indexes it will resort to table scans.
I would, at the very least, consider a non-clustered index on the column(s) that identify the user, and (assuming you are using 2008) consider the MERGE statement, which can definitely avoid the shortcomings of the mass DELETE/INSERT method currently employed.
According to The Data Loading Performance Guide (MSDN), MERGE is minimally logged for inserts with the use of a trace flag.
I won't say too much more until I know which version of SQL Server you are using.
This is called Bulk Insert, you have to drop all indexes in destination table and send insert commands in large packs (hundreds of insert statements) separated by ;
Another way is to use BULK INSERT statement http://msdn.microsoft.com/en-us/library/ms188365.aspx
but it involves dumping data to file.
See also: Bulk Insert Sql Server millions of record
It really depends upon many things
speed of your machine
size of the records being processed
network speed
etc.
Generally it is quicker to add records to a "heap" or an un-indexed table. So dropping all of your indexes and re-creating them after the load may improve your performance.
Partitioning the table may see performance benefits if you partition by active and inactive users (although the data set may be a little small for this)
Ensure you test how long each tweak adds or reduces your load and work from there.