Can SQL Server 2016 Rearrange Order of Multiple Queries in One Transaction? - sql-server

I experienced a very strange occurrence relating to a multi-query transaction. After SQL Server was updated from 2008 to 2016 (with no warning from our host), we started dropping data after it was posted to the API. The weird thing is, some of the data arrived, and some didn’t.
In order to protect integrity, the queries are all joined in one transaction. The records can be created and then updated at a later time. They are formatted similar to this:
DELETE FROM table_1 WHERE parentID = 123 AND col2 = 321;
DELETE FROM table_2 WHERE parentID = 123 AND col2 = 321;
-- etc
INSERT INTO table_1 (parentID, col2, etc) VALUES (123, 321, 123456);
INSERT INTO table_2 (parentID, col2, etc) VALUES (123, 321, 654321);
-- etc
There could be hundreds of lines being executed. Due to design, the records in question do not have unique IDs, so the most performant way to execute the queries was to first delete the matching records, then re-insert them. Looping through the records and checking for existence is the only other option (as far as I know), and that is expensive with that many records.
Anyway, I was struggling to find a reason for this data loss, which seemed random. I had logs of the sql queries, so I know they were being formatted correctly and they had all the data intact. Finally, the only thing left I could think of was to separate the DELETE queries into a separate transaction and execute first*. That seems to have fixed the problem.
Q. Does anyone know if these queries could be executed out of order in which they were presented? Do you see a better way I could be writing these transactions?
* I don't necessarily like this solution, because the delete queries were the main reason I wanted a transaction in the first place. If an error occurs during the second transaction, then all the older matching records have been deleted, but the newer versions are never saved. Living on the edge...
P.S. One other problem I had, and this is probably due to my ignorance of the platform, when I tried to bracket these queries with BEGIN TRAN; and COMMIT TRAN;, immediately after this script finished, any following queries in the same thread got hung up for about 20-30 seconds or so. What am I doing wrong? Do I actually need these verbs if all the queries are being executed at once?

We could use a bit more information, such as if there is unique constraint on your table and ignore duplicate insert.
if the data is missing, it could be due to insert failed and this will register an entry in the Profiler event "User Error Message" under "Errors and Warnings" event class. Create a trace to filter this login only and check each statement and if there is any user errors raised in the trace.
If you have a other processes running (other applications or threads), it is possible that after you inserted the records other deleted that row without your knowledge. In this case, you might want to set up a delete trigger to log all update and delete actions on the table and see what is the user performing these actions. In short, if you think you have lost data, it is either the command was not executed , executed with error, or deleted buy other processes after execution.

Related

Slow insert: select from view

The following statement takes at least 4 seconds:
INSERT INTO [SomeSmallTable]
SELECT * FROM ComplexView
WHERE [Date] = convert(datetime, '23/09/2020',103)
However, if we only run the SELECT part without the INSERT INTO, it takes less than half a second:
SELECT *
FROM ComplexView
WHERE [Date] = convert(datetime, '23/09/2020',103)
The view selects less than 200 rows, and the table called "SomeSmallTable" holds only a few rows. I think this issue started when we updated the view called "ComplexView". ComplexView is based on other views (and some of these views are based on other views itself), as well as some tables.
I tried to refresh all views using sp_refreshview, but to no avail.
How can we determine the cause of this issue and hopefully solve it?
[EDIT]
My reply to some comments:
#Dale K: I can't post the execution plans, I think they way to complex, and not relevant as they are equal for both statements, with or without the INSERT part, except for the Table Insert part. But I did see that the INSERT costs 100%. For some reason SQL has trouble inserting the view results in the table.
#Panagiotis Kanavos: Nobody but me is using the database. It's a copy of our clients database and I'm working on it on my local machine.
#gotqn: SomeSmallTable is a table, so no table variable or temporary table. However, it is created when a user opens a specific form in our application, and deleted then the user closes this form.
#Arvo: SomeSmallTable has no keys and no triggers. The view returns less than 200 rows which are inserted in this table, and before these are inserted the table is empty.
I followed the steps in the accepted answer, and eventually compared the current "ComplexView" with the previous version, and found out what caused this issue.
Checking the execution plan is the first step, as others have said. Given that the INSERT (rather than the query) is causing the delay, you could troubleshoot that further. Here are some things you can try:
Try using Statistics IO to find out more, as answered here.
Attempt an INSERT using static data (e.g. INSERT INTO [SomeSmallTable] VALUES (1, 2, '...etc');). This will tell you if the issue is any INSERT statement, or when inserting from a view specifically.
Check how much data the view is returning. 4s may or may not be reasonable, depending on how many rows are being inserted.
Check the table design to see how it is using primary keys, foreign keys, composite keys, indexes, triggers, etc. Some of these features optimise a table's design for selecting, but make insertion slower as a trade-off. A good answer about this can be found here.
If you know it's not a load issue (because you're the only one using this database), check whether something else might be restricting resources on the machine you're using (other resource-intensive tasks, any other queries happening at the same time, scheduled jobs within SQL Server, etc.) You can use SQL Server Profiler to watch the queries in real time.
If slow performance is not limited to this particular query, then there are other general design considerations you can look into.

Why does postgres lock one table when inserting into another

My source tables called Event sitting in a different database and it has millions of rows. Each event can have an action of DELETE, UPDATE or NEW.
We have a Java process that goes through these events in the order they were created and do all sort of rules and then insert the results into multiple tables for look up, analyse etc..
I am using JdbcTemplate and using batchUpdate to delete and upsert to Postgres DB in a sequential order right now, but I'd like to be able to parallel too. Each batch is 1,000 entities to be insert/upserted or deleted.
However, currently even doing in a sequential manner, Postgres locks queries somehow which I don't know much about and why.
Here are some of the codes
entityService.deleteBatch(deletedEntities);
indexingService.deleteBatch(deletedEntities);
...
entityService.updateBatch(allActiveEntities);
indexingService.updateBatch(....);
Each of these services are doing insert/delete into different tables. They are in one transaction though.
The following query
SELECT
activity.pid,
activity.usename,
activity.query,
blocking.pid AS blocking_id,
blocking.query AS blocking_query
FROM pg_stat_activity AS activity
JOIN pg_stat_activity AS blocking ON blocking.pid = ANY(pg_blocking_pids(activity.pid));
returns
Query being blocked: "insert INTO ENTITY (reference, seq, data) VALUES($1, $2, $3) ON CONFLICT ON CONSTRAINT ENTITY_c DO UPDATE SET data = $4",
Blockking query: delete from ENTITY_INDEX where reference = $1
There are no foreign constraints between these tables. And we do have indexes so that we can run queries for our processing as part of the process.
Why would one completely different table can block the other tables? And how can we go about resolving this?
Your query is misleading.
What it shows as “blocking query” is really the last statement that ran in the blocking transaction.
It was probably a previous statement in the same transaction that caused entity (or rather a row in it) to be locked.

How to reduce blocking during concurrent DELETE & INSERT to a single table in SQL Server

We have a stored procedure which loads order details about an order. We always want the latest information about an order, so order details for the order are regenerated every time, when the stored procedure is called. We are using SQL Server 2016.
Pseudo code:
DELETE by clustered index based on order identifier
INSERT into the table, based on a huge query containing information about order
When multiple end-users are executing the stored procedure concurrently, there is a blocking created on orderdetails table. Once the first caller is done, second caller is queued, followed by third caller. So, the time for the generation of the orderdetails increases as time goes by. This is happening especially in the cases of big orders containing details rows in > 100k or 1 or 2 million, as there is table level lock is happening.
The approach we took
We partitioned the table based on the last digit of the order identifier for concurrent orderdetails loading. This improves the performance in the case of first time orderdetails loading, as there are no deletes. But, second time onwards, INSERT in first session is causing blocking for other sessions DELETE. The other sessions are blocked till first session is done with INSERT.
We are considering creation of separate orderdetails table for every order to avoid this concurrency issues.
Question
Can you please suggest some approach, which will support concurrent DELETE & INSERT scenario ?
We solved the contention issue by going for temporary table for orderdetails. We found that huge queries are taking longer SELECT time and this longer time was contributing to longer table level locks on the orderdetails table.
So, we first loaded data into temporary table #orderdetail and then went for DELETE and INSERT in the orderdetail table.
As the orderdetail table is already partitioned, DELETE were faster and INSERT were happening in parallel. INSERT was also very fast here, as it is simple table scan from #orderdetail table.
You can give a look to the Hekaton Engine. It is available even in SQL Server Standard Edition if you are using SP1.
If this is too complicated for implementation due to hardware or software limitations, you can try to play with the Isolation Levels of the database. Sometimes, queries that are reading huge amount of data are blocked or even deadlock victims of queries which are modifying parts of these data. You can ask yourself do you need to guarantee that the data read by the user is valid or you can afford for example some dirty reads?

Generate insert statements with foreign key constraints

I have an issue where an entire tables worth of data was deleted. It is a child table, and contains its own Primary Key, a Foreign Key to its parent and some other data.
I tried using Merge, generated from a stored procedure I found here:
https://github.com/readyroll/generate-sql-merge
This generates a giant merge statement for your whole table. That worked ok for a while, but I've since found that records from a parent table have since been deleted, and Merge doesn't handle this too well.
I've tried rewriting it, but I'm getting bogged down in it and it feels like something somebody else will have done before.
What I'd really like is a way to generate 1000's of insert statements with an If Exists above each one saying
IF NOT EXISTS (select PK from ChildTable where ID = <about to be inserted>) AND EXISTS (select FK from ParentTable where ID = <about to be inserted>)
INSERT RECORD
OUTPUT PK TO LOG TABLE
Theres about 20,000 records so its really something I don't want to have to do by hand, and because the delete event happened several times over a few months, I need to generate the data from several different databases to recreate the whole picture.
I'd like to keep the inserted Ids in a log table, so I can tell whats been inserted, and so the data could be restored to a prescript state for any reason.
Any advice on my approach would also be welcome.
Thanks :)
So long story short is I tried a few ways to fix this, and the best was to generate Insert statements for the table using SQL Generate scripts, and bring them into a temp table.
Because I only wanted to import 90% of the data, and exclude specific records based on a few conditions, I originally thought I should wrap each Insert with an IF, but 20,000 IFs broke down when trying to create a query plan.
Instead, I inserted all the records without a filter into a temp table. I then deleted all the records I didn't want from this table with a couple of Delete statements.
Lastly I then said for all remaining data in the temp table, insert it into the actual proper table, where the data was originally missing from.
This worked perfectly, and SQL Manager was able to run without crashing. It was also a lot clearer what I was doing, and I didn't have to add lots of complicated IFs in a string builder.
I also used OUTPUT on the insert into to log all the record Ids I'd inserted, to give the audit trail like I needed.
As for the string builder, the query was so long and complicated, and Excel was moaning about a 256 char limit, I ended up using 20+ columns to build up my query after concat'ing 26 columns of data. When I used the auto formula drag feature in Excel, it would crash my machine going across so many records, and I have a pretty grunty machine!
Using the Script Generate, rather than the string builder also had the benefit of not altering the data in anyway. It was purley like for like, so weird characters and new lines etc were no issue.

Sql Server rows being auto inserted without a trigger

I have a table in sql server, for which, if I delete a row from it, a new row is inserted with the same data, and userid as the one I deleted. There are no triggers on this table. In addition, I did a search of all database objects that reference this table, and there are no triggers anywhere in the database that reference this table, only some stored procedures, none of which have any code that would cause this behavior.
To be clear, if I run this query:
delete from my_table where id = 1
the row with the id of 1 will be deleted, but a new row will be inserted that has the same userid, and date as the deleted row. No application code involved, just a straight sql delete statement run directly on the database causes this.
What else besides a trigger could be causing this to happen? I've never encountered something like this before.
It took me a long time, but I discovered this was being caused by a "rogue" linq-to-sql dll that was running in spite of it's parent app being killed.
The good news is, there isn't some weird non-trigger way to insert rows on delete in SQL, so we can all resume our normal lives now, knowing all is as it was.

Resources