I have a table that has millions of rows.
Accidentally I wrote an update query over a table without where clause and clicked execute.
It started executing. After two seconds I realized the query is wrong and I clicked 'Stop' button in Sql Server Management Studio. The query execution was stopped, this all happened within 7 seconds.
Now I am curious to know if there are any rows affected. If any which are they?
How to find it?
A single update statement will not update some rows. It's all rows or none
This is the atomicity in the ACID properties which SQL server respects well.
Atomicity requires that each transaction is "all or nothing": if one part of the transaction fails, the entire transaction fails, and the database state is left unchanged. An atomic system must guarantee atomicity in each and every situation, including power failures, errors, and crashes.
Then the commit is at the end of the statement, so when you cancel there's no commit
Related
For example, if I have a task that's inserting rows into a table while another task is truncating the same table, what happens?
I'm asking because I have a task that runs every minute which inserts rows into a table and then a lambda that reads and truncates the same table that runs every minute. I know snow tasks and event bridge don't run at every minute on the dot so I haven't really run into this issue yet but I'm thinking it'll happen eventually.
How does snowflake handle this?
It is the same concept in other SQL engines, that lock on resources will be placed.
In the Snowflake world, INSERT will have PARTITION level locking, because most of the INSERT statements write only new partitions.
Please see the below doc:
https://docs.snowflake.com/en/sql-reference/transactions.html#resource-locking
If the INSERT query is submitted before the TRUNCATE, then the TRUNCATE will have to wait until the INSERT query finishes. They can't be operated at the same time on the same resource.
See the screenshot below, the first query was the INSERT, which was HOLDING the PARTITION level lock, while the second query was the TRUNCATE, which was in the WAITING state:
The table will be locked by the first transaction that runs and subsequent transactions will be queued until the preceding transaction(s) complete.
BTW (and this may be the point of your question) having two processes like this operate independently doesn’t seem like a good design - as the lambda process seems to be logically dependent on the task.
I've been running a delete query on one of our databases. The query has been running for about 7 hours and now I need to cancel it. If I cancel it will it cause a rollback? And of so, is there a way to cancel the query without causing a rollback?
Thanks
Yes, it will cause a rollback. You can expect the rollback to take even more time than the original delete (because a: rollbacks are always single-threaded and b: rollbacks are logged as well). Same if you restart the instance. SQL Server will always do it's best to return database to transactionally consistent state.
To add to my own answer :)
The best way to delete large amount of data from a table is to do it in smaller chunks, like 5000 rows at a time, each in it's own transaction. I know it's too late now, but for the next time.
Unless you're deleting all the rows from the table, then it's TRUNCATE TABLE the fastest method.
If I cancel it will it cause a rollback?
Yes, and it likely takes longer than it was running so far.
nd of so, is there a way to cancel the query without causing a rollback?
No. The data now is in an inconsistent state without rollback.
I have a SQL Server database where I am deleting rows from three tables A,B,C in batches with some conditions through a SQL script scheduled in a SQL job. The job runs for 2 hours as the tables have a large amount of data. While the job is running, my front end application is not accessible (giving timeout error) since the application inserts and updates data in these same tables A,B,C.
Is it possible for the front end application to run in parallel without any issues while the SQL script is running? I have checked for the locks on the table and SQL Server is acquiring page locks. Can Read Committed Snapshot or Snapshot isolation levels or converting page locks to row locks help here. Need advice.
Split the operation in two phases. In the first phase, collect the primary keys of rows to delete:
create table #TempList (ID int);
insert #TempList
select ID
from YourTable
In the second phase, use a loop to delete those rows in small batches:
while 1=1
begin
delete top (1000)
from YourTable
where ID in (select ID from #TempList)
if ##rowcount = 0
break
end
The smaller batches will allow your front end applications to continue in between them.
I suspect that SQL Server at some point escalates to table lock, and this means that the table is inaccessible, both for reading and updating.
To optimize locking and concurrency when dealing with large deletes, use batches. Start with 5000 rows at the time (to prevent lock escalation) and monitor how it behaves and whether it needs further tuning up or down. 5000 is a "magic number", but it's low enough number that lock manager doesn't consider escalating to table lock, and large enough for the performance.
Whether timeouts will happen or not depends on other factors as well, but this will surely reduce if not elliminate alltogether. If the timeout happen on read operations, you should be able to get rid of them. Another approach, of course, is to increase the command timeout value on client.
Snapshot (optimistic) isolation is an option as well, READ COMMITTED SNAPSHOT more precisely, but it won't help with updates from other sessions. Also, beware of version store (in tempdb) growth. Best if you combine it with the proposed batch approach to keep the transactions small.
Also, switch to bulk-logged recovery for the duration of delete if the database is in full recovery normally. But switch back as soon as it finishes, and make a backup.
Almost forgot -- if it's Enterprise edition of SQL Server, partition your table; then you can just switch the partition out, it's almost momentarilly and the clients will never notice it.
I happened to execute a query similar to this one:
update table1
set data=(select data from table1 where key1=val1 and key2=val2)
which was supposed to update only one row, but since I missed the second where clause, I guess it started to update every row in the database, which contains a few million rows.
The correct query would have taken about 0 seconds and would be:
update table1
set data=(select data from table1 where key1=val1 and key2=val2)
where key1=val1 and key2=val3
After a few seconds, I realized it took too long and stopped it.
The database is set to full recovery mode and running on sql server 2008 r2.
The question is, what was the effect of this query? My hope is that there would be no effect since the query was stopped before completion and SQL Server rolled back the changes automatically. Is that correct?
If not, how do I roll back the database to its state at a particular point in time (right before I did the unfortunate update)?
(I saw this question: If I stop a long running query, does it rollback? but it is different in that it performs several changes as opposed to just one.)
(And yes, I do have very recent backups, but given the size of the DB I would prefer not to have to restore from backup)
If your command to cancel came in time, it was rolled back in its entirety. DML statements are always all or nothing. You should probably check the data to make sure that your cancel did arrive in time. It might have arrived in the last millisecond or so after the transaction was already committed.
I've got in an ASP.NET application this process :
Start a connection
Start a transaction
Insert into a table "LoadData" a lot of values with the SqlBulkCopy class with a column that contains a specific LoadId.
Call a stored procedure that :
read the table "LoadData" for the specific LoadId.
For each line does a lot of calculations which implies reading dozens of tables and write the results into a temporary (#temp) table (process that last several minutes).
Deletes the lines in "LoadDate" for the specific LoadId.
Once everything is done, write the result in the result table.
Commit transaction or rollback if something fails.
My problem is that if I have 2 users that start the process, the second one will have to wait that the previous has finished (because the insert seems to put an exclusive lock on the table) and my application sometimes falls in timeout (and the users are not happy to wait :) ).
I'm looking for a way to be able to have the users that does everything in parallel as there is no interaction, except the last one: writing the result. I think that what is blocking me is the inserts / deletes in the "LoadData" table.
I checked the other transaction isolation levels but it seems that nothing could help me.
What would be perfect would be to be able to remove the exclusive lock on the "LoadData" table (is it possible to force SqlServer to only lock rows and not table ?) when the Insert is finished, but without ending the transaction.
Any suggestion?
Look up SET TRANSACTION ISOLATION LEVEL READ COMMITTED SNAPSHOT in Books OnLine.
Transactions should cover small and fast-executing pieces of SQL / code. They have a tendancy to be implemented differently on different platforms. They will lock tables and then expand the lock as the modifications grow thus locking out the other users from querying or updating the same row / page / table.
Why not forget the transaction, and handle processing errors in another way? Is your data integrity truely being secured by the transaction, or can you do without it?
if you're sure that there is no issue with cioncurrent operations except the last part, why not start the transaction just before those last statements, Whichever they are that DO require isolation), and commit immediately after they succeed.. Then all the upfront read operations will not block each other...