I was testing possible issues with a query when the connection is lost or timed out. To do the test, I run a query with a fresh connection, and just seconds after I kill the program or disconnect the network. Then I check the impact of the query.
I believe that if a query, not within the explicit transaction fails for any reason, will roll back the effect. Of course, this makes sense for operations like DELETE, INSERT, UPDATE or DDL statements too. Implicit Transaction is OFF in the db.
My theory held true except when I ran a SELECT-INTO statement. Sample query that I tried -
SELECT * INTO test_table FROM audit
It failed due to Socket read timeout but later I found that even though there are no records inserted, the new table test_table was created as empty.
After browsing the docs for a while, according to official documentation, it's expected behavior. That's understandable. But the problem for me is that I can't really retry this query execution as the table already exists.
I guess to fix this I need to use the explicit transaction around such statements.
To help me with the feature - Am I going the right way? And are there any other SQL statements that can cause similar behavior?
Thanks in advance.
Edit:
Since I got suggestions on how to fix this, I am wondering now if there are any other SQL statements that can cause similar behavior.
Related
So I have been struggling with handling transactions in SSIS. My requirement is to achieve transaction without enabling MSDTC service and I have partially achieved that but I just got another error which I feel like is one of the many bugs in SSIS. I used execute SQL task and explicitly mentioned begin tran and commit/rollback tran in my package. My package is working fine. All the tables are enclosed in a sequence container. I have a condition where one output from one table goes in 2 different tables and that's where the problem is. The funny part is even the package fails, I will still see insert in only these two tables. SSIS is shown in the attached image. I have disabled two tables. These two tables take input from Frholdsum and even if the package fails and there is no data in FDR holdssum tables. Microsft never ceases to amaze me :(. enter image description here
Set RetainSameConnection on your ConnectionManager to true.
https://munishbansal.wordpress.com/2009/04/01/how-to-retain-same-data-connection-across-multiple-tasks-in-ssis/
It's working fine if I explicitly write delete statements after rollback ran like this:
rollback tran; delete from dbo.UCOP_ENDOW_INVEST; delete from dbo.ucop_fdr_attrib ;
I shouldn't have to do this though :(
I was reading an article the other day the showed how to run SQL Update, Insert, or Deletes as a whatif type scenario. I don't remember the parameter that they talked about and now I can't find the article. Not sure if I was dreaming.
Anyway, does anyone know if there is a parameter in SQL2008 that lets you try an insert, update, or delete without actually committing it? It will actually log or show you what it would have updated. You remove the parameter and run it if it behaves as you would expect.
I don't know of a SQL2008 specific feature with any SQL service that supports transactions you can do this:
Start a transaction ("BEGIN TRANSACTION" in TSQL)
The rest of your INSERT/UPDATE/DELETE/what-ever code
(optional) Some extra SELECT statements and such if needed to output the result of the above actions, if the default output from step 2 (things like "X rows affected") is not enough
Rollback the transaction ("ROLLBACK TRANSACTION" in TSQL)
(optional) Repeat the testing code to show how things are without the code in step 2 having run
For example:
BEGIN TRANSACTION
-- make changes
DELETE people WHERE name LIKE 'X%'
DELETE people WHERE name LIKE 'D%'
EXEC some_proc_that_does_more_work
-- check the DB state after the changes
SELECT COUNT(*) FROM people
-- undo
ROLLBACK TRANSACTION
-- confirm the DB state without the changes
SELECT COUNT(*) FROM people
(you might prefer to do the optional "confirm" step before starting the transaction rather than after rolling it back, but I've always done it this way around as it keeps the two likely-to-be-identical sections of code together for easier editing)
If you use something like this rather then something SQL2008 specific the technique should be transferable to other RDBS too (just update the syntax if needed).
OK, finally figured it out. I've confused this with another project I was working on with PowerShell. PowerShell has a "whatif" parameter that can be used to show you what files would be removed before they are removed.
My apologies to those who have spent time trying to find an answer to this port and my thanks to those of you who have responsed.
I believe you're talking about BEGIN TRANSACTION
BEGIN TRANSACTION starts a local transaction for the connection issuing the statement. Depending on the current transaction isolation level settings, many resources acquired to support the Transact-SQL statements issued by the connection are locked by the transaction until it is completed with either a COMMIT TRANSACTION or ROLLBACK TRANSACTION statement. Transactions left outstanding for long periods of time can prevent other users from accessing these locked resources, and also can prevent log truncation.
Do you perhaps mean SET NOEXEC ON ?
When SET NOEXEC is ON, SQL Server
compiles each batch of Transact-SQL
statements but does not execute them.
When SET NOEXEC is OFF, all batches
are executed after compilation.
Note that this won't warn/indicate things like key violations.
Toad for SQL Server has a "Validate SQL" feature that checks queries against wrong table/column names etc. . Maybe you are talking about some new feature in SSMS 2008 similar to that...
I'm more than seven years late to this particular party but I suspect the feature in question may also have been the OUTPUT clause. Certainly, it can be used to implement whatif functionality similar to Powershell's in a t-sql stored procedure.
https://learn.microsoft.com/en-us/sql/t-sql/queries/output-clause-transact-sql
Use this in each insert/update/delete/merge query to let the SP output a meaningful resultset of the changes it makes e.g. outputting the table name and action performed as the first two columns then all the altered columns.
Then simply rollback the changes if a #whatif parameter is set to 1 or commit them if #whatif is set to 0.
I am very much confused.
I have a transaction in ReadCommitted Isolation level. Among other things I am also updating a counter value in it, something similar to below:
Update tblCount set counter = counter + 1
My application is a desktop application and this transaction happens to occur quite frequently and concurrently. We recently noticed an error that sometimes the counter value doesn't get updated or is missed. We also insert one record on each counter update so we are sure that records have been inserted but somehow counter fails to update. This happens once in 2000 simulaneous transactions.
I seriously doubt it is a lost update anomaly I am facing but if you look at the command above, it's just update the counter from its own value: if I have started a transaction and the transaction has reached this statement, it should have locked the row. This should not cause lost update, but it's happening somehow.
Is the thing that this update command works in two parts? Like first it reads the counter value (during which it doesn't get the exclusive lock) and then writes the new calculated value (when it does get an exclusive lock)?
Please help, I have got really confused.
The update command does not work in two parts. It only works in one.
There's something else going on, and my first guess would be that your transaction is rolling back for another reason. Out of those 2,000 transactions, for example, one may be rolling back - especially if you're doing a ton of things concurrently - and it didn't succeed at all.
That update may not have been what caused the problem, either - you may have deadlocks involved due to other transactions, and they may be failing before the update command (or during the update command).
I'd zoom out and ask questions about the transaction's error handling. Are you doing everything in try/catch blocks? Are you capturing error levels when transactions fail? If not, you'll need to capture a trace with Profiler to find out what's going on.
Are you sure that the SQL is always succeeding? What I mean is, could it be something like an occasional lock time-out? Are you handling SQL exceptions in your .Net code in a way that will be aware of them (i.e a pop-up message or a log entry)?
I have transactional replication running between two databases. I fear they have fallen slightly out of sync, but I don't know which records are affected. If I knew, I could fix it manually on the subscriber side.
SQL Server is giving me this message:
The row was not found at the Subscriber when applying the replicated command. (Source: MSSQLServer, Error number: 20598)
I've looked around to try to find out what table, or even better what record is causing the issue, but I can't find that information anywhere.
The most detailed data I've found so far is:
Transaction sequence number: 0x0003BB0E000001DF000600000000, Command ID: 1
But how do I find the table and row from that? Any ideas?
This gives you the table the error is against
use distribution
go
select * from dbo.MSarticles
where article_id in (
select article_id from MSrepl_commands
where xact_seqno = 0x0003BB0E000001DF000600000000)
And this will give you the command (and the primary key (ie the row) the command was executing against)
exec sp_browsereplcmds
#xact_seqno_start = '0x0003BB0E000001DF000600000000',
#xact_seqno_end = '0x0003BB0E000001DF000600000000'
I'll answer my own question with a workaround I ended up using.
Unfortunately, I could not figure out which table was causing the issue through the SQL Server replication interface (or the Event Log for that matter). It just didn't say.
So the next thing I thought of was, "What if I could get replication to continue even though there is an error?" And lo and behold, there is a way. In fact, it's easy. There is a special Distribution Agent profile called "Continue on data consistency errors." If you enable that, then these types of errors will just be logged and passed on by. Once it is through applying the transactions and potentially logging the errors (I only encountered two), then you can go back and use RedGate SQL Data Compare (or some other tool) to compare your two databases, make any corrections to the subscriber and then start replication running again.
Keep in mind, for this to work, your publication database will need to be "quiet" during the part of the process where you diff and fix the subscriber database. Luckily, I had that luxury in this case.
If your database is not prohibitively large, I would stop replication, re-snapshot and then re-start replication. This technet article describes the steps.
If it got out of sync due to a user accidently changing data on the replica, I would set the necessary permissions to prevent this.
This replication article is worth reading.
Use this query to find out the article that is out of sync:
USE [distribution]
select * from dbo . MSarticles
where article_id IN ( SELECT Article_id from MSrepl_commands
where xact_seqno = 0x0003BB0E000001DF000600000000)
of course if you check the error when the replication fails it also tells you which record is at fault and you could extract that data from the core system and just insert it on the subscriber.
This is better than skipping errors as with the SQL Data Compare it will lock the table for the comparison and if you have millions of rows this can take a long time to run.
Tris
Changing the profile to "Continue on data consistency errors" won't always work. Obviously it reduces or nullifies an error, but you won't get the whole proper data. It will skip the rows by which an error occurs, and hence you fail to get accurate data.
the following checks resolve my problem
check that all the replication SQL Agents jobs are working fine and if not start them.
in my case it was stopped because of some killed session occurred a few hours before by Some DBA because of blocking issue
after a very short time all data in subscription were updated and no
other error in replication monitor
in my case all above queries did not returned nothing
This error usually comes when particular record does not exists on subscriber and a update or delete command executed for same record on primary server and which got replicated on subscriber as well.
As this records does not exists on subscriber, replication throws an error " Row Not Found"
Solution of this error to make replication work back to the normal running state:
We can check with following query, whether request at publisher was of update or delete statement:
USE [distribution]
SELECT *
FROM msrepl_commands
WHERE publisher_database_id = 1
AND command_id = 1
AND xact_seqno = 0x00099979000038D6000100000000
We can get artical id information from above query, which can be passed to below proc:
EXEC Sp_browsereplcmds
#article_id = 813,
#command_id = 1,
#xact_seqno_start = '0x00099979000038D60001',
#xact_seqno_end = '0x00099979000038D60001',
#publisher_database_id = 1
Above query will give information about, whether it was a update statement or delete statement.
In Case of Delete Statement
That record can be directly deleted from msrepl_commands objects so that replication wont make retry attempts for the record
DELETE FROM msrepl_commands
WHERE publisher_database_id = 1
AND command_id =1
AND xact_seqno = 0x00099979000038D6000100000000
In case of update statement:
You need to insert that record manually from publisher DB to subscriber DB:
So for the second day in a row, someone has wiped out an entire table of data as opposed to the one row they were trying to delete because they didn't have the qualified where clause.
I've been all up and down the mgmt studio options, but can't find a confirm option. I know other tools for other databases have it.
I'd suggest that you should always write SELECT statement with WHERE clause first and execute it to actually see what rows will your DELETE command delete. Then just execute DELETE with the same WHERE clause. The same applies for UPDATEs.
Under Tools>Options>Query Execution>SQL Server>ANSI, you can enable the Implicit Transactions option which means that you don't need to explicitly include the Begin Transaction command.
The obvious downside of this is that you might forget to add a Commit (or Rollback) at the end, or worse still, your colleagues will add Commit at the end of every script by default.
You can lead the horse to water...
You might suggest that they always take an ad-hoc backup before they do anything (depending on the size of your DB) just in case.
Try using a BEGIN TRANSACTION before you run your DELETE statement.
Then you can choose to COMMIT or ROLLBACK same.
In SSMS 2005, you can enable this option under Tools|Options|Query Execution|SQL Server|ANSI ... check SET IMPLICIT_TRANSACTIONS. That will require a commit to affect update/delete queries for future connections.
For the current query, go to Query|Query Options|Execution|ANSI and check the same box.
This page also has instructions for SSMS 2000, if that is what you're using.
As others have pointed out, this won't address the root cause: it's almost as easy to paste a COMMIT at the end of every new query you create as it is to fire off a query in the first place.
First, this is what audit tables are for. If you know who deleted all the records you can either restrict their database privileges or deal with them from a performance perspective. The last person who did this at my office is currently on probation. If she does it again, she will be let go. You have responsibilites if you have access to production data and ensuring that you cause no harm is one of them. This is a performance problem as much as a technical problem. You will never find a way to prevent people from making dumb mistakes (the database has no way to know if you meant delete table a or delete table a where id = 100 and a confirm will get hit automatically by most people). You can only try to reduce them by making sure the people who run this code are responsible and by putting into place policies to help them remember what to do. Employees who have a pattern of behaving irresponsibly with your busness data (particulaly after they have been given a warning) should be fired.
Others have suggested the kinds of things we do to prevent this from happening. I always embed a select in a delete that I'm running from a query window to make sure it will delete only the records I intend. All our code on production that changes, inserts or deletes data must be enclosed in a transaction. If it is being run manually, you don't run the rollback or commit until you see the number of records affected.
Example of delete with embedded select
delete a
--select a.* from
from table1 a
join table 2 b on a.id = b.id
where b.somefield = 'test'
But even these techniques can't prevent all human error. A developer who doesn't understand the data may run the select and still not understand that it is deleting too many records. Running in a transaction may mean you have other problems when people forget to commit or rollback and lock up the system. Or people may put it in a transaction and still hit commit without thinking just as they would hit confirm on a message box if there was one. The best prevention is to have a way to quickly recover from errors like these. Recovery from an audit log table tends to be faster than from backups. Plus you have the advantage of being able to tell who made the error and exactly which records were affected (maybe you didn't delete the whole table but your where clause was wrong and you deleted a few wrong records.)
For the most part, production data should not be changed on the fly. You should script the change and check it on dev first. Then on prod, all you have to do is run the script with no changes rather than highlighting and running little pieces one at a time. Now inthe real world this isn't always possible as sometimes you are fixing something broken only on prod that needs to be fixed now (for instance when none of your customers can log in because critical data got deleted). In a case like this, you may not have the luxury of reproducing the problem first on dev and then writing the fix. When you have these types of problems, you may need to fix directly on prod and you should have only dbas or database analysts, or configuration managers or others who are normally responsible for data on the prod do the fix not a developer. Developers in general should not have access to prod.
That is why I believe you should always:
1 Use stored procedures that are tested on a dev database before deploying to production
2 Select the data before deletion
3 Screen developers using an interview and performance evaluation process :)
4 Base performance evaluation on how many database tables they do/do not delete
5 Treat production data as if it were poisonous and be very afraid
So for the second day in a row, someone has wiped out an entire table of data as opposed to the one row they were trying to delete because they didn't have the qualified where clause
Probably the only solution will be to replace someone with someone else ;). Otherwise they will always find their workaround
Eventually restrict the database access for that person and provide them with the stored procedure that takes the parameter used in the where clause and grant them access to execute that stored procedure.
Put on your best Trogdor and Burninate until they learn to put in the WHERE clause.
The best advice is to get the muckety-mucks that are mucking around in the database to use transactions when testing. It goes a long way towards preventing "whoops" moments. The caveat is that now you have to tell them to COMMIT or ROLLBACK because for sure they're going to lock up your DB at least once.
Lock it down:
REVOKE delete rights on all your tables.
Put in an audit trigger and audit table.
Create parametrized delete SPs and only give rights to execute on an as needed basis.
Isn't there a way to give users the results they need without providing raw access to SQL? If you at least had a separate entry box for "WHERE", you could default it to "WHERE 1 = 0" or something.
I think there must be a way to back these out of the transaction journaling, too. But probably not without rolling everything back, and then selectively reapplying whatever came after the fatal mistake.
Another ugly option is to create a trigger to write all DELETEs (maybe over some minimum number of records) to a log table.