Database Keeps Going into Recovery Pending State - sql-server

I have a SQL server database that has been running perfectly fine on my machine for about 6 months, a couple days ago out of nowhere it was inaccessible (Pending Recovery).
I did a bunch of Googling and have tried the following things to fix the issue but thus far restoring it from a previous backup is the only thing that seems to work.
I have tried (From SMS and SQLCMD):
ALTER DATABASE mydatabase SET EMERGENCY
ALTER DATABASE mydatabase set single_user
DBCC CHECKDB (mydatabase, REPAIR_ALLOW_DATA_LOSS) WITH ALL_ERRORMSGS;
ALTER DATABASE mydatabase set multi_user
Step #3 errors out with: "cannot open mydatabase is already open and can only have one user at a time"
Second try:
EXEC sp_resetstatus 'mydatabase';
ALTER DATABASE mydatabase SET EMERGENCY
DBCC CHECKDB ('mydatabase')
ALTER DATABASE mydatabase SET SINGLE_USER WITH ROLLBACK IMMEDIATE
DBCC CHECKDB ('mydatabase', REPAIR_ALLOW_DATA_LOSS)
ALTER DATABASE mydatabase SET MULTI_USER
Step #5 errors out with the same error.
My question is what could be causing this in the first place and how can I fix it properly without having to do a restore twice a day.

Database is already open and can only have one user at a time, this is error number 924. The complete error message looks like this:
An exception occurred while executing a Transact-SQL statement or batch. (Microsoft.SqlServer.ConnectionInfo)
Msg 924, Level 14, State 1, Line 1 Database ‘db_name’ is already open and can only have one user at a time.
The level 14 belongs to security level errors like a permission denied. It means that it cannot be open because someone is using it.
Use the sp_who or sp_who2 stored procedures. You can also use the kill command to kill the processes that are active in the database.
I also found this thread useful: How to fix Recovery Pending State in SQL Server Database?

what could be causing this in the first place and how can I fix it properly without having to do a restore
The most likely cause is a a hardware or driver problem with your hard disk.

In my case, I had databases set up on my local machine but on an external drive mapped to my hard drive. I have the external drive connected to my docking station all the time but I had to disconnect the hard drive and after I connected it again - the databases that are restored on the external drive went into Recover Pending mode.
In my case what helped me was to set the database offline in Microsoft SQL Server Management Studio by right clicking on the database - Tasks - Take Offline. The status of the database changes to Offline. After that bring the database online again by right clicking on the database - Tasks - Bring online.
The database was successfully recovered without any issues. But if the cause is different these steps may not help.
Take the database offline
Bring the database back online

Related

Can't Run DBCC CHECKDB on master DB - Azure Files

Storing SQL Server database files on new Azure Files share. Cannot run full / comprehensive CHECKDB against these databases - I think this has something to do with user account not having permissions to create snapshots. As a result, I offloaded these checks to an alternate server where I can also test .baks. Everything works fine except for the master db, which registers corruption when you restore it as a user db and run CHECKDB against it (https://www.itprotoday.com/my-master-database-really-corrupt), even though it's not corrupt.
Questions:
1) Has anyone run into the same problem running CHECKDB on SQL db files stored on an Azure Files share? Is there a workaround?
2) What's an alternative to running CHECKDB on master if I cannot run it in PROD? Can I somehow restore master to another SQL instance and check it there?
Error when I execute DBCC CHECKDB (master) in PROD:
Msg 5030, Level 16, State 12, Line 4
The database could not be exclusively locked to perform the operation.
Msg 7926, Level 16, State 1, Line 4
Check statement aborted. The database could not be checked as a database snapshot could not be created and the database or table could not be locked. See Books Online for details of when this behavior is expected and what workarounds exist. Also see previous errors for more details.
Message when I run DBCC CHECKDB on user db in PROD:
DBCC CHECKDB will not check SQL Server catalog or Service Broker consistency because a database snapshot could not be created or because WITH TABLOCK was specified.
Please reference this Azure Support document: Error message when you run any of the DBCC CHECK commands in SQL Server: "The database could not be exclusively locked to perform the operation"
In Microsoft SQL Server, you may receive an error message when you run any of the following DBCC commands:
DBCC CHECKDB
DBCC CHECKTABLE
DBCC CHECKALLOC
DBCC CHECKCATALOG
DBCC CHECKFILEGROUP
The error message contains the following text:
Msg 5030, Level 16, State 12, Line 1 The database could not be exclusively locked to perform the operation.
Msg 7926, Level 16, State 1, Line 1
Check statement aborted. The database could not be checked as a database snapshot could not be created and the database or table could not be locked. See Books Online for details of when this behavior is expected and what workarounds exist. Also see previous errors for more details.
Cause:
This problem occurs if the following conditions are true:
At least one other connection is using the database against which you
run the DBCC CHECK command.
The database contains at least one file group that is marked as
read-only.
Starting with SQL Server 2005, DBCC CHECK commands create and use an internal database snapshot for consistency purposes when the command performs any checks. If a read-only file group exists in the database, the internal database snapshot is not created. To continue to perform the checks, the DBCC CHECK command tries to acquire an EX database lock. If other users are connected to this database, this attempt to acquire an EX lock fails. Therefore, you receive an error message.
Resolution
To resolve this problem, follow these steps instead of running the DBCC CHECK command against the database:
Create a database snapshot of the database for which you want to
perform the checks. For more information about how to create a
database snapshot, see the "Create a Database Snapshot
(Transact-SQL)" topic in SQL Server Books Online.
Run the DBCC CHECK command against the database snapshot.
Drop the database snapshot after the DBCC CHECK command is
completed.
This document can give more helps to solve the problem.
Updates:
For the system databases it does not use database snapshots, but it will hold table locks.
You also an reference this blog: Checkdb giving error for master database:
Mike Walsh gives us more things about the error.
Hope this helps.

SQL Server detected a logical consistency-based I/O error

I am using Sharepoint Foundation 2010. I got error message(824) in Event logs while executing regular schedule job for backing up databases.
WSS_Logging is showing error below:
"SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0xa691e24a; actual: 0xb68ce671). It occurred during a read of page (1:6095) in database ID 9 at offset 0x00000002f9e000 in file 'C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\WSS_Logging.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). "
Please help..
From MSDN:
What does this error mean:
This error indicates that Windows reports that the page is successfully read from disk, but SQL Server has discovered something wrong with the page. This error is similar to error 823 except that Windows did not detect the error. This usually indicates a problem in the I/O subsystem, such as a failing disk drive, disk firmware problems, faulty device driver, and so on
simply put,run CHKDSK and see if you get any errors
CHKDSK [volume[[path]filename]] [/F] [/V] [/R] [/X] [/I] [/C] [/L[:size]]
and also change the page_verify option to checksum if you haven't
Read below for more details from MSDN link
what are next steps:
Look for Hardware Failure
Run hardware diagnostics and correct any problems. Also examine the Microsoft Windows system and application logs and the SQL Server error log to see whether the error occurred because of hardware failure. Fix any hardware-related problems that are contained in the logs.
If you have persistent data corruption problems, try to swap out different hardware components to isolate the problem. Check to make sure that the system does not have write-caching enabled on the disk controller. If you suspect write-caching to be the problem, contact your hardware vendor.
Finally, you might find it useful to switch to a new hardware system. This switch may include reformatting the disk drives and reinstalling the operating system.
Restore from Backup
If the problem is not hardware-related and a known clean backup is available, restore the database from the backup.
Consider changing the databases to use the PAGE_VERIFY CHECKSUM option.
DBCC CHECKDB (yourdatabasename)
and DBCheck will give errors against the tables.
You can do repair each table with this function CHECKTABLE('tablename1', REPAIR_ALLOW_DATA_LOSS)
USE yourdatabasename;
GO
ALTER DATABASE yourdatabasename
SET single_user;
GO
DBCC CHECKTABLE('tablename1', REPAIR_ALLOW_DATA_LOSS)
GO
DBCC CHECKTABLE('tablename2', REPAIR_ALLOW_DATA_LOSS)
GO
ALTER DATABASE yourdatabasename
SET MULTI_USER;
GO
USE dbreckitInventory;
GO
ALTER DATABASE dbreckitInventory
SET single_user;
GO
DBCC CHECKTABLE('tblpurchasedetails', REPAIR_ALLOW_DATA_LOSS)
GO
DBCC CHECKTABLE('TblSalesDetails', REPAIR_ALLOW_DATA_LOSS)
GO
ALTER DATABASE dbreckitInventory
SET MULTI_USER;
GO

Database Recovery Pending - SQL Server 2014

My server is shutting down because the electrical. And when I opened my database in SSMS, database is recovery pending.
I checked my ERROR LOG, the message are :
4 transactions rolled forward in database 'POSDW' (14:0). This is an
informational message only. No user action is required.
restoreHkDatabase: DbId 14, Msg 41313, Level 16, State 1, The C
compiler encountered a failure. The exit code was 2.
[ERROR] Database ID: [14] 'POSDW'. Failed to load XTP checkpoint.
Error code: 0x82000009.
(d:\sql12_main_t\sql\ntdbms\hekaton\sqlhost\sqlmin\hkhostdb.cpp : 3126
- 'RecoverHkDatabase') Error: 41313, Severity: 16, State: 1.
I already tried to take offline but when I bring online I get error.
Can you guys help me.
Thanks
It looks like corruption. You can try one of the following options:
Restore from existing backup
Try enter the database into emergency mode and run DBCC CHECKDB. according to the results you can see if you can restore the pages from existing backup (in some cases) or run DBCC CHECKDB .
If you have functional replica of the data take the data from there.
Hope this helps.
You were able to move the database files while they were in "Recovery Pending" mode because that status means SQL couldn't open the database files for some reason. It also means it couldn't lock the files as well.
Setting a database to "offline" is similar because it unlocks the underlying files (that's why you can move database files while it's offline).
I'm pretty sure that if you'd try setting all these databases to "online" you'd get the error message you talk about or something similar.
ALTER DATABASE MyDB SET ONLINE
You can use the above statement to try and activate databases that are "offline" or are in "Recovery Pending" mode. If there's some kind of problem, SQL Server will let you know at that moment.
You can also read this thread: How to fix Recovery Pending State in SQL Server Database?

SQL Server production server - all databases are in recovery pending state

All the databases in my SQL Server production server are in recovery pending state. I tried to execute different queries but they were of no use. Please help me as production work has been stopped at client side.
Tried to execute alter commands - but show error as following:
Msg 5120, Level 16, State 101, Line 1 Unable to open the physical file
"G:\Data\MSSQL\Database.mdf". Operating system error 3: "3(The system
cannot find the path specified.)". File activation failure. The
physical file name "G:\Data\MSSQL\Data\Database_log.ldf" may be
incorrect. Msg 945, Level 14, State 2, Line 1 Database 'Database'
cannot be opened due to inaccessible files or insufficient memory or
disk space. See the SQL Server errorlog for details
Msg 5069, Level 16, State 1, Line 1
Recovery pending means that for some reason SQL cannot run restart recovery on the database. Usually this is because the log is missing or corrupt.
Check to see if you can find the Database.mdf and Database_log.ldf files in the folder specified.
Check your system has not run out of disk space.
This could be caused by a hard drive failure. You may need to do a full restore of your last full back, any differentials and then restore the logs up until the log error occurred.
See similar issue here
My team encountered this error many times for my clients & I know, It is not easy to manage in the Production server. In your case Error 5120 –This error comes when the database is in Read Only Mode.
To fix this you can below code
USE [master]
GO
ALTER DATABASE [SQLAuthority] SET READ_WRITE WITH NO_WAIT
GO
After fixing 5120, you can process to fix "databases are in recovery pending state"
Recovery Pending – If the SQL Server knows that database recovery needs to be run but something is preventing it from starting, the Server marks the db in ‘Recovery Pending’ state. This is different from the SUSPECT state because it cannot be said that recovery is going to fail – it just hasn’t started yet.
Execute the following set of queries:
ALTER DATABASE [DBName] SET EMERGENCY; GO
ALTER DATABASE [DBName] set single_user GO
DBCC CHECKDB ([DBName], REPAIR_ALLOW_DATA_LOSS) WITH ALL_ERRORMSGS; GO
ALTER DATABASE [DBName] set multi_user GO
Note: You can also read the Microsoft Warning on DBCC CHECKDB REPAIR ALLOW DATA LOSS.
It might be because of following possible causes:
Permissions
Find your SQL Server instance in the services list and double-click it, then select the Log On tab. It is this log on account that must have sufficient permissions to write to the temporary backup folder location. Check the permissions on the temporary backup folder by right-clicking it in Windows Explorer, selecting Properties, then navigating to the Security tab. Make sure that the account SQL Server is using has explicit read/write permissions for this folder.
Mapped Drives
Use a fully qualified UNC path instead of a mapped drive letter.
Lack Of Domain Trust
You can resolve this issue by ensuring that a trust between the two domains is established. You may also need to configure the SQL Server service account with pass-through authentication between the domains.
Please refer more here for recovery db
Execute these queries to fix SQL server database in recovery pending state:
ALTER DATABASE [DBName] SET EMERGENCY
GO
ALTER DATABASE [DBName] SET single_user
GO
DBCC CHECKDB ([DBName], REPAIR_ALLOW_DATA_LOSS) WITH ALL_ERRORMSGS
GO
ALTER DATABASE [DBName] SET multi_user
GO
EMERGENCY mode marks the SQL Server database as READ_ONLY, deactivates logging, and gives the permission to system admin only. This method is capable of resolving any technical issue and bringing the database back to the accessible state. The database will automatically come out of the EMERGENCY mode.

SQL Job Agent DB Restore fails with error #6107: Only user processes can be killed

We have an SQL Job Agent that runs in the "wee hours" to restore our local database (FooData) from a production backup.
First, the database is set to SINGLE_USER mode and any open processes are killed. Second, the database is restored.
But the 3rd step fails occasionally with Error 6107: "Only User Processes Can Be Killed"
This happens about once or twice a week at seemingly random intervals. Here is the code for step 3 where the failure occasionally occurs:
USE master;
go
exec msdb.dbo.KillSpids FooData;
go
ALTER DATABASE FooData SET MULTI_USER;
go
Does anybody have any ideas what might be occurring to cause this error? I'm thinking there might be some automated process starting up during step 3 or possibly some user trying to log in during that time? I'm not a DBA, so I'm guessing at this point, although I believe that a user should not be able to log in while the DB is in SINGLE_USER mode.
A user probably isn't logged in. The system is probably performing some task. The output of exec sp_who or sp_who2 will show what sessions are open. Any SPID below 50 is a system process, and cannot be killed with KILL. The only way to stop them is to stop the SQL Server service or issue a SHUTDOWN command (which does the same thing).
I found the answer to my problem by changing one line of code which worked like a charm.
As mentioned in the original question, the 'KillSpids" line is used in Step 1 of the job. (Along with SET SINGLE USER) The 'KillSpids' made sense in Step 1 because there may be unwanted processes still active on the database.
The 'KillSpids' line was then added again into Step 3, but it was unnecessary, and was also causing the 6107 error.
I replaced the 'KillSpids' line with the one shown below. Setting the freshly restored database to single user mode takes care of the concern that a user might try to log in before all the job steps have been completed. Here is the updated code:
USE master;
go
ALTER DATABASE [FooData] SET SINGLE_USER WITH ROLLBACK IMMEDIATE
go
ALTER DATABASE FooData SET MULTI_USER;
go

Resources