I am using SQL Server Broker on SQL Server 2008 for Scaleout with SignalR v2.1.2. It was recently discovered that we are producing 50k+ errors per day in our DB logs. After some research, there are 3 orphaned Service Broker queues from December. Error example:
2016-02-27 23:58:01.79 spid30s The activated proc '[dbo].[SqlQueryNotificationStoredProcedure-2ffbddba-6ddc-4ad0-88b4-45a405e975e0]' running on queue 'MY_SIGNALR_DB.dbo.SqlQueryNotificationService-2ffbddba-6ddc-4ad0-88b4-45a405e975e0' output the following: 'Could not find stored procedure 'dbo.SqlQueryNotificationStoredProcedure-2ffbddba-6ddc-4ad0-88b4-45a405e975e0'.'
These queues were created in December and were NOT dropped for some reason. The corresponding SPs were apparently dropped as expected. The DB will produce an error every 5 seconds for this (equates to 50k per day with 3 queues). Each queue DOES contain a message.
Questions:
What can cause this?
Are there additional SignalR settings that can be implemented to ensure these are cleaned up?
Is this a bug in SQL Server Service Broker?
Is there a document which describes SignalR's expected behavior with regards to Queues and their expiration?
Thank you for your time.
These are leftover from SqlDependency. The implementation of the SqlDependency.Start() is to create a just-in-time service, queue and activated procedure (see the reference source). This has some issues, and even a simple Visual Studio debugging session can leave stranded queues/activated procedures.
You can clean up these left-over service/queue/procedures as they happen, or you can choose to use the lower level SqlNotificationRequest class and handle the service/queue deployment on your own. Pick your poison.
Related
I'm working on a .NET application migration from Oracle to SQL Server database. The application was developed in the 2000s by a third party, so we intend to modify it as little as possible in order to avoid introducing new bugs.
I replaced the Oracle references to SqlClient ones (OracleConnection to SqlConnection, OracleTransaction to SqlTransaction etc.) and everything worked fine. However, I'm having trouble with a logic that tries to reconnect to the DB in case of errors.
If a problem occurs when trying to read/write to the database, method TryReconnect is called. This method checks whether the Oracle exception number is 3114 or 12571; if so, it tries to reopen the connection.
I checked these error codes:
ORA-03114: Not Connected to Oracle
ORA-12571: TNS: packet writer failure
I searched for the equivalent error codes for SQL Server but I couldn't find them. I checked the MSSQL and .NET SqlClient documentation but I'm not sure that any of those is equivalent to ORA-3114 and ORA-12571.
Can somebody help me deciding which error numbers should be checked in this logic? I thought about checking for codes 0 (I saw it happen when I stopped the database to force an error and test this) and -2 (Timeout expired), but I'm not really sure about it.
The behavior is different. You can't base your SQL Server retry logic on Oracle semantics. For starters, SqlConnection will retry to connect even in the old System.Data.SqlClient library. Its replacement, Microsoft.Data.SqlClient includes configurable retry logic to handle connections to cloud databases from on-premise applications, eg an on-prem application connecting to Azure SQL. This retry logic is on by default in the current RTM version , 3.0.0.
You can also look at high-level resiliency libraries like Polly, a very popular resiliency package that implements recovery strategies like retries with backoff, circuit breakers etc. This article describes Cadru.Polly which contains strategies for handling several SQL Server transient faults. You could use this directly or you can handle the transient error numbers described in that article:
Exception Handling Strategy
Errors Handled
SqlServerTransientExceptionHandlingStrategy
40501, 49920, 49919, 49918, 41839, 41325, 41305, 41302, 41301, 40613, 40197, 10936, 10929, 10928, 10060, 10054, 10053, 4221, 4060, 12015, 233, 121, 64, 20
SqlServerTransientTransactionExceptionHandlingStrategy
40549, 40550
SqlServerTimeoutExceptionHandlingStrategy
-2
NetworkConnectivityExceptionHandlingStrategy
11001
Polly allows you to combine policies and specify different retry strategies for them, eg :
Using a cached response in some cases (lookup data?)
Retrying with backoff (even random delays) in other cases (deadlocks?). Random delays can be very useful if you run into timeouts because too many concurrent operations cause deadlocks or timeouts. Without it, all failing requests would retry at the same time, causing yet another failure
Using a circuit breaker to switch to a different service or server.
You could create an Oracle strategy so you can use Polly throughout your projects and handle all recoverable failures, not just database retries.
The situation is closely related to SQL Service Broker - communication scenario - migration from SQL 2008 R2 to SQL 2014 and to SQL Service Broker -- one central SQL and more satelite SQL... beginner wants to understand details.
After migration from SQL Server 2008 R2 Standard Ed. to SQL Server 2014 Standard Ed., the same code does not work. The firewall was set to allow the communication.
The sys.transmission_queue (on both sender and receiver servers) keeps empty, and the GenericQueue (my identifier for the queue) receives the messages. However, the procedure attached by (to the receiving SQL server):
ALTER QUEUE [GenericQueue]
WITH ACTIVATION (
STATUS = ON,
MAX_QUEUE_READERS = 1,
PROCEDURE_NAME = [usp_CentralActivation],
EXECUTE AS OWNER);
is not activated. I have put the log message inside to have the tangible proof -- the procedure is not called.
I do not observe any error message or indication -- or I do not know where to look for the indication. How can I find what is the problem? What information should I post here to help to find the reason?
The code that installs the service broker is generated from templates, and besides the machine identification (IP address as a string) the exact same code worked nicely on SQL Server 2008 R2.
Could the EXECUTE AS OWNER be the reason? Who is the OWNER?
A shot in the dark. Try running this:
EXECUTE AS USER = 'dbo';
If this fails, then the database owner SID is invalid locally (happens frequently after a restore or file copy). The solution is trivial:
ALTER AUTHORIZATION ON DATABASE::<dbname> TO [sa];
If you still have problems then have a look at Understanding Queue Monitors. Look in sys.dm_broker_queue_monitors and see that 1) the queue is present and 2) the status is RECEIVES_OCCURING
I am trying to set up a merge replication using web synchronization between a publishing SQL Server 2012 standard and subscribing SQL Server 2012 Express. After following the instructions provided at Technet, I am stuck on this:
Source: Merge Process(Web Sync Server)
Number: -2147200985
Message: The subscription to publication 'MyMergePublication' has expired or does not exist.
I already verified that SSL certification are good, that I can browse to the publishing machine's URL https:\\mycomputer\replisapi.dll and get the expected output. I already verified that snapshot was set up and I took a giant hammer & use an administrator account to run the pool identity which is really bad security-wise but wanted to validate that it was not security that was tripping me up.
To further the mystery, when I try and fail to sync, the publisher acknowledges that a new subscriber has been registered, but it cannot get the snapshot at all and thus subscriber database is still empty.
On the replication monitor, there are no failed synchronization history, or any errors; all it has to say is that the subscriber is uninitialized, and no more.
Turning up the verbosity of the merge agent, I saw some sql being executed and tried replicating the sql and i found this was failing with same error:
{call sys.sp_MSgetreplicainfo(?,?,?,?,?,?,?,90)}
I called it with only the 3 mandatory parameters supplied and it would fail. That is despite the prior call sp_helpmergepublication does return a row for that publication. Oddly, the content of sp_helpmergepublication does not match what I configured for the subscription (e.g. it says web url is null when viewing the properties correctly shows the web url being set). Not sure that is significant.
The content of sp_MSgetreplicainfo contains a call to another system sprocs that I cannot run for some reason (says not found) so I'm not sure what is actually going on here.
Any clues would be greatly appreciated.
We are running a website on a vps server with sql server 2008 x64 r2. We are being bombarded with 17886 errors - namely:
The server will drop the connection, because the client driver has
sent multiple requests while the session is in single-user mode. This
error occurs when a client sends a request to reset the connection
while there are batches still running in the session, or when the
client sends a request while the session is resetting a connection.
Please contact the client driver vendor.
This causes sql statements to return corrupt results. I have tried pretty much all of the suggestions I have found on the net, including:
with mars, and without.
with pooling and without
with async=true and without
we only have one database and it is absolutely multi-user.
Everything has been installed recently so it is up to date. They may be correlated with high cpu (though not exclusively according to the monitors I have seen). Also correlated with high request rates from search engines. However, high cpu/requests shouldn't cause sql connections to reset - at worst we should have high response times or iis refusing to send response.
Any suggestions? I am only a developer not dba - do i need a dba to solve this problem?
Not sure but some of your queries might cause deadlocks on the server.
At the point you detect this error again
Open Management Studio (on the server, install it if necessary)
Open a new query window
Run sp_who2
Check the blkby column which is short for Blocked By. If there is any data in that column you have a deadlock problem (Normally it should be like the screenshot I attached, completely empty).
If you have a deadlock then we can continue with next steps. But right now please check that.
To fix the error above, ”MultipleActiveResultSets=True” needs to be added to the connection string.
via Event ID 17886 MSSQLServer – The server will drop the connection
I would create an eventlog task to email you whenever 17886 is thrown. Then go immediately to the db and execute the sp_who2, get the blkby spid and run a dbcc inputbuffer. Hopefully the eventinfo will give you something a bit more tangible to go on.
sp_who2
DBCC INPUTBUFFER(62)
GO
Use a "Instance Per Request" strategy in your DI-instantiation code and your problem will be solved
Most probably you are using dependency injection. During web development you have to take into account the possibility of concurrent requests. Therefor you have to make sure every request gets new instances during DI, otherwise you will get into concurrency issues. Don't be cheap by using ".SingleInstance" for services and contexts.
Enabling MARS will probably decrease the number of errors, but the errors that are encountered will be less clear. Enabling MARS is always never the solution, do not use this unless you know what you're doing.
I'd like to use SQL Server 2008 Service Broker to log the progress of a long-running (up to about 30 minutes) transaction that is dynamically created by a stored procedure. I have two goals:
1) To get real-time logging of the dynamically-created statements that make up the transaction so that the progress of the transaction can be monitored remotely,
2) To be able to review the steps that made up the transaction up to a point where a failure may have occurred requiring a rollback.
I cannot simply PRINT (or RAISERROR(msg,0,0)) to the console because I want to log the progress messages to a table (and have that log remain even if the stored procedure rollsback).
But my understanding is that messages cannot be received from the queue until the sending thread commits (the outer transaction). Is this true? If so, what options do I have?
It is true that you cannot read messages from the service queue until the transaction is committed.
You could try some other methods:
use a sql clr procedure to send a .net remoting message to a .net app that receives the messages and them log them.
use a sql clr procedure to write a text or other log file to disk.
Some other method...
Regards
AJ