Why sys.conversation_endpoints can be empty? - sql-server

I have started using Service Broker. After reading article https://www.itprotoday.com/sql-server/managing-service-broker-conversations i have tried reuse conversation via selecting existing one from sys.conversation_endpoints:
select top 1
#Handle = CEP.conversation_handle
from sys.conversation_endpoints CEP with(nolock)
where CEP.far_service = 'EventService'
and CEP.state = 'CO'
and CEP.is_initiator = 1
order by CEP.lifetime desc
and it worked nice on stage. But after releasing to prod problem was found with selecting sys.conversation_endpoints - sometime it was empty, even with nolock, although there a lot of records when selecting in monitoring script. After spending several hours in google i can't found answer how it can be. Please, help me to understand how it can be to avoid it.
PS Microsoft SQL Server 2017 (RTM-CU17) (KB4515579) - 14.0.3238.1 (X64)

The problem is in user who call my stored procedure, he does not have permissions to see some metadata info.

Related

SQL Azure Request Limits and a possible connection leak

I have an interesting problem going on. I recently moved 2 SQL databases to SQL Azure for a client and all seemed to be going well...at first. Mid-morning I get a spike of error emails for various things, but a few common ones:
-The request limit for the database is 90 and has been reached.
-Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
-A transport-level error has occurred when receiving results from the server.
There's obviously some database related issues going on with the move to Azure, or the existing code in general. The errors that seemed to happen the most were request limit and timeouts. Once they started, they never seemed to stop. And I don't think there were many users using the site today. It almost seemed like the connection continued to try to connect on a different thread in the background if this makes any sense. This is in reference to the "The timeout period elapsed prior to completion of the operation or the server is not responding." I would get an error email, I'd check the page it referenced myself, and it would load immediately. I checked with the user who threw the error and they reported everything was fine. Strange. Yet I continued every few minutes to get the same error email.
I currently have them on the S1 Tier which limits the requests to 90 concurrently. I did some digging and found the following SQL query:
select * from sys.dm_exec_connections
I ran this, and it showed I had over 90 active connections, some of which were opened some time ago. This was strange to me as the site was currently not being used (it's really late at night and I know no one is using the site). I wanted to end all the connections so I came up with the following query:
DECLARE #sessionId int
DECLARE #SQL nvarchar(1000)
DECLARE #clientIP nvarchar(50)
set #clientIP = 'XX.XX.XX.XX'
select #sessionId = min( session_id ) from sys.dm_exec_connections where client_net_address = #clientIP
while #sessionId is not null
begin
SET #SQL = 'KILL ' + CAST(#sessionId as varchar(4))
EXEC (#SQL)
select #sessionId = min( session_id ) from sys.dm_exec_connections where session_id > #sessionId and client_net_address = #clientIP
end
I tried running this command, but the connections came right back. I went on the web server and manually stopped the site in IIS, ran the KILL command again but the connections remained. I put up the app_offline file and took the site down for about a half hour to see if any lingering connections would drop, but they didn't. And I still continued to get error emails for pages I KNEW were not accessible because I stopped the Site AND app pool. I went on the server and manually stopped the w3wp process and ran SQL KILL statements to kill the connections. They finally went away! I put the app back online and hit a single page. I kept running the above query to see the active connections and sure enough every time I ran the query the active connection count kept creeping up. It stops around 102 as of right now. And that's me as a user hitting a single page. I'm guessing this isn't normal? Does this indicate connections are lingering out there and not being dropped or closed?
I just made code changes recently adding Entity Framework. Wherever I'm grabbing data through EF, im using so with a using statement on the context. The rest of the app is sort of old and is using TableAdapters. I see in some places it's following the same pattern with using statements, other places Dispose is being called. I haven't had a chance to track down all the usages yet. Is this a good place to start looking? Anyone have any suggestions on how to track this 'leak' down? I'm not super knowledgeable with SQL so any help would be greatly appreciated!

SQL Server Agent - SSIS Package - Error 0x80131904 - Timeout expired

There's been a string of random occurrences of the following error in the SQL Server Agent scheduled jobs lately that I have been unable to find a solution to.
The error occurs infrequently, but usually once a week for a daily scheduled job, but in any number of different jobs and not always the same one. Each job shares the fact that it executes an SSIS package from the same server that is running the job. It also always runs for almost exactly 30 seconds elapsed time, which I guess is the timeout threshold. I'm not sure why it would timeout if the server is just connecting to its own SSIS catalog. Also of note is that it never actually gets to the point where it executes the SSIS package, and this occurs regardless of which package is trying to be executed.
During my research I came across many people suggesting that simply updating SQL Server 2012 to the latest CU* or SP2 would solve the problem. However, upgrading the server to SP2 has not.
One solution tried (which admittedly was ugly) was to simply have a single retry upon failure of the job step, which actually did solve the problem in about 30% of the cases.
I would welcome anyone who has experience with this error, or anyone who has any suggestions.
The error message is as follows:
Date 16/07/2014 6:00:11 AM
Log Job History ({$jobname})
Step ID 1
Server {$productionserver}
Job Name {$jobname}
Step Name {$stepname}
Duration 00:00:31
Sql Severity 0
Sql Message ID 0
Operator Emailed
Operator Net sent
Operator Paged
Retries Attempted 0
Message
Executed as user: {$user}.
Microsoft (R) SQL Server Execute Package Utility Version 11.0.5058.0 for 64-bit Copyright (C) Microsoft Corporation. All rights reserved.
Started: 6:00:11 AM Failed to execute IS server package because of error 0x80131904.
Server: {$productionserver},
Package path: {$packagepath},
Environment reference Id: NULL.
Description: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
Source: .Net SqlClient Data Provider
Started: 6:00:11 AM Finished: 6:00:42 AM
Elapsed: 31.122 seconds. The package execution failed. The step failed.
Try this:
Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding
And this
https://connect.microsoft.com/SQLServer/feedback/details/783291/ssis-package-fails-to-start-application-lock-timeout-in-ssisdb-catalog-create-execution
Looks like its a known bug.
Check what else is/was running on the instance at the time of the package failures (e.g. a database integrity check or similarly intensive operation).
The SQL Agent is timing out talking to its own SSIS catalog (a 30 second timeout). It's not actually executing the packages, so it's nothing to do with the packages themselves and everything to do how busy the instance is at the time of the execution.
(Answering this question since it comes up in a Google search)
I know this is an older question. but I'm having the same problem and this doesn't have an accepted answer.
The job fails in 1.5 seconds so I believe it is NOT a timeout issue.
I can confirm 0x80131904 is (or can be) a permissions issue. I had my SSIS package running under a SQL Agent job just fine with sysadmin and network admin privileges. when i switched it to an account with fewer permissions, i get this error.
For me, the problem was because i was not assigning permissions in all the correct places. I already set Read/Execute permissions in the Project Properties. Then (this is the step I didn't do) I had to assign Read permissions on the folder containing Projects and Environments.
Hope this helps someone.
We have experienced this error when attempting to start several SSIS packages at the same instant. Service packs were supposed to fix it, but have not. We have implemented a staggered schedule for SSIS packages so only one package is starting at any given moment.
We also experienced the same bug. As a workaround, we created the following stored procedure. If you put this into a job that runs every f.e. 10 minutes, it makes sure that if there are random failures, the job gets restarted continuously until you reach an occurence without timeout failure.
USE [msdb]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[usp_StartTimedOutJob]
AS
DECLARE #jobid NVARCHAR(100)
, #jobname NVARCHAR(250)
, #stepname NVARCHAR(250)
, #varMail VARCHAR(MAX)
DECLARE cJobs CURSOR FOR
-- CTE selects all jobs that are currently not running and orders them by most recent
WITH CTE_NotRunning AS (
SELECT S.job_id
, S.step_name
, S.[message]
, rownum = ROW_NUMBER() OVER (PARTITION BY S.job_id ORDER BY S.run_date DESC, S.run_time DESC)
FROM msdb.dbo.sysjobhistory AS S
LEFT OUTER JOIN (SELECT DISTINCT ja.job_id
FROM msdb.dbo.sysjobactivity ja
LEFT JOIN msdb.dbo.sysjobhistory jh ON ja.job_history_id = jh.instance_id
JOIN msdb.dbo.sysjobs j ON ja.job_id = j.job_id
JOIN msdb.dbo.sysjobsteps js
ON ja.job_id = js.job_id
AND ISNULL(ja.last_executed_step_id,0)+1 = js.step_id
WHERE
ja.session_id = (
SELECT TOP 1 session_id FROM msdb.dbo.syssessions ORDER BY agent_start_date DESC
)
AND start_execution_date is not null
AND stop_execution_date is NULL) AS R
ON S.job_id = R.job_id
WHERE R.job_id IS NULL)
-- only select the jobs into the cursor set for which the most recent job had a timeout issue
SELECT job_id
, step_name
FROM CTE_NotRunning
WHERE [message] LIKE '%0x80131904%time%out%' -- error message that corresponds to timed out jobs, error code: 0x80131904
AND rownum = 1
OPEN cJobs
FETCH NEXT FROM cJobs
INTO #jobid, #stepname
WHILE ##FETCH_STATUS = 0
BEGIN
-- for each of the timed out jobs in the cursor, start the job again from the step that caused the timeout
SET #jobname = (SELECT [name] FROM msdb.dbo.sysjobs WHERE job_id = #jobid)
EXECUTE dbo.sp_start_job #job_id = #jobid, #step_name = #stepname
END
CLOSE cJobs
DEALLOCATE cJobs
GO
I had this exact same issue. SQL Agent was running SSIS Jobs perfectly fine then suddenly I came across this error. Spent about an hour looking for a fix online. Found out the server admin had installed new windows updates.
I simply restarted the Server (which hosts the SSIS catalog and SQL Server/Agent). After server restart jobs ran fine again.
Hope server restart works for the next person that goes through this.
Sometimes this kind of error occurs when the package is deployed twice under SQL Integration Service Catalogs. You also may have changed the package name but there are other related auto-generated configurations are unique like the Environment reference Id and others .
So if you have a scheduled job, you will need to create a new one and point it to the .
Good Luck
I had the same problem and error message on SQL Server 2017.
My problem was on the SSISDB database, that was too big and had to be maintained (no more space available). After having cleaned up the SSISDB database, the jobs ran well again on this server.

Why does SQL Server say "Starting Up Database" in the event log, twice per second?

I have a SQL Server [2012 Express with Advanced Services] database, with not much in it. I'm developing an application using EF Code First, and since my model is still in a state of flux, the database is getting dropped and re-created several times per day.
This morning, my application failed to connect to the database the first time I ran it. On investigation, it seems that the database is in "Recovery Pending" mode.
Looking in the event log, I can see that SQL Server has logged:
Starting up database (my database)
...roughly twice per second all night long. (The event log filled up, so I can't see beyond yesterday evening).
Those "information" log entries stop at about 6am this morning, and are immediately followed by an "error" log entry saying:
There is insufficient memory in resource pool 'internal' to run this query
What the heck happened to my database?
Note: it's just possible that I left my web application running in "debug" mode overnight - although without anyone "driving" it I can't imagine that there would be much database traffic, if any.
It's also worth mentioning that I have a full-text catalog in the database (though as I say, there's hardly any actual content in the DB at present).
I have to say, this is worrying - I would not be happy if this were to happen to my production database!
With AUTO_CLOSE ON the database will be closed as soon as there are no connections to it, and re-open (run recovery, albeit a fast paced one) every time a connection is established to it. So you were seeing the message because every 2 second your application would connect to the database. You probably always had this behavior and never noticed before. Now that your database crashed, you investigated the log and discovered this problem. While is good that now you know and will likely fix it, this does not address you real problem, namely the availability of the database.
So now you have a database that won't come out of recovery, what do you do? You restore from you last backup and apply your disaster recovery plan. Really, that's all there is to it. And there is no alternative.
If you want to understand why the crash happened (it can be any of about 1 myriad reasons...) then you need to contact CSS (Product Support). They have the means to guide you through investigation.
If you wanted to turn off this message in event log.
Just goto SQL Server Management Studio,
Right click on your database
Select Options (from left panel)
Look into "Automatic" section, and change "Auto Close" to "False"
Click okay
That's All :)
I had a similar problem with a sql express database stuck in recovery. After investigating the log it transpired that the database was starting up every couple of minutes. Running the script
select name, state_desc, is_auto_close_on from sys.databases where name = 'mydb'
revealed that auto close was set to on.
So it appears that the database is in always in recovery but is actually coming online for a brief second before going offline again because there are no client connections.
I solved this with following script.
Declare #state varchar(20)
while 1=1
begin
Select #state = state_desc from sys.databases where name='mydb';
If #state = 'ONLINE'
Begin
Alter database MyDb
Set AUTO_CLOSE_OFF;
Print 'Online'
break;
End
waitfor delay '00:00:02'
end

How do I find what code is consumming my SQL Server connection pool?

I have rewritten the below based on the answers.
I have a website that causes HIGH CPU issues on the database server to the point where the server becomes unavailable. Recycling the app pool fixes the issue. According to the server administrator http://www.microsoft.com/downloads/details.aspx?FamilyID=28bd5941-c458-46f1-b24d-f60151d875a3&displaylang=en shows there are threads that are active for about an hour.
The interactions with the database are very simple and worked prior to us adding web forms routing to the application.
They only consists of code like this throughout the application.
Yes, this code is not perfect, but its not this code that is an issue as prior to us adding routing, there were no problems.
private string GetPublishedParagraphs()
{
string query, paragraphs = "";
try
{
m_sql_connection = new SqlConnection(m_base_page.ConnectionString());
query = "select * from PublishedParagraphs where IDDataContent_page='" + m_IDDataContent_page + "'";
SqlDataAdapter da = new SqlDataAdapter(query, m_sql_connection);
DataSet ds = new DataSet();
da.Fill(ds, "paragraph");
if (ds.Tables["paragraph"].Rows.Count > 0)
paragraphs = (string)ds.Tables["paragraph"].Rows[0]["paragraphs"];
ds.Dispose();
da.Dispose();
}
finally
{
m_sql_connection.Close();
}
paragraphs = paragraphs.Replace("™", "™");
return paragraphs;
}
The connection string looks like:
server_name; User ID=server_user; Password=server_password
We have meticulously checked that every call to the database Open() is followed by a Close().
We have measured there are no open connections by viewing them as we run the application locally and the connection count does not increase via:
SELECT SPID,
STATUS,
PROGRAM_NAME,
LOGINAME=RTRIM(LOGINAME),
HOSTNAME,
CMD
FROM MASTER.DBO.SYSPROCESSES
WHERE DB_NAME(DBID) = 'TEST' AND DBID != 0
(However, if we don't Close connections, there is a leak)
The difference between our application from when it worked is the addition of asp.net routing via web forms. This calls the database too, but again closes connections after they are open.
We are not sure what else we can check for.
Any ideas fellow programmers?
ANSWER
We found the problem via Query Profiler. This showed us a query with high usage. Tracing the query back to the code showed an infinite loop calling the database over and over. It was difficult to find as the loop was initiated by a bot calling a page on the website that no longer existed.
In the code you are showing, the ds and da .Dispose go in the finally block. Better yet, use the using () {} structure which ensures object disposal
the pattern of build your own string as a query isn't just a gaping security hole, it is also very inefficient. Use a stored procedure and a parameter instead.
the query for processes is overly restrictive. If you have a resource issue that is causing connections to be refused, it won't be limited to a single database. About the only thing I would restrict is the current command --> where spid != ##spid
REALLY need some error messages and context - where are they being seen? Tell us more and we can help!
Good luck!
First, great additional information! Thanks for the followup.
I would suggest that if you're so sure that the code you have posted has nothing to do with the problem that you remove it from the question. However, the problems aren't an issue of merely being "imperfect". Proper disposal of memory intensive objects - ones that the initial developers recognized as intensive enough to include the dispose() method - ones that interact with the database - while you are having unexplained problems with database isn't a small issue, in my opinion anyways.
I did some googling and found this. While I wouldn't go and say that this is the problem, it did get me to thinking. When "threads that are active for about an hour", is that being measured on the db server or on the web server? I'm not familiar with the tool, but are you able to post logs from this tool?
On the webserver, are you able to monitor the routing code's actions? Is the routing code written / setup in such a way as to protect against infinite loops - see the question and answers here text.
In the earlier version of my answer, I said to you that looking only # connections for a particular database was too restrictive for your task. The clarifications to your question do not indicate that you have corrected this query. I would suggest:
SELECT
is_being_blocked = sp.blocked
, sp.cpu
, DB_NAME ( dbid )
, sp.status
, LOGINAME=RTRIM(sp.LOGINAME)
, sp.HOSTNAME
, sp.Hostprocess
, sp.CMD
FROM SYSPROCESSES sp
WHERE spid != ##SPID
ORDER BY
sp.blocked ASC
, sp.cpu DESC
Logs - what are the SQL Server Logs saying in the time span 10 minutes before and 10 minutes after you restart the web app?
Have you tried and is this issue repeatable in development?
Please tell us what the below statement means in terms of your application - an error message or other: "the server becomes unavailable"
I highly suggest that, you startup a trace of sql server using profiler. According to what you are saying in this question, this is what I would trace saving to table ( on another sql server ) or saving to file ( on another machine NOT the sql server box ). This trace is for finding a problem that is severely hampering production. It's not something that you would want running on a regular basis.
I would capture these events
* Errors and Warnings - all of them
* Security Audit
** Audit Login
** Audit Logout
* Sessions
** Existing Sessions
* TSQL
** SQL: Stmt Starting
** SQL: Stmt Completed
** Prepare SQL
** Exec Prepared SQL
I wouldn't use any filters other than the presets.
Have you tried running the "sp_who2" query in SQL Server Management Studio to see how many active database connections there are as the code looks fine.
You might want to change the scope of the m_sql_connection variable from class scope to member scope. Perhaps that could be your issue?
what do you mean by "running out of application pool?" Do you mean the connection pool?
If your database seems to be getting overworked, it could also be because a user has free reign over your m_IDDataContent_page variable. This data access code is vulnerable to sql injection.

Determine which user deleted a SQL Server database?

I have a SQL Server 2005 database that has been deleted, and I need to discover who deleted it. Is there a way of obtaining this user name?
Thanks, MagicAndi.
If there has been little or no activity since the deletion, then the out-of-the-box trace may be of help. Try running:
DECLARE #path varchar(256)
SELECT #path = path
FROM sys.traces
where id = 1
SELECT *
FROM fn_trace_gettable(#path, 1)
[In addition to the out-of-the-box trace, there is also the less well-known 'black box' trace, which is useful for diagnosing intermittent server crashes. This post, SQL Server’s Built-in Traces, shows you how to configure it.]
I would first ask everyone who has admin access to the Sql Server if they deleted it.
The best way to retrieve the information is to restore the latest backup.
Now to discuss how to avoid such problems in the future.
First make sure your backup process is running correctly and frequently. Make transaction log baclup evey 15 mintues or half an hour if it is a higly transactional database. Then the most you lose is a half an hour's worht of work. Practice restoring the database until you can easily do it under stress.
In SQL Server 2008 you can add DDL triggers (not sure if you can do this in 2005) which allow you to log who did changes to structure. It might be worth your time to look into this.
Do NOT allow more than two people admin access to your production database - a dba and a backup person for when the dba is out. These people should load all changes to the database structure and code and all of the changes should be scripted out, code reviewed and tested first on QA. No unscripted, "run by the seat of your pants" code should ever be run on prod.
Here is bit more precise TSQL
SELECT DatabaseID,NTUserName,HostName,LoginName,StartTime
FROM
sys.fn_trace_gettable(CONVERT(VARCHAR(150),
( SELECT TOP 1
f.[value]
FROM sys.fn_trace_getinfo(NULL) f
WHERE f.property = 2
)), DEFAULT) T
JOIN sys.trace_events TE ON T.EventClass = TE.trace_event_id
WHERE TE.trace_event_id =47 AND T.DatabaseName = 'delete'
-- 47 Represents event for deleting objects.
This can be used in the both events of knowing or not knowing the database/object name. Results look like this:

Resources