IIS 7/MSSQL Server debugging - sql-server

I am developing a WCF service, so far I have added only a few simple interfaces. Testing the service in the VS debug environment, all is well.
When I published the service to IIS 7 for further testing, all was fine, at least after I added db_datareader and db_datawriter permissions to the relevant databases for user IIS APPPOOL\ASP.NET v4.0.
Now the service is failing silently. Calling the service through a browser, I get the following message: "The server encountered an error processing the request. See server logs for more details." Fiddler says this is a result of an HTTP 400 error message (Bad Request).
I rolled back the code to return a hard-coded value, so I am sure that the issue is at the DB level, not the IIS 7 server installation.
The problem is that the Event Viewer shows no meaningful error messages. This despite the fact that all the code is surrounded by try/catch, with any exception caught going to the event viewer.
There is one message stating "Starting up database 'ReportServer$SQLEXPRESSTempDB'", but as far as I can tell that appears every 10 minutes, without reference to any attempts to access the database. Just to make sure, I gave the .NET 4 user R/W permission to access that DB as well.
In addition, I don't see any messages in the SQL Server Logs.
How can I debug this?

The problem was that db_datareader and db_datawriter permissions are not enough to run stored procedures. I upgraded the permissions and all was good.
The only thing I don't understand is why the exception wasn't written to the event log.

Related

What SQL user is used by TFS to send alerts?

We are running into a few issues with our TFS installation (TFS 2013 Update 4, SQL 2014 Standard) as a result of email alerts. Most notably, Work Items cannot be created, because this triggers an email.
Any time a process or user attempts to create a Work Item, the error
TF30040: The database is not correctly configured. Contact your Team Foundation Server administrator.
is received. Further, when I check the Event Viewer on the server, I can see the error and it reports that the inner exception is:
Exception Message: The EXECUTE permission was denied on the object 'sp_send_dbmail', database 'msdb', schema 'dbo'. (type SqlException)
I have worked with the DBA and we have enabled Email Alerts on the server. We have verified that, in general, the alerts work by using the test button on the administration console. I can also set up a check-in alert through the web interface and receive said alerts without issue. This seems to be specifically affecting Work Item creation alerts (which apparently are just automatically and irrevocably enabled).
Presumably, we could correct this by giving appropriate permissions to use that stored procedure. To do so, we need to know what user to give permissions to. So far we have tried giving execute permissions to my AD user, the service account used by the build service, and the Network Service account (which appears to be the TFS Service Account).
There is no indication in any error message as to what user is being used to execute that procedure. So, my question: What SQL user is used to send alerts when creating Work Items?
Edit:
For the record, this started working of its own accord. We decided Monday to call Microsoft to get this fixed. Before that happened, failed builds magically created some work items (on Tuesday, a full day after we gave up), and we are now able to create work items. Everyone involved states not doing anything. We are baffled, but in a good way.
I'm going to advise you that a DBA should not be making changes to the TFS databases. I suggest opening a ticket with MSFT and getting assistance from the product support group.

SQL Server error log entry : Error: 17806, Severity: 20, State: 14

I have error in my log for a few weeks, I searched a lot but I couldn't found useful answer.
I did close SQL Server port for public IP, But I have problem yet.
Error: 17806, Severity: 20, State: 14.
SSPI handshake failed with error code 0x8009030c, state 14 while establishing a connection with integrated security; the connection has been closed. Reason: AcceptSecurityContext failed. The Windows error code indicates the cause of failure. The logon attempt failed [CLIENT: 10.10.3.25]
Time raised: 27 Jan 2015 2:23 PM
It was raised error while this system was off.
The Scenario –
A couple of separate individual Windows ID’s started generating these errors while attempting connections, all other windows logins were working properly. The connections were initially happening through applications, but also occurred through sqlcmd. When logged in to the server locally with the offending ID’s the connections to SQL would succeed.
The Troubleshooting process –
Check all the regular SSPI issues, I wont bore you with the details as they are easily searchable
A relatively easy way of checking the “easy” authentication issues If possible/appropriate is to log into the SQL Server locally with the offending ID and fire up sqlcmd and connect to the server via sqlcmd –Sservername,port –E (by specifying the port you force TCP/IP instead of LPC, thereby forcing the network into the equation)
Verify whether the login is trying to use NTLM or Kerberos (many ways to do this but simplest is to see if there are any other KERBEROS connections on the machine)
SELECT DISTINCT auth_scheme FROM sys.dm_exec_connections
If Kerberos is in use, there are a few additional things to verify related to SPN’s, since only NTLM was in use on this server I skipped that
Determine if the accounts were excluded from connecting to the machine through the network through a group policy or some other AD setting
After all of these checked out OK, I began to try and figure out what the error code 0x8009030c meant, turns out, its fairly obvious what the description is : sec_e_logon_denied. This description was so helpful I thought about making this server into a boat anchor but, luckily for my employer the server room is located many miles away and has armed guards.
Since I knew we could logon locally to the SQL Server with the ID that SQL was rejecting with logon denied something else was trying to make my life miserable.
We didn’t have logon failure security auditing turned on so, I had no way of getting a better error description, As luck would have it though this would prove instrumental in finding the root cause. To get a better error message, I found this handy KB article detailing steps needed to put net logon into debug mode.
Say hello to my new best friend! — nltest.exe
After downloading nltest & using it to enable netlogon debugging on the SQL Server, I got this slightly better message in the netlogon.log file
06/15 14:15:39 [LOGON] SamLogon: Network logon of DOMAIN\USER from Laptop Entered
06/15 14:15:39 [CRITICAL] NlPrintRpcDebug: Couldn’t get EEInfo for I_NetLogonSamLogonEx: 1761 (may be legitimate for 0xc0000064)
06/15 14:15:39 [LOGON] SamLogon: Network logon of DOMAIN\USER from Laptop Returns 0xC0000064
The error code 0XC0000064 maps to “NO_SUCH_USER”
Since I was currently logged in to the server with the ID that was returning no such user, something else was obviously wrong, and luckily at this point I knew it wasn’t SQL.
Running “set log” on the server revealed that a local DC (call it DC1) was servicing the local logon request.
After asking our AD guys about DC1 and its synchronization status, as well as whether the user actually existed there, everything still looked OK.
After looking around a bit more I discovered this gem of a command for nltest to determine which DC will handle a logon request
C:\>nltest /whowill:Domain Account
[16:32:45] Mail message 0 sent successfully (\MAILSLOT\NET\GETDC579)
[16:32:45] Response 0: DC2 D:Domain A:Account (Act found)
The command completed successfully
Even though this command returned “act found” it was returning from DC2. (I dont exactly understand why the same account would authenticate against 2 different DC’s based on a local desktop login or a SQL login but it apparently can)
After asking the AD guys about DC2 the light bulbs apparently went off for them as that server actually exists behind a different set of firewalls, in a totally different location. While DC2 would return a ping, the console wouldn’t allow logons for some reason. After a quick reboot of DC2, and some magic AD pixie dust (I am not an AD admin, if it wasn’t totally obvious from my newfound friend nltest) the windows Id’s that were having trouble started authenticating against DC3 and our SSPI errors went away.
Interesting tidbit — During troubleshooting, I found that this particular SQL Server was authenticating accounts against at least 5 different DC’s. Some of this might be expected since there are different domains at play but, I haven’t heard a final answer from the AD guys about whether it should work that way.
The solution
Reboot the misbehaving DC, of course there may be other ways to fix this by redirecting requests to a different DC without a reboot but, since it was misbehaving anyway, and the AD experts wanted to reboot so we went with that. A reboot of SQL would have likely solved this problem too but, I hate reboot fixes of issues, they always seem to come back!
reference

Merge replication unintialized subcription is expired or does not exist

I am trying to set up a merge replication using web synchronization between a publishing SQL Server 2012 standard and subscribing SQL Server 2012 Express. After following the instructions provided at Technet, I am stuck on this:
Source: Merge Process(Web Sync Server)
Number: -2147200985
Message: The subscription to publication 'MyMergePublication' has expired or does not exist.
I already verified that SSL certification are good, that I can browse to the publishing machine's URL https:\\mycomputer\replisapi.dll and get the expected output. I already verified that snapshot was set up and I took a giant hammer & use an administrator account to run the pool identity which is really bad security-wise but wanted to validate that it was not security that was tripping me up.
To further the mystery, when I try and fail to sync, the publisher acknowledges that a new subscriber has been registered, but it cannot get the snapshot at all and thus subscriber database is still empty.
On the replication monitor, there are no failed synchronization history, or any errors; all it has to say is that the subscriber is uninitialized, and no more.
Turning up the verbosity of the merge agent, I saw some sql being executed and tried replicating the sql and i found this was failing with same error:
{call sys.sp_MSgetreplicainfo(?,?,?,?,?,?,?,90)}
I called it with only the 3 mandatory parameters supplied and it would fail. That is despite the prior call sp_helpmergepublication does return a row for that publication. Oddly, the content of sp_helpmergepublication does not match what I configured for the subscription (e.g. it says web url is null when viewing the properties correctly shows the web url being set). Not sure that is significant.
The content of sp_MSgetreplicainfo contains a call to another system sprocs that I cannot run for some reason (says not found) so I'm not sure what is actually going on here.
Any clues would be greatly appreciated.

How to determine the most restrictive SQL server security permissions a program can use and still function?

PROBLEM BACKGROUND
Sorry if this is a bit tedious to read, but please bear with me.
I have been tasked to determine the most restrictive security permissions...or rather investigate if more restrictive security settings can be configured for the SQL server login our program uses, yet still function as normal.
Currently the program runs as a Windows service configured to log on using a Windows user account that has been configured in SQL server with trusted auth. The login used has been assigned a db_owner role and the service works fine like that.
So to narrow the permissions for this user I removed the db_owner rights and assigned it to the db_datareader and db_datawriter roles. Unfortunately this causes a problem and when I start up the service I get an error dialogue displaying:
Error 1053: the service did not respond to the start or control request in a timely fashion.
and in the event viewer under the System events are logged:
event 7009 (timeout waiting for..to connenct)
event 7000 (the service did not respond to the start or control )
My problem is the code base is really large and I'm not sure what exactly to look for that would require db_owner permissions (it sets permissions maybe?).
QUESTION
What should I be looking for in a program that executes SQL that would cause it to require db_owner permissions?
In case the first question is too general: is there an easy way/any tools I can use to figure out what a Windows service is trying to do during start-up 'SQL wise' if I get system error events logged:
event 7009 (Timeout (30000 milliseconds) waiting for the ... service to connect)
event 7000 (The service did not respond to the start or control request in a timely fashion).
BTW I tried running profiler with all audit events selected, but still get nothing logged when starting the service.
This is such a broad question without knowing the architecture of your service and how it communicates with SQL Server. Are you using in-line SQL? Stored Procedures?
I think you'd best tackle this issue by starting from the service's code and tracing the execution path from the start and see what is being executed on/against SQL Server.
Alternatively, if you are using stored procedures, you may want to script them all out into a file and search on some common T-SQL commands limited to a db_owner, such as CREATE, DROP, ALTER.

ELMAH SQL Error Handler database not available- what happens to logging?

I'm testing ELMAH and have deliberately turned off the database connection for the ELMAH log in my application to see what will happen in production if the DB isn't available.
It seems that ELMAH can't trap its own errors- the AXD file isn't available when the SQL databse log fails.
What is the intended behavior of ELMAH if the database isn't available?
How can I diagnose my errors if this occurs?
It seems that ELMAH can't trap its own
errors
ELMAH does trap its own errors to some extent. If the ErrorLogModule encounters an exception while attempting to log the error then the exception resulting from logging is sent to the standard .NET Framework trace facility. See line 123 from 1.0 sources. See also the following walk-through from the ASP.NET documentation for getting the standard .NET Framework tracing working with ASP.NET tracing:
Walkthrough: Integrating ASP.NET Tracing with System.Diagnostics Tracing
the AXD file isn't available when the
SQL databse log fails.
That is correct. SQL Server database connectivity must be functional to view errors stored in a SQL Server database when using SqlErrorLog.
What is the intended behavior of ELMAH
if the database isn't available?
If, for example, the SQL Server database is down, a SqlException will occur during logging. ELMAH will then send the SqlException object content to the standard .NET Framework trace facility.
How can I diagnose my errors if this
occurs?
The best option here is to also enable logging and e-mailing of errors. If the database is down, chances are good that the mail gateway is up and you will still get notified of errors. The errors will, in effect, get logged in some mailbox(es). This also has the added advantage that if the mail gateway is ever down then chances are that the database will be up and errors will get logged there. If both are down, however, then you will need to seriously review your production infrastructure and possibly take measures for monitoring health of your system via additional measures.
Not really sure about ELMAH but expected behaviour of such logging frameworks is not to throw any exceptions if something goes wrong with them. I.e. if ELMAH's database is down I'd assume it will just not log the errors to database.
As suggested above you can/should use alternative sinks - email or flat file.
You can always use the xml file option to log your errors.
I think you're mixing up contexts a bit.
ELMAH's behavior if the database isn't available is to not log errors to the database. If an exception is thrown on the server or if you raise an exception via an ErrorSignal, ELMAH is going to let that exception pass through to either a yellow screen or a custom errors page (your setting.)
Since Errors.axd page is only accessible to those that should be seeing it (ideally,) it is okay to present that error to the user.
The bottom line is that if the errors database is down you can't diagnose errors. For us, if that were the case, we'd have bigger problems since the error database sits with the production database.
I would also advocate against using XML logging for your primary logging source. SQL server is going to give you the best performance without having to manage the files. With XML logging that is not the case.

Resources