Database Mail registry settings won't stay configured - sql-server

I have a strange one for you. I'm maintaining several databases prior to a migration. One of them is a 2008R2 instance. This instance has multiple errors in the logs (the infrastructure has been poorly maintained), so I set up a bunch of alerts (16-25) and tried using Database Mail to send them. But the mail registry settings keep resetting and preventing it from working. I can't tell if someone is maliciously going in behind me and reverting the settings in the registry (this is possible in the poisonous environment I'm working in) or whether it's some kind of obscure problem.
Just to confirm... I've created the same alerts with the same mail settings on the 2017 instances that I'm also monitoring with no problem. Equally, on the 2008R2 instance, I can successfully set the Database Mail parameters, send myself a test email AND execute a job, sending a 'completed' email using the same Database Mail profile and user via an Operator.
Setting the parameters using xp_instance_regwrite or sp_set_sqlagent_properties didn't work either, although I realised early on that the parameters weren't sticking because of a lack of admin rights on the server, so I got the infrastructure guys to give me access. I then:
logged in to the server
shut down the Agent (it isn't doing anything at all)
configured the registry settings (HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10_50.<instance>\SQLServerAgent\UseDatabaseMail = 1, HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10_50.<instance>\SQLServerAgent\DatabaseMailProfile = <my-mail-profile>)
restarted the Agent.
I then confirmed from SSMS that the 'Mail session' parameters (Enable mail profile, Mail system and Mail profile) were correctly set. A day later, the log is full of errors, I have no emails and all of the Agent properties are empty and greyed out!
Anyone seen this before?

Related

How to Delegate Credentials through double hop to SQL Server?

What I am trying to do:
We have a Task Scheduler that kicks off an EXE, which in the course of its runtime, will connect to SQL Server.
So that would be:
taskServer.myDomain triggers the Task Scheduler action
taskServer.myDomain exe runs locally
taskServer.myDomain initiates a connection to sqlServer.myDomain
The scheduled task is associated with a service account (svc_user) that is set to run with highest privilege, run whether the user is logged in or not, and store credentials for access to non-local resources.
The actual behavior
What we are seeing is the Task Scheduler is indeed running as svc_user. It triggers the EXE as expected, and the EXE is also running as svc_user. When the EXE initiates a connection to SQL Server, it errors on authentication.
Looking at the Event Viewer we can see the failure trying to initialize the connection to SQL
Exception Info: System.Data.SqlClient.SqlException
at System.Data.SqlClient.SqlInternalConnectionTds..ctor(System.Data.ProviderBase.DbConnectionPoolIdentity, System.Data.SqlClient.SqlConnectionString, System.Data.SqlClient.SqlCredential, System.Object, System.String, System.Security.SecureString, Boolean, System.Data.SqlClient.SqlConnectionString, System.Data.SqlClient.SessionData, System.Data.ProviderBase.DbConnectionPool, System.String, Boolean, System.Data.SqlClient.SqlAuthenticationProviderManager)
And then looking at the SQL Server logs we can see the root of the issue
Logon,Unknown,Login failed for user 'NT AUTHORITY\ANONYMOUS LOGON'. Reason: Could not find a login matching the name provided.
The connection initialized by the EXE to SQL Server is trying to authenticate as ANONYMOUS LOGON.
What I have tried
Background
This issue popped up when our IT team started deploying a GPO lockdown in our environments. So in order to get to this point, we first had to add some GPO exceptions to allow the svc_user to:
log on locally
log on as batch job
Progress?
This is where we started being able to capture the ANONYMOUS LOGON error in SQL Server. From there we tried a handful of other GPO exceptions including
Allow Credential Save
Enable computer and user accounts to be trusted for delegation
The actual issue?
So it would appear that this is a double hop delegation issue. Which eventually led me here and then via the answer, here and here.
So I tried adding GPO policies to allow delegating fresh credentials using the WSMAN/* protocol + wildcard.
Two issues with this:
the Fresh credentials refer to prompted credentials while the EXE is running as a service during off-hours and inheriting the credentials from the TaskScheduler
the WSMAN protocol appears to be used for remote PowerShell sessions (via the original question in the serverfault post) and not SQL Service connections.
So, I added the protocol MSSQLSvc/* to the enabled delegation and tried all permutations of Fresh, Saved and Default delegation. (This was all done in Local Computer Policy -> Computer Configuration -> Administrative Templates -> system -> Credentials Delegation)
Where it gets weird
We have another server, otherServer.myDomain, which we setup with the same TaskSchedule. It is setup with the same GPO memberships, but seems to be able to successfully connect to SQL Server. AFAIK, the servers are identical as far as setup and configuration.
The Present
I have done a bit more digging into anywhere I could think that might offer clues as to how I can feed the credentials through or where they might be falling through. Including watching the traffic between the taskServer and the sqlServer as well as otherServer and sqlServer.
I was able to see NTLM challenges coming from the sqlServer to the taskServer/otherServer.
In the case of taskServer, the NTLM response only has a workstationString=taskServer
On otherServer, the NTLM response has workstationString=otherServer, domainString=myDomain, and userString=svc_user.
Question
What is the disconnect between hop 1 (task scheduler to EXE) and hop 2 (EXE to SQL on sqlServer)? And why does this behavior not match between taskServer and otherServer?
So I finally have an update/solution for this post.
The crux of the issue was a missing SPN. The short answer:
Add an SPN for sqlServer associated with the service account SQL services are running as (not the svc_user)
example: SetSPN -S MSSQLSvc/sqlServer.myDomain myDomain\svc_sql_user
Add another SPN like above but w/ the sql service port
example: SetSPN -S MSSQLSvc/sqlServer.myDomain:1433 myDomain\svc_sql_user
Set the SQL service user account to allow delegation like so

Why my Windows service only establishes connection with database when SQL Server Service runs under Local System account?

My windows service is using integrated authentication and running under Local System account and got the below exception.
The target principal name is incorrect. Cannot generate SSPI context.
The SQL Server Service is running under domain admin user e.g. "domain\administrator". If I change the SQL Server Service to run under Local System account then it fixes the above error.
Can anyone explain why it's happening like this? We have an InstallShield wizard which installs our application on client side i don't know how we can handle this behavior through the wizard. Also changing the user for SQL Server Service is not realistic as well because the client may not allow it.
Note: Once when my windows service works fine and I revert the SQL Server run under the admin account my service runs fine. I guess there are some permissions are set to the local system account.
Before it, I ran the Kerberos which generated the following script to run and fixed the issue. After this it was not required to change the user for SQL Server Service.
SetSPN -d "MSSQLSvc/FQDN" "domain\machine$"
SetSPN -s "MSSQLSvc/FQDN" "domain\administrator"
Please explain why it's happening and what is the best way to handle the situation?
When running under the Local System account, sql-server registers an spn for every service it controls automatcially up to active-directory, and attempts to unregister them when the service shuts down. The Local System account has the ability to communicate over the network as the computer account and thus can indicate to Active Directory as to when to make changes about itself and the SPN SQL Service wants to register. When you change the SQL Server account over to an AD domain user account, the Local System account immediately loses it's ability to control this; therefore you must manually delete the existing SPNs previously registered for that SQL service by Local System before registering new SPNs. You should now notice why its nice that the SQL server script helpfully calls for a deletion of the old SPN followed by the registration of a new one in order to prevent issues. When this isn't done properly - you'll get an authentication error when the kerberos clients obtain a ticket for the old invalid SPN - because it was never deleted and any Kerberos-aware service will always reject a ticket for a wrong SPN. After you make SPN changes, always be sure to restart the SQL Server service and right after that if you’re testing with a user have that user log out and log back in. This answers your main question here.
Please see this Microsoft document for further reading on the subject: Register a Service Principal Name for Kerberos Connections. There's also a very good youtube video on this exact problem, that's where I learned about it and how to resolve it. Ignore "SSRS" in the title, I've watched the entirety and the guidance applies to any and all services by SQL which have SPNs.
You had a secondary question at the very end of your question regarding what is the best way to handle the situation. If you're talking about solving it programmatically that would be very difficult to answer as all environments are different in some way and you will come across SQL instances running in all sorts of different security contexts. In an online forum like this you would probably get different answers from different people. If this were your only question, I think it would get closed by the moderators for "being primarily opinion-based" and likely to attract spam answers. I would suggest you incorporate some kind of guidance about the problem in some form of a Readme file that you should package with the InstallShield wizard.
Side note: I think you should add the kerberos tag to this question - as SPNs are relevant to Kerberos only - and not to any other authentication protocol.

What SQL user is used by TFS to send alerts?

We are running into a few issues with our TFS installation (TFS 2013 Update 4, SQL 2014 Standard) as a result of email alerts. Most notably, Work Items cannot be created, because this triggers an email.
Any time a process or user attempts to create a Work Item, the error
TF30040: The database is not correctly configured. Contact your Team Foundation Server administrator.
is received. Further, when I check the Event Viewer on the server, I can see the error and it reports that the inner exception is:
Exception Message: The EXECUTE permission was denied on the object 'sp_send_dbmail', database 'msdb', schema 'dbo'. (type SqlException)
I have worked with the DBA and we have enabled Email Alerts on the server. We have verified that, in general, the alerts work by using the test button on the administration console. I can also set up a check-in alert through the web interface and receive said alerts without issue. This seems to be specifically affecting Work Item creation alerts (which apparently are just automatically and irrevocably enabled).
Presumably, we could correct this by giving appropriate permissions to use that stored procedure. To do so, we need to know what user to give permissions to. So far we have tried giving execute permissions to my AD user, the service account used by the build service, and the Network Service account (which appears to be the TFS Service Account).
There is no indication in any error message as to what user is being used to execute that procedure. So, my question: What SQL user is used to send alerts when creating Work Items?
Edit:
For the record, this started working of its own accord. We decided Monday to call Microsoft to get this fixed. Before that happened, failed builds magically created some work items (on Tuesday, a full day after we gave up), and we are now able to create work items. Everyone involved states not doing anything. We are baffled, but in a good way.
I'm going to advise you that a DBA should not be making changes to the TFS databases. I suggest opening a ticket with MSFT and getting assistance from the product support group.

How to determine the most restrictive SQL server security permissions a program can use and still function?

PROBLEM BACKGROUND
Sorry if this is a bit tedious to read, but please bear with me.
I have been tasked to determine the most restrictive security permissions...or rather investigate if more restrictive security settings can be configured for the SQL server login our program uses, yet still function as normal.
Currently the program runs as a Windows service configured to log on using a Windows user account that has been configured in SQL server with trusted auth. The login used has been assigned a db_owner role and the service works fine like that.
So to narrow the permissions for this user I removed the db_owner rights and assigned it to the db_datareader and db_datawriter roles. Unfortunately this causes a problem and when I start up the service I get an error dialogue displaying:
Error 1053: the service did not respond to the start or control request in a timely fashion.
and in the event viewer under the System events are logged:
event 7009 (timeout waiting for..to connenct)
event 7000 (the service did not respond to the start or control )
My problem is the code base is really large and I'm not sure what exactly to look for that would require db_owner permissions (it sets permissions maybe?).
QUESTION
What should I be looking for in a program that executes SQL that would cause it to require db_owner permissions?
In case the first question is too general: is there an easy way/any tools I can use to figure out what a Windows service is trying to do during start-up 'SQL wise' if I get system error events logged:
event 7009 (Timeout (30000 milliseconds) waiting for the ... service to connect)
event 7000 (The service did not respond to the start or control request in a timely fashion).
BTW I tried running profiler with all audit events selected, but still get nothing logged when starting the service.
This is such a broad question without knowing the architecture of your service and how it communicates with SQL Server. Are you using in-line SQL? Stored Procedures?
I think you'd best tackle this issue by starting from the service's code and tracing the execution path from the start and see what is being executed on/against SQL Server.
Alternatively, if you are using stored procedures, you may want to script them all out into a file and search on some common T-SQL commands limited to a db_owner, such as CREATE, DROP, ALTER.

Problems with sending notification emails from SQL Server 2008

I have a problem setting up email notifications in SQL Server 2008.
I am using SQL Server 2008 to build a data cube. This process consists of several jobs, that are all scheduled to run each night at a specific time. To see whether a job was running without any failure, it is possible to set up email notification.
(source: smilinginthesun.de)
(source: smilinginthesun.de)
Unfortunately, no email is sent even though I think I have set up database mail correctly.
All guides I can found just explains, how to set up database mail (eg. http://blog.sqlauthority.com/2008/08/23/sql-server-2008-configure-database-mail-send-email-from-sql-database/), but thats it.
If I sent a mail to test functionality (via test email button or from T-SQL script) it works just fine and the email arrives. But when using the notification feature in a job, it doesn't even create a sysmail_mailitems or sysmail_log entry.
I have also send an email using sp_notify_operator, successfully. So the operator set up seems to work, too. (Thanks, Joe Stefanelli, for the hint.)
There is a great workaround written for SQL Server 2005, that works for me, too: http://www.howtogeek.com/howto/database/sending-automated-job-email-notifications-in-sql-server-with-smtp/, but why isn't it possible to just use that build-in functionality?
Does anyone knows if there is anything else to do, to make that email notification task work properly? Thanks in advise.
I am late but I saw your post having the same problem.
The solution is to enable a mail profile for alerts.
For this:
Right click on "SQL Server Agent"
Then "Properties".
Then go to "Alert System" section.
Tick the "Enable mail profile" box.
Then "OK".
Restart the SQL Server Agent service.
To send email notifications from agent jobs in SQL Server Agent, the alarm system must be activated. We did that. After restarting the agent, the Database Mail system should be used. Unfortunately, this does not work. The agent was still trying to reach a Mapi profile (as in SQL 2000).
The following registry value wasn't set: "UseDatabaseMail"=dword:00000001
After setting it to 1, the database mail system was used.
You can do that in SQL server, directly:
EXEC master.dbo.xp_instance_regwrite N'HKEY_LOCAL_MACHINE', N'SOFTWARE\Microsoft\MSSQLServer\SQLServerAgent', N'UseDatabaseMail', N'REG_DWORD', 1
I've tried what ZeuG. indicated and still didn't have success... then I restarted the SQL Agent. That's what made the job email notification work for me.
For me, SQL email worked fine when I right-clicked Database Mail and said Send Test Email. However, after setting up alerts for all severities and running RAISERROR (N'Test alert!', 20, 1) WITH LOG;, no email came through. The issue is the service account that SQL Server Agent runs under. Apparently, the virtual account NT Service\SQLSERVERAGENT (or whatever your default may be) does not have permissions to access objects on the network and this includes e-mail.
To fix, change the service to use a local account under the Users group, restart SQL Server agent, and then try running a command that throws an error like the RAISERROR statement above. You should now receive the email.

Resources