PowerShell Script
New-Cluster -Name "DI-XXX-YY-CLUSTER" -Node "di-XXX-YY-db1","di-XXX-YY-db2" -NoStorage -StaticAddress 172.17.XX.YYY
Set-ClusterQuorum -NodeAndFileShareMajority "\\DI-XXX-YY-WS1\ClusterQuorum"
Invoke-Command -ComputerName "DI-XXX-YY-WS1" -ScriptBlock { mkdir c:\Quorum}
Invoke-Command -ComputerName "DI-XXX-YY-WS1" -ScriptBlock { New-SmbShare -Name "Quorum" -Path "c:\Quorum" -FullAccess "didevtest.local\DI-XXX-YY-CLUSTE"}
Add-ClusterNode -Cluster "DI-XXX-YY-CLUSTER" -Name "di-XXX-YY-db2" -NoStorage
The Server manager on the second node (di-XXX-YY-db2) showing a warning.
Incomplete communication with DI-XXX-YY-CLUSTER. The following nodes
or cluster roles might be offline or have connectivity issues
Server Manager->All Servers
The Server Manager refresh fails on the second node (di-XXX-YY-db2)
Windows error log entries
The Kerberos client received a KRB_AP_ERR_MODIFIED error from the
server di-XXX-XX-db1$. The target name used was
MSServerClusterMgmtAPI/DI-XXX-XX-CLUSTER.didevtest.local. This
indicates that the target server failed to decrypt the ticket provided
by the client. This can occur when the target server principal name
(SPN) is registered on an account other than the account the target
service is using. Ensure that the target SPN is only registered on the
account used by the server. This error can also happen if the target
service account password is different than what is configured on the
Kerberos Key Distribution Center for that target service. Ensure that
the service on the server and the KDC are both configured to use the
same password. If the server name is not fully qualified, and the
target domain (DIDEVTEST.LOCAL) is different from the client domain
(DIDEVTEST.LOCAL), check if there are identically named server
accounts in these two domains, or use the fully-qualified name to
identify the server.
DCOM was unable to communicate with the computer
DI-XXX-XX-CLUSTER.didevtest.local using any of the configured
protocols; requested by PID 14d4
(C:\Windows\system32\ServerManager.exe).
You are creating a Windows Server Failover Cluster (WSFC), not an FCI. FCI is the clustered instance of SQL Server.
That said, check networking (including DNS), firewall, and most importantly, AD. If the WSFC is not coming online, it could be any of these things. Make sure that the CNO is precreated or the account creating the WSFC has rights to create objects in AD. If the object is there but not in DNS, similar issue - make sure DNS is right.
Also, why are you running Add-ClusterNode? The WSFC is being formed with both nodes in New-Cluster.
Check the logs and Event Viewer. They will give you a clue as to why things are messed up.
One NIC is fine if it's virtualized. There are cases where you would have two NICs (always in physical). Do you have two NICs in one server but not the other?
Also read all the text and not just go by the yellow/green/blue. Sometimes the problem is in the notes.
That said, again, go check SPNs and DNS. Look for things like duplicate or stale DNS records or duplicate SPNs.
You can search for "KRB_AP_ERR_MODIFIED cluster" on the web to see quite a few different solutions, but most are DNS related (including what I mentioned).
Related
What I am trying to do:
We have a Task Scheduler that kicks off an EXE, which in the course of its runtime, will connect to SQL Server.
So that would be:
taskServer.myDomain triggers the Task Scheduler action
taskServer.myDomain exe runs locally
taskServer.myDomain initiates a connection to sqlServer.myDomain
The scheduled task is associated with a service account (svc_user) that is set to run with highest privilege, run whether the user is logged in or not, and store credentials for access to non-local resources.
The actual behavior
What we are seeing is the Task Scheduler is indeed running as svc_user. It triggers the EXE as expected, and the EXE is also running as svc_user. When the EXE initiates a connection to SQL Server, it errors on authentication.
Looking at the Event Viewer we can see the failure trying to initialize the connection to SQL
Exception Info: System.Data.SqlClient.SqlException
at System.Data.SqlClient.SqlInternalConnectionTds..ctor(System.Data.ProviderBase.DbConnectionPoolIdentity, System.Data.SqlClient.SqlConnectionString, System.Data.SqlClient.SqlCredential, System.Object, System.String, System.Security.SecureString, Boolean, System.Data.SqlClient.SqlConnectionString, System.Data.SqlClient.SessionData, System.Data.ProviderBase.DbConnectionPool, System.String, Boolean, System.Data.SqlClient.SqlAuthenticationProviderManager)
And then looking at the SQL Server logs we can see the root of the issue
Logon,Unknown,Login failed for user 'NT AUTHORITY\ANONYMOUS LOGON'. Reason: Could not find a login matching the name provided.
The connection initialized by the EXE to SQL Server is trying to authenticate as ANONYMOUS LOGON.
What I have tried
Background
This issue popped up when our IT team started deploying a GPO lockdown in our environments. So in order to get to this point, we first had to add some GPO exceptions to allow the svc_user to:
log on locally
log on as batch job
Progress?
This is where we started being able to capture the ANONYMOUS LOGON error in SQL Server. From there we tried a handful of other GPO exceptions including
Allow Credential Save
Enable computer and user accounts to be trusted for delegation
The actual issue?
So it would appear that this is a double hop delegation issue. Which eventually led me here and then via the answer, here and here.
So I tried adding GPO policies to allow delegating fresh credentials using the WSMAN/* protocol + wildcard.
Two issues with this:
the Fresh credentials refer to prompted credentials while the EXE is running as a service during off-hours and inheriting the credentials from the TaskScheduler
the WSMAN protocol appears to be used for remote PowerShell sessions (via the original question in the serverfault post) and not SQL Service connections.
So, I added the protocol MSSQLSvc/* to the enabled delegation and tried all permutations of Fresh, Saved and Default delegation. (This was all done in Local Computer Policy -> Computer Configuration -> Administrative Templates -> system -> Credentials Delegation)
Where it gets weird
We have another server, otherServer.myDomain, which we setup with the same TaskSchedule. It is setup with the same GPO memberships, but seems to be able to successfully connect to SQL Server. AFAIK, the servers are identical as far as setup and configuration.
The Present
I have done a bit more digging into anywhere I could think that might offer clues as to how I can feed the credentials through or where they might be falling through. Including watching the traffic between the taskServer and the sqlServer as well as otherServer and sqlServer.
I was able to see NTLM challenges coming from the sqlServer to the taskServer/otherServer.
In the case of taskServer, the NTLM response only has a workstationString=taskServer
On otherServer, the NTLM response has workstationString=otherServer, domainString=myDomain, and userString=svc_user.
Question
What is the disconnect between hop 1 (task scheduler to EXE) and hop 2 (EXE to SQL on sqlServer)? And why does this behavior not match between taskServer and otherServer?
So I finally have an update/solution for this post.
The crux of the issue was a missing SPN. The short answer:
Add an SPN for sqlServer associated with the service account SQL services are running as (not the svc_user)
example: SetSPN -S MSSQLSvc/sqlServer.myDomain myDomain\svc_sql_user
Add another SPN like above but w/ the sql service port
example: SetSPN -S MSSQLSvc/sqlServer.myDomain:1433 myDomain\svc_sql_user
Set the SQL service user account to allow delegation like so
We've a Windows Event Collector in DOMAIN1. DOMAIN1 and DOMAIN2 have a two-way transitive forest trust. Events from sources in D1 are forwarding fine to the WEC in D1.
D2 is setup to communicate to the same FQDN subscription manager over http/5985 (Server=http://server1.domain1.com:5985/wsman/SubscriptionManager/WEC,Refresh=60). Source initiated event collection. Port 5985 is open and listening from D2 machines through WEC in D1.
Machines in D2 are getting this in their Eventlog-ForwardingPlugin Operational logs
The forwarder is having a problem communicating with subscription manager at address http://wec1.domain1.com:5985/wsman/SubscriptionManager/WEC. Error code is 2150858909 and Error Message is <f:WSManFault xmlns:f="http://schemas.microsoft.com/wbem/wsman/1/wsmanfault" Code="2150858909" Machine="server1.domain2.com"><f:Message>WinRM cannot process the request. The following error with errorcode 0xc0000413 occurred while using Kerberos authentication: An unknown security error occurred.
Possible causes are:
-The user name or password specified are invalid.
-Kerberos is used when no authentication method and no user name are specified.
-Kerberos accepts domain user names, but not local user names.
-The Service Principal Name (SPN) for the remote computer name and port does not exist.
-The client and remote computers are in different domains and there is no trust between the two domains.
After checking for the above issues, try the following:
-Check the Event Viewer for events related to authentication.
-Change the authentication method; add the destination computer to the WinRM TrustedHosts configuration setting or use HTTPS transport.
Note that computers in the TrustedHosts list might not be authenticated.
-For more information about WinRM configuration, run the following command: winrm help config. </f:Message></f:WSManFault>.
[eventlog][1]
I don't know enough about kerberos to know if tickets from D2 can be used in D1 or somehow made to. Anyone got any ideas? I can't find much about this exact issue and WEF.
thanks
[1]: https://i.stack.imgur.com/VVF0Y.png
I've a GPO that won't work in Azure AD. I need to create multiple GPOs to map network drives. I've put the GPO right under the domain
I've mapped the drive, and targeted at a security group. Tried with an OU first, but that didn't work either.
So did I place the GPO wrong, or did I map the drive wrong? The client has a dynamic IP and it's DNS servers are the IP of the servers
When I run gpudate on the client, It seems that the server is unreachable:
Let me know if you need additional information
• Your procedure of mapping a network drive is correct but the error snapshot that you have posted regarding the reachability of the AD/DNS servers is a matter of concern due to which the group policies were not able to replicate and apply authoritatively from the AD or Group policy server. Thus, please check the connectivity of the DC/Group policy server from the client system as below: -
A) Check the SYSVOL replication is happening correctly or not. DFRS (Distributed File Replication Service) is used for SYSVOL replication, to confirm that run the below command and check its result
‘ dfsrmig.exe /getglobalstate ’ --> If the result shows: 3 (ELIMINATED), then its Ok
B) Then check whether which DC has the FSMO roles installed on it. For that, run the below command and check the IP and hostname whether it is configured as the correct DNS in IP configuration in the client system or not
‘ netdom query fsmo ’
C) Once the above is done, please check the replication between the DCs is working correctly or not by executing the below commands one by one and analyzing their results
‘ Dcdiag /v >c:\dcdiag1.log
Repadmin /showrepl
Repadmin /syncall /APeD ‘
D) Ensure that the ‘gpt.ini’ file exists on your DC at ‘\domain.local\SysVol\domain.local\Policies{Policy_GUID}\’ path and if not then your GPO server might be at risk of corruption of essential system files. Please reset it. Also, do ensure that your DNS server or DC is reachable and pingable through the below commands successfully. Try to reset the DNS resolver cache on client computers.
‘ ping<hostname of DC>
Nslookup<hostname of DC>
Ipconfig /flushdns ’ on client systems
Lastly, ensure that your DC and domain is accessible via RPC protocol through the below command: -
‘ nltest /dsgetdc:hostname of DC ’
If all of the above commands return positive results, then you should check your client’s network and domain settings for any issues as everything else is correct on the DC end.
I have hosted my WebApp on server 1 and my database on server 2
But I'm getting following error
Communication with the underlying transaction manager has failed.
I googled and found a post which mentioned that it is the issue of DTC(Distributed Transaction)
I enabled DTC on server2(DB server) and made an exception of it in Firewall.
But still same error.
Here is the full stack trace
Message: System.Transactions.TransactionManagerCommunicationException: Communication with the underlying transaction manager has failed. ---> System.Runtime.InteropServices.COMException: The MSDTC transaction manager was unable to pull the transaction from the source transaction manager due to communication problems. Possible causes are: a firewall is present and it doesn't have an exception for the MSDTC process, the two machines cannot find each other by their NetBIOS names, or the support for network transactions is not enabled for one of the two transaction managers. (Exception from HRESULT: 0x8004D02B)
at System.Transactions.Oletx.IDtcProxyShimFactory.ReceiveTransaction(UInt32 propgationTokenSize, Byte[] propgationToken, IntPtr managedIdentifier, Guid& transactionIdentifier, OletxTransactionIsolationLevel& isolationLevel, ITransactionShim& transactionShim)
at System.Transactions.TransactionInterop.GetOletxTransactionFromTransmitterPropigationToken(Byte[] propagationToken)
Kindly advice
We had the exact same situation, and more than once. Each time, it was one of the following:
The IP address in the DNS for the server is outdated (as said in error message: "two machines cannot find each other by their NetBIOS names"). You can check if this is the case by trying ping servername from one server to another in the command prompt. If the ping by name fails and ping by IP succeeds (or ping by name returns the wrong IP), than you should talk to the System Admins to take a look at DNS/DHCP.
The servers are created as an image of preconfigured server (for example, if you are working with virtual machines, and instead of doing a fresh install for each of the servers, you simply clone the image). This is a problem because DTC has an internal "Identifier" - and in case of image cloning both your installations now have same DTC ID, and won't be able to communicate with each other. The solution is to simply uninstall and install the DTC again.
Hope it helps.
Things to check:
Have you done this configuration on both servers?
Are both servers members of the same domain?
Have you checked the event log?
I had the same problem while connecting to a remote SQl Server.
The solution in my case was to add "enlist=false" to the connection string.
I was missing quite a lot of things:
No authentication (as DB server and APP server and not within same AD domain)
Rule to Windows Firewall enabling msdtc.exe
Rule to firewall between DMZ and internal zone TCP 135,1024-65535 in both directions. The link tell you how to restrict the firewall policy to few ports only.
short / long server names to hosts or a shared DNS server. Eg. 192.168.1.1 app1 as well as 192.168.1.1 app1.domain.local
On the other hand based on this link my setup doesn't require:
Allow Remote Clients
Allow Remote Administration
Enable XA Transactions (required prior Windows Server 2003 SP1)
Solved after adding remote IP\machine name to files on server:
hosts, lmhosts
in folder
C:\Windows\System32\drivers\etc
One of our servers displayed this error after the Virtual Machine (VM) controlling our Domain Controller froze. Several related communication problems also started to pop up (like failed password resets). Resetting the frozen VM fixed the issue.
Lots of helpful answers already given.
One problem for me was the presence of invalid (cyrillic) characters in the computer name.
And there is also a way to validate the connection between two servers (or between a server and a computer) using a small tool from Microsoft called DTCPing.
Is this even a valid question? I have a .NET Windows app that is using MSTDC and it is throwing an exception:
System.Transactions.TransactionManagerCommunicationException: Network access for Distributed Transaction Manager (MSDTC) has been disabled. Please enable DTC for
network access in the security configuration for MSDTC using the Component Services Administrative tool ---> System.Runtime.InteropServices.COMException (0x8004D024): The transaction manager has disabled its support for remote/network
transactions. (Exception from HRESULT: 0x8004D024) at System.Transactions.Oletx.IDtcProxyShimFactory.ReceiveTransaction(UInt32
propgationTokenSize, Byte[] propgationToken, IntPtr managedIdentifier,
Guid& transactionIdentifier, OletxTransactionIsolationLevel&
isolationLevel, ITransactionShim& transactionShim)....
I followed the Kbalertz guide to enable MSDTC on the PC on which the app is installed, but the error still occurs.
I was wondering if this was a database issue? If so, how can I resolve it?
Use this for windows Server 2008 r2 and Windows Server 2012 R2
Click Start, click Run, type dcomcnfg and then click OK to open Component Services.
In the console tree, click to expand Component Services, click to expand Computers, click to expand My Computer, click to expand Distributed Transaction Coordinator and then click Local DTC.
Right click Local DTC and click Properties to display the Local DTC Properties dialog box.
Click the Security tab.
Check mark "Network DTC Access" checkbox.
Finally check mark "Allow Inbound" and "Allow Outbound" checkboxes.
Click Apply, OK.
A message will pop up about restarting the service.
Click OK and That's all.
Reference : https://msdn.microsoft.com/en-us/library/dd327979.aspx
Note: Sometimes the network firewall on the Local Computer or the Server could interrupt your connection so make sure you create rules to "Allow Inbound" and "Allow Outbound" connection for C:\Windows\System32\msdtc.exe
Do you even need MSDTC? The escalation you're experiencing is often caused by creating multiple connections within a single TransactionScope.
If you do need it then you need to enable it as outlined in the error message. On XP:
Go to Administrative Tools -> Component Services
Expand Component Services -> Computers ->
Right-click -> Properties -> MSDTC tab
Hit the Security Configuration button
I've found that the best way to debug is to use the microsoft tool called DTCPing
Copy the file to both the server (DB) and the client (Application server/client pc)
Start it at the server and the client
At the server: fill in the client netbios computer name and try to setup a DTC connection
Restart both applications.
At the client: fill in the server netbios computer name and try to setup a DTC connection
I've had my fare deal of problems in our old company network, and I've got a few tips:
if you get the error message "Gethostbyname failed" it means the computer can not find the other computer by its netbios name. The server could for instance resolve and ping the client, but that works on a DNS level. Not on a netbios lookup level. Using WINS servers or changing the LMHOST (dirty) will solve this problem.
if you get an error "Acces Denied", the security settings don't match. You should compare the security tab for the msdtc and get the server and client to match. One other thing to look at is the RestrictRemoteClients value. Depending on your OS version and more importantly the Service Pack, this value can be different.
Other connection problems:
The firewall between the server and the client must allow communication over port 135. And more importantly the connection can be initiated from both sites (I had a lot of problems with the firewall people in my company because they assumed only the server would open an connection on to that port)
The protocol returns a random port to connect to for the real transaction communication. Firewall people don't like that, they like to restrict the ports to a certain range. You can restrict the RPC dynamic port generation to a certain range using the keys as described in How to configure RPC dynamic port allocation to work with firewalls.
In my experience, if the DTCPing is able to setup a DTC connection initiated from the client and initiated from the server, your transactions are not the problem any more.
Can also see here on how to turn on MSDTC from the Control Panel's services.msc.
On the server where the trigger resides, you need to turn the MSDTC
service on. You can this by clicking START > SETTINGS > CONTROL PANEL > ADMINISTRATIVE TOOLS > SERVICES. Find the service called 'Distributed Transaction Coordinator' and RIGHT CLICK (on it and
select) > Start.
MSDTC must be enabled on both systems, both server and client.
Also, make sure that there isn't a firewall between the systems that blocks RPC.
DTCTest is a nice litt app that helps you to troubleshoot any other problems.
#Dan,
Do I not need msdtc enabled for
transactions to work?
Only distributed transactions - Those that involve more than a single connection. Make doubly sure you are only opening a single connection within the transaction and it won't escalate - Performance will be much better too.
MSDTC can be configured with MsDtc PowerShell module, e.g.:
# Import the module
Import-Module -Name MsDtc
# Set the DTC config
$dtcNetworkSetting = #{
DtcName = 'Local'
AuthenticationLevel = 'NoAuth'
InboundTransactionsEnabled = $true
OutboundTransactionsEnabled = $true
RemoteClientAccessEnabled = $true
RemoteAdministrationAccessEnabled = $true
XATransactionsEnabled = $false
LUTransactionsEnabled = $true
}
Set-DtcNetworkSetting #dtcNetworkSetting
# Restart the MsDtc service
Get-Service -Name MsDtc | Restart-Service
Run on each of the machines that will be supporting the distributed transactions (i.e. where the MSDTC service is running).