Release Management 2015 randomly hangs during acceptance/approval/validation/deployment steps - ms-release-management

We have been using RM for a while with no significant issues until we upgraded from 2013 to RM 2015. We upgraded to 2015 and also migrated the RM server to be hosted on a different virtual machine at the same time.
Ever since that upgrade, we have an intermittent issue where RM will completely hang for 10-15 minutes whenever a release is accepted, approved or validated. This is what happens when RM hangs:
1) Multiple deployment agents will disconnect from RM because they can't connect to the server.
2) The RM fat client becomes unresponsive
3) The RM website becomes unresponsive
4) If the hang occurs during the deployment step, often times the deployment will fail because the agent can not talk to the server
Things we have tried:
1) Per Microsoft recommendation, we turned off recycling the app pool on the server
2) We made sure all of our servers are accessing the drop location using UNC paths and not HTTPS
3) Increased Look for Packages to Deploy timeout
This issue does not seem to be tied to any particular release template or even a release. It can happen to any of them at any time. So far, the only pattern we have been able to identify is that it only happens when someone accepts, approves or validates a deployment. Even then, it does not occur every time.
When RM hangs, both the server and the agents log errors like this:
Timestamp: 1/6/2016 11:36:09 AM
Message: Root element is missing.: \r\n\r\n at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
at System.Xml.XmlTextReaderImpl.Read()
at System.Xml.Linq.XDocument.Load(XmlReader reader, LoadOptions options)
at System.Xml.Linq.XDocument.Parse(String text, LoadOptions options)
at Microsoft.TeamFoundation.Release.Common.ExtensionMethods.XmlExtensionMethods.ToXDocument(String value, Boolean preserveWhitespace)
at Microsoft.TeamFoundation.Release.Data.Model.ModelFactory.TransformXmlToModel[T](T model, String xml)
at Microsoft.TeamFoundation.Release.Data.Model.ModelFactory.Load[T](Int32 id)
at Microsoft.TeamFoundation.Release.DeploymentAgent.Services.Deployer.DeploymentEventFetcherBase.DeployNextComponent()
Category: General
Priority: -1
EventId: 0
Severity: Error
Title:
Machine: [Redacted]
Application Domain: DeploymentAgent.exe
Process Id: 1880
Process Name: C:\Program Files\Microsoft Visual Studio 12.0\Release Management\bin\DeploymentAgent.exe
Win32 Thread Id: 2436
Thread Name:
and this:
Timestamp: 1/6/2016 11:36:09 AM
Message: Error while converting string to XDocument: [Root element is missing.] [ at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
at System.Xml.XmlTextReaderImpl.Read()
at System.Xml.Linq.XDocument.Load(XmlReader reader, LoadOptions options)
at System.Xml.Linq.XDocument.Parse(String text, LoadOptions options)
at Microsoft.TeamFoundation.Release.Common.ExtensionMethods.XmlExtensionMethods.ToXDocument(String value, Boolean preserveWhitespace)].
Value is:
Category: General
Priority: -1
EventId: 0
Severity: Error
Title:
Machine: [Redacted]
Application Domain: DeploymentAgent.exe
Process Id: 1880
Process Name: C:\Program Files\Microsoft Visual Studio 12.0\Release Management\bin\DeploymentAgent.exe
Win32 Thread Id: 2436
Thread Name:
Extended Properties:
After about 10-15 minutes, RM will recover on its own and be usable again. It might proceed with not problems on a few more releases, or it might hang again on the very next approval/acceptance/validation gate.
Any help troubleshooting would be greatly appreciated.

Joe, you can enable verbose logs for RM service but following the steps in https://blogs.msdn.microsoft.com/visualstudioalm/2013/12/12/how-to-debug-release-management-components/. Follow section named "Enable Services Logs on Release Management Server".
You can then mail the logs to rm_customer_queries_at_microsoft_dot_com for faster resolution.

Related

SQL Server DTSWizard.exe Import and Export Wizard clr.dll Access Violation

Earlier today I was attempting to import data in a local SQL Server instance using the built-in SQL Server Import and Export Wizard [DTSWizard.exe] which I've recently done many times. Today was not one of them. Immediately after clicking the initial 'Next' where the Source Data Sources are enumerated, DTSWizard.exe crashes immediately. Luckily, it does generate Application Events in Event Viewer.
Application: DTSWizard.exe Framework Version: v4.0.30319 Description:
The process was terminated due to an internal error in the .NET
Runtime at IP 745B1AB3 (74590000) with exit code 80131506.
Faulting application name: DTSWizard.exe, version: 15.0.2000.168, time
stamp: 0x60d2af25 Faulting module name: clr.dll, version: 4.8.4400.0,
time stamp: 0x60b90414 Exception code: 0xc0000005 Fault offset:
0x00021ab3 Faulting process id: 0x2644 Faulting application start
time: 0x01d798859bcd5141 Faulting application path: C:\Program Files
(x86)\Microsoft SQL Server Management Studio
18\Common7\IDE\CommonExtensions\Microsoft\SSIS\150\Binn\DTSWizard.exe
Faulting module path:
C:\Windows\Microsoft.NET\Framework\v4.0.30319\clr.dll Report Id:
f299f3ae-8e51-4bf8-a4b0-33bee2d36804 Faulting package full name:
Faulting package-relative application ID:
Fault bucket 1468275927648085676, type 1 Event Name: APPCRASH
Response: Not available Cab Id: 0
Problem signature: P1: DTSWizard.exe P2: 15.0.2000.168 P3: 60d2af25
P4: clr.dll P5: 4.8.4400.0 P6: 60b90414 P7: c0000005 P8: 00021ab3 P9:
P10:
Attached files:
C:\ProgramData\Microsoft\Windows\WER\Temp\WERF2CF.tmp.dmp
C:\ProgramData\Microsoft\Windows\WER\Temp\WERF438.tmp.WERInternalMetadata.xml
C:\ProgramData\Microsoft\Windows\WER\Temp\WERF448.tmp.xml
C:\ProgramData\Microsoft\Windows\WER\Temp\WERF446.tmp.csv
C:\ProgramData\Microsoft\Windows\WER\Temp\WERF467.tmp.txt
These files may be available here:
C:\ProgramData\Microsoft\Windows\WER\ReportArchive\AppCrash_DTSWizard.exe_9a495bf0850f18fc52df16157df1d27143632_d6e9ae04_912b0408-c4d8-4634-820f-fde40457c5a6
Analysis symbol: Rechecking for solution: 0 Report Id:
f299f3ae-8e51-4bf8-a4b0-33bee2d36804 Report Status: 268435456 Hashed
bucket: 99ad5a02b5c660c124605d2d4bb8a6ac CabGuid: 0
I'm using the latest version of SQL Server Management Studio, build 15.0.18386.0, which I did reinstall but with no change. Other things which I have tried is apply all cumulative updates for .NET Framework 4.8, which brought clr.dll to file version 4.8.4400.0, timestamped 6/3/2021 1:25PM. I have also tried the .NET Framework Repair Tool which didn't seem to make a difference. Based on Windows Update, there seem to have been no recently applied hotfixes, patches, etc. between now and last week which is the time of the last successful import.
In my expierence, this could be related to anti-virus/anti-malware software. You might need to whitelist the software.

Ubuntu mssql server Corruption detected in persistent registry: \SystemRoot\security.hiv

I started having this issue today on our production sql server. I have tried a variety of different fixes proposed online. We are using MSSQL server 2017 (14.0.3257.3-13). I'm out of ideas on what could be causing the server to crash. Below is the recent crash log.
This program has encountered a fatal error and cannot continue running at Sat Feb 1 14:21:21 2020
The following diagnostic information is available:
Reason: 0x00000007
Status: 0xc000014c
Message: Corruption detected in persistent registry: \SystemRoot\security.hiv.
Stack Trace:
000000006b137250
000000006b1345bf
000000006b1347a3
000000006b1337d3
000000006b1326f2
000000006b175c31
Process: 8815 - sqlservr
Thread: 8819 (application thread 0x4)
Instance Id: e5a2f812-0426-4d92-b9b2-1db1e60d957c
Crash Id: 60073e70-4042-4275-9fcd-a05ae84d26f5
Build stamp: 9726a6583fe7826f57b03fd1c7adf12bebe7692cb64630fccb0541c06820af4d
Distribution: Ubuntu 16.04.6 LTS
Processors: 9
Total Memory: 8589934592 bytes
Timestamp: Sat Feb 1 14:21:21 2020
Last errno: 2
Last errno text: No such file or directory
Thank you for the ideas, Toret.
I have faced the same issue, but I solved it just by deleting the security.hiv file.
rm /var/opt/mssql/.system/system/security.hiv
After that the mssql-server service started normaly.
After working through multiple proposed solutions online nothing worked. Some of the things I tried:
Upgrading mssql-server to latest version.
Repairing missing files or dependencies.
Changing access permissions to the directory.
Elevating access permissions for the mssql user.
Changing user access to root for the .hiv files located in the mssql .system/system folder
The only way to for me to get it to work was to:
Delete all the folders manually from /var/opt/mssql/ except for the
data folder.
Re-link python from 3.5 to 2.7
Then I downgraded the mssql-server version to Microsoft SQL Server 2017 14.0.3192.2.
Run the sudo /opt/mssql/bin/mssql-conf setup
**Python Re-link**
sudo rm /user/bin/python
sudo ln -s /user/bin/python[version] /user/bin/python
After that everything worked again.

Configuring MSSQL Server on ubuntu - Cannot open or read the persistent registry: \SystemRoot\security.hiv

I'm using the following guide to install MSSQL server on my ubuntu 16.04 machine
https://learn.microsoft.com/en-us/sql/linux/quickstart-install-connect-ubuntu?view=sql-server-2017
when I'm running:
sudo /opt/mssql/bin/mssql-conf setup
no matter what kind of SQL Server edition I choose, I'm getting the following error:
Confirm the SQL Server system administrator password:
Configuring SQL Server...
This program has encountered a fatal error and cannot continue running at Mon Apr 1 16:06:07 2019
The following diagnostic information is available:
Reason: 0x00000007
Message: Cannot open or read the persistent registry: \SystemRoot\security.hiv.
Process: 19600 - sqlservr
Thread: 19604 (application thread 0x4)
Instance Id: 7ebfcf27-db60-460d-afd3-6d852b70069e
Crash Id: d99ba388-d323-43f3-b758-e116f42bb2e8
Build stamp: 70437f6583b8ef39b1ef70539ef84690980315dc7a4436c9c40015f28610e4aa
Distribution: Ubuntu 16.04.6 LTS
Processors: 8
Total Memory: 16673366016 bytes
Timestamp: Mon Apr 1 16:06:07 2019
Ubuntu 16.04.6 LTS
Capturing core dump and information to /var/opt/mssql/log...
Hint: You are currently not seeing messages from other users and the system.
Users in the 'systemd-journal' group can see all messages. Pass -q to
turn off this notice.
No journal files were opened due to insufficient permissions.
Hint: You are currently not seeing messages from other users and the system.
Users in the 'systemd-journal' group can see all messages. Pass -q to
turn off this notice.
No journal files were opened due to insufficient permissions.
/usr/bin/tail: cannot open '/var/log/syslog' for reading: Permission denied
Attempting to capture a dump with paldumper
Captured a dump with paldumper
Core dump and information are being compressed in the background. When
complete, they can be found in the following location:
/var/opt/mssql/log/core.sqlservr.04_01_2019_16_06_07.19600.tbz2
Initial setup of Microsoft SQL Server failed. Please consult the ERRORLOG
in /var/opt/mssql/log for more information.
also I found this post, which look like this guy had a similar problem, but sadly no solution
does any one knows how to solve my problem?
Thank you
Edit:
after implementing the answer I got another error:
Confirm the SQL Server system administrator password:
Configuring SQL Server...
Initial setup of Microsoft SQL Server failed. Please consult the ERRORLOG in /var/opt/mssql/log for more information
To make some clean in the mess that I had in the log folder I decided to delete it completely using
sudo rm -rf /var/opt/mssql/log
and re-run the setup, Apparently that solved my last problem and finally:
Setup has completed successfully. SQL Server is now starting.
You'll find further information in
/var/opt/mssql/log
Mine said:
{
"reason": "0x00000007",
"processName": "sqlservr",
"pid": "5773",
"instanceId": "d7df749c-50e6-4f3b-b894-2aa7c743f33d",
"crashId": "281e772a-5946-4349-aa9e-671cd0a3772c",
"threadId": "5777",
"libosThreadId": "0x4",
"buildStamp": "70437f6583b8ef39b1ef70539ef84690980315dc7a4436c9c40015f28610e4aa",
"message": "Cannot open or read the persistent registry: \\SystemRoot\\lsa.hiv.",
"last_errno": "13",
"last_errno_text": "Permission denied",
"distribution": "Ubuntu 16.04.6 LTS",
"processors": "4",
"total_memory": "16732037120",
"timestamp": "Fri Apr 12 22:02:44 2019"
}
So I ran locate to see where "systemroot" is located:
locate security.hiv
/var/opt/mssql/.system/system/security.hiv
I didn't know which permissions should be applied, so I just gave read&write to "others".
then the same with
lsa.hiv
licensing.hiv
re-run
sudo /opt/mssql/bin/mssql-conf setup
and then, sql-server starts, plus the permissions for others are gone again.
By the way, you can run sql-server without service, then it works even if the service fails:
/opt/mssql/bin/sqlservr
In my case (#Mine) it was not only licensing.hiv.
My /var/opt/mssql/.system/instance_id was somehow destroyed and there were more files with owner root.
After deleting /var/opt/mssql/.system/instance_id and changing all root.root files to mssql.mssql (chown mssql.mssql /var/opt/mssql/.system/system/*), I was able to rerun "/opt/mssql/bin/mssql-conf setup"
Afterwards mssql runs fine again
I very goog hint was that "/var/opt/mssql/.system/instance_id" runs on his own.

Linux MS SQL Server evaluation expired, can't upgrade to developer

When I installed MS SQL Server for Linux half a year ago, there was no way to choose between evaluation and developer. Now the evaluation period has expired and I can't install a developer version. I don't care about any of my databases and I have tried to remove SQL Server before installing it again. The installation is fine but when I run mssql-conf setup I get the following:
Configuring SQL Server...
Error: The evaluation period has expired.
This program has encountered a fatal error and cannot continue running.
The following diagnostic information is available:
Reason: 0x00000001
Signal: SIGSEGV - Segmentation fault (11)
Stacktrace: 0000564434051ee7 00007f9892387b20 00005644340236c2
000056443404a8db 000056443404a059
Process: 7228 - sqlservr
Thread: 7253 (application thread 0x1060)
Instance Id: 357ebf86-214d-4100-b14f-cb62b380917e
Crash Id:
Build stamp: 3db4cdd88f9bbf816f82e0ab6e17825a0a0f8b2ef98a5c67b521be0ed19c297c
/opt/mssql/lib/mssql-conf/invokesqlservr.sh: line 15: 7227 Aborted sudo -EH -u mssql /bin/bash -c "$CMDLINE"
Setup has completed successfully. SQL Server is now starting.
The last line is wrong, SQL Server does not start.
I assume that uninstallation leaves some trace of my old evaluation that is detected when I try to set up the new developer installation. Anyone knows if there is a way to get rid of whatever is blocking the new install?
TIA,
Gunnar
Shane's comment answered the question. Replacing the repository according to https://learn.microsoft.com/en-us/sql/linux/quickstart-install-connect-suse did the trick!

TFSBuild / Release Manager: The directory name is invalid

I have a continuous integration build that used to run fine, but began giving me the following exception:
This exception is from the build agent,
Process each ConfigurationsToRelease
Release the build
Run the Release Management build Process for the current configuration:
Exception:
Exception Message: The directory name is invalid (type Win32Exception)
Exception Stack Trace:
Server stack trace:
at System.Diagnostics.Process.StartWithCreateProcess(ProcessStartInfo startInfo)
at Microsoft.TeamFoundation.Build.Workflow.Activities.InvokeProcess.ProcessWrapper.Start()
at Microsoft.TeamFoundation.Build.Workflow.Activities.InvokeProcess.InvokeProcessInternal.RunCommand (AsyncState state)
at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[]
args, Object server, Object[]& outArgs
at System.Runtime.Remoting.Messaging.StackBuilderSink.AsyncProcessMessage(IMessage msg,
IMessageSink replySink)
Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.EndInvokeHelper(Message reqMsg, Boolean bProxyCase)
at System.Runtime.Remoting.Proxies.RemotingProxy.Invoke(Object NotUsed, MessageData& msgData)
at System.Func`2.EndInvoke(IAsyncResult result)
at System.Activities.AsyncCodeActivity`1.System.Activities.IAsyncCodeActivity.FinishExecution
(AsyncCodeActivityContext context, IAsyncResult result)
at
System.Activities.AsyncCodeActivity.CompleteAsyncCodeActivityData.CompleteAsyncCodeActivityWorkItem.Execute(ActivityExecutor executor, BookmarkManager bookmarkManager)
This is using the default build template and it seemed to have started randomly. The release never actually hits the Release Management and never throws an exception or Roll Back there. It seems to die as it should be hitting Release Management.
I've checked the drop folder and everything is there as it should be. Permissions are still correct. I don't know what folder it's looking for.
Has anyone had any experience with this or any ideas of where to begin looking?
Install the RM Client on the build server.
Somehow, the RM Client had gotten removed from the build server in between releases. Going to have a talk with IT about that.
Thank you #Daniel Mann for catching that.
Update:
After upgrading to Release Management 2015, I received the same error. The ReleaseTfvcTemplate.12.xaml was updated. Make sure to copy the new template to your BuildProcessTemplates (or update your custom templates) from "C:\Program Files (x86)\Microsoft Visual Studio 14.0\Release Management\Client\bin"
After upgrading to RM 2015 I had to update my build templates as described in the comment above by abest.
The only changes are the version number in the registry paths.

Resources