WPF application crashes when run in Jenkins service but not as a regular user - wpf

I am automating some software testing using Jenkins on our in-house test software. It is written in C# with WPF. I am using the version of the program that has no frontend, but it still makes calls to WPF in order to start the service. The test application is launched through python scripts using subprocess.Popen
When running under the Jenkins slave process I get the following error:
Unhandled Exception: System.ComponentModel.Win32Exception: The operation completed successfully
at MS.Win32.UnsafeNativeMethods.RegisterClassEx(WNDCLASSEX_D wc_d)
...
From my research it seems that something is going wrong with the windows atom tables. The weird thing is that I don't run into the same issue when running the python scripts as a local user, it only crashes when it is the Jenkins service running the scripts.
Is there some limitation on the local atom table for windows services?
Is Jenkins hogging atom table entries?
Update 1:
I did some more research on the crash and some resources said for WPF there might be a windows handle leak, so I inserted some power shell calls to check how many active handles there are on the system.
It doesn't look like there is a handle leak from any of the software under test, I am seeing around 50-60k handles while jenkins is running the test scripts, which is consistent with what I was seeing while running it under my user account. It looks like Jenkins is interfering with WPF's ability to construct the program (The full callstack for the error is on the Main() constructor), but I have no idea why it would be doing that.
Update 2:
Some extra information since I think it is relevant, here is the full call stack:
Unhandled Exception: System.ComponentModel.Win32Exception: The operation completed successfully
at MS.Win32.UnsafeNativeMethods.RegisterClassEx(WNDCLASSEX_D wc_d)
at MS.Win32.HwndWrapper..ctor(Int32 classStyle, Int32 style, Int32 exStyle, Int32 x, Int32 y, Int32 width, Int32 height, String name, IntPtr parent, HwndWrapperHook[] hooks)
at System.Windows.Threading.Dispatcher..ctor()
at System.Windows.Threading.Dispatcher.get_CurrentDispatcher()
at System.Windows.Application..ctor()
at <TestingApplication>.ScriptHost.App..ctor()
at <TestingApplication>.ScriptHost.App.Main()
So what appears is happening is the following:
Jenkins calls the python scripts. They perform setup and all pre-condition work such as bringing up the software under test
Python calls the TestingApplication through Popen
The executable starts and attempts to construct the application
WPF checks to see if there is a Dispatcher already on the thread, there is not
WPF attempts to create a Dispatcher since one doesn't already exist
Crash because a Dispatcher cannot be created
Again, this only happens when launched under the Jenkins service user.

So here's what I did to fix the problem:
The software under test also uses WPF to create the displays, so I wanted to see if maybe i'm running out of resources by having too many things open. There are 5 SUT applications. Luckily, one of the pieces of software we run is a console that I don't need open when running tests through Jenkins. Thus I was able to not launch the console application and that freed up enough resources for the test application to run.
What this doesn't answer is:
Why do I have less resources on a service compared to a regular user?
Is this because of Jenkins/Python or because of Windows?
Is there a workaround for this problem?
Those questions could be asked outside of this question so I am answering just so that it is available to someone with a similar problem and to close it. For someone who doesn't have the luxury of closing extra applications, I would ask those questions yourself.

Related

Packaged shell extension killing application

I have a WPF application. To give it an identity to consume UWP APIs, I've added sparse package support. This installs / uninstalls / updates (we're not using MSIX) with my WPF application fine, and my app is running with an identity. It shows in task manager with a Package Name listed on my process.
Now I'm attempting to add context menu support following Microsoft's docs.
I've created a shell extension which will show when opening the context menu for any file and folder, which is pretty much a copy of their sample with different GetIcon(), GetTitle() and Invoke() implementations for IExplorerCommand.
I'm specifying this in the AppxManifest.xml (values anonymised):
<desktop4:Extension Category="windows.fileExplorerContextMenus">
<desktop4:FileExplorerContextMenus>
<desktop5:ItemType Type="*">
<desktop5:Verb Id="MyFileCommand" Clsid="file-guid"/>
</desktop5:ItemType>
<desktop5:ItemType Type="Directory">
<desktop5:Verb Id="MyFolderCommand" Clsid="folder-guid"/>
</desktop5:ItemType>
</desktop4:FileExplorerContextMenus>
</desktop4:Extension>
<com:Extension Category="windows.comServer">
<com:ComServer>
<com:SurrogateServer DisplayName="SSVerbHandler">
<com:Class Id="file-guid" Path="my-shell-extension.dll" ThreadingModel="STA"/>
</com:SurrogateServer>
<com:SurrogateServer DisplayName="SSVerbHandler">
<com:Class Id="folder-guid" Path="my-shell-extension.dll" ThreadingModel="STA"/>
</com:SurrogateServer>
</com:ComServer>
</com:Extension>
This works, my context menu entry is listed and performs the action as expected. But here's the issue: each time the context menu is opened for the first time, it kills the already running instance of my WPF application. By first time, I mean restarting explorer.exe and right clicking on a file or folder.
My gut feeling is this is related to UWP side of things. This is because originally it would always kill my application when right clicking to open a context menu. But with a little trial and error I solved this by configuring multi-instance support in my AppxManifest.xml:
<Package
...
xmlns:desktop4="http://schemas.microsoft.com/appx/manifest/desktop/windows10/4"
xmlns:iot2="http://schemas.microsoft.com/appx/manifest/iot/windows10/2"
IgnorableNamespaces="uap mp desktop4 iot2">
...
<Applications>
<Application Id="App"
...
desktop4:SupportsMultipleInstances="true"
iot2:SupportsMultipleInstances="true">
...
</Application>
</Applications>
...
</Package>
I'm hoping someone can suggest any troubleshooting ideas as I'm now struggling.
I've sprinkled some ol' trusty MessageBox functions within the shell extension in DllMain, DllCanUnloadNow and DllGetClassObject. But for that first load attempt, no message boxes are shown, no context menu item listed and my application is still killed.
I've poked around in the event viewer hoping to see any errors or warnings listed recently, plus in Applications and Service Logs\Microsoft\Windows\Appx* and Applications and Service Logs\Microsoft\Windows\AppModel-Runtime. Nothing has jumped out at me.
According to this SO anwser, if there's an error with the shell extension itself, it may not show. That does fit, but I'm sceptical as it always works on subsequent attempts, and the previously solved UWP killing issue.
In the scenario where my application stops, using these powershell commands I get:
$process = Start-Process .\MyApp.exe -PassThru -Wait
$process.ExitCode
1
I do have crash logging in my app, but nothing gets logged. This is the only place Environment.Exit(1) is called within the WPF application.
I did try the silent monitoring detection in GFlags on the WPF application, but couldn't seem to trigger it. Only when I manually closed the application. (Ignore Self Exits unchecked for testing). I'm not sure if that's because the exit code is 1.
When the error occurs with process monitor running, I can see lots of ThreadExit before a final ProcessExit with the exit code 1. That would imply it's exiting cleanly with my application itself returning 1?
It's also worth mentioning I've lived in managed / .NET land for the last decade or so, I don't have much experience with C++ (or unmanaged languages in general) or UWP, and the first time I've attempted to write a shell extension.

Selenium WebDriver without a Test Runner?

I'm not sure if this question is going to be closed due to it being too novice but I thought I'll give this a shot anyway.
I am currently working on a Selenium Automation framework which, though seemingly well built, is running it's code by spawning threads. (The framework is proprietary so I'm unable to share the code)
This framework instead of using a Test Framework like JUnit or TestNG to run "Tests", uses a threaded approach to run. aka, the methods that read datasheet, instantate and execute the Drivers, report the results etc. them are executed by starting a thread, the class of which is instantiated at various places in the code on runtime.
My concern is: though it runs fine locally with providing the reports and what have you, what it would be unable to do, due to it not operating using a Test Runner, it's unable pass or fail a "Test".
Therefore, on putting this up on a build pipeline, "Test"s wouldn't be executed as there are no "tests" so to speak, thereby making it making it lose it's juice on CI/CD as far as reporting of build pipeline success or failure is concerned.
Am I justified/unjustified in my concerns? Why? And is there a workaround for this? At what ROI?
Resources or links shall be welcomed and beer shall be owed!! :-)
Cheers,
Danesh

Application.Dispatcher.UnhandledException and CurrentDomain.UnhandledException

I'm investigating strange WPF multi threaded application hang that happen randomly on production machines more easily than development ones.
I have added Application.Dispatcher.UnhandledException and CurrentDomain.UnhandledException event handlers to log the every and each exception in order to keep the application from crashing as it is a system shell.
I suspect the reason of the hang is that I'm handling those events and not allowing the application to crash, what make me say this is that when i monitored the application activity when it hanged using windows task manager i noticed CPU load probably caused by background tasks in the application and those background tasks are writing to logs as expected.
The logger class writes the logs to the disk asynchronously using BeginInvoke.
The big problem is every time the application hangs in production machine i find nothing in log files.
Now i have two questions:
What is the fire order of those events: "Application.Dispatcher.UnhandledException and CurrentDomain.UnhandledException".
2.Is there anyway i can manage to be able to know were my code is failing
Note: I use a lot of Thread.Sleep in background tasks as i manage the threads my self and i think i went wrong with this approach and I'm considering rewriting all of my background tasks from scratch.
Any help will be much appreciated.
Order of this event is:
Application.DispatcherUnhandledException
CurrentDomain.UnhandledException
If you want get information from exception where your code is failing, check property StackTrace. You can also attach to your project pdb files and in this case you will get StackTrace with line numbers.

Problem in process hooking

I have a process (say, for example, MyProcessA), hooked an exe and injected my dll (MyDll.dll) into the process space of MyProcessA, so even if it's gonna create n number of child processes it will be process hooked as well. I have no problem in hooking and injecting the dll into the process. I have hooked all file and process dependant functions, but somehow I am not able to achieve complete hook of any setup (any application setup). I suspect if am missing any process related APIs or it might be some UAC problem, currently I am using CreateProcess(A&W), NtCreateProcess, ShellExecute(A&W). What could be the problem?
I suspect that the answer is related to the "Windows Installer Service". I'm guessing that your hooks wouldn't catch any interactions with a service, which even if launched as a result of FireFox's setup is going to be created by a different System process. I haven't had much experience with Windows Installer, but the documentation here should have more details than you could possibly wish for, given the time to find it.
UAC might also cause you issues, but you should be able to rule that out by launching the hooking code with administrative privileges to start with.
Is this research for uni? Either way good luck, it sounds like an interesting problem.

How do I launch a WPF app from command.com. I'm getting a FontCache error

I know this is not ideal, but my constraint is that I have a legacy application written in Clipper.
I want to launch a new, WinForms/WPF application from inside the application (to ease transition). This legacy application written in Clipper launches using:
SwpRunCmd("C:\MyApp\MyBat.bat",0)
The batch file contains something like this command:
C:\PROGRA~1\INTERN~1\iexplore "http://QASVR/MyApp/AppWin/MyCompany.MyApp.AppWin.application#MyCompany.MyApp.AppWin.application"
It is launching a WinForms/WPF app that is we deploy via ClickOnce. Everything has been going well until we introduced WPF into the application. We were able to easily launch from the legacy application.
Since we have introduced WPF, however, we have the following behavior. If we launch via the Clipper application first, we get an exception when launching the application. The error text is:
The type initializer for 'System.Windows.FrameworkElement' threw an exception.
at System.Windows.FrameworkElement..ctor()
at System.Windows.Controls.Panel..ctor()
at System.Windows.Controls.DockPanel..ctor()
at System.Windows.Forms.Integration.AvalonAdapter..ctor(ElementHost hostControl)
at System.Windows.Forms.Integration.ElementHost..ctor()
at MyCompany.MyApp.AppWin.Main.InitializeComponent()
at MyCompany.MyApp.AppWin.Main..ctor(String[] args)
at MyCompany.MyApp.AppWin.Program.Main(String[] args)
The type initializer for 'System.Windows.Documents.TextElement' threw an exception.
at System.Windows.FrameworkElement..cctor()
The type initializer for 'System.Windows.Media.FontFamily' threw an exception.
at System.Windows.Media.FontFamily..ctor(String familyName)
at System.Windows.SystemFonts.get_MessageFontFamily()
at System.Windows.Documents.TextElement..cctor()
The type initializer for 'MS.Internal.FontCache.Util' threw an exception.
at MS.Internal.FontCache.Util.get_WindowsFontsUriObject()
at System.Windows.Media.FontFamily.PreCreateDefaultFamilyCollection()
at System.Windows.Media.FontFamily..cctor()
Invalid URI: The format of the URI could not be determined.
at System.Uri.CreateThis(String uri, Boolean dontEscape, UriKind uriKind)
at System.Uri..ctor(String uriString, UriKind uriKind)
at MS.Internal.FontCache.Util..cctor()
If we launch the application via the URL (in IE) or via the icon on the desktop first, we do not get the exception and application launches as expected.
The neat thing is that whatever we launch with first determines whether the app will launch at all. So, if we launch with legacy first, it breaks right away and we can't get the app to run even if we launch with the otherwise successful URL or icon. To get it to work, we have to logout and log back in and start it from the URL or icon.
If we first use the URL or the icon, we have no problem launching from the legacy application from that point forward (until we logout and come back in).
One other piece of information is that we are able to simulate the problem in the following fashion. If we enter a command prompt using "cmd.exe" and execute a statement to launch from a URL, we are successful. If, however, we enter a command prompt using "command.com" and we execute that same statement, we experience the breaking behavior.
We assume it is because the legacy application in Clipper uses the equivalent of command.com to create the shell to spawn the other app. We have tried a bunch of hacks like having command.com run cmd.exe or psexec and then executing, but nothing seems to work.
We have some ideas for workarounds (like making the app launch on startup so we force the successful launch from a URL, making all subsequent launches successful), but they all are sub-optimal even though we have a great deal of control over our workstations.
To reduce the chance that this is related to permissions, we have given the launching account administrative rights (as well as non-administrative rights in case that made a difference).
Any ideas would be greatly-appreciate. Like I said, we have some work arounds, but I would love to avoid them.
Thanks!
It sounds like the Presentation Font Cache service has trouble starting when the app is launched in this way.
If you have control over the client environment, you could try setting the Windows Presentation Font Cache startup to automatic instead of manual.
This is a shot in the dark made with incomplete information:
command.com and cmd.exe are quite different. AFAIK, command.com exists for legacy compatibility, so applications you run from it will run differently. I can't test anything to complete my post because I believe that command.com runs in 16-bit mode and 64bit versions of Windows (on which I'm running) don't support that mode anymore so no more command.com for me.
That being said, there should be no difference when trying to run 32-bit applications (including managed applications).
I'm not aware of what are the limitations of your environment, but some things you may try are:
Rename you .bat into .cmd to make sure it starts with cmd.exe rather than command.com
Make your .bat start the program using the start console command
Have a non-WPF program to invoke your WPF one with a more sane environment
The problem is that the windir environmental variable is not set when using command.com.
So, in your case, adding the line set windir=C:\Windows to the beginning of the bat file will solve the problem (assuming that you have your Windows instalation in C:\Windows.
An additional issue might be that the host application is running command.com in compatibility mode. The best is to list all the environmental variables after running cmd.exe (using the set command) and comparing it to the output of the set command that you set in your bat file

Resources