Issue
In my current project I am implementing some trivial automated tests for a video call application which runs on Android, iOS (with Unity) and Windows (WPF). To create those automated tests I decided to use Appium for several reasons. The most important one was that there is a framework which makes it easy to automate Unity apps [1]. During the login process the systems default browser will be embedded to the current view of the application. The user has to enter his e-mail address and his password. This ist the point where I am struggeling. I try to send the following string:
String eMail = "system-administrator#e-mail.de"
But appium types the following text into the textfield of the embedded browser within the WPF client:
szstemßadministrator#eßmail.de
(This is because the german keyboad is configured as default at the system settings. The result would be another if the default layout would be another one)
I figured out, that the .NET driver for Appium was designed for US keyboards [2]. So I thought it would be the best way to send unicode characters [3]. Furhtermore I tried to normalize the String I like to send before, by using java.text.Normalizer [4]. But even, if I set the recommended desired capabilities [5][6], this does not effect the result described above. It looks like there is no solution for WPF available yet [7-11].
At the moment I have implemented a method to replace the characters which results from a german layout with the equivalent of the US keyboard (i.e. 'z' -> 'y', 'z' -> 'y' or '-' -> '/'), by using StringBuilder#replace. But this feels wrong and is very system dependent. Another workaround would be to configure shortcuts at the system and switch the layout of the keyboard during the test and switch it back afterwards [12]. But this feels wrong too, because the test should not rely on system settings or change them. Maybe my research was not sufficient enough and there is a way to force Appium to send the given String as it is displayed in the code.
question:
Which way could be the best to solve the issue described above?
Implement a method which replaces the characters, if necessary?
Create and use shortcuts to switch the keyboard of the system, if necessary?
Another one?
sources
[1] Java Image Recognition
[2] Appium: Issue#380
[3] Convert string to unicode
[4] Appium: Multi-lingual Support
[5] Appium Send keys() function sending Chinese characters as part of English strings
[6] Appium: Desired Capabilities
[7] Force keyboard layout for Selenium2 sendkeys method
[8] convert at symbol ("#") to CharSequence
[9] How to send "special characters" with Python Actions Send_keys?
[10] Appium: Issue#215
[11] Appium: Issue#507
[12] Appium: WPF keyboard workaround
Related
a quick question for what is strange behavior in my opinion.
I'm developing an app which, for the moment, is meant to run fully locally, with no network access at all.
Well, since when I've introduced some 3d graphics I'm having this in the debugger log window (in light sky blue)
Please note that such (apparent?) CDN access was fully absent before the addition of a 3d scene.
Could someone tell me what Apple is asking to a CDN and, most important, how can I prevent such accesses that have not been explicitly authorized/configured by the end user?
Thank you
In this case "CDN" does not refer to a Content Delivery Network, but rather to Apple's CoreDisplay framework. The following command will show that these logs come from the framework:
strings /System/Library/Frameworks/CoreDisplay.framework/CoreDisplay | grep "client setup_"
Is there a way to detect input areas such as textboxes and checkboxes within an application? I want to label each input area with a number so I can jump between input fields with AHK using my keyboard.
For example: Once the script is activated and active window is Google Chrome, Chrome could have its address bar labeled #1. When I press "1", the cursor will be directed to that area.
I'm basically trying to create a workaround for applications that are not very keyboard friendly.
Most Windows applications use standard windows elements.
For these...
https://autohotkey.com/docs/commands/WinGet.htm - with the ControlList parameter, gets a list of all standard controls.
For those:
https://autohotkey.com/docs/commands/ControlGet.htm - can get the type of control, and
https://autohotkey.com/docs/commands/ControlGetPos.htm - can get position and dimensions of the control.
Some can also be controlled through COM: https://gist.github.com/kheybot/7026077#automation-of-office-applications
Commandline and console programs can sometimes be communicated with directly, using the standard streams (STDIN, STDOUT, STDERR, LPTn, PRN, NUL), or you can communicate with the terminal that displays the program using COM or WSH:
https://gist.github.com/kheybot/7026077#interact-with-command-line
This is important for a lot of legacy data-entry programs.
Browsers (eg Chrome), unfortunately, can't use these heavyweight components, because there may be far too many on a page, but there are other options for communicating with them, such as COM, DDE, etc to communicate with the DOM:
https://gist.github.com/kheybot/7026077#browser-automation
For a web browser, I'd be inclined to go for a hybrid approach, combining AHK-handling of the web browser's input areas (address bar, etc) with a Greasemonkey/Tampermonkey script to handle input fields within the web page itself - the Javascript will be better able to handle input areas using the DOM than any screen-scraping software could. There's also the possibility of using a functional-testing suite like Selenium for automation, and using the browser's plug-in functionality to write an extension to handle its UI.
This would mean that you now have TWO programming problems, of course...
Java applications, Flash applications, HTML5 applications, some graphic design software, and just about all computer games are essentially just graphics, with no way of externally identifying controls.
For these, you have to use basic screen scraping techniques: http://www.autohotkey.com/docs/commands/ImageSearch.htm and http://www.autohotkey.com/docs/commands/PixelSearch.htm to identify specific areas, which can only really be done by individually programming the specific control.
One option for generic detection, though, is to have something that detects shadows (drop shadows, buttonized components, etc) and allows you to tab between and send a click to the components detected that way. Unfortunately, modern flat design meant this won't always work, so you could also try searching for flat-colored rectangles... except sometimes they have curved corners. Because graphic designers hate people.
At this point, you will hopefully see that what you have here is an infinite rabbithole of fractal complexity.
You can make a simple ControlGet solution which doesn't work for a lot of applications you would use regularly... or you can create a hybrid approach that targets many applications individually, while also trying to have a generic solution for unrecognized apps.
If you are creating this for your own use, I'd say aim for making it work with the apps you know and use regularly, and that should be enough.
If you're writing it as accessibility software for others to use, I'd say aim for having it user-configurable for each application: let them control what input element they want to click, and in what order, because auto-detection will never work perfectly, and will only rarely pick the ideal solution.
The answer is yes, if the number of check boxes and their position in the application is fixed and you know on which machine the automation takes place.
Please research ImageSearch on how to locate them from screenshots.
If you know the X/Y position of the checkbox in the window, you can also use PixelGetColor to check if a check is visible or not.
You should also examine your application with the included AutoIt Spy. This program shows you, what it can see in the application window.
To get your labelling, checkout the Gui commands. If you make you gui transparent and don't give focus, you can write labels on top of the application.
I have a GUI application that is written using win API's
and we need to launch a new GUI application when the user selects some command menu items.
We decided to write the new application in PyQt and launch the PyQt application usig Python C Api.
Everything is working fine except that the Parent window, through which we launch the PyQt Application, is not responding to some of the events when PyQt application is open. Once we close the PyQt Application it starts responding again to the key events.
I guess, that once the PyQt Gui application is launched, somehow the messages are not passed to the Parent window.
Inspecting with Spy++ I've found the following result:
Receives messages for:
- ALT key
- F1, F2 keys
- Mouse events
Does NOT receive messages for:
- CTRL key
- All other Fn keys
- All letter keys
- SHIFT, CAPS keys
Any thoughts to solve this problem would be appreciated
I believe what you are trying to do -- operate two separate GUIs within a single process -- is not supported by any major operating system. A while back, I searched for a long time for ways to do this and never came up with any advice except "don't".
I'm surprised that missing keys are the only problem you have.. I recommend finding a different solution before you discover more trouble (unless you can find some good evidence that this is at least supposed to work).
Could you perhaps spawn a new process to run the Qt event loop instead? Since you already have python embedded in the main process, this should be fairly easy--use python's built-in IPC to handle the communication between processes.
One solution is to build the QtWinMigrate module to create a QWinHost which supports parenting to a native HWND but unfortunately it is not part of the PyQt distribution.
You can find some sources here: https://github.com/glennra/PyQtWinMigrate.
This is what had to be done for Python integration in 3ds Max by Blur studio. I am currently studying the C++ source code of QWinWidget too see if I can work out an alternative solution using Win32 calls.
I know that we can find the screen resolution of the client's monitor.
Is it possible to find out whether the type of device is Monitor or Projector?
If I want my web-based silverlight client to work only in Monitors and not on Projectors or vice versa, is it possible to enforce that?
The following SO question deals a similar matter in the case of java applets.
Detect Display Type (Projector) from within the browser
So whats the case with silverlight?
I don't think even Windows knows that. Most of the time it's the display driver and only on laptops. So, I don't know of any easy way of doing that. You could use encrypted DRM to enforce HDCP but even then...no go more than likely. Silverlight is basically VB .net or C# so perhaps try to find an example in those languages.
EDIT: I did some more looking around and found no real API that provided a way to detect an outputs type (Monitor or Projector)
Edit: In addition to the bounty, we're willing to pay $250 to have this bug fixed in the Firefox/Gecko codebase. Here is a simple test project (Visual Studio 2008 C#) that reproduces the problem.
Edit #2 we're willing to pay $600 to have this bug fixed. See above for sample project that reproduces the problem.
We have a Firefox (Gecko) ActiveX control on our C# Windows Form to display HTML.
When this Firefox ActiveX control is on our form, about 2-3% of our key presses don't make it through. Or rather, a different Windows message is sent:
We hold down the TAB key to tab through 3 regular WinForms text boxes. It will behave correctly 97% of the time. Spy++ tells us WM_KEYDOWN message is sent properly:
normal behavior http://judahhimango.com/images/normaltab.jpg
But randomly, maybe 2-3% of the time, the tab key (or other key) isn't processed right. Spy++ tells us WM_CHAR is being sent instead:
odd behavior http://judahhimango.com/images/screwytab.png
When the odd behavior occurs, either the key is not processed at all, or is processed incorrectly (such as inserting a '\t' character into a textbox that doesn't support tab characters.
This only occurs if the Firefox ActiveX control is on our form.
Our question is: does Firefox/Gecko engine install some kind of keyboard hook that might cause these side effects? Or better yet, how do we fix this problem?
The WM_CHAR message is generated by TranslateMessage call, so a good place to start looking would be the TranslateMessage calls in the Gecko source code.
In the first example code you provided the function is imported only by two libraries - mozctl.dll and xul.dll. Since you claim that the same error happens also with GeckoFX we can take mozctl.dll out of the equation. That leaves us with xul.dll, so given the Gecko source code I would suggest to look into widget\src\windows\nsToolkit.cpp. I am not sure if the code is run if the engine is embedded, but if it is then the library starts a whole new message pump in different thread, which is bound to break.
Unfortunately I can't run or compile the code on my machine (Windows 7 x64 w/o the Mozilla ActiveX control installed), so I can't verify any of this with a debugger. Hope it helps someone to track it down further.
The root problem is that when Mozilla is embedded in another application, it incorrectly pumps Windows messages when it dispatches internal events. Mozilla uses an event system to coordinate across threads or to schedule deferred processing on a thread (see nsIThread, nsIEventTarget). If you embed a web page with a lot of active XMLHTTPRequests, for example, Mozilla will use its event dispatching interface to dispatch events back to javascript and it will pump windows messages as a side effect. Once Mozilla events are fully dispatched, it goes back to the main event loop.
When Mozilla pumps windows messages, it doesn't include the extra processing done by the application's event loop - IsDialogMessage(), TranslateMessage(), PreTranslateMessage(), or any other pre-processing are skipped when Mozilla gets into this state. Symptoms therefore include tab key presses getting inserted as a character instead of being used for dialog navigation, keyboard hotkeys being sporadically ignored, or custom message pre-processing being sporadically skipped. For example, the Outlook 2007/2010 "Compose" screen sporadically loses keystrokes because it relies on custom message pre-processing to handle keyboard input.
See https://bugzilla.mozilla.org/show_bug.cgi?id=582790 for a patch that addresses the problem.
I have Snoop Free and PSM Anti-Keylogger.
One of them detected firefox trying to install a Keyboard Hook.
Mozilla/Firefox file xul.dll attempt at installing at keyboard hook.
DENIED.
I noticed that you have implemented all of the interoperability yourself. Can you try this with the GeckoFX project and see if you get the same error? I use this project at work and haven't encountered any issues yet.