Context:
Reading some GUI libraries on Linux 64.
I've always used libraries (or done headless applications). Now, it is time to move on and complete my understanding.
Question:
I am not sure how the system knows when one clicks a button on a gui app.
It seems that poll/select/epoll helps but I don't get the whole picture.
Here is what I think:
When the gui is created, it knows where the buttons pixels are, so it attaches each of them to an event handler (epoll...), OR just one callback to react to a click in this app.
When I click a button, epoll calls the callback for this application which manages the click events. the callback iterates the list to find the button.
Of course, there are optimisations, like dividing the screen in multiple squares for example and many other things.
But, am I correct ? is this the logic under the hood ? Is X11 more involved ?
Thanks
Ok, after your comment: I'll bite:
No, you are not right.
But how does that increase your knowledge now?
Related
Is there a way to detect input areas such as textboxes and checkboxes within an application? I want to label each input area with a number so I can jump between input fields with AHK using my keyboard.
For example: Once the script is activated and active window is Google Chrome, Chrome could have its address bar labeled #1. When I press "1", the cursor will be directed to that area.
I'm basically trying to create a workaround for applications that are not very keyboard friendly.
Most Windows applications use standard windows elements.
For these...
https://autohotkey.com/docs/commands/WinGet.htm - with the ControlList parameter, gets a list of all standard controls.
For those:
https://autohotkey.com/docs/commands/ControlGet.htm - can get the type of control, and
https://autohotkey.com/docs/commands/ControlGetPos.htm - can get position and dimensions of the control.
Some can also be controlled through COM: https://gist.github.com/kheybot/7026077#automation-of-office-applications
Commandline and console programs can sometimes be communicated with directly, using the standard streams (STDIN, STDOUT, STDERR, LPTn, PRN, NUL), or you can communicate with the terminal that displays the program using COM or WSH:
https://gist.github.com/kheybot/7026077#interact-with-command-line
This is important for a lot of legacy data-entry programs.
Browsers (eg Chrome), unfortunately, can't use these heavyweight components, because there may be far too many on a page, but there are other options for communicating with them, such as COM, DDE, etc to communicate with the DOM:
https://gist.github.com/kheybot/7026077#browser-automation
For a web browser, I'd be inclined to go for a hybrid approach, combining AHK-handling of the web browser's input areas (address bar, etc) with a Greasemonkey/Tampermonkey script to handle input fields within the web page itself - the Javascript will be better able to handle input areas using the DOM than any screen-scraping software could. There's also the possibility of using a functional-testing suite like Selenium for automation, and using the browser's plug-in functionality to write an extension to handle its UI.
This would mean that you now have TWO programming problems, of course...
Java applications, Flash applications, HTML5 applications, some graphic design software, and just about all computer games are essentially just graphics, with no way of externally identifying controls.
For these, you have to use basic screen scraping techniques: http://www.autohotkey.com/docs/commands/ImageSearch.htm and http://www.autohotkey.com/docs/commands/PixelSearch.htm to identify specific areas, which can only really be done by individually programming the specific control.
One option for generic detection, though, is to have something that detects shadows (drop shadows, buttonized components, etc) and allows you to tab between and send a click to the components detected that way. Unfortunately, modern flat design meant this won't always work, so you could also try searching for flat-colored rectangles... except sometimes they have curved corners. Because graphic designers hate people.
At this point, you will hopefully see that what you have here is an infinite rabbithole of fractal complexity.
You can make a simple ControlGet solution which doesn't work for a lot of applications you would use regularly... or you can create a hybrid approach that targets many applications individually, while also trying to have a generic solution for unrecognized apps.
If you are creating this for your own use, I'd say aim for making it work with the apps you know and use regularly, and that should be enough.
If you're writing it as accessibility software for others to use, I'd say aim for having it user-configurable for each application: let them control what input element they want to click, and in what order, because auto-detection will never work perfectly, and will only rarely pick the ideal solution.
The answer is yes, if the number of check boxes and their position in the application is fixed and you know on which machine the automation takes place.
Please research ImageSearch on how to locate them from screenshots.
If you know the X/Y position of the checkbox in the window, you can also use PixelGetColor to check if a check is visible or not.
You should also examine your application with the included AutoIt Spy. This program shows you, what it can see in the application window.
To get your labelling, checkout the Gui commands. If you make you gui transparent and don't give focus, you can write labels on top of the application.
In my App I have an annoying behavior. It is causing problems to my costumers.
The app has several points where I need to show a Dialog (Modal), then the users can fill some fields and then they can close the dialog. So the system follows its natural path.
In determined moments this works fine. The dialog is shown, user interacts with it, closes it , ....
But, in others moments (the same code) the dialog doesn't appear automatically. The user needs to execute some external action on device (like change its orientation, touch in the center of the screen, execute scroll gesture, etc). Some action that isn't intuitive at the moment.
This behavior makes the user think my app froze.
For me it is clear that the dialog was called, simply it wasn't drawed on the screen.
I tryed read about this problem.I did some researches in similar questions without success.
I guess the cause is related to EDT.
In short, How can I call a Dialog Modal without breaking EDT-rules.
And more specifically, How can I resolve this problem.
when I request a dialog to be displayed on the screen, I want it really appears in 100% of cases. Today works randomly.
Additional infomations:
My app uses Java 5 yet.
Do you recommends migration to Java 8?
======= Additional Informations (1) ===========
This problem is strongly dependent of device model.
In MotoG3 (Android 6) this problem is a exception. Rarely it occurs.
In my Galaxy Note 8 is the opposite. Always occurs.
In Lenovo Vibe5 (Android 6). Frequently occurs.
I added these informations. Maybe it help to compound problem picture.
Additional question:
Is it possible write a snippet that I can use as a template
to execute Dialog Modal call without break some rule of EDT?
Turn on the EDT violation detection tool in the simulator which should detect such issues. Inspect potentially problematic cases of Dialog calls and post them specifically if you don't know how to fix them.
Java 8 is unrelated although migrating a project is non-trivial.
This scenario has come up before and I'm wondering if there is any way I can determine where the actively running form is executing within code? The problem is when I inherit a very large application which I'm not totally familiar with yet and I have it running through VS.NET 2010. I might have a particular screen up and go "geeze it would be nice if I could start debugging when I do 'x'".
If this was a simple form with some buttons I wouldn't even bother asking here; I'm not that novice. But the time consuming task is when I look at a tabbed screen in a large multi-project solution with drag and drop capabilities, right click options, etc. and have to spend 5-10 minutes tracking down where to place a breakpoint to debug.
What I'm wondering is if there is a way to have the WinForms app running via IDE and do 'something' that tells VS.NET on the next action break into the code (obviously without a breakpoint because I don't know where to place one yet). This would save me a ton of time trying to track down which event is occurring in a not so simple form or series of forms.
I hope this makes sense...
Thanks!
Yes, that's somewhat possible. When you use Debug + Break All then the 99.9% odds are that you don't break into code that's part of the project. A Winforms app is normally idle, pumping the message loop and waiting for Windows to tell it that something happened. You'll break at the Application.Run() statement.
The trick to then use Debug + Step Over. The program resumes running like normal. Then give a UI command (do 'x' in your question) and the debugger will break at the first statement of real code, typically at the start of the event handler for that command. It isn't exactly guaranteed that that code would be relevant, you might break at a MouseMove event handler for example. So YMMV.
I have a WPF app that uses a non-WPF vendor library. My app does not receive any events that the library fires. I've been told that this is because I need a message pump.
In another (very similar) question, the accepted answer suggested using System.Windows.Threading.Dispatcher.Run().
When I add in that call, however, my window won't pop up-- the app is effectively backgrounded and I have to shut it down with Task Manager.
I'm really stumped here, and I'm not even sure how to investigate it. Any help would be greatly appreciated.
You already have one if you use WPF, there's no other way that it can get any Windows notifications. Every WPF app starts life with a call to Application.Run() on the main thread. It is usually well hidden, auto-generated in the bin\debug\app.g.cs source code file. Application.Run() in turn calls Dispatcher.Run()
Your vendor is correct, without a message loop many COM components go catatonic. But since you have one you need to look for the problem elsewhere. Don't use the component on threads.
Basically, I have two applications that run sequentially (second is started by the first, and the first exits immediately after.) I'd like to pass ownership of a window the first application created to the second application. The actual contents of the window don't need to be passed along, it's just being drawn in by DirectX.
Alternatively, but less desirably, is it possible to at least disable the window closing/opening animation, so it at least looks like the desired effect is achieved?
(This is in C, using the vanilla Win32 API.)
Instead of separated application make a DLL that will be loaded by the first application and run within it.
I suspect that you're going to run into problems because the WindowProc function is located in the memory address space of the program that you're closing.
Also, a quick look at the second remark at the bottom of the documentation for RegisterClass doesn't seem to offer up much hope.
The only work around that I can suggest for what you've described is to not close the first application until the second application is finished with the window in question.
you can use API hooking to make your DLL capture API windows calls sent by the application window and respond as if your DLL is the windows DLL
for more information about hooking check :
Hooks Overview