Autohotkey - How to detect all input areas/checkboxes in an application? - screen-scraping

Is there a way to detect input areas such as textboxes and checkboxes within an application? I want to label each input area with a number so I can jump between input fields with AHK using my keyboard.
For example: Once the script is activated and active window is Google Chrome, Chrome could have its address bar labeled #1. When I press "1", the cursor will be directed to that area.
I'm basically trying to create a workaround for applications that are not very keyboard friendly.

Most Windows applications use standard windows elements.
For these...
https://autohotkey.com/docs/commands/WinGet.htm - with the ControlList parameter, gets a list of all standard controls.
For those:
https://autohotkey.com/docs/commands/ControlGet.htm - can get the type of control, and
https://autohotkey.com/docs/commands/ControlGetPos.htm - can get position and dimensions of the control.
Some can also be controlled through COM: https://gist.github.com/kheybot/7026077#automation-of-office-applications
Commandline and console programs can sometimes be communicated with directly, using the standard streams (STDIN, STDOUT, STDERR, LPTn, PRN, NUL), or you can communicate with the terminal that displays the program using COM or WSH:
https://gist.github.com/kheybot/7026077#interact-with-command-line
This is important for a lot of legacy data-entry programs.
Browsers (eg Chrome), unfortunately, can't use these heavyweight components, because there may be far too many on a page, but there are other options for communicating with them, such as COM, DDE, etc to communicate with the DOM:
https://gist.github.com/kheybot/7026077#browser-automation
For a web browser, I'd be inclined to go for a hybrid approach, combining AHK-handling of the web browser's input areas (address bar, etc) with a Greasemonkey/Tampermonkey script to handle input fields within the web page itself - the Javascript will be better able to handle input areas using the DOM than any screen-scraping software could. There's also the possibility of using a functional-testing suite like Selenium for automation, and using the browser's plug-in functionality to write an extension to handle its UI.
This would mean that you now have TWO programming problems, of course...
Java applications, Flash applications, HTML5 applications, some graphic design software, and just about all computer games are essentially just graphics, with no way of externally identifying controls.
For these, you have to use basic screen scraping techniques: http://www.autohotkey.com/docs/commands/ImageSearch.htm and http://www.autohotkey.com/docs/commands/PixelSearch.htm to identify specific areas, which can only really be done by individually programming the specific control.
One option for generic detection, though, is to have something that detects shadows (drop shadows, buttonized components, etc) and allows you to tab between and send a click to the components detected that way. Unfortunately, modern flat design meant this won't always work, so you could also try searching for flat-colored rectangles... except sometimes they have curved corners. Because graphic designers hate people.
At this point, you will hopefully see that what you have here is an infinite rabbithole of fractal complexity.
You can make a simple ControlGet solution which doesn't work for a lot of applications you would use regularly... or you can create a hybrid approach that targets many applications individually, while also trying to have a generic solution for unrecognized apps.
If you are creating this for your own use, I'd say aim for making it work with the apps you know and use regularly, and that should be enough.
If you're writing it as accessibility software for others to use, I'd say aim for having it user-configurable for each application: let them control what input element they want to click, and in what order, because auto-detection will never work perfectly, and will only rarely pick the ideal solution.

The answer is yes, if the number of check boxes and their position in the application is fixed and you know on which machine the automation takes place.
Please research ImageSearch on how to locate them from screenshots.
If you know the X/Y position of the checkbox in the window, you can also use PixelGetColor to check if a check is visible or not.
You should also examine your application with the included AutoIt Spy. This program shows you, what it can see in the application window.
To get your labelling, checkout the Gui commands. If you make you gui transparent and don't give focus, you can write labels on top of the application.

Related

How do I avoid MDIParent form from resizing

I am designing a Windows Form app. I have an MDIParent form that loads in a maximized state, and loads its child forms in a maximized state as well. However, when I open an OpenFileDialog, or any datareader object, the MDIParent shrinks to a smaller size with all its forms and controls.
This solution Opening child form is causing mdiform to change size and shrink does not apply/work in my situation.
Also this solution https://support.microsoft.com/en-nz/help/967173/restoring-a-maximized-or-minimized-mdi-parent-form-causes-its-height-t did not work for me.
Some background: I have seen this behaviour in almost all my WinForm applications but I have never been keen to sort it out. I was able to narrow down to the causes as highlighted above when I started investigate it. Some posts are describing it as a windows bug, but it has existed for as long as the screen resolutions started going above 1024 (VS 2010) for my case. I hoped it is not just a windows bug...
I hoped it is not just a windows bug...
Feature, not a bug, but it is not one that Winforms programmers like very much. Notable is that there have been several questions about mystifying window shrinkage in the past few months. I think it is associated with the release of Win10 Fall Creators edition. It has deep changes to the legacy Win32 api layer and they've caused plenty of upheaval.
In your specific case, the "feature" is enabled by a shell extension. They get injected into your process when you use OpenFileDialog. The one that does this is very, very evil and does something that a shell extension must absolutely never do. It calls SetProcessDPIAware(). Notable is that it might have been written in WPF, it has a very sneaky backdoor to declare itself dpiAware. Just loading the PresentationCore assembly is enough. But not otherwise limited to WPF code, any code can do this and that might have been undetected for a long time.
One way to chase down this evil extension is by using SysInternals' AutoRuns utility. It lets you selectively disable extensions. But there is also a programmer's way, you can debug this in VS.
Use Project > Properties > Debug tab > tick the "Enable native code debugging" checkbox. Named slightly different in old VS versions btw. Then Debug > New Breakpoint > Function Breakpoint. Function name = user32!SetProcessDPIAware, Language = C. You can exercise this in a do-nothing WPF app to ensure that everything is set correctly. For completeness you can also add the breakpoint for SetProcessDPIAwareness, the new flavor.
Press F5 to start debugging and trigger the OpenFileDialog.ShowDialog() call. The breakpoint should now hit, use Debug > Windows > Call Stack to look at the stack trace. You typically will not see anything very recognizable in your case since the evil code lives in a DLL that you don't have a PDB for. But the DLL name and location (visible in Debug > Windows > Modules) ought to be helpful to identify the person you need to file a bug with. Uninstall it if you can live without it.
Last but not least, it is getting pretty important to start creating Winforms apps that are dpiAware so such a bug can never byte. You kick this off by declaring your app to be dpiAware so DPI virtualization is disabled. Plus whatever you need to do in your code to ensure the UI design scales properly.

Notification in screen corner

I need to create a small notification in the right-bottom corner of the screen. It should provide the following functionality:
Should NOT change the current focus.
Should allow me to put some text in it.
Should appear (and stay if possible) on top of all windows.
Can you suggest using something? The less installing required the better.
Well, there are a few ways to do it.
Roll your own
Use the infrastructure of the desktop environment
Naturally, #2 is going to be more reliable — if you know what the desktop environment you're targeting is.
You mention Linux, so let's look at Gnome. The two most popular (?) Linux-based operating systems are the Red Hat/Fedora/CentOS family and Ubuntu, both of which are based on Gnome 3.
Gnome 3's Notifications;
Do not change the keyboard focus
Allow text (and more)
Appear for a moment above other windows, but then tuck away at the bottom of the screen after a bit; but, can be called back up by mousing over their icons.
Plus, there's nothing to “install” — unless you're running an unusual build, the stock distributions all include the Notification support you want already.
The documentation is found on the Developer.GNOME.org web site, here.
If you are not running on a “normal” Linux distribution, you still have options.
Install libnotify, and enough Gnome infrastructure to let it work.
Re-inventing the wheel…
In the latter case, you'll want to:
Create a top-level X Window;
Set flags on it to ask the Window Manager to please* keep it on top, not decorate it
with the usual resize and title decorations, and so forth;
and set up its contents on your own.
Some documentation on providing hints to the window manager can be found on FreeDesktop.org.
*- the window manager, however, is free to ignore your hints, if it chooses.

Avoiding all system messages and messages from other software

Here is the situation. The company I work for builds this piece of software in c that can make a Windows computer act a bit like a TV. Essentially, our piece of software is meant to be played full screen and content is displayed from the internet without the user having to ever touch the computer again.
The problem is that once in a while, the system brings up pop-ups like "Your Windows system is ready for an upgrade." or "Please renew your Norton subscription" etc. which the user has to periodically and manually remove.
Is there a way to display content full screen without being bothered by those warnings?
Yah, whether or not the development community agrees, Microsoft has several standards for when and why it might be acceptable to have exclusive use of the monitor.
The most official strategy is to use DirectX in exclusive mode. This is what games do, what windows media player does in full screen video with hardware acceleration enabled, etc... If your application is multimedia intensive (as suggested by TV like functionality), you should probably be using DirectX too. Besides giving you the exclusive display access it will also increase your applications performance while lowering the CPU load (as it will overload graphics work to the video card when possible).
If DirectX is not an option, there are a great number of hacks available that seem to all behave differently between various generations of windows operating systems. So you might have to be prepared to implement several techniques to cover each OS you plan to support.
One technique is to set your application as the currently running screensaver. A screensaver if really just an EXE renamed to SCR with certain command line switches it should support. But you can write your own application to be such a screensaver and a little launcher stub that sets it as the screensaver and launches it. Upon exit the application should return the original screensaver settings (perhaps the launcher waits for the process to exit so that it returns the settings in both graceful exits and any unplanned process terminations ie: app crash). I'm not sure if this behavior is consistent across platforms though, you'll have to test it.
Preventing other applications from creating window handles is truly a hack in my opinion and pretty bad one that I wouldn't appreciate as a customer of such software.
A constant BringWindowToTop() call to keep you in front is better (it doesn't break other software) but still a little hack-ish.
Catch window creation messages with a global hook. This way you can close or hide unwanted windows before they become visible.
EDIT: If you definitely want to avoid hooks, then you can call a function periodically, which puts your window to the top of the z-stack.
You could disable system updates http://support.microsoft.com/kb/901037 and remove the norton malware.
You could also connect a second screen so that the bubbles appear in the the first monitor.
Or you rewrite it for linux or windows ce.
One final option is to install software that reconfigures your os into a kiosk http://shop.inteset.com/Products/9-securelockdown.aspx
If you don't need keyboard or mouse input, how about running your application as a screensaver?
A lot of thoses messages are trigged/managed by Windows Explorer.
Just replace it with your dummy c#/winform.
By changing the registry value
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon]
"Shell"="Explorer.exe"
You can specify virtually any exe as an alternative to explorer.exe
That's the way all windows based (embedded) system (ATM & co) do.
There's still few adjustment (disable services you dont need / dr watson & others), and of course, you'll want to keep a "restart explorer.exe" backdoor.
But that's a good start

prevent screen capture

i am developing a video player i silverlight
i wanna something to prevent recording or screen capturing
i thought about hacking the windows APIs and stop my program from running if there was any of those capturing software asking the user to close it first but i donno how to do this
is there another solution ??!!!!
It's simple not possible. If you try it, you're only going to annoy people.
Even 'hacking the windows API' would not work, since the OS itself could be run inside a VM.
I hate to be a downer, but task is impossible to fully accomplish.
If you were somehow able to hook the keyboard (from a silverlight app no-less) I would certainly hope that whatever AV the user is running would throw up some red flags.
Also what if the user doesn't use the standard (alt)+prtscr? A third-party tool might use a different key-combo. Also, I've written a screen-grabber with the GDI+ API, and there's no way to disable something that low-level.
What about attached capture-cards? What if your app is running in a VM or over remote-desktop?
If you are that deeply concerned about protection your HD content, watermark it, or make the user pay for it first.
All-in-all, as soon as your content's data enters your user's computer, they can duplicate it.
You could go about using a key hook system, stopping the user pressing the print screen key on the keyboard, that would be a start. There aren't many systems which stop users from print screening video specifically. You might want to try just watermarking your video instead? At least then people know that the video was originally sourced from you.
The solution is not to allow your application to run on a computer, but instead target a device such as a phone. Computers will always allow some kind of screen capture and video capture but this is much harder and less likely to be tackled if you restrict to only playing on certain devices.
How badly do you need this? There are many ways to defeat screen capture protection: for instance, aiming a video recorder at the computer screen (or looping output to a TV with a capture card, etc. etc. etc.)
Go for a commercial solution if you really really need this: don't have any experience with those myself, however.

Listing and finding windows on OS X

I am trying to do some stuff on OS X using the carbon api, but I can't find anything I am looking for on google or the Apple development website.
Ideally I would like to find a function that finds the window at a certain location on screen. It seems that there are similar functions, but one of them says that it only finds windows in the current process, and the other says that it is for locating the destination of mouse clicks.
Assuming that there is no way to do that, how would I go about iterating through all the windows on the screen. Finding information about how the OS X window manager works is quite difficult, because it has no name, and any google search is overpowered by referenced to the operating system Windows. Does it have nested windows? What is a window list? Is there only one? does each process have one? can you create arbitrarily many of them? I tentatively guess that GetWindowList is what I am looking for, but there is no example, and the documentation is all vague "Gets the next window", without any explaination of the abstraction or example code.
If someone could either explain how I could do this, or how the window manager sees things, or point me to somewhere I could read about it, that would be great!
I think what you're looking for is Quartz Window Services, part of the Core Graphics framework. You'll probably want to start with the CGWindowListCreate() function to get a list of ID numbers for the windows on screen, which you can then use to get further information about each individiual window.

Resources