Listing and finding windows on OS X - c

I am trying to do some stuff on OS X using the carbon api, but I can't find anything I am looking for on google or the Apple development website.
Ideally I would like to find a function that finds the window at a certain location on screen. It seems that there are similar functions, but one of them says that it only finds windows in the current process, and the other says that it is for locating the destination of mouse clicks.
Assuming that there is no way to do that, how would I go about iterating through all the windows on the screen. Finding information about how the OS X window manager works is quite difficult, because it has no name, and any google search is overpowered by referenced to the operating system Windows. Does it have nested windows? What is a window list? Is there only one? does each process have one? can you create arbitrarily many of them? I tentatively guess that GetWindowList is what I am looking for, but there is no example, and the documentation is all vague "Gets the next window", without any explaination of the abstraction or example code.
If someone could either explain how I could do this, or how the window manager sees things, or point me to somewhere I could read about it, that would be great!

I think what you're looking for is Quartz Window Services, part of the Core Graphics framework. You'll probably want to start with the CGWindowListCreate() function to get a list of ID numbers for the windows on screen, which you can then use to get further information about each individiual window.

Related

C GTK2 frustrated with gnome documentation

I have been attempting to create an app using C in Code::Blocks on Win7.
Can anyone please point me to a better documentation then the gnome site? Or failing that, can someone point me to a place I can see which signals are allowed for which widgets?
I recently wrote an app using Python and found TKinter to be very good, and every time I searched Google for help on TKinter the documentation was easy to read and understandable.
The gnome GTK documentation, however, is really bad. Yes, it does describe each function, but doesn't lead you to the other parts needed to get a full understanding of the function.
They go into great detail in some cases, actually including an entire program as an example (without comments in the code I might add,) totally obscuring the forest trying to describe the tree.
I don't want to get too bogged down in a detail of my problem now, but this is an example of my frustration.
Specifically, I am attaching a signal to an entry widget, and I can find the g_signal_connect declaration that gives the parameters needed, like the instance, the_signal, the handler and such, but nowhere does it say WHICH signals can be used.
I guess it is because each widget may use a different subset of signals for the signal, but, to date. I have not found even a list of signals available let alone which ones can be used on which widgets.
I can find the gtk_entry_new() definition, but again, that description doesn't give a list of allowable signals. Just how to call it.
I saw an example that uses the "insert_text" signal, but that isn't really right, another site says there is an "activate" signal, but that only works if the user presses enter, not if the user clicks elsewhere in the window.
Any help is appreciated.
Mark.
I've already seen that doc issue. The way the doc is generated has changed and it seems this broke some parts of GTK+ 2 generated doc. Now, you shouldn't be using GTK+ 2 in the first place. GTK+ 3 has been the stable release for years now, and GTK+ 2 should only be used in legacy projects. GTK+ 4 in on its way to be released this year.
To know which signals can be used on which widget, you just have to go to the "signals" section of the documentation page of that widget. For example, here are the signals specific to GtkEntry. Each widget doc page has a top bar with several section shortcuts, with links to the sections you want:
Top | Description | Object Hierarchy | Implemented Interfaces | Properties | Style Properties | Signals
You see the last one is about signals.
Now this is only for signals specific to the class. This is object-oriented programming, so you can also use the signals from the parent classes as well. Just click on the "Object Hierarchy" link and you'll be sent on the inheritance diagram of the class. From there you may explore the parent classes, and then their signals.
You may also want to install the Devhelp program, which give you a search-as-you-type entry and gathers the docs of lots of other libraries on which GTK+ and the GNOME platform depend (cairo, pango, etc.). Install it with your package manager, and you'll have access to offline help for all the development packages you installed, and at the versions you're really using.

Autohotkey - How to detect all input areas/checkboxes in an application?

Is there a way to detect input areas such as textboxes and checkboxes within an application? I want to label each input area with a number so I can jump between input fields with AHK using my keyboard.
For example: Once the script is activated and active window is Google Chrome, Chrome could have its address bar labeled #1. When I press "1", the cursor will be directed to that area.
I'm basically trying to create a workaround for applications that are not very keyboard friendly.
Most Windows applications use standard windows elements.
For these...
https://autohotkey.com/docs/commands/WinGet.htm - with the ControlList parameter, gets a list of all standard controls.
For those:
https://autohotkey.com/docs/commands/ControlGet.htm - can get the type of control, and
https://autohotkey.com/docs/commands/ControlGetPos.htm - can get position and dimensions of the control.
Some can also be controlled through COM: https://gist.github.com/kheybot/7026077#automation-of-office-applications
Commandline and console programs can sometimes be communicated with directly, using the standard streams (STDIN, STDOUT, STDERR, LPTn, PRN, NUL), or you can communicate with the terminal that displays the program using COM or WSH:
https://gist.github.com/kheybot/7026077#interact-with-command-line
This is important for a lot of legacy data-entry programs.
Browsers (eg Chrome), unfortunately, can't use these heavyweight components, because there may be far too many on a page, but there are other options for communicating with them, such as COM, DDE, etc to communicate with the DOM:
https://gist.github.com/kheybot/7026077#browser-automation
For a web browser, I'd be inclined to go for a hybrid approach, combining AHK-handling of the web browser's input areas (address bar, etc) with a Greasemonkey/Tampermonkey script to handle input fields within the web page itself - the Javascript will be better able to handle input areas using the DOM than any screen-scraping software could. There's also the possibility of using a functional-testing suite like Selenium for automation, and using the browser's plug-in functionality to write an extension to handle its UI.
This would mean that you now have TWO programming problems, of course...
Java applications, Flash applications, HTML5 applications, some graphic design software, and just about all computer games are essentially just graphics, with no way of externally identifying controls.
For these, you have to use basic screen scraping techniques: http://www.autohotkey.com/docs/commands/ImageSearch.htm and http://www.autohotkey.com/docs/commands/PixelSearch.htm to identify specific areas, which can only really be done by individually programming the specific control.
One option for generic detection, though, is to have something that detects shadows (drop shadows, buttonized components, etc) and allows you to tab between and send a click to the components detected that way. Unfortunately, modern flat design meant this won't always work, so you could also try searching for flat-colored rectangles... except sometimes they have curved corners. Because graphic designers hate people.
At this point, you will hopefully see that what you have here is an infinite rabbithole of fractal complexity.
You can make a simple ControlGet solution which doesn't work for a lot of applications you would use regularly... or you can create a hybrid approach that targets many applications individually, while also trying to have a generic solution for unrecognized apps.
If you are creating this for your own use, I'd say aim for making it work with the apps you know and use regularly, and that should be enough.
If you're writing it as accessibility software for others to use, I'd say aim for having it user-configurable for each application: let them control what input element they want to click, and in what order, because auto-detection will never work perfectly, and will only rarely pick the ideal solution.
The answer is yes, if the number of check boxes and their position in the application is fixed and you know on which machine the automation takes place.
Please research ImageSearch on how to locate them from screenshots.
If you know the X/Y position of the checkbox in the window, you can also use PixelGetColor to check if a check is visible or not.
You should also examine your application with the included AutoIt Spy. This program shows you, what it can see in the application window.
To get your labelling, checkout the Gui commands. If you make you gui transparent and don't give focus, you can write labels on top of the application.

Notification in screen corner

I need to create a small notification in the right-bottom corner of the screen. It should provide the following functionality:
Should NOT change the current focus.
Should allow me to put some text in it.
Should appear (and stay if possible) on top of all windows.
Can you suggest using something? The less installing required the better.
Well, there are a few ways to do it.
Roll your own
Use the infrastructure of the desktop environment
Naturally, #2 is going to be more reliable — if you know what the desktop environment you're targeting is.
You mention Linux, so let's look at Gnome. The two most popular (?) Linux-based operating systems are the Red Hat/Fedora/CentOS family and Ubuntu, both of which are based on Gnome 3.
Gnome 3's Notifications;
Do not change the keyboard focus
Allow text (and more)
Appear for a moment above other windows, but then tuck away at the bottom of the screen after a bit; but, can be called back up by mousing over their icons.
Plus, there's nothing to “install” — unless you're running an unusual build, the stock distributions all include the Notification support you want already.
The documentation is found on the Developer.GNOME.org web site, here.
If you are not running on a “normal” Linux distribution, you still have options.
Install libnotify, and enough Gnome infrastructure to let it work.
Re-inventing the wheel…
In the latter case, you'll want to:
Create a top-level X Window;
Set flags on it to ask the Window Manager to please* keep it on top, not decorate it
with the usual resize and title decorations, and so forth;
and set up its contents on your own.
Some documentation on providing hints to the window manager can be found on FreeDesktop.org.
*- the window manager, however, is free to ignore your hints, if it chooses.

Is it possible for an application to take ownership of a window from another application?

Basically, I have two applications that run sequentially (second is started by the first, and the first exits immediately after.) I'd like to pass ownership of a window the first application created to the second application. The actual contents of the window don't need to be passed along, it's just being drawn in by DirectX.
Alternatively, but less desirably, is it possible to at least disable the window closing/opening animation, so it at least looks like the desired effect is achieved?
(This is in C, using the vanilla Win32 API.)
Instead of separated application make a DLL that will be loaded by the first application and run within it.
I suspect that you're going to run into problems because the WindowProc function is located in the memory address space of the program that you're closing.
Also, a quick look at the second remark at the bottom of the documentation for RegisterClass doesn't seem to offer up much hope.
The only work around that I can suggest for what you've described is to not close the first application until the second application is finished with the window in question.
you can use API hooking to make your DLL capture API windows calls sent by the application window and respond as if your DLL is the windows DLL
for more information about hooking check :
Hooks Overview

Is there a way for my binary to react to some global hotkeys in Linux?

Is it possible to listen for a certain hotkey (e.g:Ctrl-I) and then perform a specific action? My application is written in C, will only run on Linux, and it doesn't have a GUI. Are there any libraries that help with this kind of task?
EDIT: as an example, amarok has global shortcuts, so for example if you map a combination of keys to an action (let's say Ctrl-+, Ctrl and +) you could execute that action when you press the keys. If I would map Ctrl-+ to the volume increase action, each time I press ctrl-+ the volume should increase by a certain amount.
Thanks
How global do your hotkeys need to be? Is it enough for them to be global for a X session? In that case you should be able to open an Xlib connection and listen for the events you need.
Ordinarily keyboard events in X are delivered to the window that currently has the focus, and propagated up the hierarchy until they are handled. Clearly this is not what we want. We need to process the event before any other window can get to it. We need to call XGrabKey on the root window with the keycode and modifiers of our hotkey to accomplish this.
I found a good example here.
I think smoofra is on the right track here; you're looking to register a global hotkey with X so that you can intercept keypresses and take appropriate action. Xlib is probably what you want, and XGrabKey is the function, i think.
It's not easy to learn, I'm afraid; I did locate this example that seems useful: TinyWM. I also found an example using Java/JNI (accessing the same underlying Xlib function).
You should look at the source code of xbindkeys.
Xlib programming is pretty arcane, documentation is hard to find, and there are subtle portability issues. You'll be better off copying some battle-hardened code.
One way to do it is to have your application listen on a certain port, or socket file, for incoming requests.
Then you can write a small client application that connects to that port or socket file and sends commands to the running application.
Then you can configure your window manager to bind certain key combinations to launch your small client app.
In UNIX, your access to a commandline shell is via a terminal. This harks back to the days when folks accessed their big shared computers literally via terminals connected directly to the machines (e.g. by a serial cable).
In fact, the 'xterm' program or whatever derivative you use on your UNIX box is properly referred to as a terminal emulator - it behaves (from both your point of view and that of the operating system) much like one of those old-fashioned terminal machines.
This makes it slightly complicated to handle input in interesting ways, since there are lots of different kinds of terminals, and your UNIX system has to know about the capabilities of each kind. These capabilities were traditionally stored in a termcap file, and I think more modern systems use terminfo instead. Try
man 5 terminfo
on a Linux system for more information.
Now, the good news is that you don't need to do too much messing about with terminal capabilities, etc. to have a commandline application that does interesting things with input or windowing features. There's a library, curses, that will help. Lookup
man 3 ncurses
on your Linux system for more information. You will probably be able to find a decent tutorial on using curses online.

Resources