Do graphics cards typically write their output to some place in memory which I can access? Do I have to use the driver? If so, can I use OpenGL?
I want to know if it's possible to "capture" the output of a VM on Linux which has direct access to a GPU, and is running Windows. Ideally, I could access the output from memory directly, without touching the GPU, since this code would be able to run on the Linux host.
The other option is to maybe write a Windows driver which reads the output of the GPU and writes it to some place in memory. Then, on the Linux side, a program can read this memory. This seems somewhat impossible, since I'm not really sure how to get a process on the host to share memory with a process on the guest.
Is it possible to do option 1 and simply read the output from memory?
I do not code under Linux but in Windows (you are running it in emulator anyway) you can use WinAPI to directly access canvas of any window or even desktop from 3th party App. Some GPU overlays could be problematic to catch (especially DirectX based) but I had no problems with mine GL/GLSL for now.
If you got access to App source you can use glReadPixels for image extraction from GL directly (but that works only for current GL based rendering).
Using glReadPixels
As mentioned this must be implemented directly in the targeted app so you need to have it source code or inject your code in the right place/time. I use for screenshoting this code:
void OpenGLscreen::screenshot(Graphics::TBitmap *bmp)
{
if (bmp==NULL) return;
int *dat=new int[xs*ys],x,y,a,*p;
if (dat==NULL) return;
bmp->HandleType=bmDIB;
bmp->PixelFormat=pf32bit;
if ((bmp->Width!=xs)||(bmp->Height!=ys)) bmp->SetSize(xs,ys);
if ((bmp->Width==xs)&&(bmp->Height==ys))
{
glReadPixels(0,0,xs,ys,GL_BGRA,GL_UNSIGNED_BYTE,dat);
glFinish();
for (a=0,y=ys-1;y>=0;y--)
for (p=(int*)bmp->ScanLine[y],x=0;x<xs;x++,a++)
p[x]=dat[a];
}
delete[] dat;
}
where xs,ys is OpenGL window resolution, you can ignore the whole bmp stuff (it is VCL bitmap I use to store screenshot) and also can ignore the for it just copy the image from buffer to bitmap. So the important stuff is just this:
int *dat=new int[xs*ys]; // each pixel is 32bit int
glReadPixels(0,0,xs,ys,GL_BGRA,GL_UNSIGNED_BYTE,dat);
glFinish();
You need to execute this code after the rendering is done otherwise you will obtain unfinished or empty buffer. I use it after redraw/repaint events. As mentioned before this will obtain only the GL rendered stuff so if your App combines GDI+OpenGL it is better to use the next approach.
WinAPI approach
To obtain Canvas image of any window I wrote this class:
//---------------------------------------------------------------------------
//--- screen capture ver: 1.00 ----------------------------------------------
//---------------------------------------------------------------------------
class scrcap
{
public:
HWND hnd,hnda;
TCanvas *scr;
Graphics::TBitmap *bmp;
int x0,y0,xs,ys;
scrcap()
{
hnd=NULL;
hnda=NULL;
scr=new TCanvas();
bmp=new Graphics::TBitmap;
#ifdef _mmap_h
mmap_new('scrc',scr,sizeof(TCanvas() ));
mmap_new('scrc',bmp,sizeof(Graphics::TBitmap));
#endif
if (bmp)
{
bmp->HandleType=bmDIB;
bmp->PixelFormat=pf32bit;
}
x0=0; y0=0; xs=1; ys=1;
hnd=GetDesktopWindow();
}
~scrcap()
{
#ifdef _mmap_h
mmap_del('scrc',scr);
mmap_del('scrc',bmp);
#endif
if (scr) delete scr; scr=NULL;
if (bmp) delete bmp; bmp=NULL;
}
void init(HWND _hnd=NULL)
{
RECT r;
if (scr==NULL) return;
if (bmp==NULL) return;
bmp->SetSize(1,1);
if (!IsWindow(_hnd)) _hnd=hnd;
scr->Handle=GetDC(_hnd);
hnda=_hnd;
resize();
}
void resize()
{
if (!IsWindow(hnda)) return;
RECT r;
// GetWindowRect(hnda,&r);
GetClientRect(hnda,&r);
x0=r.left; xs=r.right-x0;
y0=r.top; ys=r.bottom-y0;
bmp->SetSize(xs,ys);
xs=bmp->Width;
ys=bmp->Height;
}
void capture()
{
if (scr==NULL) return;
if (bmp==NULL) return;
bmp->Canvas->CopyRect(Rect(0,0,xs,ys),scr,TRect(x0,y0,x0+xs,y0+ys));
}
};
//---------------------------------------------------------------------------
//---------------------------------------------------------------------------
//---------------------------------------------------------------------------
Again it uses VCL so rewrite bitmap bmp and canvas scr to your programing environment style. Also igneore the _mmap_h chunks of code they are just for debugging/tracing of memory pointers related to some nasty compiler bug I was facing at that time I wrote this.
The usage is simple:
// globals
scrcap cap;
// init
cap.init();
// on screenshot
cap.capture();
// here use cap.bmp
If you call cap.init() it will lock on the whole windows desktop. If you call cap.init(window_handle) it will lock on specific visual window/component instead. To obtain window handle from 3th app side see:
is ther a way an app can display a message without the use of messagebox API?
Sorry it is on SE/RE instead of here on SE/SO but My answer here covering this topic was deleted. I use this for video capture ... all of the animated GIFs in my answers where created by this code. Another example could be seen on the bottom of this answer of mine:
Image to ASCII art conversion
As you can see it works also for DirectX overlay with media player classic (even Windows PrintScreen function cant do it right). As I wrote I got no problems with this yet.
Beware visual stuff WinAPI calls MUST BE CALLED FROM APP MAIN THREAD (WNDPROC) otherwise serious problems could occur leading to random unrelated WinAPI calls exceptions anywhere in the App.
Related
I have a simple GUI application I wrote in C for the RaspBerry PI while using GTK+2.0 to handle the actual UI rendering. The application so far is pretty simple, with just a few pushbuttons for testing simple functions I wrote. One button causes a thread to be woken up which prints text to the console, and goes back to sleep, while another button stops this operation early by locking a mutex, changing a status variable, then unlocking the mutex again. Fairly simple stuff so far. The point of using this threaded approach is so that I don't ever "lock up" the UI during a long function call, forcing the user to be blocked on the I/O operations completing before the UI is usable again.
If I call the following function in my thread's processing loop, I encounter a number of issues.
#include <opencv2/objdetect/objdetect.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <stdio.h>
#include <errno.h>
using namespace std;
using namespace cv;
#define PROJECT_NAME "CAMERA_MODULE" // Include before liblog
#include <log.h>
int cameraAcquireImage(char* pathToImage) {
if (!pathToImage) {
logError("Invalid input");
return (-EINVAL);
}
int iErr = 0;
CvCapture *capture = NULL;
IplImage *frame, *img;
//0=default, -1=any camera, 1..99=your camera
capture = cvCaptureFromCAM(CV_CAP_ANY);
if(!capture) {
logError("No camera interface detected");
iErr = (-EIO);
}
if (!iErr) {
if ((frame = cvQueryFrame(capture)) == NULL) {
logError("ERROR: frame is null...");
iErr = (-EIO);
}
}
if (!iErr) {
CvSize size = cvSize(100, 100);
if ((img = cvCreateImage(size, IPL_DEPTH_16S, 1)) != NULL) {
img = frame;
cvSaveImage(pathToImage, img);
}
}
if (capture) {
cvReleaseCapture(&capture);
}
return 0;
}
The function uses some simple OpenCV code to take a snapshot with a webcam connected to my Raspberry PI. It issues warnings of VIDIOC_QUERYMENU: Invalid argument to the console, but still manages to acquire the images and save them to a file for me. However, my GUI becomes sluggish, and sometimes hangs. If it doesn't outright hang, then the window goes blank, and I have to randomly click all over the UI area until I click on where a pushbutton would normally be located, and the UI finally re-renders again rather than showing a white empty layout.
How do I go about resolving this? Is this some quirk in OpenCv when using it as part of a Gtk+2.0 application? I had originally had my project setup as a GTK3.0 application, but it wouldn't run due to some check in GTK preventing multiple versions from being included in a single application, and it seems OpenCv is an extension of GTK+2.0.
Thank you.
there is something quite broken here:
CvSize size = cvSize(100, 100);
if ((img = cvCreateImage(size, IPL_DEPTH_16S, 1)) != NULL) {
img = frame;
cvSaveImage(pathToImage, img);
}
first, you create a useless 16-bit image (why even?), then you reassign(alias) that pointer to your original image, and then you don't cvReleaseImage it (memleak).
please, stop using opencv's deprecated c-api. please.
any noob will shoot into his foot using this (one of the main reasons to get rid of it)
also, you can only use ~30% of opencv's functionality this way (the opencv1.0 set)
again, please, stop using opencv's deprecated c-api. please.
Didn't you forget to free the img pointer ?
Also, I did in the past an app that stored uncompressed images on the disk, and things used to become sluggish. In fact, what was taking time was storing the images on the disk, as it was exceeding the max bandwidth of what the filesystem layer could handle.
So try to see is you can store compressed images instead (trading some CPU to save bandwidth), or store your images in RAM in a queue and save them afterwards (in a separate thread, or in an idle handler). Of course, if the video you capture is too long, you may end up with an Out Of Memory condition. I only had sequences of a few seconds to store, so that did the trick.
I'm pretty new to OpenCV and I'm trying to get my bearings by looking at, and running, sample code.
One of the sample programs that I was looking at is a program for displaying webcam video. Here are the important lines (the program doesn't execute farther than this):
// Make frame.
CvCapture* capture = cvCaptureFromCAM(0);
if(!capture) {
printf("Webcam not initialized....");
}
// Display video in frame.
Unfortunately, the if statement always executes. For some reason, capture is not initialized.
Even stranger, when I run the program, it even gives me a GUI to select the webcam that I want to use:
However, even after I select the webcam, capture is not initialized!
What does this mean? How do I fix this?
Thanks for any suggestions.
It is possible that OpenCV cannot access the webcam until after you select it. In that case, try looping until the webcam is available:
CvCapture *capture = NULL;
do {
// you could also try passing in CV_CAP_ANY or -1 instead of 0
capture = cvCaptureFromCAM(0);
} while (!capture);
If this still doesn't work, call cvErrorStr(cvGetErrStatus()) to get a string explaining the error.
I want to write an application that will automatically detect and fill the text field in the window shown below:
(assuming the data to be entered is in a file).
The question is how does my application find this text field?
I can do this job if I am able to find the location of the text field on the desktop through program.
Can someone help me understand possible ways for finding this text field?
I am using Windows Form application in C++.
Update:
I played with spy++.
I used spy++, to find the window handle. I did it by putting finder on the window I am interested in. Its giving handle in hex values: 00080086 (actually just for testing purpose I put the finder tool on Visual Studio new project page ). How do I interpret this Hex value into meaningful window name ?
See the below figure.
What is the next step to get to the text field " Enter name" under "name" field.
****Any sample code will be highly appreciated.**
I am open to any solution not necessarily how I am doing this.
One solution is to use the Microsoft UI Automation technology. It's shipped out-of-the-box with Windows since Vista. It's usable from .NET but also from C++ using COM.
Here is a short C++ console application example that displays the class name of the UI Automation Element currently at the middle of the desktop window, each second (you can have it run and see what it displays):
int _tmain(int argc, _TCHAR* argv[])
{
CoInitialize(NULL);
IUIAutomation *pAutomation; // requires Uiautomation.h
HRESULT hr = CoCreateInstance(__uuidof(CUIAutomation), NULL, CLSCTX_INPROC_SERVER, __uuidof(IUIAutomation), (LPVOID *)&pAutomation);
if (SUCCEEDED(hr))
{
RECT rc;
GetWindowRect(GetDesktopWindow(), &rc);
POINT center;
center.x = (rc.right - rc.left) / 2;
center.y = (rc.bottom - rc.top) / 2;
printf("center x:%i y:%i'\n", center.x, center.y);
do
{
IUIAutomationElement *pElement;
hr = pAutomation->ElementFromPoint(center, &pElement);
if (SUCCEEDED(hr))
{
BSTR str;
hr = pElement->get_CurrentClassName(&str);
if (SUCCEEDED(hr))
{
printf("element name:'%S'\n", str);
::SysFreeString(str);
}
pElement->Release();
}
Sleep(1000);
}
while(TRUE);
pAutomation->Release();
}
CoUninitialize();
return 0;
}
From this sample, what you can do is launch the application you want to automate and see if the sample detects it (it should).
You could also use the UISpy tool to display the full tree of what can be automated in your target app. You should see the windows and other elements (text field) of this target app and you should see the element displayed by the console application example.
From the pElement discovered in the sample, you can call FindFirst with the proper condition (class name, name, control type, etc...) to get to the text field. From this text field, you would use one of the UI Automation Patterns that should be available (probably TextPattern or ValuePattern) to get or set the text itself.
The cool thing is you can use the UISpy tool to check all this is possible before actually coding it.
You could enumerate windows and then find it.
For exploring application on your screenshot you could you Spy++ (spyxx.exe) that is distributed with visual studio. In you code, you clould use EnumWindows and EnumChildWindows to enumerates all window or all child windows to find one you need.
Although the answer given by Simon is accepted and is the best one, but still for future visitors I am providing this link which has more description for UI automation of windows applications. .
Also for automating a web application one may want to go to this link
I'm actually trying to read a specific pixel on a window which is hidden by others. I want to use the GetPixel function from GDI library but it seems it only works with the global device context. I can't read pixel from a specific window and I don't understand why..
I found this article which uses the PrintWindow function to copy a specific window content to a temporary device context which can be read. But I can't reproduce it.
EDIT
Thank you all my problem is solved :)
This script give you the RGB color of the pointer on the choosen window, even though the window is hidden. Remind that this program must be launch with admin privileges to get the pixels of processes launched with admin privileges.
#define STRICT
#define WINVER 0x0501
#define _WIN32_WINNT 0x0501
// 0x0501 for PrintWindow function
// You must be at least running Windows XP
// See http://msdn.microsoft.com/en-us/library/6sehtctf.aspx
#include <stdio.h>
#include <string.h>
#include <windows.h>
#define WINDOW_LIST_LIMIT 32
#define WINDOW_NAME_LIMIT 1024
void FatalError(char* error)
{
printf("%s", error);
exit(-1);
}
HWND window_list[WINDOW_LIST_LIMIT];
unsigned int window_list_index = 0;
BOOL EnumWindowsProc(HWND window_handle, LPARAM param)
{
char window_title[WINDOW_NAME_LIMIT];
if(!IsWindowVisible(window_handle)) return TRUE;
RECT rectangle = {0};
GetWindowRect(window_handle, &rectangle);
if (IsRectEmpty(&rectangle)) return TRUE;
GetWindowText(window_handle, window_title, sizeof(window_title));
if(strlen(window_title) == 0) return TRUE;
if(!strcmp(window_title, "Program Manager")) return TRUE;
window_list[window_list_index] = window_handle;
window_list_index++;
printf("%u - %s\n", window_list_index, window_title);
if(window_list_index == WINDOW_LIST_LIMIT) return FALSE;
return TRUE;
}
int main(int argc, char** argv)
{
unsigned int i, input;
EnumWindows((WNDENUMPROC) EnumWindowsProc, (LPARAM) NULL);
printf("\nChoose a window: ");
scanf("%u", &input);
printf("\n");
if(input > window_list_index) FatalError("Bad choice..\n");
HDC window_dc = GetWindowDC(window_list[input - 1]), global_dc = GetDC(0), temp_dc;
if(!window_dc && !global_dc) FatalError("Fatal Error - Cannot get device context.\n");
POINT cursor, previous_cursor;
while(1)
{
temp_dc = CreateCompatibleDC(window_dc);
if(!temp_dc) FatalError("Fatal Error - Cannot create compatible device context.\n");
RECT window_rectangle;
GetWindowRect(window_list[input - 1], &window_rectangle);
HBITMAP bitmap = CreateCompatibleBitmap(window_dc,
window_rectangle.right - window_rectangle.left,
window_rectangle.bottom - window_rectangle.top);
if (bitmap)
{
SelectObject(temp_dc, bitmap);
PrintWindow(window_list[input - 1], temp_dc, 0);
DeleteObject(bitmap);
}
GetCursorPos(&cursor);
if(cursor.x != previous_cursor.x && cursor.y != previous_cursor.y)
{
COLORREF color = GetPixel(temp_dc, cursor.x - window_rectangle.left, cursor.y - window_rectangle.top);
int red = GetRValue(color);
int green = GetGValue(color);
int blue = GetBValue(color);
printf("\rRGB %02X%02X%02X", red, green, blue);
cursor = previous_cursor;
}
DeleteDC(temp_dc);
Sleep(50); // for lags
}
ReleaseDC(window_list[input - 1], window_dc);
return 0;
}
I've changed some things, now User32 isn't dynamically loaded.
It compiles with
gcc main.c -o main.exe -lGid32 -lUser32
Have a great day !
You are passing a process handle to GetDC. That's not right. Processes don't have device contexts, windows do. Remember a process can have many windows, or even none at all.
You need to get hold of the window handle, the HWND, for the window in question, and pass that to GetDC. I'd look to using FindWindow or EnumWindows to find your target top-level window.
Of course, there may be other problems with your code, but that's the one that jumps out at me.
HDC process_dc = GetDC(process_handle)
Well that's all kinds of wrong. GetDC accepts a window handle, not a process handle.
In order to find such errors, recompile with
#define STRICT
placed before your includes.
This is a bit of a confusing topic, so let's see if I can clarify a few things.
First things first: as both David and Ben have already answered, you're passing a process handle to the GetDC function, which is wrong. GetDC accepts a handle to a window (the HWND type), and it returns a device context (DC, the HDC type) corresponding to that window. You need to get that fixed before anything else will work.
Now, as the article you've read indicates, windows (assuming they've been correctly programmed) respond to the WM_PRINT or WM_PRINTCLIENT messages by rendering an image of themselves into the specified device context (HDC). This is a simple and effective way of capturing an "image" of a window, whether an overlapping window or the window of an individual control.
The rub comes in, as Hans mentioned in a comment, because handles to device contexts have process affinity, which means that the HDC you pass to the window in a separate process, into which it is supposed to render itself, will not be valid from that other process. Handles to device contexts cannot be passed across process boundaries. That's the primary reason that your code fails (or is going to fail, once you fix the handle type problems). The MSDN entry on GDI Objects makes this explicitly clear:
Handles to GDI objects are private to a process. That is, only the process that created the GDI object can use the object handle.
Fixing or getting around that is going to be a bit of an uphill battle. The only solution that I know of is to inject code into the other application's process that first creates a DC in memory, then sends the WM_PRINT or WM_PRINTCLIENT message to a window owned by that process to draw into that in-memory device context, and then transfers the resulted bitmap back to your own application. This is going to require that you implement some type of inter-process communication mechanism.
I've seen some anecdotal evidence that passing device context handles between processes via the WM_PRINT and WM_PRINTCLIENT messages "works", but it's unclear to me whether this is an artifact of the current implementation (and therefore subject to breaking in future versions of Windows), or if this is because Windows is actually handling the marshaling between processes. I haven't seen any documentation one way or the other. If this is a one-off project you're doing for fun or for a limited use, you might try it and get away with it. For other purposes, you probably want to investigate using IPC to really do this the right way.
Don't use GetDC for the DC to pass to PrintWindow. You need to create a compatible DC as you're doing (though you can pass it NULL to get a generic screen DC), then create a compatible bitmap the size of the window you're trying to capture and select it into the DC. Then pass that DC handle to PrintWindow.
Windows aren't required to respond properly to WM_PRINT or WM_PRINTCLIENT, so there may be some glitches even when you get this to work.
My app. will be running on the system try monitoring for a hotkey; when the user selects some text in any window and presses a hotkey, how do I obtain the selected text, when I get the WM_HOTKEY message?
To capture the text on to the clipboard, I tried sending Ctrl + C using keybd_event() and SendInput() to the active window (GetActiveWindow()) and forground window (GetForegroundWindow()); tried combinations amongst these; all in vain. Can I get the selected text of the focused window in Windows with plain Win32 system APIs?
TL;DR: Yes, there is a way to do this using plain win32 system APIs, but it's difficult to implement correctly.
WM_COPY and WM_GETTEXT may work, but not in all cases. They depend on the receiving window handling the request correctly - and in many cases it will not. Let me run through one possible way of doing this. It may not be as simple as you were hoping, but what is in the adventure filled world of win32 programming? Ready? Ok. Let's go.
First we need to get the HWND id of the target window. There are many ways of doing this. One such approach is the one you mentioned above: get the foreground window and then the window with focus, etc. However, there is one huge gotcha that many people forget. After you get the foreground window you must AttachThreadInput to get the window with focus. Otherwise GetFocus() will simply return NULL.
There is a much easier way. Simply (miss)use the GUITREADINFO functions. It's much safer, as it avoids all the hidden dangers associated with attaching your input thread with another program.
LPGUITHREADINFO lpgui = NULL;
HWND target_window = NULL;
if( GetGUIThreadInfo( NULL, lpgui ) )
target_window = lpgui->hwndFocus;
else
{
// You can get more information on why the function failed by calling
// the win32 function, GetLastError().
}
Sending the keystrokes to copy the text is a bit more involved...
We're going to use SendInput instead of keybd_event because it's faster, and, most importantly, cannot be messed up by concurrent user input, or other programs simulating keystrokes.
This does mean that the program will be required to run on Windows XP or later, though, so, sorry if your running 98!
// We're sending two keys CONTROL and 'V'. Since keydown and keyup are two
// seperate messages, we multiply that number by two.
int key_count = 4;
INPUT* input = new INPUT[key_count];
for( int i = 0; i < key_count; i++ )
{
input[i].dwFlags = 0;
input[i].type = INPUT_KEYBOARD;
}
input[0].wVK = VK_CONTROL;
input[0].wScan = MapVirtualKey( VK_CONTROL, MAPVK_VK_TO_VSC );
input[1].wVK = 0x56 // Virtual key code for 'v'
input[1].wScan = MapVirtualKey( 0x56, MAPVK_VK_TO_VSC );
input[2].dwFlags = KEYEVENTF_KEYUP;
input[2].wVK = input[0].wVK;
input[2].wScan = input[0].wScan;
input[3].dwFlags = KEYEVENTF_KEYUP;
input[3].wVK = input[1].wVK;
input[3].wScan = input[1].wScan;
if( !SendInput( key_count, (LPINPUT)input, sizeof(INPUT) ) )
{
// You can get more information on why this function failed by calling
// the win32 function, GetLastError().
}
There. That wasn't so bad, was it?
Now we just have to take a peek at what's in the clipboard. This isn't as simple as you would first think. The "clipboard" can actually hold multiple representations of the same thing. The application that is active when you copy to the clipboard has control over what exactly to place in the clipboard.
When you copy text from Microsoft Office, for example, it places RTF data into the clipboard, alongside a plain-text representation of the same text. That way you can paste it into wordpad and notepad. Wordpad would use the rich-text format, while notepad would use the plain-text format.
For this simple example, though, let's assume we're only interested in plaintext.
if( OpenClipboard(NULL) )
{
// Optionally you may want to change CF_TEXT below to CF_UNICODE.
// Play around with it, and check out all the standard formats at:
// http://msdn.microsoft.com/en-us/library/ms649013(VS.85).aspx
HGLOBAL hglb = GetClipboardData( CF_TEXT );
LPSTR lpstr = GlobalLock(hglb);
// Copy lpstr, then do whatever you want with the copy.
GlobalUnlock(hglb);
CloseClipboard();
}
else
{
// You know the drill by now. Check GetLastError() to find out what
// went wrong. :)
}
And there you have it! Just make sure you copy lpstr to some variable you want to use, don't use lpstr directly, since we have to cede control of the contents of the clipboard before we close it.
Win32 programming can be quite daunting at first, but after a while... it's still daunting.
Cheers!
Try adding a Sleep() after each SendInput(). Some apps just aren't that fast in catching keyboard input.
Try SendMessage(WM_COPY, etc. ).