I have a simple GUI application I wrote in C for the RaspBerry PI while using GTK+2.0 to handle the actual UI rendering. The application so far is pretty simple, with just a few pushbuttons for testing simple functions I wrote. One button causes a thread to be woken up which prints text to the console, and goes back to sleep, while another button stops this operation early by locking a mutex, changing a status variable, then unlocking the mutex again. Fairly simple stuff so far. The point of using this threaded approach is so that I don't ever "lock up" the UI during a long function call, forcing the user to be blocked on the I/O operations completing before the UI is usable again.
If I call the following function in my thread's processing loop, I encounter a number of issues.
#include <opencv2/objdetect/objdetect.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <stdio.h>
#include <errno.h>
using namespace std;
using namespace cv;
#define PROJECT_NAME "CAMERA_MODULE" // Include before liblog
#include <log.h>
int cameraAcquireImage(char* pathToImage) {
if (!pathToImage) {
logError("Invalid input");
return (-EINVAL);
}
int iErr = 0;
CvCapture *capture = NULL;
IplImage *frame, *img;
//0=default, -1=any camera, 1..99=your camera
capture = cvCaptureFromCAM(CV_CAP_ANY);
if(!capture) {
logError("No camera interface detected");
iErr = (-EIO);
}
if (!iErr) {
if ((frame = cvQueryFrame(capture)) == NULL) {
logError("ERROR: frame is null...");
iErr = (-EIO);
}
}
if (!iErr) {
CvSize size = cvSize(100, 100);
if ((img = cvCreateImage(size, IPL_DEPTH_16S, 1)) != NULL) {
img = frame;
cvSaveImage(pathToImage, img);
}
}
if (capture) {
cvReleaseCapture(&capture);
}
return 0;
}
The function uses some simple OpenCV code to take a snapshot with a webcam connected to my Raspberry PI. It issues warnings of VIDIOC_QUERYMENU: Invalid argument to the console, but still manages to acquire the images and save them to a file for me. However, my GUI becomes sluggish, and sometimes hangs. If it doesn't outright hang, then the window goes blank, and I have to randomly click all over the UI area until I click on where a pushbutton would normally be located, and the UI finally re-renders again rather than showing a white empty layout.
How do I go about resolving this? Is this some quirk in OpenCv when using it as part of a Gtk+2.0 application? I had originally had my project setup as a GTK3.0 application, but it wouldn't run due to some check in GTK preventing multiple versions from being included in a single application, and it seems OpenCv is an extension of GTK+2.0.
Thank you.
there is something quite broken here:
CvSize size = cvSize(100, 100);
if ((img = cvCreateImage(size, IPL_DEPTH_16S, 1)) != NULL) {
img = frame;
cvSaveImage(pathToImage, img);
}
first, you create a useless 16-bit image (why even?), then you reassign(alias) that pointer to your original image, and then you don't cvReleaseImage it (memleak).
please, stop using opencv's deprecated c-api. please.
any noob will shoot into his foot using this (one of the main reasons to get rid of it)
also, you can only use ~30% of opencv's functionality this way (the opencv1.0 set)
again, please, stop using opencv's deprecated c-api. please.
Didn't you forget to free the img pointer ?
Also, I did in the past an app that stored uncompressed images on the disk, and things used to become sluggish. In fact, what was taking time was storing the images on the disk, as it was exceeding the max bandwidth of what the filesystem layer could handle.
So try to see is you can store compressed images instead (trading some CPU to save bandwidth), or store your images in RAM in a queue and save them afterwards (in a separate thread, or in an idle handler). Of course, if the video you capture is too long, you may end up with an Out Of Memory condition. I only had sequences of a few seconds to store, so that did the trick.
Related
Is there any way, using static analysis tools(I'm using Codesonar now), to detect unreleased lock problems (something like unreleased semaphores) in the following program?(The comment part marked by arrows)
The project is a multi-task system using Round-robin scheduling, where new_request() is an interrupt task comes randomly and send_buffer() is another period task.
In real case, get_buffer() and send_buffer() are various types of wrappers, which contains many call layers until actual lock/unlock process. So I can't simply specify get_buffer() as lock function in settings of static analysis tool.
int bufferSize = 0; // say max size is 5
// random task
void new_request()
{
int bufferNo = get_buffer(); // wrapper
if (bufferNo == -1)
{
return; // buffer is full
}
if (check_something() == OK)
{
add_to_sendlist(bufferNo); // for asynchronous process of send_buffer()
}
else // bad request
{
// ↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓
// There should be clear_buffer placed here
// but forgotten. Eventually the buffer will be
// full and won't be cleared since 5th bad request comes.
// ↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑
do_nothing();
// clear_buffer(bufferNo);
}
}
int get_buffer()
{
if(bufferSize < 5)
{
bufferSize++;
return bufferSize;
}
else
{
wait_until_empty(); // wait until someone is sent by send_buffer()
return -1;
}
}
// clear specifiled one in buffer
void clear_buffer(int bufferNo)
{
delete(bufferNo)
bufferSize--;
}
// period task
void send_buffer()
{
int sent = send_1st_stuff_in_list();
clear_buffer(sent);
}
yoyozi - Fair disclosure: I'm an engineer at GrammaTech who works on CodeSonar.
First some general things. The relevant parts of the manual for this are on the page: codesonar/doc/html/C_Module/LibraryModels/ConcurrencyModelsLocks.html. Especially the bottom of the page on Resolving Lock Operation Identification Problems.
Based on your comments, I think you have already read this, since you address setting the names in the configuration settings.
So then the question is how many different wrappers do you have? If it is only a few, then the settings in the configuration file are the way to go. If there are many, that gets tedious. And if there are very many it becomes practically impossible.
So knowing some estimate for how many wrapper sets you have would help.
Even with the wrappers accounted for, it may be that the deadlock and race detectors aren't quite what you need for your problem.
If I understand your issue correctly, you have a queue with limited space, and by accident malformed items don't get cleaned out of the queue, and so the queue gets full and that stalls all processing. While you may have multiple threads involved in this implementation, the issue itself would still be a problem in a basically serial setting.
The best way to work with an issue like this is to try and make a simpler example that displays the same core problem. If you can do this in a way that can be shared with GrammaTech, we can work with you on ways to adjust settings or maybe provide hints to the analysis so it can find this issue.
If you would like to talk about this in more detail, and with prodetction against public disclosure of your code, please contact us at support_at_grammatech_dot_com, where the at and dot should be replaced as needed to make a well formed email address.
I want to create a realtime sine generator using apples core audio framework. I want do do it low level so I can learn and understand the fundamentals.
I know that using PortAudio or Jack would probably be easier and I will use them at some point but I would like to get this to work first so I can be confident to understand the fundamentals.
I literally searched for days now on this topic but no one seems to have ever created a real time wave generator using core audio trying to optain low latency while using C and not Swift or Objective-C.
For this I use a project I set up a while ago. It was first designed to be a game. So after the Application starts up, it will enter a run loop. I thought this would perfectly fit as I can use the main loop to copy samples into the audio buffer and process rendering and input handling as well.
So far I get sound. Sometimes it works for a while then starts to glitch, sometimes it glitches right away.
This is my code. I tried to simplify if and only present the important parts.
I got multiple questions. They are located in the bottom section of this post.
Applications main run loop. This is where it all starts after the window is created and buffers and memory is initialized:
while (OSXIsGameRunning())
{
OSXProcessPendingMessages(&GameData);
[GlobalGLContext makeCurrentContext];
CGRect WindowFrame = [window frame];
CGRect ContentViewFrame = [[window contentView] frame];
CGPoint MouseLocationInScreen = [NSEvent mouseLocation];
BOOL MouseInWindowFlag = NSPointInRect(MouseLocationInScreen, WindowFrame);
CGPoint MouseLocationInView = {};
if (MouseInWindowFlag)
{
NSRect RectInWindow = [window convertRectFromScreen:NSMakeRect(MouseLocationInScreen.x, MouseLocationInScreen.y, 1, 1)];
NSPoint PointInWindow = RectInWindow.origin;
MouseLocationInView= [[window contentView] convertPoint:PointInWindow fromView:nil];
}
u32 MouseButtonMask = [NSEvent pressedMouseButtons];
OSXProcessFrameAndRunGameLogic(&GameData, ContentViewFrame,
MouseInWindowFlag, MouseLocationInView,
MouseButtonMask);
#if ENGINE_USE_VSYNC
[GlobalGLContext flushBuffer];
#else
glFlush();
#endif
}
Through using VSYNC I can throttle the loop down to 60 FPS. The timing is not super tight but it is quite steady. I also have some code to throttle it manually using mach timing which is even more imprecise. I left it out for readability.
Not using VSYNC or using mach timing to get 60 iterations a second also makes the audio glitch.
Timing log:
CyclesElapsed: 8154360866, TimeElapsed: 0.016624, FPS: 60.155666
CyclesElapsed: 8174382119, TimeElapsed: 0.020021, FPS: 49.946926
CyclesElapsed: 8189041370, TimeElapsed: 0.014659, FPS: 68.216309
CyclesElapsed: 8204363633, TimeElapsed: 0.015322, FPS: 65.264511
CyclesElapsed: 8221230959, TimeElapsed: 0.016867, FPS: 59.286217
CyclesElapsed: 8237971921, TimeElapsed: 0.016741, FPS: 59.733719
CyclesElapsed: 8254861722, TimeElapsed: 0.016890, FPS: 59.207333
CyclesElapsed: 8271667520, TimeElapsed: 0.016806, FPS: 59.503273
CyclesElapsed: 8292434135, TimeElapsed: 0.020767, FPS: 48.154209
What is important here is the function OSXProcessFrameAndRunGameLogic. It is called 60 times a second and it is passed a struct containing basic information like a buffer for rendering, keyboard state and a sound buffer which looks like this:
typedef struct osx_sound_output
{
game_sound_output_buffer SoundBuffer;
u32 SoundBufferSize;
s16* CoreAudioBuffer;
s16* ReadCursor;
s16* WriteCursor;
AudioStreamBasicDescription AudioDescriptor;
AudioUnit AudioUnit;
} osx_sound_output;
Where game_sound_output_buffer is:
typedef struct game_sound_output_buffer
{
real32 tSine;
int SamplesPerSecond;
int SampleCount;
int16 *Samples;
} game_sound_output_buffer;
These get set up before the application enters its run loop.
The size for the SoundBuffer itself is SamplesPerSecond * sizeof(uint16) * 2 where SamplesPerSecond = 48000.
So inside OSXProcessFrameAndRunGameLogic is the sound generation:
void OSXProcessFrameAndRunGameLogic(osx_game_data *GameData, CGRect WindowFrame,
b32 MouseInWindowFlag, CGPoint MouseLocation,
int MouseButtonMask)
{
GameData->SoundOutput.SoundBuffer.SampleCount = GameData->SoundOutput.SoundBuffer.SamplesPerSecond / GameData->TargetFramesPerSecond;
// Oszi 1
OutputTestSineWave(GameData, &GameData->SoundOutput.SoundBuffer, GameData->SynthesizerState.ToneHz);
int16* CurrentSample = GameData->SoundOutput.SoundBuffer.Samples;
for (int i = 0; i < GameData->SoundOutput.SoundBuffer.SampleCount; ++i)
{
*GameData->SoundOutput.WriteCursor++ = *CurrentSample++;
*GameData->SoundOutput.WriteCursor++ = *CurrentSample++;
if ((char*)GameData->SoundOutput.WriteCursor >= ((char*)GameData->SoundOutput.CoreAudioBuffer + GameData->SoundOutput.SoundBufferSize))
{
//printf("Write cursor wrapped!\n");
GameData->SoundOutput.WriteCursor = GameData->SoundOutput.CoreAudioBuffer;
}
}
}
Where OutputTestSineWave is the part where the buffer is actually filled with data:
void OutputTestSineWave(osx_game_data *GameData, game_sound_output_buffer *SoundBuffer, int ToneHz)
{
int16 ToneVolume = 3000;
int WavePeriod = SoundBuffer->SamplesPerSecond/ToneHz;
int16 *SampleOut = SoundBuffer->Samples;
for(int SampleIndex = 0;
SampleIndex < SoundBuffer->SampleCount;
++SampleIndex)
{
real32 SineValue = sinf(SoundBuffer->tSine);
int16 SampleValue = (int16)(SineValue * ToneVolume);
*SampleOut++ = SampleValue;
*SampleOut++ = SampleValue;
SoundBuffer->tSine += Tau32*1.0f/(real32)WavePeriod;
if(SoundBuffer->tSine > Tau32)
{
SoundBuffer->tSine -= Tau32;
}
}
}
So when the Buffers are created at start up also Core audio is initialized which I do like this:
void OSXInitCoreAudio(osx_sound_output* SoundOutput)
{
AudioComponentDescription acd;
acd.componentType = kAudioUnitType_Output;
acd.componentSubType = kAudioUnitSubType_DefaultOutput;
acd.componentManufacturer = kAudioUnitManufacturer_Apple;
AudioComponent outputComponent = AudioComponentFindNext(NULL, &acd);
AudioComponentInstanceNew(outputComponent, &SoundOutput->AudioUnit);
AudioUnitInitialize(SoundOutput->AudioUnit);
// uint16
//AudioStreamBasicDescription asbd;
SoundOutput->AudioDescriptor.mSampleRate = SoundOutput->SoundBuffer.SamplesPerSecond;
SoundOutput->AudioDescriptor.mFormatID = kAudioFormatLinearPCM;
SoundOutput->AudioDescriptor.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsNonInterleaved | kAudioFormatFlagIsPacked;
SoundOutput->AudioDescriptor.mFramesPerPacket = 1;
SoundOutput->AudioDescriptor.mChannelsPerFrame = 2; // Stereo
SoundOutput->AudioDescriptor.mBitsPerChannel = sizeof(int16) * 8;
SoundOutput->AudioDescriptor.mBytesPerFrame = sizeof(int16); // don't multiply by channel count with non-interleaved!
SoundOutput->AudioDescriptor.mBytesPerPacket = SoundOutput->AudioDescriptor.mFramesPerPacket * SoundOutput->AudioDescriptor.mBytesPerFrame;
AudioUnitSetProperty(SoundOutput->AudioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
0,
&SoundOutput->AudioDescriptor,
sizeof(SoundOutput->AudioDescriptor));
AURenderCallbackStruct cb;
cb.inputProc = OSXAudioUnitCallback;
cb.inputProcRefCon = SoundOutput;
AudioUnitSetProperty(SoundOutput->AudioUnit,
kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Global,
0,
&cb,
sizeof(cb));
AudioOutputUnitStart(SoundOutput->AudioUnit);
}
The initialization code for core audio sets the render callback to OSXAudioUnitCallback
OSStatus OSXAudioUnitCallback(void * inRefCon,
AudioUnitRenderActionFlags * ioActionFlags,
const AudioTimeStamp * inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList * ioData)
{
#pragma unused(ioActionFlags)
#pragma unused(inTimeStamp)
#pragma unused(inBusNumber)
//double currentPhase = *((double*)inRefCon);
osx_sound_output* SoundOutput = ((osx_sound_output*)inRefCon);
if (SoundOutput->ReadCursor == SoundOutput->WriteCursor)
{
SoundOutput->SoundBuffer.SampleCount = 0;
//printf("AudioCallback: No Samples Yet!\n");
}
//printf("AudioCallback: SampleCount = %d\n", SoundOutput->SoundBuffer.SampleCount);
int SampleCount = inNumberFrames;
if (SoundOutput->SoundBuffer.SampleCount < inNumberFrames)
{
SampleCount = SoundOutput->SoundBuffer.SampleCount;
}
int16* outputBufferL = (int16 *)ioData->mBuffers[0].mData;
int16* outputBufferR = (int16 *)ioData->mBuffers[1].mData;
for (UInt32 i = 0; i < SampleCount; ++i)
{
outputBufferL[i] = *SoundOutput->ReadCursor++;
outputBufferR[i] = *SoundOutput->ReadCursor++;
if ((char*)SoundOutput->ReadCursor >= (char*)((char*)SoundOutput->CoreAudioBuffer + SoundOutput->SoundBufferSize))
{
//printf("Callback: Read cursor wrapped!\n");
SoundOutput->ReadCursor = SoundOutput->CoreAudioBuffer;
}
}
for (UInt32 i = SampleCount; i < inNumberFrames; ++i)
{
outputBufferL[i] = 0.0;
outputBufferR[i] = 0.0;
}
return noErr;
}
This is mostly all there is to it. This is quite long but I did not see a way to present all needed information in a more compact way. I wanted to show all because I am by no means a professional programmer. If there is something you feel is missing, pleas tell me.
My feeling tells me that there is something wrong with the timing. I feel the function OSXProcessFrameAndRunGameLogic sometimes needs more time so that the core audio callback is already pulling samples out of the buffer before it is fully written by OutputTestSineWave.
There is actually more stuff going on in OSXProcessFrameAndRunGameLogic which I did not show here. I am "software rendering" very basic stuff into a framebuffer which is then displayed by OpenGL and I also do keypress checks in there because yeah, its the main function of functionality. In the future this is the place where I would like to handle the controls for multiple oscillators, filters and stuff.
Anyway even if I stop the Rendering and Input handling from being called every iteration I still get audio glitches.
I tried pulling all the sound processing in OSXProcessFrameAndRunGameLogic into an own function void* RunSound(void *GameData) and changed it to:
pthread_t soundThread;
pthread_create(&soundThread, NULL, RunSound, GameData);
pthread_join(soundThread, NULL);
However I got mixed results and was not even sure if multithreading is done like that. Creating and destroying threads 60 times a second didn't seem the way to go.
I also had the idea to let sound processing happen on a completely different thread before the application actually runs into the main loop. Something like two simultaneously running while loops where the first processes audio and the latter UI and input.
Questions:
I get glitchy audio. Rendering and input seem to work correctly but audio sometimes glitches, sometimes it doesn't. From the code I provided, can you maybe see me doing something wrong?
Am I using the core audio technology in a wrong way in order to achieve real time low latency signal generation?
Should I do sound processing in a separate thread like I talked about above? How would threading in this context be done correctly? It would make sense to have a thread only dedicated for sound am I right?
Am I right that the basic audio processing should not be done in the render callback of core audio? Is this function only for outputting the provided sound buffer?
And if sound processing should be done right here, how can I access information like the keyboard state from inside the callback?
Are there any resources you could point me to that I maybe missed?
This is the only place I know where I can get help with this project. I would really appreciate your help.
And if something is not clear to you please let me know.
Thank you :)
In general when dealing with low-latency audio you want to achieve the most deterministic behaviour possible.
This, for example, translates to:
Don't hold any locks on the audio thread (priority inversion)
No memory allocation on the audio thread (takes often too much time)
No file/network IO on the audio thread (takes often too much time)
Question 1:
There are indeed some problems with your code for when you want to achieve continuous, realtime, non-glitching audio.
1. Two different clock domains.
You are providing audio data from a (what I call) a different clock domain than the clock domain asking for data. Clock domain 1 in this case is defined by your TargetFramesPerSecond value, clock domain 2 defined by Core Audio. However, due too how scheduling works you have no guarantee that you thread is finishing in time and on time. You try to target your rendering to n frames per second, but what happens when you don't make it time wise? As far as I can see you don't compensate for the deviation a render cycle took compared to the ideal timing.
The way threading works is that ultimately the OS scheduler decides when your thread is active. There are never guarantees and this causes you render cycles to be not very precise (in term of precision you need for audio rendering).
2. There is no synchronisation between the render thread and the Core Audio rendercallback thread.
The thread where the OSXAudioUnitCallback runs is not the same as where your OSXProcessFrameAndRunGameLogic and thus OutputTestSineWave run. You are providing data from your main thread, and data is being read from the Core Audio render thread. Normally you would use some mutexes to protect you data, but in this case that's not possible because you would run into the problem of priority inversion.
A way of dealing with race conditions is to use a buffer which uses atomic variables to store the usage and pointers of the buffer and let only 1 producer and 1 consumer use this buffer.
Good examples of such buffers are:
https://github.com/michaeltyson/TPCircularBuffer
https://github.com/andrewrk/libsoundio/blob/master/src/ring_buffer.h
3. There are a lot of calls in you audio render thread which prevent deterministic behaviour.
As you wrote you are doing a lot more inside the same audio render thread. Changes are quite high that there will be stuff going on (under the hood) which prevents your thread from being on time. Generally, you should avoid calls which take either too much time or are not deterministic. With all the OpenGL/keypres/framebuffer rendering there is no way to be certain you thread will "arrive on time".
Below are some resources worth looking into.
Question 2:
AFAICT generally speaking, you are using the Core Audio technology correctly. The only problem I think you have is on the providing side.
Question 3:
Yes. Definitely! Although, there are multiple ways of doing this.
In your case you have a normal-priority thread running to do the rendering and a high-performance, realtime thread on which the audio render callback is being called. Looking at your code I would suggest putting the generation of the sine wave inside the render callback function (or call OutputTestSineWave from the render callback). This way you have the audio generation running inside a reliable high prio thread, there is no other rendering interfering with the timing precision and there is no need for a ringbuffer.
In other cases where you need to do "non-realtime" processing to get audiodata ready (think of reading from a file, reading from a network or even from another physical audio device) you cannot run this logic inside the Core Audio thread. A way to solve this is to start a separate, dedicated thread to do this processing. To pass the data to the realtime audio thread you would then make use of the earlier mentioned ringbuffer.
It basically boiles down to two simple goals: for the realtime thread it is necessary to have the audio data available at all times (all render calls), if this failes you will end up sending invalid (or better zeroed) audio data.
The main goal for the secondary thread is to fill up the ringbuffer as fast as possible and to keep the ringbuffer as full as possible. So, whenever there is room to put new audio data into the ringbuffer the thread should do just that.
The size of the ringbuffer in this case will dicate how much tolerance there will be for delay. The size of the ringbuffer will be a balance between certainty (bigger buffer) and latency (smaller buffer).
BTW. I'm quite certain Core Audio has all the facilities to do all this for you.
Question 4:
There are multiple ways of achieving you goal, and rendering the stuff inside the render callback from Core Audio is definitely one of them. The one thing you should keep in mind is that you have to make sure the function returns in time.
For changing parameters to manipulate the audio rendering you'll have to find a way of passing messages which enables the reader (audio renderer function) to get messages without locking and waiting. The way I have done this is to create a second ringbuffer which hold messages from which the audio renderer can consume. This can be as simple as a ringbuffer which hold structs with data (or even pointers to data). As long as you stick to the rules of no locking.
Question 5:
I don't know what resources you are aware of but here are some must-reads:
http://atastypixel.com/blog/four-common-mistakes-in-audio-development/
http://www.rossbencina.com/code/real-time-audio-programming-101-time-waits-for-nothing
https://developer.apple.com/library/archive/qa/qa1467/_index.html
You basic problem is that you are trying to push audio from your game loop instead of letting the audio system pull it; e.g. instead of always having (or quickly being able to create *) enough audio samples ready for the amount requested by the audio callback to be pulled by the audio callback. The "always" has to account for enough slop to cover timing jitter (being called late or early or too few times) in your game loop.
(* with no locks, semaphores, memory allocation or Objective C messages)
Do graphics cards typically write their output to some place in memory which I can access? Do I have to use the driver? If so, can I use OpenGL?
I want to know if it's possible to "capture" the output of a VM on Linux which has direct access to a GPU, and is running Windows. Ideally, I could access the output from memory directly, without touching the GPU, since this code would be able to run on the Linux host.
The other option is to maybe write a Windows driver which reads the output of the GPU and writes it to some place in memory. Then, on the Linux side, a program can read this memory. This seems somewhat impossible, since I'm not really sure how to get a process on the host to share memory with a process on the guest.
Is it possible to do option 1 and simply read the output from memory?
I do not code under Linux but in Windows (you are running it in emulator anyway) you can use WinAPI to directly access canvas of any window or even desktop from 3th party App. Some GPU overlays could be problematic to catch (especially DirectX based) but I had no problems with mine GL/GLSL for now.
If you got access to App source you can use glReadPixels for image extraction from GL directly (but that works only for current GL based rendering).
Using glReadPixels
As mentioned this must be implemented directly in the targeted app so you need to have it source code or inject your code in the right place/time. I use for screenshoting this code:
void OpenGLscreen::screenshot(Graphics::TBitmap *bmp)
{
if (bmp==NULL) return;
int *dat=new int[xs*ys],x,y,a,*p;
if (dat==NULL) return;
bmp->HandleType=bmDIB;
bmp->PixelFormat=pf32bit;
if ((bmp->Width!=xs)||(bmp->Height!=ys)) bmp->SetSize(xs,ys);
if ((bmp->Width==xs)&&(bmp->Height==ys))
{
glReadPixels(0,0,xs,ys,GL_BGRA,GL_UNSIGNED_BYTE,dat);
glFinish();
for (a=0,y=ys-1;y>=0;y--)
for (p=(int*)bmp->ScanLine[y],x=0;x<xs;x++,a++)
p[x]=dat[a];
}
delete[] dat;
}
where xs,ys is OpenGL window resolution, you can ignore the whole bmp stuff (it is VCL bitmap I use to store screenshot) and also can ignore the for it just copy the image from buffer to bitmap. So the important stuff is just this:
int *dat=new int[xs*ys]; // each pixel is 32bit int
glReadPixels(0,0,xs,ys,GL_BGRA,GL_UNSIGNED_BYTE,dat);
glFinish();
You need to execute this code after the rendering is done otherwise you will obtain unfinished or empty buffer. I use it after redraw/repaint events. As mentioned before this will obtain only the GL rendered stuff so if your App combines GDI+OpenGL it is better to use the next approach.
WinAPI approach
To obtain Canvas image of any window I wrote this class:
//---------------------------------------------------------------------------
//--- screen capture ver: 1.00 ----------------------------------------------
//---------------------------------------------------------------------------
class scrcap
{
public:
HWND hnd,hnda;
TCanvas *scr;
Graphics::TBitmap *bmp;
int x0,y0,xs,ys;
scrcap()
{
hnd=NULL;
hnda=NULL;
scr=new TCanvas();
bmp=new Graphics::TBitmap;
#ifdef _mmap_h
mmap_new('scrc',scr,sizeof(TCanvas() ));
mmap_new('scrc',bmp,sizeof(Graphics::TBitmap));
#endif
if (bmp)
{
bmp->HandleType=bmDIB;
bmp->PixelFormat=pf32bit;
}
x0=0; y0=0; xs=1; ys=1;
hnd=GetDesktopWindow();
}
~scrcap()
{
#ifdef _mmap_h
mmap_del('scrc',scr);
mmap_del('scrc',bmp);
#endif
if (scr) delete scr; scr=NULL;
if (bmp) delete bmp; bmp=NULL;
}
void init(HWND _hnd=NULL)
{
RECT r;
if (scr==NULL) return;
if (bmp==NULL) return;
bmp->SetSize(1,1);
if (!IsWindow(_hnd)) _hnd=hnd;
scr->Handle=GetDC(_hnd);
hnda=_hnd;
resize();
}
void resize()
{
if (!IsWindow(hnda)) return;
RECT r;
// GetWindowRect(hnda,&r);
GetClientRect(hnda,&r);
x0=r.left; xs=r.right-x0;
y0=r.top; ys=r.bottom-y0;
bmp->SetSize(xs,ys);
xs=bmp->Width;
ys=bmp->Height;
}
void capture()
{
if (scr==NULL) return;
if (bmp==NULL) return;
bmp->Canvas->CopyRect(Rect(0,0,xs,ys),scr,TRect(x0,y0,x0+xs,y0+ys));
}
};
//---------------------------------------------------------------------------
//---------------------------------------------------------------------------
//---------------------------------------------------------------------------
Again it uses VCL so rewrite bitmap bmp and canvas scr to your programing environment style. Also igneore the _mmap_h chunks of code they are just for debugging/tracing of memory pointers related to some nasty compiler bug I was facing at that time I wrote this.
The usage is simple:
// globals
scrcap cap;
// init
cap.init();
// on screenshot
cap.capture();
// here use cap.bmp
If you call cap.init() it will lock on the whole windows desktop. If you call cap.init(window_handle) it will lock on specific visual window/component instead. To obtain window handle from 3th app side see:
is ther a way an app can display a message without the use of messagebox API?
Sorry it is on SE/RE instead of here on SE/SO but My answer here covering this topic was deleted. I use this for video capture ... all of the animated GIFs in my answers where created by this code. Another example could be seen on the bottom of this answer of mine:
Image to ASCII art conversion
As you can see it works also for DirectX overlay with media player classic (even Windows PrintScreen function cant do it right). As I wrote I got no problems with this yet.
Beware visual stuff WinAPI calls MUST BE CALLED FROM APP MAIN THREAD (WNDPROC) otherwise serious problems could occur leading to random unrelated WinAPI calls exceptions anywhere in the App.
I'm working on a project in c, where I'm going to make some heavy physics calculations, and I want the ability to see the results when I'm finished. The way it works now is that I run GLUT on the main thread, and use a seperate thread (pthread) to do input (from terminal) and calculations. I currently use glutTimerFunc to do the animation, but the problem is that that function will fire every given time intervall no matter what. I can stop the animation by using an if statement in the animation function, stopping the variables from being updatet, but this uses a lot of unnecessary resources (I think).
To fix this problem I was thinking that I could use an extra thread with a custom timer function that I could controll myself (without glutMainLoop messing things up). Currently this is my test function to check if this would work (meaning that the function itself is in no way finished). It runs in a seperate thread createt just before glutMainLoop:
void *threadAnimation() {
while (1) {
if (animationRun) {
rotate = rotate+0.00001;
if (rotate>360) {
rotate = rotate-360;
}
glutSetWindow(window);
glutPostRedisplay();
}
}
}
The specific problem I have is that the animation just runs for a couple of seconds, and then stops. Does anybody know how I can fix this? I am planning to use timers and so on later, but what I'm looking for is a way to ensure that glutPostRedisplay will be sent the right place. I tought glutSetWindow(window) was the solution, but apparently not. If I remove glutSetWindow(window) the animation still works, just not for as long, but runs much faster (so maybe glutSetWindow(window) takes a lot of resources?)
btw the variable "window" is created like this:
glutInit(&argc, argv);
glutInitDisplayMode(GLUT_DOUBLE | GLUT_RGB);
glutInitWindowSize(854, 540);
glutInitWindowPosition(100, 100);
window = glutCreateWindow("Animation View");
init();
glutDisplayFunc(display);
glutReshapeFunc(reshape);
timerInt = pthread_create(&timerThread, NULL, &threadAnimation, NULL);
glutMainLoop();
I don't acctually know if this is correct, but it compiles just fine. Any help is greatly appriciated!
Here's a little idea, create class that will contain all dynamic settings:
class DynamicInfo {
public int vertexCount;
public float *vertexes;
...
DynamicInfo &operator=( const DynamicInfo &origin);
};
And than main application will contain those:
DynamicInfo buffers[2];
int activeBuffer = 0;
In animation thread just draw (maybe use some threads locks for one variable):
DynamicInfo *current = buffers + activeBuffer; // Or rather use reference
In calculations:
// Store currently used buffer as current (for future manipulation)
DynamicInfo *current = buffers + activeBuffer;
// We finished calculations on activeBuffer[1] so we may propagate it to application
activeBuffer = (activeBuffer + 1)%2;
// Let actual data propagate to current buffer
(*current) = buffers[activeBuffer];
Again it's locking one variable.
I'm actually trying to read a specific pixel on a window which is hidden by others. I want to use the GetPixel function from GDI library but it seems it only works with the global device context. I can't read pixel from a specific window and I don't understand why..
I found this article which uses the PrintWindow function to copy a specific window content to a temporary device context which can be read. But I can't reproduce it.
EDIT
Thank you all my problem is solved :)
This script give you the RGB color of the pointer on the choosen window, even though the window is hidden. Remind that this program must be launch with admin privileges to get the pixels of processes launched with admin privileges.
#define STRICT
#define WINVER 0x0501
#define _WIN32_WINNT 0x0501
// 0x0501 for PrintWindow function
// You must be at least running Windows XP
// See http://msdn.microsoft.com/en-us/library/6sehtctf.aspx
#include <stdio.h>
#include <string.h>
#include <windows.h>
#define WINDOW_LIST_LIMIT 32
#define WINDOW_NAME_LIMIT 1024
void FatalError(char* error)
{
printf("%s", error);
exit(-1);
}
HWND window_list[WINDOW_LIST_LIMIT];
unsigned int window_list_index = 0;
BOOL EnumWindowsProc(HWND window_handle, LPARAM param)
{
char window_title[WINDOW_NAME_LIMIT];
if(!IsWindowVisible(window_handle)) return TRUE;
RECT rectangle = {0};
GetWindowRect(window_handle, &rectangle);
if (IsRectEmpty(&rectangle)) return TRUE;
GetWindowText(window_handle, window_title, sizeof(window_title));
if(strlen(window_title) == 0) return TRUE;
if(!strcmp(window_title, "Program Manager")) return TRUE;
window_list[window_list_index] = window_handle;
window_list_index++;
printf("%u - %s\n", window_list_index, window_title);
if(window_list_index == WINDOW_LIST_LIMIT) return FALSE;
return TRUE;
}
int main(int argc, char** argv)
{
unsigned int i, input;
EnumWindows((WNDENUMPROC) EnumWindowsProc, (LPARAM) NULL);
printf("\nChoose a window: ");
scanf("%u", &input);
printf("\n");
if(input > window_list_index) FatalError("Bad choice..\n");
HDC window_dc = GetWindowDC(window_list[input - 1]), global_dc = GetDC(0), temp_dc;
if(!window_dc && !global_dc) FatalError("Fatal Error - Cannot get device context.\n");
POINT cursor, previous_cursor;
while(1)
{
temp_dc = CreateCompatibleDC(window_dc);
if(!temp_dc) FatalError("Fatal Error - Cannot create compatible device context.\n");
RECT window_rectangle;
GetWindowRect(window_list[input - 1], &window_rectangle);
HBITMAP bitmap = CreateCompatibleBitmap(window_dc,
window_rectangle.right - window_rectangle.left,
window_rectangle.bottom - window_rectangle.top);
if (bitmap)
{
SelectObject(temp_dc, bitmap);
PrintWindow(window_list[input - 1], temp_dc, 0);
DeleteObject(bitmap);
}
GetCursorPos(&cursor);
if(cursor.x != previous_cursor.x && cursor.y != previous_cursor.y)
{
COLORREF color = GetPixel(temp_dc, cursor.x - window_rectangle.left, cursor.y - window_rectangle.top);
int red = GetRValue(color);
int green = GetGValue(color);
int blue = GetBValue(color);
printf("\rRGB %02X%02X%02X", red, green, blue);
cursor = previous_cursor;
}
DeleteDC(temp_dc);
Sleep(50); // for lags
}
ReleaseDC(window_list[input - 1], window_dc);
return 0;
}
I've changed some things, now User32 isn't dynamically loaded.
It compiles with
gcc main.c -o main.exe -lGid32 -lUser32
Have a great day !
You are passing a process handle to GetDC. That's not right. Processes don't have device contexts, windows do. Remember a process can have many windows, or even none at all.
You need to get hold of the window handle, the HWND, for the window in question, and pass that to GetDC. I'd look to using FindWindow or EnumWindows to find your target top-level window.
Of course, there may be other problems with your code, but that's the one that jumps out at me.
HDC process_dc = GetDC(process_handle)
Well that's all kinds of wrong. GetDC accepts a window handle, not a process handle.
In order to find such errors, recompile with
#define STRICT
placed before your includes.
This is a bit of a confusing topic, so let's see if I can clarify a few things.
First things first: as both David and Ben have already answered, you're passing a process handle to the GetDC function, which is wrong. GetDC accepts a handle to a window (the HWND type), and it returns a device context (DC, the HDC type) corresponding to that window. You need to get that fixed before anything else will work.
Now, as the article you've read indicates, windows (assuming they've been correctly programmed) respond to the WM_PRINT or WM_PRINTCLIENT messages by rendering an image of themselves into the specified device context (HDC). This is a simple and effective way of capturing an "image" of a window, whether an overlapping window or the window of an individual control.
The rub comes in, as Hans mentioned in a comment, because handles to device contexts have process affinity, which means that the HDC you pass to the window in a separate process, into which it is supposed to render itself, will not be valid from that other process. Handles to device contexts cannot be passed across process boundaries. That's the primary reason that your code fails (or is going to fail, once you fix the handle type problems). The MSDN entry on GDI Objects makes this explicitly clear:
Handles to GDI objects are private to a process. That is, only the process that created the GDI object can use the object handle.
Fixing or getting around that is going to be a bit of an uphill battle. The only solution that I know of is to inject code into the other application's process that first creates a DC in memory, then sends the WM_PRINT or WM_PRINTCLIENT message to a window owned by that process to draw into that in-memory device context, and then transfers the resulted bitmap back to your own application. This is going to require that you implement some type of inter-process communication mechanism.
I've seen some anecdotal evidence that passing device context handles between processes via the WM_PRINT and WM_PRINTCLIENT messages "works", but it's unclear to me whether this is an artifact of the current implementation (and therefore subject to breaking in future versions of Windows), or if this is because Windows is actually handling the marshaling between processes. I haven't seen any documentation one way or the other. If this is a one-off project you're doing for fun or for a limited use, you might try it and get away with it. For other purposes, you probably want to investigate using IPC to really do this the right way.
Don't use GetDC for the DC to pass to PrintWindow. You need to create a compatible DC as you're doing (though you can pass it NULL to get a generic screen DC), then create a compatible bitmap the size of the window you're trying to capture and select it into the DC. Then pass that DC handle to PrintWindow.
Windows aren't required to respond properly to WM_PRINT or WM_PRINTCLIENT, so there may be some glitches even when you get this to work.