PeekMessage triggers WndProc callback - c

Yesterday I encountered the weirdest problem I have ever seen.
I wrote a module that should get a notification on USB plugs.
To do so, I created a dummy window and registered it to device change notifications using some interface's GUID.
The strange error occurs when PeekMessage is called.
at this point, some why, the window's WndProc callback is called, only when the peeked message is WM_DEVICECHANGE (we were registered to in the above code).
On any other message, the DispatchMessage triggers the callback, as expected.
NotificationFilter.dbcc_size = sizeof(DEV_BROADCAST_DEVICEINTERFACE);
NotificationFilter.dbcc_devicetype = DBT_DEVTYP_DEVICEINTERFACE;
NotificationFilter.dbcc_classguid = guid;
not = RegisterDeviceNotification(
hWnd, // events recipient
&NotificationFilter, // type of device
DEVICE_NOTIFY_WINDOW_HANDLE // type of recipient handle
In order to incorporate this module with the rest of my code which is asynchronous, using Reactor design pattern with Windows Events, and following the advice of stackoverflow community members, I incorporated MsgWaitForMultipleObjects in order to listen for events and windows messages as well.
for (;;)
dwRetval = MsgWaitForMultipleObjects(cntEvents, arrEvents, FALSE, INFINITE, QS_ALLINPUT);
switch (dwRetval)
// failed. TODO: status
// TODO: handle abandoned.
if (dwRetval == cntEvents)
// Message has popped.
BOOL x = PeekMessage(&tMsg, hWnd, 0, 0, PM_REMOVE); <---- WM_DEVICECHANGE triggers the callback
if (x)
else if (dwRetval < cntEvents)
// event signaled
// TODO: status. unexpected.
return FALSE; // unexpected failure
I disassembled the code, and compared the registers before any call to NtUserPeekMessage
Registers on successful call:
RAX = 00000059A604EFB0 RBX = 0000000000000000 RCX = 00000059A604EF18
RDX = 0000000000070C62 RSI = 00000059A604EF18 RDI = 0000000000070C62
R8  = 0000000000000000 R9  = 0000000000000000 R10 = 00007FF71A65D800
R11 = 0000000000000246 R12 = 0000000000000000 R13 = 0000000000000000
R14 = 0000000000000000 R15 = 0000000000000000 RIP = 00007FF954562AA1
RSP = 00000059A604EE70 RBP = 0000000000000000 EFL = 00000200
Registers on unknown callback trigger call:
RAX = 00000059A604EFB0 RBX = 0000000000000000 RCX = 00000059A604EF18
RDX = 0000000000070C62 RSI = 00000059A604EF18 RDI = 0000000000070C62
R8  = 0000000000000000 R9  = 0000000000000000 R10 = 00007FF71A65D800
R11 = 0000000000000246 R12 = 0000000000000000 R13 = 0000000000000000
R14 = 0000000000000000 R15 = 0000000000000000 RIP = 00007FF954562AA1
RSP = 00000059A604EE70 RBP = 0000000000000000 EFL = 00000200
The registers are exactly the same! (No parameters are passed on the stack, 64bit..)
In both cases (strange error and expected flow) I stepped into at NtUserPeekMessage, it turns out that the WndProc callback is triggered only from the internal syscall!
00007FF954562A80 mov r10,rcx
00007FF954562A83 mov eax,1003h
00007FF954562A88 syscall
I couldn't find any documentation on MSDN or explanation on the internet to the phenomenon.
I would really like some help,
Thanks in advance.

That is as expected, and is documented. PeekMessage is one of the functions that dispatches sent messages. From the documentation:
Dispatches incoming sent messages, checks the thread message queue for a posted message, and retrieves the message (if any exist).
And then later in the same document:
During this call, the system delivers pending, nonqueued messages, that is, messages sent to windows owned by the calling thread using the SendMessage, SendMessageCallback, SendMessageTimeout, or SendNotifyMessage function.
The documentation for SendMessage says this (with my emphasis):
If the specified window was created by the calling thread, the window procedure is called immediately as a subroutine. If the specified window was created by a different thread, the system switches to that thread and calls the appropriate window procedure. Messages sent between threads are processed only when the receiving thread executes message retrieval code.
By message retrieval code the documentation means functions like GetMessage and PeekMessage. There are a few others, I don't have a comprehensive list at hand.


GetOverlappedResult(bWait=TRUE) vs WaitForSingleObject() for overlapped I/O

When I open and read file in OVERLAPPED manner on Win32 api, I then have several ways to complete IO request including waiting for file handle (or event in overlapped structure) using
GetOverlappedResult with bWait=TRUE
Both functions seems to have same effect: thread stopped until handle or event is signaled, and that means data is placed in buffer provided to ReadFile.
So, what is the difference? Why do I need GetOverlappedResult?
i full agree with Remus Rusanu answer . also instead create own IOCP and thread pool, which will be listen on this IOCP, you can use or BindIoCompletionCallback or CreateThreadpoolIo (begin from vista) - in this case system yourself create IOCP and thread pool wich will be listen on this IOCP and when some operation completed - call your callback. this is very simplify code vs own iocp/thread pool (own iocp/thread pool really i think have sense implement only when you have very big count I/O (say socket io on server side) and need special optimization for perfomance)
So, what is the difference? Why do I need GetOverlappedResult
how you can see GetOverlappedResult[Ex] not only wait for result, but
return to you NumberOfBytesTransferred if operation is completed.
if operation is completed with error NTSTATUS - convert it to win32
error and set last error
if operation still pending and you want wait - it select wait on
hFile or hEvent
so GetOverlappedResult[Ex] do much more than simply call WaitForSingleObject
however not very hard implement this API yourself. for example
_In_ HANDLE hFile,
_In_ LPOVERLAPPED lpOverlapped,
_Out_ LPDWORD lpNumberOfBytesTransferred,
_In_ BOOL bWait
if ((NTSTATUS)lpOverlapped->Internal == STATUS_PENDING)
if (!bWait)
return FALSE;
if (lpOverlapped->hEvent)
hFile = lpOverlapped->hEvent;
if (WaitForSingleObject(hFile, INFINITE) != WAIT_OBJECT_0)
return FALSE;
*lpNumberOfBytesTransferred = (ULONG)lpOverlapped->InternalHigh;
NTSTATUS status = (NTSTATUS)lpOverlapped->Internal;
if (status)
return NT_SUCCESS(status);
so what better : use GetOverlappedResult[Ex] or implement it functional yourself ?
You could use either, but truly that's not the 'right' way of doing it. you should attach the handle to an IO completion port and then wait on the completion port. This way you have one pool of threads servicing many IO events, as you can attach multiple handles to a completion port. I recommend reading Designing Applications for High Performance.

Which control codes have be implemented in the control handler of a service

The SERVICE_STATUS documentation says this structure has to filled out when calling the SetServiceStatus() function.
The third field is dwControlsAccepted.
Unfortunately I have not found any information about which control codes MUST ALWAYS be implemented/react to, at least.
The page says:
By default, all services accept the SERVICE_CONTROL_INTERROGATE value.
But, is there a problem when the service control handler does not react to the SERVICE_CONTROL_STOP control code? Is there a problem when the service control handler does not at least call SetServiceStatus() in this case?
As far as dwControlsAccepted is concerned, there are no mandatory control codes. You can set this value to zero if that meets your needs. Apart from SERVICE_CONTROL_INTERROGATE your code does not need to handle any control codes that you have not specified as acceptable.
For example, if you have not set SERVICE_ACCEPT_STOP then Windows will never send you the SERVICE_CONTROL_STOP control. Any attempt to stop the service will result in error 1052, "The requested control is not valid for this service."
Note that unless you have a specific need to perform a clean shutdown (for example, because you have a database file that has to be properly closed) you do not need to accept shutdown controls either. Such a service will continue to run until the computer is actually powered down.
If you always set dwControlsAccepted to zero, this is all you need for a control handler:
static DWORD WINAPI ServiceHandlerEx(DWORD control, DWORD eventtype, LPVOID lpEventData, LPVOID lpContext)
return NO_ERROR;

(Why) Does Windows "Calc.exe" lack a WndProc?

I am fiddling with wndprocs and WinSpy++ and i stumbled upon a strange thing with calc.exe.
It appears to lack a WndProc.
Here is my screenshot: a test program I made, the WinSpy++ window,, showing N/A, and the culprit.
Maybe the tool is a bit outdated, but the empirical evidence proves no WndProc is there.
I don't know if this is by design(this would be strange), or if I am missing something...
Here is referenced code:
Function FindWindow(title As String) As IntPtr
Return AutoIt.AutoItX.WinGetHandle(title)
End Function
Function GetWindowProc(handle As IntPtr) As IntPtr
Return GetWindowLong(handle, WindowLongFlags.GWL_WNDPROC)
End Function
In short (about your code): GetWindowLong() fails because you're trying to read an address in target process address space.
When GetWindowLong() returns 0 it means there is an error, from MSDN:
If the function fails, the return value is zero. To get extended error information, call GetLastError.
Check Marshal.GetLastWin32Error() and you probably see error code is ERROR_ACCESS_DENIED (numeric value is 0x5).
Why? Because GetWindowLong() is trying to get address (or handle) of window procedure (not in your code, but in target process, in theory it may even be default window procedure but I never saw an application main window that doesn't hanle at least few messages). You may use this trick (but I never tried!) to see if a window is using default procedure (you have an address or not), I don't know...someone should try.
Now think what WNDPROC is:
An address (valid in process A) is not callable in process B (where it makes no sense at all). Windows DLLs code segments are shared across processes (I assume, I didn't check but it's reasonable in the game between safety and performance).
Moreover CallWindowProc(NULL, ...) will understand that NULL as a special value to invoke window procedure for that window class (on HWND owner). From MSDN:
...If this value is obtained by calling the GetWindowLong function ...the address of a window or dialog box procedure, or a special internal value meaningful only to CallWindowProc.
How Microsoft Spy++ does it (and maybe WinSpy++ does not)? Hard to say without WinSpy++ source code. For sure it's not such easy like GetWindowLong() and right way should involve CreateRemoteThread() and to do LoadLibrary() from that but both Microsoft Spy++ and WinSpy++ source code aren't available (AFAIK) for further inspection...
WinSpy++ inspection/debugging is pretty off-topic with the question (you should post a ticket to developers, your source code may fail for what I explained above, you should - always - check error codes) but we may take a look for fun.
In InjectThread.c we see it uses WriteProcessMemory + CreateRemoteThread then ReadProcessMemory to read data back (not relevant code omitted):
// Write a copy of our injection thread into the remote process
WriteProcessMemory(hProcess, pdwRemoteCode, lpCode, cbCodeSize, &dwWritten);
// Write a copy of the INJTHREAD to the remote process. This structure
// MUST start on a 32bit boundary
pRemoteData = (void *)((BYTE *)pdwRemoteCode + ((cbCodeSize + 4) & ~ 3));
// Put DATA in the remote thread's memory block
WriteProcessMemory(hProcess, pRemoteData, lpData, cbDataSize, &dwWritten);
hRemoteThread = CreateRemoteThread(hProcess, NULL, 0,
(LPTHREAD_START_ROUTINE)pdwRemoteCode, pRemoteData, 0, &dwRemoteThreadId);
// Wait for the thread to terminate
WaitForSingleObject(hRemoteThread, INFINITE);
// Read the user-structure back again
if(!ReadProcessMemory(hProcess, pRemoteData, lpData, cbDataSize, &dwRead))
//an error occurred
Window procedure in "General" tab and in "Class" tab differs (in "Class" tab it correctly display a value). From DisplayClassInfo.c:
//window procedure
if(spy_WndProc == 0)
wsprintf(ach, _T("N/A"));
wsprintf(ach, szHexFmt, spy_WndProc);
if(spy_WndProc != spy_WndClassEx.lpfnWndProc)
lstrcat(ach, _T(" (Subclassed)"));
//class window procedure
if(spy_WndClassEx.lpfnWndProc == 0)
wsprintf(ach, _T("N/A"));
wsprintf(ach, szHexFmt, spy_WndClassEx.lpfnWndProc);
As you see they're different values (obtained in different ways). Code to fill spy_WndProc is in WinSpy.c and GetRemoteWindowInfo.c. Extracted code from GetRemoteInfo() in WinSpy.c:
GetClassInfoEx(0, spy_szClassName, &spy_WndClassEx);
GetRemoteWindowInfo(hwnd, &spy_WndClassEx, &spy_WndProc, spy_szPassword, 200);
Now in GetRemoteWindowInfo() we see a call to GetClassInfoExProc (injected in the other process):
pInjData->wndproc = (WNDPROC)pInjData->fnGetWindowLong(pInjData->hwnd, GWL_WNDPROC);
(LPTSTR)pInjData->szClassName, &pInjData->wcOutput);
As you can see (please follow using source code) wcOutput is what is displayed in "Class" tab and wndproc what is displayed in "General" tab. Simply GetWindowLong() fails but GetClassInfoEx does not (but they do not necessarily retrieve same value because (if I'm not wrong) what you have in WNDCLASSEX is what you registered with RegisterClassEx but what you get with GetWindowLong() is what you hooked with SetWindowLong().
You are right. It does not have a WndProc(...) function. It is just simply using a DlgProc to process the dialog events. I now this as I have written 'server/thin client' code in C/C++ to capture direct calls into windows API functions like WndProc(...). Any Windows GUI function really - BeginPaint(...) as an example. I used CALC.EXE as a test and executable runs on server while GUI calls are relayed/returned to/from the thin client. Have only tested calc.exe versions thru Vista. There is a chance the newer versions have been 'programmed' differently - meaning not using Win32 SDK. But, even MFC is just a shell to the Win32 SDK,

Working with WINAPI with couple of threads

I've been working on WinAPI for a while, and I noticed that whenever I try to use WINAPI functions (such as create buttons/windows / update listview and such) inside a thread which isn't the main thread, it just wont show up.
So for example, if I want to add items to a ListView, and I call a function that takes a string and adds it to the listview, if I call the function from the main thread, it'll work great, but if I call it from a different thread, it won't work at all.
What can I do?
As with most (all?) GUI systems you need to update the GUI from the thread that owns the window (usually the main thread). You need to find a way to communicate between the two threads. In Win32 my preferred way is to send a user message to the GUI thread (via PostMessage) and update accordingly. You will need to ensure there's no concurrent access to data you send between them, for example protect global data with a Critical Section or something.
A simple example, semi pseudo code:
do some number crunching...
// inform user
strncpy(StatusMessageText, "Crunching away...", ARRAYSIZE(StatusMessageText));
PostMessage(hwndMain, WM_MY_MESSAGE, 0, 0); // You can utilize the params to your hearts content: structures, enums, etc...
switch (message)
case WM_INITDIALOG: // etc - whatever is in your normal message handler
ListView_InsertItem(...); // etc
EnterCriticalSection(&MessageCrit); // Protect the global data
ListView_SetItemText(item, StatusMessageText);
You should either use PostMessage:
static LVITEM lvi = { ... };
PostMessage( myListView, LVM_INSERTITEM, 0, (LPARAM)&lvi );
or, if you need the return value, create a message pump for your thread first:
MSG msg;
PeekMessage( &msg, 0, 0, 0, PM_NOREMOVE );
static LVITEM lvi = { ... };
ListView_InsertItem( myListView, &lvi );
If you use PostMessage, be sure to keep the memory alive also after PostMessage returns, as the message is processed asynchronously by your main thread.

Smooth animations using Win32 API - without controlling the message pump

I'm currently trying to integrate some animation drawing code of mine into a third party application, under the form of an external plugin.
This animation code in realtime 3d, based on OpenGL, and is supposed to render as fast as it can, usually at 60 frames per second.
In my base application, where I'm the king of the world, I control the application message pump, so that drawing occurs whenever possible. Like that :
for (;;)
if (PeekMessage(&msg, NULL, 0, 0, PM_REMOVE))
if (msg.message == WM_QUIT) break;
while (PeekMessage(&msg, NULL, 0, 0, PM_REMOVE));
Now that I'm no more king in the world, I have to play nice with the application messages, so that it keeps being responsive. To my knowledge, as I'm a plugin, I can't hijack the whole application message pump ; so I tried various things, doing my drawing in WM_PAINT message handler :
Use WM_TIMER, which doesn't work :I don't know in advance which time step I need (often not fixed) and the timing in not accurate.
Call InvalidateRect as soon as I'm done drawing, doesn't work : completely prevents the rest of the application of being responsive and doing its own refreshing.
Create a 'worker' thread, whose only job is to post a user message to the plugin window. This message is posted as soon as the drawing is finished (signaled by an event). The user message handler, in turn, calls InvalidateRect (see there).
So far, my last attempt is the better, and sometimes work fine.
DWORD WINAPI PaintCommandThreadProc(LPVOID lpParameter)
Plugin* plugin = static_cast<Plugin*>(lpParameter);
HANDLE updateEvent = plugin->updateEvent();
while (updateEvent == plugin->updateEvent())
::WaitForSingleObject(updateEvent, 100);
if (updateEvent == plugin->updateEvent())
::PostMessage(plugin->hwnd(), WM_USER+0x10, 0, 0);
return 0;
bool processDefault = true;
LRESULT result = 0;
Plugin* plugin = reinterpret_cast<Plugin*>( GetWindowLong(hWnd, GWL_USERDATA) );
switch (msg) {
::InvalidateRect( hWnd, NULL, FALSE );
processDefault = false;
result = TRUE;
case WM_PAINT:
::SetEvent( plugin->updateEvent() );
processDefault = false;
result = TRUE;
if (processDefault && plugin && plugin->m_wndOldProc)
result = ::CallWindowProc(plugin->m_wndOldProc, hWnd, msg, wParam, lParam);
return result;
On some occasions, the host application still seems to miss messages. The main characteristics of the problem are that I have to press the 'Alt' key for modal dialogs to show up ; and I have to move the mouse to give some processing time to the host application !...
Is there any 'industry standard' solution for this kind of as-often-as-you-can animation repaint problem ?
Each thread has its own message queue, and messages sent to a window arrive in the queue of the thread that created the window. If you create your plugin window yourself, you can create it in a separate thread, and that way you will have complete control over its message pump.
An alternative solution (which imho is better), is to only have OpenGL rendering in a separate thread. All OpenGL calls must occur in the thread that created the OpenGL context. However, you can create a window in one thread (your application main thread), but create the OpenGL context in another thread. That way the original application message pumps stays intact, and in your rendering thread you can loop forever doing rendering (with calls to SwapBuffers to vsync).
The main problem with that second solution is that communication between the plugin WindowProc and the rendering loop must take into account threading (ie. use locks when accessing shared memory). However since the message pump is separate from rendering, it can be simultaneous, and your message handling is as responsive as it can get.
