Setting hardware breakpoint using winapi for current process - c

I want to set a breakpoint on my variable, when i do the read access, i have this code:
char * lpBuffer = NULL;
void proc(PVOID)
{
for (int i = 0; i < 10; i++) {
Sleep(100);
MessageBoxA(0, 0, 0, 0);
char * h = lpBuffer;
h[0x4] = 0x88;
}
}
int main()
{
lpBuffer = malloc(20);
SetDebugPrivilege(TRUE);
HANDLE hThread = CreateThread(0, 0, (LPTHREAD_START_ROUTINE)proc, 0, CREATE_SUSPENDED, 0);
CONTEXT ctx = {};
BOOL st = GetThreadContext(hThread, &ctx);
ctx.ContextFlags = CONTEXT_DEBUG_REGISTERS;
ctx.Dr0 = (DWORD)&lpBuffer;
ctx.Dr7 = 0x30002;
st = SetThreadContext(hThread, &ctx);
ResumeThread(hThread);
DEBUG_EVENT dbgEvent;
int status = WaitForDebugEvent(&dbgEvent, INFINITE);
//..
}
I always get 0 from int status = WaitForDebugEvent(&dbgEvent, INFINITE); it doesnt perfrorm any waiting just returns 0 immediatly, i placed MessageBoxA to test, basically i don't get any notifications and waiting that the read has performed on lpBuffer. I think i have done something wrong, maybe it has to do with dr7 flags? 0x30002 is ‭00110000000000000010‬, so it should be hardware read/write bp.

about debugging. for debug need create special DebugObject and associate process (which you want to debug) with this debug object. after this, when some events occur in process associated with debug object, system suspend all threads in process, insert debug event to debug object and set it to signal state. in debugger, thread which wait on debug object, say via WaitForDebugEvent (but possible direct pass handle of debug object to say MsgWaitForMultipleObjectsEx) awakened and handle this debug event. but if thread will be from debugged process - it will be suspended, as all threads in process and never handle this event. process hang in this case. so - process can not debug itself because system suspend all threads in process when debug event (exception for example) occur.
about WaitForDebugEvent doesnt perfrorm any waiting. WaitForDebugEvent internally call ZwWaitForDebugEvent and from where is HANDLE hDebugObject is geted ? from thread TEB (here exist special field for save it). when you call CreateProcess with flag DEBUG_PROCESS or DebugActiveProcess system internally call DbgUiConnectToDbg - this api check - are thread already have accosiated debug object (in TEB) and if yet no - create new with ZwCreateDebugObject and store it in TEB (it can be accessed via DbgUiGetThreadDebugObject and changed via DbgUiSetThreadDebugObject). only after this you can call WaitForDebugEvent. because you not direct create debug object via ZwCreateDebugObject, and not call CreateProcess with flag DEBUG_PROCESS or DebugActiveProcess - no debug objects assosiated with your thread, and call WaitForDebugEvent of course fail
only one option catch hardware breakpoint in own process - use VEX. for example:
LONG NTAPI OnVex(PEXCEPTION_POINTERS ExceptionInfo)
{
WCHAR buf[1024], *sz = buf, wz[64];
PEXCEPTION_RECORD ExceptionRecord = ExceptionInfo->ExceptionRecord;
swprintf(wz, L"%x> %x at %p", GetCurrentThreadId(), ExceptionRecord->ExceptionCode, ExceptionRecord->ExceptionAddress);
*buf = 0;
if (ULONG NumberParameters = ExceptionRecord->NumberParameters)
{
sz += swprintf(sz, L"[ ");
PULONG_PTR ExceptionInformation = ExceptionRecord->ExceptionInformation;
do
{
sz += swprintf(sz, L"%p, ", *ExceptionInformation++);
} while (--NumberParameters);
sz += swprintf(sz - 2, L" ]");
}
MessageBoxW(0, buf, wz, 0);
return ExceptionRecord->ExceptionCode == STATUS_SINGLE_STEP ? EXCEPTION_CONTINUE_EXECUTION : EXCEPTION_CONTINUE_SEARCH;
}
DWORD HardwareTest(PVOID pv)
{
WCHAR sz[64];
swprintf(sz, L"hThread = %p\n", *(void**)pv);
return MessageBoxW(0, 0, sz, MB_ICONINFORMATION);
}
void ep()
{
if (PVOID handler = AddVectoredExceptionHandler(TRUE, OnVex))
{
if (HANDLE hThread = CreateThread(0, 0, HardwareTest, &hThread, CREATE_SUSPENDED, 0))
{
CONTEXT ctx = {};
ctx.ContextFlags = CONTEXT_DEBUG_REGISTERS;
ctx.Dr3 = (ULONG_PTR)&hThread;
ctx.Dr7 = 0xF0000040;
SetThreadContext(hThread, &ctx);
ResumeThread(hThread);
WaitForSingleObject(hThread, INFINITE);
CloseHandle(hThread);
}
RemoveVectoredExceptionHandler(handler);
}
}

Related

Child process (via CreateProcess) stalls on getch() with redirected stdout and stdin

I am trying to launch a process with CreateProcess() with stdin and stdout redirected to pipes. When the child process consists of just printf() statements, I see them piped up to the parent and displayed just fine. If my child process does a printf() and a _getch() statement, then things fail. I have considered a possible deadlock between the pipes in several ways to no avail:
changing the order of things,
applying PeekNamedPipe() and Sleep() statements, and
overlapped I/O using a named pipe.
I'm left suspecting a subtle configuration issue somewhere. This is part of an issue in a larger program but I've reduced it to this simple test case. I started with the Microsoft example for "Creating a Child Process with Redirected Input and Output". That worked, so maybe the child process using ReadFile() works, but my problem is _getch() (among other programs that seem to have related failures). I replaced the child process with my test program and it stalls. I try solving deadlocks as above, with the overlapped I/O achieved following this example for using named pipes to this purpose (in my reading someone mentioned that the Windows implementation of named and anonymous pipes are reasonably unified).
Again, works if the child issues only printfs but fails with _getch(). Of note is that if a _getch() is present in the child program then even the printfs don't show up - even printfs() issued before the _getch(). I've read that pipes have buffering and as above they have potential deadlocks waiting on the other end of the pipe, but I can't think what else I can do to avoid that besides what's done below.
Just in case, I also made sure I had a large heap buffer for the command-line buffer since CreateProcess() is known to modify it.
Here is my parent test code, with those first booleans configuring overlapped/not overlapped behavior:
#include <string>
#include <Windows.h>
#include <tchar.h>
#include <stdio.h>
#include <strsafe.h>
#include <conio.h>
#include <assert.h>
TCHAR szCmdline[] = TEXT("child.exe");
bool OverlappedStdOutRd = true;
bool OverlappedStdInWr = true;
#define BUFSIZE 4096
HANDLE g_hChildStd_IN_Rd = NULL;
HANDLE g_hChildStd_IN_Wr = NULL;
HANDLE g_hChildStd_OUT_Rd = NULL;
HANDLE g_hChildStd_OUT_Wr = NULL;
using namespace std;
void ErrorExit(PTSTR lpszFunction)
// Format a readable error message, display a message box,
// and exit from the application.
{
LPVOID lpMsgBuf;
LPVOID lpDisplayBuf;
DWORD dw = GetLastError();
FormatMessage(
FORMAT_MESSAGE_ALLOCATE_BUFFER |
FORMAT_MESSAGE_FROM_SYSTEM |
FORMAT_MESSAGE_IGNORE_INSERTS,
NULL,
dw,
MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
(LPTSTR)&lpMsgBuf,
0, NULL);
lpDisplayBuf = (LPVOID)LocalAlloc(LMEM_ZEROINIT,
(lstrlen((LPCTSTR)lpMsgBuf) + lstrlen((LPCTSTR)lpszFunction) + 40) * sizeof(TCHAR));
StringCchPrintf((LPTSTR)lpDisplayBuf,
LocalSize(lpDisplayBuf) / sizeof(TCHAR),
TEXT("%s failed with error %d: %s"),
lpszFunction, dw, lpMsgBuf);
MessageBox(NULL, (LPCTSTR)lpDisplayBuf, TEXT("Error"), MB_OK);
LocalFree(lpMsgBuf);
LocalFree(lpDisplayBuf);
ExitProcess(1);
}
static ULONG PipeSerialNumber = 1;
static BOOL APIENTRY MyCreatePipeEx(
OUT LPHANDLE lpReadPipe,
OUT LPHANDLE lpWritePipe,
IN LPSECURITY_ATTRIBUTES lpPipeAttributes,
IN DWORD nSize,
DWORD dwReadMode,
DWORD dwWriteMode
)
/*++
Routine Description:
The CreatePipeEx API is used to create an anonymous pipe I/O device.
Unlike CreatePipe FILE_FLAG_OVERLAPPED may be specified for one or
both handles.
Two handles to the device are created. One handle is opened for
reading and the other is opened for writing. These handles may be
used in subsequent calls to ReadFile and WriteFile to transmit data
through the pipe.
Arguments:
lpReadPipe - Returns a handle to the read side of the pipe. Data
may be read from the pipe by specifying this handle value in a
subsequent call to ReadFile.
lpWritePipe - Returns a handle to the write side of the pipe. Data
may be written to the pipe by specifying this handle value in a
subsequent call to WriteFile.
lpPipeAttributes - An optional parameter that may be used to specify
the attributes of the new pipe. If the parameter is not
specified, then the pipe is created without a security
descriptor, and the resulting handles are not inherited on
process creation. Otherwise, the optional security attributes
are used on the pipe, and the inherit handles flag effects both
pipe handles.
nSize - Supplies the requested buffer size for the pipe. This is
only a suggestion and is used by the operating system to
calculate an appropriate buffering mechanism. A value of zero
indicates that the system is to choose the default buffering
scheme.
Return Value:
TRUE - The operation was successful.
FALSE/NULL - The operation failed. Extended error status is available
using GetLastError.
--*/
{
HANDLE ReadPipeHandle, WritePipeHandle;
DWORD dwError;
CHAR PipeNameBuffer[MAX_PATH];
//
// Only one valid OpenMode flag - FILE_FLAG_OVERLAPPED
//
if ((dwReadMode | dwWriteMode) & (~FILE_FLAG_OVERLAPPED)) {
SetLastError(ERROR_INVALID_PARAMETER);
return FALSE;
}
//
// Set the default timeout to 120 seconds
//
if (nSize == 0) {
nSize = 4096;
}
sprintf_s(PipeNameBuffer,
"\\\\.\\Pipe\\TruthPipe.%08x.%08x",
GetCurrentProcessId(),
PipeSerialNumber++ // TODO: Should use InterlockedIncrement() here to be thread-safe.
);
ReadPipeHandle = CreateNamedPipeA(
PipeNameBuffer,
PIPE_ACCESS_INBOUND | dwReadMode,
PIPE_TYPE_BYTE | PIPE_WAIT,
1, // Number of pipes
nSize, // Out buffer size
nSize, // In buffer size
1000, // Timeout in ms
lpPipeAttributes
);
if (!ReadPipeHandle) {
return FALSE;
}
WritePipeHandle = CreateFileA(
PipeNameBuffer,
GENERIC_WRITE,
0, // No sharing
lpPipeAttributes,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL | dwWriteMode,
NULL // Template file
);
if (INVALID_HANDLE_VALUE == WritePipeHandle) {
dwError = GetLastError();
CloseHandle(ReadPipeHandle);
SetLastError(dwError);
return FALSE;
}
*lpReadPipe = ReadPipeHandle;
*lpWritePipe = WritePipeHandle;
return(TRUE);
}
bool OutstandingWrite = false;
OVERLAPPED WriteOverlapped;
CHAR chWriteBuf[BUFSIZE];
DWORD dwBytesWritten;
DWORD dwBytesToWrite;
bool OutstandingRead = false;
OVERLAPPED ReadOverlapped;
CHAR chReadBuf[BUFSIZE];
DWORD dwBytesRead;
void OnReadComplete();
void StartOverlappedRead();
void WaitForIO(bool Wait)
{
HANDLE hEvents[2];
int iEvent = 0;
int iReadEvent = -1;
int iWriteEvent = -1;
if (OutstandingRead) {
hEvents[iEvent] = ReadOverlapped.hEvent;
iReadEvent = iEvent;
iEvent++;
}
if (OutstandingWrite) {
hEvents[iEvent] = WriteOverlapped.hEvent;
iWriteEvent = iEvent;
iEvent++;
}
DWORD dwStatus = WaitForMultipleObjects(iEvent, hEvents, FALSE, Wait ? INFINITE : 250 /*ms*/);
int Index = -2;
switch (dwStatus)
{
case WAIT_OBJECT_0: Index = 0; break;
case WAIT_OBJECT_0 + 1: Index = 1; break;
case WAIT_TIMEOUT: return;
default:
ErrorExit(TEXT("WaitForMultipleObjects"));
}
if (Index == iReadEvent)
{
if (!GetOverlappedResult(
g_hChildStd_OUT_Rd, // handle to pipe
&ReadOverlapped, // OVERLAPPED structure
&dwBytesRead, // bytes transferred
FALSE)) // do not wait
ErrorExit(TEXT("GetOverlappedResult"));
OutstandingRead = false;
if (dwBytesRead > 0) OnReadComplete();
StartOverlappedRead();
}
else if (Index == iWriteEvent)
{
if (!GetOverlappedResult(
g_hChildStd_IN_Wr, // handle to pipe
&WriteOverlapped, // OVERLAPPED structure
&dwBytesWritten, // bytes transferred
FALSE)) // do not wait
ErrorExit(TEXT("GetOverlappedResult"));
if (dwBytesWritten != dwBytesToWrite) ErrorExit(TEXT("Write incomplete."));
OutstandingWrite = false;
}
else ErrorExit(TEXT("WaitForMultipleObjects indexing"));
}
void WriteToPipe(string text)
{
BOOL bSuccess = FALSE;
printf("Writing: %s\n", text.c_str());
if (!OverlappedStdInWr)
{
bSuccess = WriteFile(g_hChildStd_IN_Wr, text.c_str(), (DWORD)text.length(), &dwBytesWritten, NULL);
if (!bSuccess) ErrorExit(TEXT("WriteToPipe"));
return;
}
else
{
while (OutstandingWrite) WaitForIO(true); // Can only have one outstanding write at a time.
WriteOverlapped.Offset = 0;
WriteOverlapped.OffsetHigh = 0;
WriteOverlapped.Pointer = nullptr;
if (text.length() > BUFSIZE) ErrorExit(TEXT("Attempt to write too long a message!"));
CopyMemory(chWriteBuf, text.c_str(), text.length());
dwBytesToWrite = text.length();
bSuccess = WriteFile(g_hChildStd_IN_Wr, chWriteBuf, dwBytesToWrite, &dwBytesWritten, &WriteOverlapped);
if (bSuccess) return;
if (!bSuccess)
{
if (GetLastError() == ERROR_IO_PENDING) {
OutstandingWrite = true;
return;
}
ErrorExit(TEXT("WriteToPipe"));
}
}
}
void OnReadComplete()
{
chReadBuf[dwBytesRead] = '\0';
printf("Rx: ");
for (DWORD ii = 0; ii < dwBytesRead; ii++)
{
if (chReadBuf[ii] >= 0x20 && chReadBuf[ii] <= 0x7e) printf("%c", chReadBuf[ii]);
else
{
printf("\\0x%02X", chReadBuf[ii]);
}
if (chReadBuf[ii] == '\n') printf("\n");
}
printf("\n");
}
void StartOverlappedRead()
{
int loops = 0;
for (;; loops++)
{
if (loops > 10) ErrorExit(TEXT("Read stuck in loop"));
assert(!OutstandingRead);
ReadOverlapped.Offset = 0;
ReadOverlapped.OffsetHigh = 0;
ReadOverlapped.Pointer = nullptr;
BOOL Success = ReadFile(g_hChildStd_OUT_Rd, chReadBuf, BUFSIZE - 1, &dwBytesRead, &ReadOverlapped);
if (!Success && GetLastError() != ERROR_IO_PENDING)
ErrorExit(TEXT("ReadFile"));
if (Success)
{
if (dwBytesRead > 0)
OnReadComplete();
continue;
}
else {
OutstandingRead = true; return;
}
}
}
void ReadFromPipe(void)
// Read output from the child process's pipe for STDOUT
// and write to the parent process's pipe for STDOUT.
// Stop when there is no more data.
{
BOOL bSuccess = FALSE;
if (!OverlappedStdOutRd)
{
for (;;)
{
DWORD total_available_bytes;
if (FALSE == PeekNamedPipe(g_hChildStd_OUT_Rd,
0,
0,
0,
&total_available_bytes,
0))
{
ErrorExit(TEXT("ReadFromPipe - peek"));
return;
}
else if (total_available_bytes == 0)
{
// printf("No new pipe data to read at this time.\n");
return;
}
bSuccess = ReadFile(g_hChildStd_OUT_Rd, chReadBuf, BUFSIZE - 1, &dwBytesRead, NULL);
if (!bSuccess) ErrorExit(TEXT("ReadFromPipe"));
if (dwBytesRead == 0) return;
OnReadComplete();
}
}
else
{
if (!OutstandingRead) StartOverlappedRead();
WaitForIO(false);
}
}
void Create()
{
SECURITY_ATTRIBUTES saAttr;
printf("\n->Start of parent execution.\n");
// Set the bInheritHandle flag so pipe handles are inherited.
saAttr.nLength = sizeof(SECURITY_ATTRIBUTES);
saAttr.bInheritHandle = TRUE;
saAttr.lpSecurityDescriptor = NULL;
if (!OverlappedStdOutRd)
{
// As per the MS example, create anonymous pipes
// Create a pipe for the child process's STDOUT.
if (!CreatePipe(&g_hChildStd_OUT_Rd, &g_hChildStd_OUT_Wr, &saAttr, 0))
ErrorExit(TEXT("StdoutRd CreatePipe"));
}
else
{
// Create overlapped I/O pipes (only one side is overlapped).
if (!MyCreatePipeEx(&g_hChildStd_OUT_Rd, &g_hChildStd_OUT_Wr, &saAttr, 0, FILE_FLAG_OVERLAPPED, 0))
ErrorExit(TEXT("Stdout MyCreatePipeEx"));
ZeroMemory(&ReadOverlapped, sizeof(ReadOverlapped));
ReadOverlapped.hEvent = CreateEvent(NULL, TRUE, TRUE, NULL); // Manual-reset event, unnamed, initially signalled.
if (ReadOverlapped.hEvent == NULL)
ErrorExit(TEXT("CreateEvent Read"));
}
// Ensure the read handle to the pipe for STDOUT is not inherited.
if (!SetHandleInformation(g_hChildStd_OUT_Rd, HANDLE_FLAG_INHERIT, 0))
ErrorExit(TEXT("Stdout SetHandleInformation"));
if (!OverlappedStdInWr)
{
// Create a pipe for the child process's STDIN.
if (!CreatePipe(&g_hChildStd_IN_Rd, &g_hChildStd_IN_Wr, &saAttr, 0))
ErrorExit(TEXT("Stdin CreatePipe"));
}
else
{
// Create overlapped I/O pipes (only one side is overlapped).
if (!MyCreatePipeEx(&g_hChildStd_IN_Rd, &g_hChildStd_IN_Wr, &saAttr, 0, 0, FILE_FLAG_OVERLAPPED))
ErrorExit(TEXT("Stdin MyCreatePipeEx"));
ZeroMemory(&WriteOverlapped, sizeof(WriteOverlapped));
WriteOverlapped.hEvent = CreateEvent(NULL, TRUE, TRUE, NULL); // Manual-reset event, unnamed, initially signalled.
if (WriteOverlapped.hEvent == NULL)
ErrorExit(TEXT("CreateEvent Write"));
}
// Ensure the write handle to the pipe for STDIN is not inherited.
if (!SetHandleInformation(g_hChildStd_IN_Wr, HANDLE_FLAG_INHERIT, 0))
ErrorExit(TEXT("Stdin SetHandleInformation"));
// Create the child process.
TCHAR* szMutableCmdline = new TCHAR[1024];
ZeroMemory(szMutableCmdline, 1024 * sizeof(TCHAR));
CopyMemory(szMutableCmdline, szCmdline, _tcslen(szCmdline) * sizeof(TCHAR));
PROCESS_INFORMATION piProcInfo;
STARTUPINFO siStartInfo;
BOOL bSuccess = FALSE;
// Set up members of the PROCESS_INFORMATION structure.
ZeroMemory(&piProcInfo, sizeof(PROCESS_INFORMATION));
// Set up members of the STARTUPINFO structure.
// This structure specifies the STDIN and STDOUT handles for redirection.
ZeroMemory(&siStartInfo, sizeof(STARTUPINFO));
siStartInfo.cb = sizeof(STARTUPINFO);
siStartInfo.hStdError = g_hChildStd_OUT_Wr;
siStartInfo.hStdOutput = g_hChildStd_OUT_Wr;
siStartInfo.hStdInput = g_hChildStd_IN_Rd;
siStartInfo.dwFlags |= STARTF_USESTDHANDLES;
// Create the child process.
bSuccess = CreateProcess(NULL,
szMutableCmdline, // command line
NULL, // process security attributes
NULL, // primary thread security attributes
TRUE, // handles are inherited
0, // creation flags
NULL, // use parent's environment
NULL, // use parent's current directory
&siStartInfo, // STARTUPINFO pointer
&piProcInfo); // receives PROCESS_INFORMATION
// If an error occurs, exit the application.
if (!bSuccess)
ErrorExit(TEXT("CreateProcess"));
else
{
// Close handles to the child process and its primary thread.
// Some applications might keep these handles to monitor the status
// of the child process, for example.
CloseHandle(piProcInfo.hProcess);
CloseHandle(piProcInfo.hThread);
}
}
int main()
{
printf("Launching...\n");
Create();
Sleep(500);
ReadFromPipe();
Sleep(250);
WriteToPipe("A\r\n");
Sleep(250);
ReadFromPipe();
WriteToPipe("\r\n");
Sleep(250);
ReadFromPipe();
WriteToPipe("X\r\n");
Sleep(250);
ReadFromPipe();
Sleep(250);
ReadFromPipe();
printf("Press any key to exit.\n");
_getch();
// TODO: Not doing proper cleanup in this test app. Overlapped I/O, CloseHandles, etc. are outstanding. Bad.
return 0;
}
And child code can be as simple as:
#include <conio.h>
int main()
{
printf("Hello!\n");
_getch();
printf("Bye!\n");
return 0;
}
Edit: As #Rbmm points out, _getch() uses ReadConsoleInput(). I assume it uses CONIN$ as opposed to STDIN. So the question becomes: can I redirect CONIN$ or have the parent process write to it?
In child process after printf you can add fflush(stdout);. This will immediately transfer data from stdout buffer to pipe. In some configurations stdout buffer data is automatically flushed on end of line character \n, but I'm not sure if it is in this case - probably not.
If your child should read data from pipe (not from console) use getchar, fgets, fread, fscanf giving them stdin as stream argument.
int main()
{
printf("Hello!\n");
fflush(stdout);
getchar();
printf("Bye!\n");
fflush(stdout);
return 0;
}
And you don't have dead lock. Your child just waits for char from console. Press Enter key to revive it.

Asynchronous Majordomo Pattern example using the CZMQ-4.1.0 new zsock API updated not working

After installing zmq and czmq with brew, I tried to compile and play the Asynchronous-Majordomo-Pattern but it did not work as it requires czmq v3. As far as I understood, I tried to update it to the v4, using zactor because
zthread is deprecated in favor of zactor http://czmq.zeromq.org/czmq3-0:zthread
So right now the following code looks fine to me as updated async-majordomo pattern, but it does not work as expected, It does not create any thread when I run it via my terminal.
// Round-trip demonstrator
// While this example runs in a single process, that is just to make
// it easier to start and stop the example. The client task signals to
// main when it's ready.
#include "czmq.h"
#include <stdlib.h>
void dbg_write_in_file(char * txt, int nb_request) {
FILE * pFile;
pFile = fopen ("myfile.txt","a");
if (pFile!=NULL)
{
fputs (txt, pFile);
char str_nb_request[12];
sprintf(str_nb_request, "%d", nb_request);
fputs (str_nb_request, pFile);
fputs ("\n", pFile);
fclose (pFile);
}
}
static void
client_task (zsock_t *pipe, void *args)
{
zsock_t *client = zsock_new (ZMQ_DEALER);
zsock_connect (client, "tcp://localhost:5555");
printf ("Setting up test...\n");
zclock_sleep (100);
printf("child 1: parent: %i\n\n", getppid());
printf("child 1: my pid: %i\n\n", getpid());
int requests;
int64_t start;
printf ("Synchronous round-trip test...\n");
start = zclock_time ();
for (requests = 0; requests < 10000; requests++) {
zstr_send (client, "hello");
// stuck here /!\
char *reply = zstr_recv (client);
zstr_free (&reply);
// check if it does something
dbg_write_in_file("sync round-trip requests : ", requests);
// end check
}
printf (" %d calls/second\n",
(1000 * 10000) / (int) (zclock_time () - start));
printf ("Asynchronous round-trip test...\n");
start = zclock_time ();
for (requests = 0; requests < 100000; requests++) {
zstr_send (client, "hello");
// check if it does something
dbg_write_in_file("async round-trip send requests : ", requests);
// end check
}
for (requests = 0; requests < 100000; requests++) {
char *reply = zstr_recv (client);
zstr_free (&reply);
// check if it does something
dbg_write_in_file("async round-trip rec requests : ", requests);
// end check
}
printf (" %d calls/second\n",
(1000 * 100000) / (int) (zclock_time () - start));
zstr_send (pipe, "done");
}
// Here is the worker task. All it does is receive a message, and
// bounce it back the way it came:
static void
worker_task (zsock_t *pipe, void *args)
{
printf("child 2: parent: %i\n\n", getppid());
printf("child 2: my pid: %i\n\n", getpid());
zsock_t *worker = zsock_new (ZMQ_DEALER);
zsock_connect (worker, "tcp://localhost:5556");
while (true) {
zmsg_t *msg = zmsg_recv (worker);
zmsg_send (&msg, worker);
}
zsock_destroy (&worker);
}
// Here is the broker task. It uses the zmq_proxy function to switch
// messages between frontend and backend:
static void
broker_task (zsock_t *pipe, void *args)
{
printf("child 3: parent: %i\n\n", getppid());
printf("child 3: my pid: %i\n\n", getpid());
// Prepare our sockets
zsock_t *frontend = zsock_new (ZMQ_DEALER);
zsock_bind (frontend, "tcp://localhost:5555");
zsock_t *backend = zsock_new (ZMQ_DEALER);
zsock_bind (backend, "tcp://localhost:5556");
zmq_proxy (frontend, backend, NULL);
zsock_destroy (&frontend);
zsock_destroy (&backend);
}
// Finally, here's the main task, which starts the client, worker, and
// broker, and then runs until the client signals it to stop:
int main (void)
{
// Create threads
zactor_t *client = zactor_new (client_task, NULL);
assert (client);
zactor_t *worker = zactor_new (worker_task, NULL);
assert (worker);
zactor_t *broker = zactor_new (broker_task, NULL);
assert (broker);
// Wait for signal on client pipe
char *signal = zstr_recv (client);
zstr_free (&signal);
zactor_destroy (&client);
zactor_destroy (&worker);
zactor_destroy (&broker);
return 0;
}
When I run it, it looks like the program is stuck at the comment
// stuck here /!\
Then when I kill it as it does not finish, or print anything at all, I need to press five time Ctrl+C ( ^C ). Only then, it looks more verbose on the console, like it was indeed running. => Note that I delete all my printf() steps' outputs, as it was really messy to read.
When it runs, it does not write anything to the file, called by the dbg_write_in_file() function, only after sending five Ctrl+C ( ^C ).
Both client worker and broker task return the same getppid number ( my terminal ) and getpid as the program itself.
I use gcc trippingv4.c -o trippingv4 -L/usr/local/lib -lzmq -lczmq to compile.
When I try to kill it :
./trippingv4
Setting up test...
child 1: parent: 60967
child 1: my pid: 76853
Synchronous round-trip test...
^Cchild 2: parent: 60967
child 2: my pid: 76853
^Cchild 3: parent: 60967
child 3: my pid: 76853
^C^C^CE: 18-02-28 00:16:37 [76853]dangling 'PAIR' socket created at src/zsys.c:471
E: 18-02-28 00:16:37 [76853]dangling 'DEALER' socket created at trippingv4.c:29
E: 18-02-28 00:16:37 [76853]dangling 'PAIR' socket created at src/zsys.c:471
E: 18-02-28 00:16:37 [76853]dangling 'DEALER' socket created at trippingv4.c:89
Update
Thanks for the detailed answer #user3666197. In first part, the compiler does not compile the assert call so I just show the value instead and compare visually, they are the same.
int czmqMAJOR,
czmqMINOR,
czmqPATCH;
zsys_version ( &czmqMAJOR, &czmqMINOR, &czmqPATCH );
printf( "INF: detected CZMQ ( %d, %d, %d ) -version\n",
czmqMAJOR,
czmqMINOR,
czmqPATCH
);
printf( "INF: CZMQ_VERSION_MAJOR %d, CZMQ_VERSION_MINOR %d, CZMQ_VERSION_PATCH %d\n",
CZMQ_VERSION_MAJOR,
CZMQ_VERSION_MINOR,
CZMQ_VERSION_PATCH
);
Output :
INF: detected CZMQ ( 4, 1, 0 ) -version
INF: CZMQ_VERSION_MAJOR 4, CZMQ_VERSION_MINOR 1, CZMQ_VERSION_PATCH 0
The zsys_info call does compile but does not show anything on the terminal, even with a fflush(stdout) just in case so I just used printf :
INF: This system's Context() limit is 65535 ZeroMQ socketsINF: current state of the global Context()-instance has:
( 1 )-IO-threads ready
( 1 )-ZMQ_BLOCKY state
Then I changed the global context thread value with zsys_set_io_threads(2) and/or zmq_ctx_set (aGlobalCONTEXT, ZMQ_BLOCKY, false);, still blocked. It looks like zactor does not works with systems threads as zthread was... or does not gives a similar behavior. Given my experience in zeromq (also zero) probably I trying something that can't be achieved.
Update solved but unproper
My main error was to not have properly initiate zactor instance
An actor function MUST call zsock_signal (pipe) when initialized and MUST listen to pipe and exit on $TERM command.
And to not have blocked the zactor's proxy execution before it called zactor_destroy (&proxy);
I let the final code below but you still need to exit at the end with Ctrl+C because I did not figure it out how to manage $TERM signal properly. Also, zactor still appears to not use system theads. It's probably design like this but I don't know how it's work behind the wood.
// Round-trip demonstrator
// While this example runs in a single process, that is just to make
// it easier to start and stop the example. The client task signals to
// main when it's ready.
#include <czmq.h>
static void
client_task (zsock_t *pipe, void *args)
{
assert (streq ((char *) args, "Hello, Client"));
zsock_signal (pipe, 0);
zsock_t *client = zsock_new (ZMQ_DEALER);
zsock_connect (client, "tcp://127.0.0.1:5555");
printf ("Setting up test...\n");
zclock_sleep (100);
int requests;
int64_t start;
printf ("Synchronous round-trip test...\n");
start = zclock_time ();
for (requests = 0; requests < 10000; requests++) {
zstr_send (client, "hello");
zmsg_t *msgh = zmsg_recv (client);
zmsg_destroy (&msgh);
}
printf (" %d calls/second\n",
(1000 * 10000) / (int) (zclock_time () - start));
printf ("Asynchronous round-trip test...\n");
start = zclock_time ();
for (requests = 0; requests < 100000; requests++) {
zstr_send (client, "hello");
}
for (requests = 0; requests < 100000; requests++) {
char *reply = zstr_recv (client);
zstr_free (&reply);
}
printf (" %d calls/second\n",
(1000 * 100000) / (int) (zclock_time () - start));
zstr_send (pipe, "done");
printf("send 'done' to pipe\n");
}
// Here is the worker task. All it does is receive a message, and
// bounce it back the way it came:
static void
worker_task (zsock_t *pipe, void *args)
{
assert (streq ((char *) args, "Hello, Worker"));
zsock_signal (pipe, 0);
zsock_t *worker = zsock_new (ZMQ_DEALER);
zsock_connect (worker, "tcp://127.0.0.1:5556");
bool terminated = false;
while (!terminated) {
zmsg_t *msg = zmsg_recv (worker);
zmsg_send (&msg, worker);
// zstr_send (worker, "hello back"); // Give better perf I don't know why
}
zsock_destroy (&worker);
}
// Here is the broker task. It uses the zmq_proxy function to switch
// messages between frontend and backend:
static void
broker_task (zsock_t *pipe, void *args)
{
assert (streq ((char *) args, "Hello, Task"));
zsock_signal (pipe, 0);
// Prepare our proxy and its sockets
zactor_t *proxy = zactor_new (zproxy, NULL);
zstr_sendx (proxy, "FRONTEND", "DEALER", "tcp://127.0.0.1:5555", NULL);
zsock_wait (proxy);
zstr_sendx (proxy, "BACKEND", "DEALER", "tcp://127.0.0.1:5556", NULL);
zsock_wait (proxy);
bool terminated = false;
while (!terminated) {
zmsg_t *msg = zmsg_recv (pipe);
if (!msg)
break; // Interrupted
char *command = zmsg_popstr (msg);
if (streq (command, "$TERM")) {
terminated = true;
printf("broker received $TERM\n");
}
freen (command);
zmsg_destroy (&msg);
}
zactor_destroy (&proxy);
}
// Finally, here's the main task, which starts the client, worker, and
// broker, and then runs until the client signals it to stop:
int main (void)
{
// Create threads
zactor_t *client = zactor_new (client_task, "Hello, Client");
assert (client);
zactor_t *worker = zactor_new (worker_task, "Hello, Worker");
assert (worker);
zactor_t *broker = zactor_new (broker_task, "Hello, Task");
assert (broker);
char *signal = zstr_recv (client);
printf("signal %s\n", signal);
zstr_free (&signal);
zactor_destroy (&client);
printf("client done\n");
zactor_destroy (&worker);
printf("worker done\n");
zactor_destroy (&broker);
printf("broker done\n");
return 0;
}
Let's diagnose the as-is state, going step by step:
int czmqMAJOR,
czmqMINOR,
czmqPATCH;
zsys_version ( &czmqMAJOR, &czmqMINOR, &czmqPATCH );
printf( "INF: detected CZMQ( %d, %d, %d )-version",
czmqMAJOR,
czmqMINOR,
czmqPATCH
);
assert ( czmqMAJOR == CZMQ_VERSION_MAJOR & "Major: does not match\n" );
assert ( czmqMINOR == CZMQ_VERSION_MINOR & "Minor: does not match\n" );
assert ( czmqPATCH == CZMQ_VERSION_PATCH & "Patch: does not match\n" );
if this matches your expectations, you may hope the DLL-versions are both matching and found in proper locations.
Next:
may test the whole circus run in a non-blocking mode, to prove, there is no other blocker, but as briefly inspected, I have not found such option exposed in CZMQ-API, the native API allows one to flag a NOBLOCK option on { _send() | _recv() }-operations, which prevents them from remaining blocked ( which may be the case for DEALER socket instance in cases on _send()-s, when there are not yet any counterparty with a POSACK-ed .bind()/.connect() state ).
Here I did not find some tools to do this as fast as expected in native API. Maybe you will have more luck on going through this.
Test the presence of a global Context() instance, if it is ready:
add before a first socket instantiation, to be sure we are before any and all socket-generation and their respective _bind()/_connect() operation a following self-reporting row, using:
zsys_info ( "INF: This system's Context() limit is %zu ZeroMQ sockets",
zsys_socket_limit ()
);
One may also enforce the Context() instantiation manually:
so as to be sure the global Context() instance is up and running, before any higher abstracted instances ask if for implementing additional internalities ( sockets, counters, handlers, port-management, etc. )
// Initialize CZMQ zsys layer; this happens automatically when you create
// a socket or an actor; however this call lets you force initialization
// earlier, so e.g. logging is properly set-up before you start working.
// Not threadsafe, so call only from main thread. Safe to call multiple
// times. Returns global CZMQ context.
CZMQ_EXPORT void *
zsys_init (void);
// Optionally shut down the CZMQ zsys layer; this normally happens automatically
// when the process exits; however this call lets you force a shutdown
// earlier, avoiding any potential problems with atexit() ordering, especially
// with Windows dlls.
CZMQ_EXPORT void
zsys_shutdown (void);
and possibly better tune IO-performance, using this right at the initialisation state:
// Configure the number of I/O threads that ZeroMQ will use. A good
// rule of thumb is one thread per gigabit of traffic in or out. The
// default is 1, sufficient for most applications. If the environment
// variable ZSYS_IO_THREADS is defined, that provides the default.
// Note that this method is valid only before any socket is created.
CZMQ_EXPORT void
zsys_set_io_threads (size_t io_threads);
This manual instantiation gives one an additional benefit, from having the instance-handle void pointer, so that one can inspect it's current state and shape by zmq_ctx_get() tools:
void *aGlobalCONTEXT = zsys_init();
printf( "INF: current state of the global Context()-instance has:\n" );
printf( " ( %d )-IO-threads ready\n", zmq_ctx_get( aGlobalCONTEXT,
ZMQ_IO_THREADS
)
);
printf( " ( %d )-ZMQ_BLOCKY state\n", zmq_ctx_get( aGlobalCONTEXT,
ZMQ_BLOCKY
)
); // may generate -1 in case DLL is << 4.2+
...
If unhappy with signal-handling, one may design and use another one:
// Set interrupt handler; this saves the default handlers so that a
// zsys_handler_reset () can restore them. If you call this multiple times
// then the last handler will take affect. If handler_fn is NULL, disables
// default SIGINT/SIGTERM handling in CZMQ.
CZMQ_EXPORT void
zsys_handler_set (zsys_handler_fn *handler_fn);
where
// Callback for interrupt signal handler
typedef void (zsys_handler_fn) (int signal_value);

Main UI processing thread messages

I am writing a win32 application which allows an operator to perform testing on 4 devices at a time. So i have 4 threads running simultaneously, this works as intended (though could be more efficient). The problem i have is when the thread tries to send either a Pass or Fail to the main UI to be displayed to the operator, i have debugged the PostMessage call and this returns '1', but nothing is shown in the listbox that should display the result. Here's some code;
Thread function first, this is the same as the other 3 threads
void Thread1(PVOID pvoid)
{
for(int i=0;i<numberOfTests1;i++) {
//DWORD id = GetCurrentThreadId();
int ret;
double TimeOut = 60.0;
int Lng = 1;
test1[i].testNumber = CMD_TOOL_BUZZER;
//test1[i].testNumber = getTestNumber(test1[i].testName);
unsigned char Param[255] = {0};
unsigned char Port1 = port1;
ret = PSB30_Open(Port1, 16);
ret = PSB30_SendOrder (Port1, test1[i].testNumber, &Param[0], &Lng, &TimeOut);
ret = PSB30_Close (Port1);
int result = 0;
if(*Param == 1) {
PostMessage(hWnd,WM_TEST_PASS,i,(LPARAM)"PASS");
test1[i].passed = true;
}
else PostMessage(hWnd, WM_TEST_FAIL, i, (LPARAM)"FAIL") ;
}
_endthread();
}
In main, the message handler;
while (GetMessage(&msg, NULL, 0, 0))
{
if (!TranslateAccelerator(msg.hwnd, hAccelTable, &msg))
{
TranslateMessage(&msg);
DispatchMessage(&msg);
}
}
return (int) msg.wParam;
And finally user defined message;
case WM_TEST_PASS:
i = wParam;
SendDlgItemMessage(hWnd,IDT_RESULTLIST1,LB_ADDSTRING,i,(LPARAM)"PASS");
MessageBox(hWnd,"test",0,0); //debug
break;
Is there anything else i require to display the results in the listbox, i am at my wits end with this.
Note, i have tried PostThreadMesage and SendMessage and get the same results, which leads me to think the issue lies with the message handler or my user-defined message
Thanks for looking

Glib: how to start a new thread until another thread is finished?

I am using Glib to develop a multi-threading C software.
I would like to have a set of alive threads. Once some thread finishes, another thread starts with a different parameter. It is something like a thread pool.
I am using glib thread to implement the multi-threading. But I cannot find a lot of tutorials from Google. I can now start a set of threads, but have no idea about the waiting. Some code of mine:
GThread *threads[n_threads];
thread_aux_t *data = (thread_aux_t*) calloc(n_threads, sizeof(thread_aux_t));
for (i = 0; i < n_threads; ++i) {
data[i].parameter = i;
threads[i] = g_thread_create((GThreadFunc) pe_lib_thread, data + i,
TRUE, NULL);
}
/* wait for threads to finish */
for (i = 0; i < n_threads; ++i) {
g_thread_join(threads[i]); // How to start a new thread depending on the return value?
}
free(data);
Thanks.
Problem solved. Update:
Just found a thread pool implementation of glib: Thread Pools. I have run it and it works correctly.
The code is written as:
// 'query' is for this new thread,
// 'data' is the global parameters set when initiating the pool
void *pe_lib_thread(gpointer query, gpointer data) {
}
void run_threads() {
GThreadPool *thread_pool = NULL;
// Global parameters by all threads.
thread_aux_t *data = (thread_aux_t*) calloc(1, sizeof(thread_aux_t));
data->shared_hash_table = get_hash_table();
g_thread_init(NULL);
thread_pool = g_thread_pool_new((GFunc) pe_lib_thread, data, n_threads,
TRUE, NULL);
// If n_threads is 10, there are maximum 10 threads running, others are waiting.
for (i = 0; i < n_queries; i++) {
query = &queries[i];
g_thread_pool_push(thread_pool, (gpointer) query, NULL);
}
g_thread_pool_free(thread_pool, 0, 1);
}
g_thread_join returns the return value, so you just check it.
Let's say you want to create a new process if the return value is 17.
for (i = 0; i < n_threads; ++i) {
if (threads[i] && g_thread_join(threads[i]) == 17) {
/* Start a new thread. */
threads[i] = g_thread_create((GThreadFunc) pe_lib_thread, data + i,
TRUE, NULL);
} else {
threads[i] = NULL;
}
}

Is it possible to call a non-exported function that resides in an exe?

I'd like to call a function that resides in a 3rd-party .exe and obtain its result. It seems like there should be a way, as long as I know the function address, calling-convention, etc... but I don't know how.
Does anyone know how I would do this?
I realize that any solution would be a non-standard hack, but there must be a way!
My non-nefarious use-case: I'm reverse engineering a file-format for my software. The calculations in this function are too complex for my tiny brain to figure out; I've been able to pull the assembly-code directly into my own DLL for testing, but of course I can't release that, as that would be stealing. I will be assuming users already have this particular application pre-installed so my software will run.
It is possible but not trivial. And yes, this is a very dirty hack.
In some cases loading the EXE file with LoadLibrary is enough. The returned HMODULE is actually the base address of the loaded EXE. Cast it to a suitable int type, add your relative function address to that, cast it back to a function pointer and call the function through that pointer.
Unfortunately, the EXE file may have its relocation info stripped. It means that the EXE will be expecting to run from a specific address. In this case, you have to change your own program's base address to avoid conflict. Check out your linker's docs, there should be an option to do that. After that, LoadLibrary will load the EXE in its preferred base address and hopefully all should work fine.
There is some very useful info on this here. Make sure to check the update at the end of the page for a different technique that may work better in some cases.
Edit: As Alex correctly stated in the comment below, if the function relies on some initialized value, or it calls such a function, including most C runtime functions, it will be much harder to make it work. One can identify the initialization functions and call them beforehand but using debug API may be your best bet in those situations.
OK, I've put together a prototype.
This program creates another instance of itself as a debugged child process.
An automatic breakpoint will be encountered before main() and CRT initialization code. This is when we can change the memory and registers of the debugged process to make it execute a function of interest. And that's what the program does.
It tries to catch and handle all the bad situations (e.g. unexpected exceptions) and reports them as errors.
One bad situation is actually a good one. It's the #UD exception from the UD2 instruction that the program places into the debugged process. It uses this #UD to stop the process execution after the function of interest has returned.
A few more notes:
This code is 32-bit only. I didn't even try to make it 64-bit compilable or support 64-bit child processes.
This code will likely leak handles. See the Windows Debug API function descriptions on MSDN to find out where they need to be closed.
This code is a proof of concept only and does not support passing and returning data via pointers or registers other than EAX, ECX and EDX. You'll have to extend it as necessary.
This code requires some privileges in order to be able to create and fully debug a process. You may have to worry about this if your program's users aren't admins.
Enjoy.
Code:
// file: unexported.c
//
// compile with Open Watcom C/C++: wcl386 /q /wx /we /s unexported.c
// (Note: "/s" is needed to avoid stack check calls from the "unexported"
// functions, these calls are through a pointer, and it'll be
// uninitialized in our case.)
//
// compile with MinGW gcc 4.6.2: gcc unexported.c -o unexported.exe
#include <windows.h>
#include <stdio.h>
#include <string.h>
#include <stdarg.h>
#include <limits.h>
#ifndef C_ASSERT
#define C_ASSERT(expr) extern char CAssertExtern[(expr)?1:-1]
#endif
// Compile as a 32-bit app only.
C_ASSERT(sizeof(void*) * CHAR_BIT == 32);
#define EXC_CODE_AND_NAME(X) { X, #X }
const struct
{
DWORD Code;
PCSTR Name;
} ExcCodesAndNames[] =
{
EXC_CODE_AND_NAME(EXCEPTION_ACCESS_VIOLATION),
EXC_CODE_AND_NAME(EXCEPTION_ARRAY_BOUNDS_EXCEEDED),
EXC_CODE_AND_NAME(EXCEPTION_BREAKPOINT),
EXC_CODE_AND_NAME(EXCEPTION_DATATYPE_MISALIGNMENT),
EXC_CODE_AND_NAME(EXCEPTION_FLT_DENORMAL_OPERAND),
EXC_CODE_AND_NAME(EXCEPTION_FLT_DIVIDE_BY_ZERO),
EXC_CODE_AND_NAME(EXCEPTION_FLT_INEXACT_RESULT),
EXC_CODE_AND_NAME(EXCEPTION_FLT_INVALID_OPERATION),
EXC_CODE_AND_NAME(EXCEPTION_FLT_OVERFLOW),
EXC_CODE_AND_NAME(EXCEPTION_FLT_STACK_CHECK),
EXC_CODE_AND_NAME(EXCEPTION_FLT_UNDERFLOW),
EXC_CODE_AND_NAME(EXCEPTION_ILLEGAL_INSTRUCTION),
EXC_CODE_AND_NAME(EXCEPTION_IN_PAGE_ERROR),
EXC_CODE_AND_NAME(EXCEPTION_INT_DIVIDE_BY_ZERO),
EXC_CODE_AND_NAME(EXCEPTION_INT_OVERFLOW),
EXC_CODE_AND_NAME(EXCEPTION_INVALID_DISPOSITION),
EXC_CODE_AND_NAME(EXCEPTION_NONCONTINUABLE_EXCEPTION),
EXC_CODE_AND_NAME(EXCEPTION_PRIV_INSTRUCTION),
EXC_CODE_AND_NAME(EXCEPTION_SINGLE_STEP),
EXC_CODE_AND_NAME(EXCEPTION_STACK_OVERFLOW),
EXC_CODE_AND_NAME(EXCEPTION_GUARD_PAGE),
EXC_CODE_AND_NAME(DBG_CONTROL_C),
{ 0xE06D7363, "C++ EH exception" }
};
PCSTR GetExceptionName(DWORD code)
{
DWORD i;
for (i = 0; i < sizeof(ExcCodesAndNames) / sizeof(ExcCodesAndNames[0]); i++)
{
if (ExcCodesAndNames[i].Code == code)
{
return ExcCodesAndNames[i].Name;
}
}
return "?";
}
typedef enum tCallConv
{
CallConvCdecl, // Params on stack; caller removes params
CallConvStdCall, // Params on stack; callee removes params
CallConvFastCall // Params in ECX, EDX and on stack; callee removes params
} tCallConv;
DWORD Execute32bitFunctionFromExe(PCSTR ExeName,
int FunctionAddressIsRelative,
DWORD FunctionAddress,
tCallConv CallConvention,
DWORD CodeDataStackSize,
ULONG64* ResultEdxEax,
DWORD DwordParamsCount,
.../* DWORD params */)
{
STARTUPINFO startupInfo;
PROCESS_INFORMATION processInfo;
DWORD dwContinueStatus = DBG_CONTINUE; // exception continuation
DEBUG_EVENT dbgEvt;
UCHAR* procMem = NULL;
DWORD breakPointCount = 0;
DWORD err = ERROR_SUCCESS;
DWORD ecxEdxParams[2] = { 0, 0 };
DWORD imageBase = 0;
CONTEXT ctx;
va_list ap;
va_start(ap, DwordParamsCount);
*ResultEdxEax = 0;
memset(&startupInfo, 0, sizeof(startupInfo));
startupInfo.cb = sizeof(startupInfo);
memset(&processInfo, 0, sizeof(processInfo));
if (!CreateProcess(
NULL,
(LPSTR)ExeName,
NULL,
NULL,
FALSE,
DEBUG_ONLY_THIS_PROCESS, // DEBUG_PROCESS,
NULL,
NULL,
&startupInfo,
&processInfo))
{
printf("CreateProcess() failed with error 0x%08X\n",
err = GetLastError());
goto Cleanup;
}
printf("Process 0x%08X (0x%08X) \"%s\" created,\n"
" Thread 0x%08X (0x%08X) created\n",
processInfo.dwProcessId,
processInfo.hProcess,
ExeName,
processInfo.dwThreadId,
processInfo.hThread);
procMem = VirtualAllocEx(
processInfo.hProcess,
NULL,
CodeDataStackSize,
MEM_COMMIT | MEM_RESERVE,
PAGE_EXECUTE_READWRITE);
if (procMem == NULL)
{
printf("VirtualAllocEx() failed with error 0x%08X\n",
err = GetLastError());
goto Cleanup;
}
printf("Allocated RWX memory in process 0x%08X (0x%08X) "
"at address 0x%08X\n",
processInfo.dwProcessId,
processInfo.hProcess,
procMem);
while (dwContinueStatus)
{
// Wait for a debugging event to occur. The second parameter indicates
// that the function does not return until a debugging event occurs.
if (!WaitForDebugEvent(&dbgEvt, INFINITE))
{
printf("WaitForDebugEvent() failed with error 0x%08X\n",
err = GetLastError());
goto Cleanup;
}
// Process the debugging event code.
switch (dbgEvt.dwDebugEventCode)
{
case EXCEPTION_DEBUG_EVENT:
// Process the exception code. When handling
// exceptions, remember to set the continuation
// status parameter (dwContinueStatus). This value
// is used by the ContinueDebugEvent function.
printf("%s (%s) Exception in process 0x%08X, thread 0x%08X\n"
" Exc. Code = 0x%08X (%s), Instr. Address = 0x%08X",
dbgEvt.u.Exception.dwFirstChance ?
"First Chance" : "Last Chance",
dbgEvt.u.Exception.ExceptionRecord.ExceptionFlags ?
"non-continuable" : "continuable",
dbgEvt.dwProcessId,
dbgEvt.dwThreadId,
dbgEvt.u.Exception.ExceptionRecord.ExceptionCode,
GetExceptionName(dbgEvt.u.Exception.ExceptionRecord.ExceptionCode),
dbgEvt.u.Exception.ExceptionRecord.ExceptionAddress);
if (dbgEvt.u.Exception.ExceptionRecord.ExceptionCode ==
EXCEPTION_ACCESS_VIOLATION)
{
ULONG_PTR* info = dbgEvt.u.Exception.ExceptionRecord.ExceptionInformation;
printf(",\n Access Address = 0x%08X, Access = 0x%08X (%s)",
(DWORD)info[1],
(DWORD)info[0],
(info[0] == 0) ?
"read" : ((info[0] == 1) ? "write" : "execute")); // 8 = DEP
}
printf("\n");
// Get the thread context (register state).
// We'll need to either display it (in case of unexpected exceptions) or
// modify it (to execute our code) or read it (to get the results of
// execution).
memset(&ctx, 0, sizeof(ctx));
ctx.ContextFlags = CONTEXT_INTEGER | CONTEXT_CONTROL;
if (!GetThreadContext(processInfo.hThread, &ctx))
{
printf("GetThreadContext() failed with error 0x%08X\n",
err = GetLastError());
goto Cleanup;
}
#if 0
printf(" EAX=0x%08X EBX=0x%08X ECX=0x%08X EDX=0x%08X EFLAGS=0x%08X\n"
" ESI=0x%08X EDI=0x%08X EBP=0x%08X ESP=0x%08X EIP=0x%08X\n",
ctx.Eax, ctx.Ebx, ctx.Ecx, ctx.Edx, ctx.EFlags,
ctx.Esi, ctx.Edi, ctx.Ebp, ctx.Esp, ctx.Eip);
#endif
if (dbgEvt.u.Exception.ExceptionRecord.ExceptionCode == EXCEPTION_BREAKPOINT &&
breakPointCount == 0)
{
// Update the context so our code can be executed
DWORD mem, i, data;
SIZE_T numberOfBytesCopied;
mem = (DWORD)procMem + CodeDataStackSize;
// Child process memory layout (inside the procMem[] buffer):
//
// higher
// addresses
// .
// . UD2 instruction (causes #UD, indicator of successful
// . completion of FunctionAddress())
// .
// . last on-stack parameter for FunctionAddress()
// . ...
// . first on-stack parameter for FunctionAddress()
// .
// . address of UD2 instruction (as if "call FunctionAddress"
// . executed just before it and is going to return to UD2)
// . (ESP will point here)
// .
// . FunctionAddress()'s stack
// .
// lower
// addresses
mem -= 2;
data = 0x0B0F; // 0x0F, 0x0B = UD2 instruction
if (!WriteProcessMemory(processInfo.hProcess,
(PVOID)mem,
&data,
2,
&numberOfBytesCopied))
{
ErrWriteMem1:
printf("WriteProcessMemory() failed with error 0x%08X\n",
err = GetLastError());
goto Cleanup;
}
else if (numberOfBytesCopied != 2)
{
ErrWriteMem2:
printf("WriteProcessMemory() failed with error 0x%08X\n",
err = ERROR_BAD_LENGTH);
goto Cleanup;
}
// Copy function parameters.
mem &= 0xFFFFFFFC; // align the address for the stack
for (i = 0; i < DwordParamsCount; i++)
{
if (CallConvention == CallConvFastCall && i < 2)
{
ecxEdxParams[i] = va_arg(ap, DWORD);
}
else
{
data = va_arg(ap, DWORD);
if (!WriteProcessMemory(processInfo.hProcess,
(DWORD*)mem - DwordParamsCount + i,
&data,
sizeof(data),
&numberOfBytesCopied))
{
goto ErrWriteMem1;
}
else if (numberOfBytesCopied != sizeof(data))
{
goto ErrWriteMem2;
}
}
}
// Adjust what will become ESP according to the number of on-stack parameters.
for (i = 0; i < DwordParamsCount; i++)
{
if (CallConvention != CallConvFastCall || i >= 2)
{
mem -= 4;
}
}
// Store the function return address.
mem -= 4;
data = (DWORD)procMem + CodeDataStackSize - 2; // address of UD2
if (!WriteProcessMemory(processInfo.hProcess,
(PVOID)mem,
&data,
sizeof(data),
&numberOfBytesCopied))
{
goto ErrWriteMem1;
}
else if (numberOfBytesCopied != sizeof(data))
{
goto ErrWriteMem2;
}
// Last-minute preparations for execution...
// Set up the registers (ECX, EDX, EFLAGS, EIP, ESP).
if (CallConvention == CallConvFastCall)
{
if (DwordParamsCount >= 1) ctx.Ecx = ecxEdxParams[0];
if (DwordParamsCount >= 2) ctx.Edx = ecxEdxParams[1];
}
ctx.EFlags &= ~(1 << 10); // clear DF for string instructions
ctx.Eip = FunctionAddress + imageBase * !!FunctionAddressIsRelative;
ctx.Esp = mem;
if (!SetThreadContext(processInfo.hThread, &ctx))
{
printf("SetThreadContext() failed with error 0x%08X\n",
err = GetLastError());
goto Cleanup;
}
printf("Copied code/data to the process\n");
#if 0
for (i = esp; i < (DWORD)procMem + CodeDataStackSize; i++)
{
data = 0;
ReadProcessMemory(processInfo.hProcess,
(void*)i,
&data,
1,
&numberOfBytesCopied);
printf("E[SI]P = 0x%08X: 0x%02X\n", i, data);
}
#endif
breakPointCount++;
dwContinueStatus = DBG_CONTINUE; // continue execution of our code
}
else if (dbgEvt.u.Exception.ExceptionRecord.ExceptionCode == EXCEPTION_ILLEGAL_INSTRUCTION &&
breakPointCount == 1 &&
ctx.Eip == (DWORD)procMem + CodeDataStackSize - 2/*UD2 size*/)
{
// The code has finished execution as expected.
// Collect the results.
*ResultEdxEax = ((ULONG64)ctx.Edx << 32) | ctx.Eax;
printf("Copied code/data from the process\n");
dwContinueStatus = 0; // stop debugging
}
else
{
// Unexpected event. Do not continue execution.
printf(" EAX=0x%08X EBX=0x%08X ECX=0x%08X EDX=0x%08X EFLAGS=0x%08X\n"
" ESI=0x%08X EDI=0x%08X EBP=0x%08X ESP=0x%08X EIP=0x%08X\n",
ctx.Eax, ctx.Ebx, ctx.Ecx, ctx.Edx, ctx.EFlags,
ctx.Esi, ctx.Edi, ctx.Ebp, ctx.Esp, ctx.Eip);
err = dbgEvt.u.Exception.ExceptionRecord.ExceptionCode;
goto Cleanup;
}
break; // case EXCEPTION_DEBUG_EVENT:
case CREATE_PROCESS_DEBUG_EVENT:
// As needed, examine or change the registers of the
// process's initial thread with the GetThreadContext and
// SetThreadContext functions; read from and write to the
// process's virtual memory with the ReadProcessMemory and
// WriteProcessMemory functions; and suspend and resume
// thread execution with the SuspendThread and ResumeThread
// functions. Be sure to close the handle to the process image
// file with CloseHandle.
printf("Process 0x%08X (0x%08X) "
"created, base = 0x%08X,\n"
" Thread 0x%08X (0x%08X) created, start = 0x%08X\n",
dbgEvt.dwProcessId,
dbgEvt.u.CreateProcessInfo.hProcess,
dbgEvt.u.CreateProcessInfo.lpBaseOfImage,
dbgEvt.dwThreadId,
dbgEvt.u.CreateProcessInfo.hThread,
dbgEvt.u.CreateProcessInfo.lpStartAddress);
// Found image base!
imageBase = (DWORD)dbgEvt.u.CreateProcessInfo.lpBaseOfImage;
dwContinueStatus = DBG_CONTINUE;
break;
case EXIT_PROCESS_DEBUG_EVENT:
// Display the process's exit code.
printf("Process 0x%08X exited, exit code = 0x%08X\n",
dbgEvt.dwProcessId,
dbgEvt.u.ExitProcess.dwExitCode);
// Unexpected event. Do not continue execution.
err = ERROR_PROC_NOT_FOUND;
goto Cleanup;
case CREATE_THREAD_DEBUG_EVENT:
case EXIT_THREAD_DEBUG_EVENT:
case LOAD_DLL_DEBUG_EVENT:
case UNLOAD_DLL_DEBUG_EVENT:
case OUTPUT_DEBUG_STRING_EVENT:
dwContinueStatus = DBG_CONTINUE;
break;
case RIP_EVENT:
printf("RIP: Error = 0x%08X, Type = 0x%08X\n",
dbgEvt.u.RipInfo.dwError,
dbgEvt.u.RipInfo.dwType);
// Unexpected event. Do not continue execution.
err = dbgEvt.u.RipInfo.dwError;
goto Cleanup;
} // end of switch (dbgEvt.dwDebugEventCode)
// Resume executing the thread that reported the debugging event.
if (dwContinueStatus)
{
if (!ContinueDebugEvent(dbgEvt.dwProcessId,
dbgEvt.dwThreadId,
dwContinueStatus))
{
printf("ContinueDebugEvent() failed with error 0x%08X\n",
err = GetLastError());
goto Cleanup;
}
}
} // end of while (dwContinueStatus)
err = ERROR_SUCCESS;
Cleanup:
if (processInfo.hProcess != NULL)
{
if (procMem != NULL)
{
VirtualFreeEx(processInfo.hProcess, procMem, 0, MEM_RELEASE);
}
TerminateProcess(processInfo.hProcess, 0);
CloseHandle(processInfo.hThread);
CloseHandle(processInfo.hProcess);
}
va_end(ap);
return err;
}
int __cdecl FunctionCdecl(int x, int y, int z)
{
return x + y + z;
}
int __stdcall FunctionStdCall(int x, int y, int z)
{
return x * y * z;
}
ULONG64 __fastcall FunctionFastCall(DWORD x, DWORD y, DWORD z)
{
return (ULONG64)x * y + z;
}
int main(int argc, char** argv)
{
DWORD err;
ULONG64 resultEdxEax;
err = Execute32bitFunctionFromExe(argv[0]/*ExeName*/,
1/*FunctionAddressIsRelative*/,
(DWORD)&FunctionCdecl -
(DWORD)GetModuleHandle(NULL),
CallConvCdecl,
4096/*CodeDataStackSize*/,
&resultEdxEax,
3/*DwordParamsCount*/,
2, 3, 4);
if (err == ERROR_SUCCESS)
printf("2 + 3 + 4 = %d\n", (int)resultEdxEax);
err = Execute32bitFunctionFromExe(argv[0]/*ExeName*/,
1/*FunctionAddressIsRelative*/,
(DWORD)&FunctionStdCall -
(DWORD)GetModuleHandle(NULL),
CallConvStdCall,
4096/*CodeDataStackSize*/,
&resultEdxEax,
3/*DwordParamsCount*/,
-2, 3, 4);
if (err == ERROR_SUCCESS)
printf("-2 * 3 * 4 = %d\n", (int)resultEdxEax);
err = Execute32bitFunctionFromExe(argv[0]/*ExeName*/,
1/*FunctionAddressIsRelative*/,
(DWORD)&FunctionFastCall -
(DWORD)GetModuleHandle(NULL),
CallConvFastCall,
4096/*CodeDataStackSize*/,
&resultEdxEax,
3/*DwordParamsCount*/,
-1, -1, -1);
if (err == ERROR_SUCCESS)
printf("0xFFFFFFFF * 0xFFFFFFFF + 0xFFFFFFFF = 0x%llX\n",
(unsigned long long)resultEdxEax);
return 0;
}
Output:
Process 0x00001514 (0x00000040) "C:\MinGW\msys\1.0\home\Alex\unexported.exe" cre
ated,
Thread 0x00000CB0 (0x0000003C) created
Allocated RWX memory in process 0x00001514 (0x00000040) at address 0x002B0000
Process 0x00001514 (0x00000044) created, base = 0x00400000,
Thread 0x00000CB0 (0x00000048) created, start = 0x0040126C
First Chance (continuable) Exception in process 0x00001514, thread 0x00000CB0
Exc. Code = 0x80000003 (EXCEPTION_BREAKPOINT), Instr. Address = 0x77090FAB
Copied code/data to the process
First Chance (continuable) Exception in process 0x00001514, thread 0x00000CB0
Exc. Code = 0xC000001D (EXCEPTION_ILLEGAL_INSTRUCTION), Instr. Address = 0x002
B0FFE
Copied code/data from the process
2 + 3 + 4 = 9
Process 0x00001828 (0x0000003C) "C:\MinGW\msys\1.0\home\Alex\unexported.exe" cre
ated,
Thread 0x00001690 (0x00000040) created
Allocated RWX memory in process 0x00001828 (0x0000003C) at address 0x002B0000
Process 0x00001828 (0x0000006C) created, base = 0x00400000,
Thread 0x00001690 (0x00000074) created, start = 0x0040126C
First Chance (continuable) Exception in process 0x00001828, thread 0x00001690
Exc. Code = 0x80000003 (EXCEPTION_BREAKPOINT), Instr. Address = 0x77090FAB
Copied code/data to the process
First Chance (continuable) Exception in process 0x00001828, thread 0x00001690
Exc. Code = 0xC000001D (EXCEPTION_ILLEGAL_INSTRUCTION), Instr. Address = 0x002
B0FFE
Copied code/data from the process
-2 * 3 * 4 = -24
Process 0x00001388 (0x00000040) "C:\MinGW\msys\1.0\home\Alex\unexported.exe" cre
ated,
Thread 0x00001098 (0x0000003C) created
Allocated RWX memory in process 0x00001388 (0x00000040) at address 0x002B0000
Process 0x00001388 (0x0000008C) created, base = 0x00400000,
Thread 0x00001098 (0x00000090) created, start = 0x0040126C
First Chance (continuable) Exception in process 0x00001388, thread 0x00001098
Exc. Code = 0x80000003 (EXCEPTION_BREAKPOINT), Instr. Address = 0x77090FAB
Copied code/data to the process
First Chance (continuable) Exception in process 0x00001388, thread 0x00001098
Exc. Code = 0xC000001D (EXCEPTION_ILLEGAL_INSTRUCTION), Instr. Address = 0x002
B0FFE
Copied code/data from the process
0xFFFFFFFF * 0xFFFFFFFF + 0xFFFFFFFF = 0xFFFFFFFF00000000
Instead of loading the EXE into your process, two better (IMO) ways:
1) Use debugging API (or something like PyDbg) to start the target under a debugger, then set up the arguments in the stack, set EIP to the necessary address, put breakpoint on the return address, and resume.
2) make a small DLL with some IPC to communicate with your program, inject it into target (there are several ways to do it, the easiest is probably keyboard hooking) and have it call the necessary code. Or you could use an existing too that can do it, e.g. Intel's PIN.

Resources