Win32 - Backtrace from C code - c

I'm currently looking for a way to get backtrace information under Windows, from C code (no C++).
I'm building a cross-platform C library, with reference-counting memory management. It also have an integrated memory debugger that provides informations about memory mistakes (XEOS C Foundation Library).
When a fault occurs, the debugger is launched, providing information about the fault, and the memory record involved.
On Linux or Mac OS X, I can look for execinfo.h in order to use the backtrace function, so I can display additional infos about the memory fault.
I'm looking for the same thing on Windows.
I've seen How can one grab a stack trace in C? on Stack Overflow. I don't want to use a third-party library, so the CaptureStackBackTrace or StackWalk functions looks good.
The only problem is that I just don't get how to use them, even with the Microsoft documentation.
I'm not used to Windows programming, as I usually work on POSIX compliant systems.
What are some explanations for those functions, and maybe some examples?
EDIT
I'm now considering using the CaptureStackBackTrace function from DbgHelp.lib, as is seems there's a little less overhead...
Here's what I've tried so far:
unsigned int i;
void * stack[ 100 ];
unsigned short frames;
SYMBOL_INFO symbol;
HANDLE process;
process = GetCurrentProcess();
SymInitialize( process, NULL, TRUE );
frames = CaptureStackBackTrace( 0, 100, stack, NULL );
for( i = 0; i < frames; i++ )
{
SymFromAddr( process, ( DWORD64 )( stack[ i ] ), 0, &symbol );
printf( "%s\n", symbol.Name );
}
I'm just getting junk. I guess I should use something else than SymFromAddr.

Alright, now I got it. : )
The problem was in the SYMBOL_INFO structure. It needs to be allocated on the heap, reserving space for the symbol name, and initialized properly.
Here's the final code:
void printStack( void );
void printStack( void )
{
unsigned int i;
void * stack[ 100 ];
unsigned short frames;
SYMBOL_INFO * symbol;
HANDLE process;
process = GetCurrentProcess();
SymInitialize( process, NULL, TRUE );
frames = CaptureStackBackTrace( 0, 100, stack, NULL );
symbol = ( SYMBOL_INFO * )calloc( sizeof( SYMBOL_INFO ) + 256 * sizeof( char ), 1 );
symbol->MaxNameLen = 255;
symbol->SizeOfStruct = sizeof( SYMBOL_INFO );
for( i = 0; i < frames; i++ )
{
SymFromAddr( process, ( DWORD64 )( stack[ i ] ), 0, symbol );
printf( "%i: %s - 0x%0X\n", frames - i - 1, symbol->Name, symbol->Address );
}
free( symbol );
}
Output is:
6: printStack - 0xD2430
5: wmain - 0xD28F0
4: __tmainCRTStartup - 0xE5010
3: wmainCRTStartup - 0xE4FF0
2: BaseThreadInitThunk - 0x75BE3665
1: RtlInitializeExceptionChain - 0x770F9D0F
0: RtlInitializeExceptionChain - 0x770F9D0F

Here's my super-low-fi alternative, as used for reading stacks from a C++ Builder app. This code is executed within the process itself when it crashes and gets a stack into the cs array.
int cslev = 0;
void* cs[300];
void* it = <ebp at time of crash>;
void* rm[2];
while(it && cslev<300)
{
/* Could just memcpy instead of ReadProcessMemory, but who knows if
the stack's valid? If it's invalid, memcpy could cause an AV, which is
pretty much exactly what we don't want
*/
err=ReadProcessMemory(GetCurrentProcess(),it,(LPVOID)rm,sizeof(rm),NULL);
if(!err)
break;
it=rm[0];
cs[cslev++]=(void*)rm[1];
}
UPDATE
Once I've got the stack, I then go about translating it into names. I do this by cross-referencing with the .map file that C++Builder outputs. The same thing could be done with a map file from another compiler, although the formatting would be somewhat different. The following code works for C++Builder maps. This is again quite low-fi and probably not the canonical MS way of doing things, but it works in my situation. The code below isn't delivered to end users.
char linbuf[300];
char *pars;
unsigned long coff,lngth,csect;
unsigned long thisa,sect;
char *fns[300];
unsigned int maxs[300];
FILE *map;
map = fopen(mapname, "r");
if (!map)
{
...Add error handling for missing map...
}
do
{
fgets(linbuf,300,map);
} while (!strstr(linbuf,"CODE"));
csect=strtoul(linbuf,&pars,16); /* Find out code segment number */
pars++; /* Skip colon */
coff=strtoul(pars,&pars,16); /* Find out code offset */
lngth=strtoul(pars,NULL,16); /* Find out code length */
do
{
fgets(linbuf,300,map);
} while (!strstr(linbuf,"Publics by Name"));
for(lop=0;lop!=cslev;lop++)
{
fns[lop] = NULL;
maxs[lop] = 0;
}
do
{
fgets(linbuf,300,map);
sect=strtoul(linbuf,&pars,16);
if(sect!=csect)
continue;
pars++;
thisa=strtoul(pars,&pars,16);
for(lop=0;lop!=cslev;lop++)
{
if(cs[lop]<coff || cs[lop]>coff+lngth)
continue;
if(thisa<cs[lop]-coff && thisa>maxs[lop])
{
maxs[lop]=thisa;
while(*pars==' ')
pars++;
fns[lop] = fnsbuf+(100*lop);
fnlen = strlen(pars);
if (fnlen>100)
fnlen = 100;
strncpy(fns[lop], pars, 99);
fns[lop][fnlen-1]='\0';
}
}
} while (!feof(map));
fclose(map);
After running this code, the fns array contains the best-matching function from the .map file.
In my situation, I actually have the call stack as produced by the first piece of code submitting to a PHP script - I do the equivalent of the C code above using a piece of PHP. This first bit parses the map file (Again, this works with C++Builder maps but could be easily adapted to other map file formats):
$file = fopen($mapdir.$app."-".$appversion.".map","r");
if (!$file)
... Error handling for missing map ...
do
{
$mapline = fgets($file);
} while (!strstr($mapline,"CODE"));
$tokens = split("[[:space:]\:]", $mapline);
$codeseg = $tokens[1];
$codestart = intval($tokens[2],16);
$codelen = intval($tokens[3],16);
do
{
$mapline = fgets($file);
} while (!strstr($mapline,"Publics by Value"));
fgets($file); // Blank
$addrnum = 0;
$lastaddr = 0;
while (1)
{
if (feof($file))
break;
$mapline = fgets($file);
$tokens = split("[[:space:]\:]", $mapline);
$thisseg = $tokens[1];
if ($thisseg!=$codeseg)
break;
$addrs[$addrnum] = intval($tokens[2],16);
if ($addrs[$addrnum]==$lastaddr)
continue;
$lastaddr = $addrs[$addrnum];
$funcs[$addrnum] = trim(substr($mapline, 16));
$addrnum++;
}
fclose($file);
Then this bit translates an address (in $rowaddr) into a given function (as well as the offset after the function):
$thisaddr = intval($rowaddr,16);
$thisaddr -= $codestart;
if ($thisaddr>=0 && $thisaddr<=$codelen)
{
for ($lop=0; $lop!=$addrnum; $lop++)
if ($thisaddr<$addrs[$lop])
break;
}
else
$lop = $addrnum;
if ($lop!=$addrnum)
{
$lop--;
$lines[$ix] = substr($line,0,13).$rowaddr." : ".$funcs[$lop]." (+".sprintf("%04X",$thisaddr-$addrs[$lop]).")";
$stack .= $rowaddr;
}
else
{
$lines[$ix] = substr($line,0,13).$rowaddr." : external";
}

#Jon Bright: You say "who known whether the stack is valid...": Well there's a way to find out, as the stack addresses are known. Assuming you need a trace in the current thread, of course:
NT_TIB* pTEB = GetTEB();
UINT_PTR ebp = GetEBPForStackTrace();
HANDLE hCurProc = ::GetCurrentProcess();
while (
((ebp & 3) == 0) &&
ebp + 2*sizeof(VOID*) < (UINT_PTR)pTEB->StackBase &&
ebp >= (UINT_PTR)pTEB->StackLimit &&
nAddresses < nTraceBuffers)
{
pTraces[nAddresses++]._EIP = ((UINT_PTR*)ebp)[1];
ebp = ((UINT_PTR*)ebp)[0];
}
My "GetTEB()" is NtCurrentTeb() from NTDLL.DLL - and it is not only Windows 7 and above as stated in the current MSDN. MS junks up the documentation. It was there for a long time. Using the ThreadEnvironment Block (TEB), you do not need ReadProcessMemory() as you know the stack's lower and upper limit. I assume this is the fastest way to do it.
Using the MS compiler, GetEBPForStackTrace() can be
inline __declspec(naked) UINT_PTR GetEBPForStackTrace()
{
__asm
{
mov eax, ebp
ret
}
}
as easy way to get EBP of the current thread (but you can pass any valid EBP to this loop as long as it is for the current thread).
Limitation: This is valid for x86 under Windows.

Related

char * array not valid anymore when modified from a function (C99)

In Vulkan, but it is more a c99 problem, I create my extensions list this way :
const char *extension_list[7];
uint32_t extension_count = 0;
extension_list[ extension_count++ ] = VK_KHR_SWAPCHAIN_EXTENSION_NAME;
extension_list[ extension_count++ ] = VK_KHR_MAINTENANCE1_EXTENSION_NAME;
Then I can call vkCreateDevice with this extensions array and count:
VkDeviceCreateInfo desc;
desc.sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO;
desc.enabledExtensionCount = extension_count;
desc.ppEnabledExtensionNames = extension_list ;
// (...)
vkCreateDevice( physical_device, &desc, NULL, &vk.device );
I need to add others extensions from a function addOtherExt(). (I simplified the code to only shown my problem, but I have severall extensions to add.)
void addOtherExt( const char *list[], uint32_t *count )
{
list[ *count ] = VK_KHR_MAINTENANCE1_EXTENSION_NAME;
*count += 1;
}
Code is the same except the function call :
extension_list[ extension_count++ ] = VK_KHR_SWAPCHAIN_EXTENSION_NAME;
addOtherExt( extension_list, &extension_count );
vkCreateDevice doesn't return an error, but I got a crash with message "Failed to find entrypoint vkAcquireNextImageKHR".
I print the array extension_list and it's look exactly like in version without function (extension is added), and the extension_count is correct.
I have suspected the terminating '\0' but did not manage to make it work.

Return large string from lua function in nodemcu lua

i'm trying to modify nodemcu lua file.list functon /app/modules/file.c
to return a large string of filenames separated with newline char. currently it returns array and is very memory consuming also i'm stripping file size.
Here is what i have done (examined how some other functions return strings)
static int file_list( lua_State* L )
{
char temp[32];
unsigned st = luaL_optinteger( L, 1, 1 ); // start offset
unsigned tf = luaL_optinteger( L, 2, 100000 ); // how much files to list
tf=tf+st;
vfs_dir *dir;
if (dir = vfs_opendir("")) {
lua_newtable( L );
struct vfs_stat stat;
int i=1;
int ii=0;
while (vfs_readdir(dir, &stat) == VFS_RES_OK) {
if (i<st)
{
i++;
continue;
}
if (i>=tf)
{
break;
}
strcpy (temp,stat.name);
strcat (temp,"\n");
lua_pushstring( L, temp );
i++;
ii++;
}
vfs_closedir(dir);
return ii;
}
return 0;
}
It do not work as expected, If I request more than 40 files (at once after device reboots) I see output like this:
....
3fff0d10 already freed
3fff11b8 already freed
3fff0cc8 already freed
3fff1f88 already freed
.....
and device restart, but if I request 30 files, and every time increase them in steps of 30, manage to get 400 files at one time.
=file.list(1,30)
=file.list(1,60)
=file.list(1,90)
that way it works, If I do directly:
=file.list(1,60)
it does not work. Noticed also that memory is allocated and not set free after function finished, but also not reallocate after same command execution so it is not memory leak, just some date stays in the stack perhaps.

Entry Point Obscuring

I've been writing an EPO program and so far I've been able to find a call opcode and get the RVA from the following address in the binary, then parse the IAT to get names of functions that are imported and their corresponding RVA's.
I've come to a problem when trying fill arrays with the names + RVA's and going on to compare the WORD value I have from the call address against the RVA's of all the imported functions.
Here's the code I've been working with;
//Declarations.
DWORD dwImportDirectoryVA,dwSectionCount,dwSection=0,dwRawOffset;
PIMAGE_IMPORT_DESCRIPTOR pImportDescriptor;
PIMAGE_THUNK_DATA pThunkData, pFThunkData;
// Arrays to hold names + rva's
unsigned long namearray[100];
DWORD rvaArray[100];
int i = 0;
And the rest:
/* Import Code: */
dwSectionCount = pNtHeaders->FileHeader.NumberOfSections;
dwImportDirectoryVA = pNtHeaders->OptionalHeader.DataDirectory[1].VirtualAddress;
for(;dwSection < dwSectionCount && pSectionHeader->VirtualAddress <= dwImportDirectoryVA;pSectionHeader++,dwSection++);
pSectionHeader--;
dwRawOffset = (DWORD)hMap+pSectionHeader->PointerToRawData;
pImportDescriptor = (PIMAGE_IMPORT_DESCRIPTOR)(dwRawOffset+(dwImportDirectoryVA-pSectionHeader->VirtualAddress));
for(;pImportDescriptor->Name!=0;pImportDescriptor++)
{
pThunkData = (PIMAGE_THUNK_DATA)(dwRawOffset+(pImportDescriptor->OriginalFirstThunk-pSectionHeader->VirtualAddress));
pFThunkData = (PIMAGE_THUNK_DATA)pImportDescriptor->FirstThunk;
for(;pThunkData->u1.AddressOfData != 0;pThunkData++)
{
if(!(pThunkData->u1.Ordinal & IMAGE_ORDINAL_FLAG32))
{
namearray[i] = (dwRawOffset+(pThunkData->u1.AddressOfData-pSectionHeader->VirtualAddress+2));
rvaArray[i] = pFThunkData;
i++;
//
pFThunkData++;
}
}
}
printf("\nFinished.\n");
for (i = 0 ; i <= 100 ; i++)
{
//wRva is defined and initialized earlier in code.
if (rvaArray[i] == wRva)
{
printf("Call to %s found. Address: %X\n", namearray[i], rvaArray[i]);
}
}
NOTE: A lot of this code has been stripped down ( printf statements to track progress.)
The problem is the types of arrays I've been using. I'm not sure how I can store pThunkData (Names) and pFThunkData (RVA's) correctly for usage later on.
I've tried a few things a messed around with the code but I'm admitting defeat and asking for your help.
You could create a list or array of structs, containing pThunkData and pFThunkData.
#define n 100
struct pdata
{
PIMAGE_THUNK_DATA p_thunk_data;
PIMAGE_THUNK_DATA pf_thunk_data;
}
struct pdata pdatas[n]

How to know the address range when searching for a function by its signature?

I'm trying to search for a function by its "signature".
However I can't figure out what address range I'm supposed to be searching?
I've had a look at VirtualQuery() and GetNativeSystemInfo() but I'm not if I'm on the right path or not.
Edit: Question re-attempt.
Using Win32 API I'm trying to find out how to get the start and end address of the executable pages of the process my code is executing in.
This is what I've tried:
SYSTEM_INFO info;
ZeroMemory( &info, sizeof( SYSTEM_INFO ) );
GetNativeSystemInfo( &info ); // GetSystemInfo() might be wrong on WOW64.
info.lpMinimumApplicationAddress;
info.lpMaximumApplicationAddress;
HANDLE thisProcess = GetCurrentProcess();
MEMORY_BASIC_INFORMATION memInfo;
ZeroMemory( &memInfo, sizeof( memInfo ) );
DWORD addr = (DWORD)info.lpMinimumApplicationAddress;
do
{
if ( VirtualQueryEx( thisProcess, (LPVOID)addr, &memInfo, sizeof( memInfo ) ) == 0 )
{
DWORD gle = GetLastError();
if ( gle != ERROR_INVALID_PARAMETER )
{
std::stringstream str;
str << "VirtualQueryEx failed with: " << gle;
MessageBoxA( NULL, str.str().c_str(), "Error", MB_OK );
}
break;
}
if ( memInfo.Type == MEM_IMAGE )
{
// TODO: Scan this memory block for the the sigature
}
addr += info.dwPageSize;
}
while ( addr < (DWORD)info.lpMaximumApplicationAddress );
The reason for doing this is that I'm looking for an un-exported function by its signature as asked here:
Find a function by it signature in Windows DLL
See the answer about "code signature scanning".
While this is enumerating an address range I don't know if this is correct or not since I don't know what the expected range should be. Its just the best I could come up with from looking around MSDN.
the address range when signature scanning a module is from the start of the code section to the start + the section size. the start of the code section and its size are in the PE. most tools take the lazy route and scan the entire module (again using the PE to get the size, but with the module handle as the start address).

tmpfile() on windows 7 x64

Running the following code on Windows 7 x64
#include <stdio.h>
#include <errno.h>
int main() {
int i;
FILE *tmp;
for (i = 0; i < 10000; i++) {
errno = 0;
if(!(tmp = tmpfile())) printf("Fail %d, err %d\n", i, errno);
fclose(tmp);
}
return 0;
}
Gives errno 13 (Permission denied), on the 637th and 1004th call, it works fine on XP (haven't tried 7 x86). Am I missing something or is this a bug?
I've got similar problem on Windows 8 - tmpfile() was causing win32 ERROR_ACCESS_DENIED error code - and yes, if you run application with administrator privileges - then it works fine.
I guess problem is mentioned over here:
https://lists.gnu.org/archive/html/bug-gnulib/2007-02/msg00162.html
Under Windows, the tmpfile function is defined to always create
its temporary file in the root directory. Most users don't have
permission to do that, so it will often fail.
I would suspect that this is kinda incomplete windows port issue - so this should be an error reported to Microsoft. (Why to code tmpfile function if it's useless ?)
But who have time to fight with Microsoft wind mills ?! :-)
I've coded similar implementation using GetTempPathW / GetModuleFileNameW / _wfopen. Code where I've encountered this problem came from libjpeg - I'm attaching whole source code here, but you can pick up code from jpeg_open_backing_store.
jmemwin.cpp:
//
// Windows port for jpeg lib functions.
//
#define JPEG_INTERNALS
#include <Windows.h> // GetTempFileName
#undef FAR // Will be redefined - disable warning
#include "jinclude.h"
#include "jpeglib.h"
extern "C" {
#include "jmemsys.h" // jpeg_ api interface.
//
// Memory allocation and freeing are controlled by the regular library routines malloc() and free().
//
GLOBAL(void *) jpeg_get_small (j_common_ptr cinfo, size_t sizeofobject)
{
return (void *) malloc(sizeofobject);
}
GLOBAL(void) jpeg_free_small (j_common_ptr cinfo, void * object, size_t sizeofobject)
{
free(object);
}
/*
* "Large" objects are treated the same as "small" ones.
* NB: although we include FAR keywords in the routine declarations,
* this file won't actually work in 80x86 small/medium model; at least,
* you probably won't be able to process useful-size images in only 64KB.
*/
GLOBAL(void FAR *) jpeg_get_large (j_common_ptr cinfo, size_t sizeofobject)
{
return (void FAR *) malloc(sizeofobject);
}
GLOBAL(void) jpeg_free_large (j_common_ptr cinfo, void FAR * object, size_t sizeofobject)
{
free(object);
}
//
// Used only by command line applications, not by static library compilation
//
#ifndef DEFAULT_MAX_MEM /* so can override from makefile */
#define DEFAULT_MAX_MEM 1000000L /* default: one megabyte */
#endif
GLOBAL(long) jpeg_mem_available (j_common_ptr cinfo, long min_bytes_needed, long max_bytes_needed, long already_allocated)
{
// jmemansi.c's jpeg_mem_available implementation was insufficient for some of .jpg loads.
MEMORYSTATUSEX status = { 0 };
status.dwLength = sizeof(status);
GlobalMemoryStatusEx(&status);
if( status.ullAvailPhys > LONG_MAX )
// Normally goes here since new PC's have more than 4 Gb of ram.
return LONG_MAX;
return (long) status.ullAvailPhys;
}
/*
Backing store (temporary file) management.
Backing store objects are only used when the value returned by
jpeg_mem_available is less than the total space needed. You can dispense
with these routines if you have plenty of virtual memory; see jmemnobs.c.
*/
METHODDEF(void) read_backing_store (j_common_ptr cinfo, backing_store_ptr info, void FAR * buffer_address, long file_offset, long byte_count)
{
if (fseek(info->temp_file, file_offset, SEEK_SET))
ERREXIT(cinfo, JERR_TFILE_SEEK);
size_t readed = fread( buffer_address, 1, byte_count, info->temp_file);
if (readed != (size_t) byte_count)
ERREXIT(cinfo, JERR_TFILE_READ);
}
METHODDEF(void)
write_backing_store (j_common_ptr cinfo, backing_store_ptr info, void FAR * buffer_address, long file_offset, long byte_count)
{
if (fseek(info->temp_file, file_offset, SEEK_SET))
ERREXIT(cinfo, JERR_TFILE_SEEK);
if (JFWRITE(info->temp_file, buffer_address, byte_count) != (size_t) byte_count)
ERREXIT(cinfo, JERR_TFILE_WRITE);
// E.g. if you need to debug writes.
//if( fflush(info->temp_file) != 0 )
// ERREXIT(cinfo, JERR_TFILE_WRITE);
}
METHODDEF(void)
close_backing_store (j_common_ptr cinfo, backing_store_ptr info)
{
fclose(info->temp_file);
// File is deleted using 'D' flag on open.
}
static HMODULE DllHandle()
{
MEMORY_BASIC_INFORMATION info;
VirtualQuery(DllHandle, &info, sizeof(MEMORY_BASIC_INFORMATION));
return (HMODULE)info.AllocationBase;
}
GLOBAL(void) jpeg_open_backing_store(j_common_ptr cinfo, backing_store_ptr info, long total_bytes_needed)
{
// Generate unique filename.
wchar_t path[ MAX_PATH ] = { 0 };
wchar_t dllPath[ MAX_PATH ] = { 0 };
GetTempPathW( MAX_PATH, path );
// Based on .exe or .dll filename
GetModuleFileNameW( DllHandle(), dllPath, MAX_PATH );
wchar_t* p = wcsrchr( dllPath, L'\\');
wchar_t* ext = wcsrchr( p + 1, L'.');
if( ext ) *ext = 0;
wchar_t* outFile = path + wcslen(path);
static int iTempFileId = 1;
// Based on process id (so processes would not fight with each other)
// Based on some process global id.
wsprintfW(outFile, L"%s_%d_%d.tmp",p + 1, GetCurrentProcessId(), iTempFileId++ );
// 'D' - temporary file.
if ((info->temp_file = _wfopen(path, L"w+bD") ) == NULL)
ERREXITS(cinfo, JERR_TFILE_CREATE, "");
info->read_backing_store = read_backing_store;
info->write_backing_store = write_backing_store;
info->close_backing_store = close_backing_store;
} //jpeg_open_backing_store
/*
* These routines take care of any system-dependent initialization and
* cleanup required.
*/
GLOBAL(long)
jpeg_mem_init (j_common_ptr cinfo)
{
return DEFAULT_MAX_MEM; /* default for max_memory_to_use */
}
GLOBAL(void)
jpeg_mem_term (j_common_ptr cinfo)
{
/* no work */
}
}
I'm intentionally ignoring errors from some of functions - have you ever seen GetTempPathW or GetModuleFileNameW failing ?
A bit of a refresher from the manpage of on tmpfile(), which returns a FILE*:
The file will be automatically deleted when it is closed or the program terminates.
My verdict for this issue: Deleting a file on Windows is weird.
When you delete a file on Windows, for as long as something holds a handle, you can't call CreateFile on something with the same absolute path, otherwise it will fail with the NT error code STATUS_DELETE_PENDING, which gets mapped to the Win32 code ERROR_ACCESS_DENIED. This is probably where EPERM in errno is coming from. You can confirm this with a tool like Sysinternals Process Monitor.
My guess is that CRT somehow wound up creating a file that has the same name as something it's used before. I've sometimes witnessed that deleting files on Windows can appear asynchronous because some other process (sometimes even an antivirus product, in reaction to the fact that you've just closed a delete-on-close handle...) will leave a handle open to the file, so for some timing window you will see a visible file that you can't get a handle to without hitting delete pending/access denied. Or, it could be that tmpfile has simply chosen a filename that some other process is working on.
To avoid this sort of thing you might want to consider another mechanism for temp files... For example a function like Win32 GetTempFileName allows you to create your own prefix which might make a collision less likely. That function appears to resolve race conditions by retrying if a create fails with "already exists", so be careful about deleting the temp filenames that thing generates - deleting the file cancels your rights to use it concurrently with other processes/threads.

Resources