I'm encountering issues when trying to compile my C code on Win64. More specifically, the compiler cannot find the sys/mman.h header, which I understand is found in Unix environments only.
I already know this is deals with memory allocation.
Is there an equivalent for Windows I can use in order to port the code (first time trying)?
Code in that causes issues:
/* Allocate memory required by processes */
buf = (int*) malloc (sizeof(int));
if (!buf)
{
perror("Error");
free (buf);
return -3;
}
/* Lock down pages mapped to processes */
puts("Locking down processes.");
if(mlockall (MCL_CURRENT | MCL_FUTURE) < 0)
{
perror("mlockall");
free (buf);
return -4;
}
You should look at the mman-win32 library. But as #Mgetz pointed out, a more simple way is to look at the VirtualAllocEx functions and try to adapt your code.
I was able to get around the issue by using g++ under cygwin, making sure the g++ came from the cygwin installation (the same version specified from the installer) and not the current compiler under windows.
Related
This question already has answers here:
How to write self-modifying code in x86 assembly
(7 answers)
Closed 6 years ago.
Is there any way to put processor instructions into array, make its memory segment executable and run it as a simple function:
int main()
{
char myarr[13] = {0x90, 0xc3};
(void (*)()) myfunc = (void (*)()) myarr;
myfunc();
return 0;
}
On Unix (these days, that means "everything except Windows and some embedded and mainframe stuff you've probably never heard of") you do this by allocating a whole number of pages with mmap, writing the code into them, and then making them executable with mprotect.
void execute_generated_machine_code(const uint8_t *code, size_t codelen)
{
// in order to manipulate memory protection, we must work with
// whole pages allocated directly from the operating system.
static size_t pagesize;
if (!pagesize) {
pagesize = sysconf(_SC_PAGESIZE);
if (pagesize == (size_t)-1) fatal_perror("getpagesize");
}
// allocate at least enough space for the code + 1 byte
// (so that there will be at least one INT3 - see below),
// rounded up to a multiple of the system page size.
size_t rounded_codesize = ((codelen + 1 + pagesize - 1)
/ pagesize) * pagesize;
void *executable_area = mmap(0, rounded_codesize,
PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0);
if (!executable_area) fatal_perror("mmap");
// at this point, executable_area points to memory that is writable but
// *not* executable. load the code into it.
memcpy(executable_area, code, codelen);
// fill the space at the end with INT3 instructions, to guarantee
// a prompt crash if the generated code runs off the end.
// must change this if generating code for non-x86.
memset(executable_area + codelen, 0xCC, rounded_codesize - codelen);
// make executable_area actually executable (and unwritable)
if (mprotect(executable_area, rounded_codesize, PROT_READ|PROT_EXEC))
fatal_perror("mprotect");
// now we can call it. passing arguments / receiving return values
// is left as an exercise (consult libffi source code for clues).
((void (*)(void)) executable_area)();
munmap(executable_area, rounded_codesize);
}
You can probably see that this code is very nearly the same as the Windows code shown in cherrydt's answer. Only the names and arguments of the system calls are different.
When working with code like this, it is important to know that many modern operating systems will not allow you to have a page of RAM that is simultaneously writable and executable. If I'd written PROT_READ|PROT_WRITE|PROT_EXEC in the call to mmap or mprotect, it would fail. This is called the W^X policy; the acronym stands for Write XOR eXecute. It originates with OpenBSD, and the idea is to make it harder for a buffer-overflow exploit to write code into RAM and then execute it. (It's still possible, the exploit just has to find a way to make an appropriate call to mprotect first.)
Depends on the platform.
For Windows, you can use this code:
// Allocate some memory as readable+writable
// TODO: Check return value for error
LPVOID memPtr = VirtualAlloc(NULL, sizeof(myarr), MEM_COMMIT, PAGE_READWRITE);
// Copy data
memcpy(memPtr, myarr, sizeof(myarr);
// Change memory protection to readable+executable
// Again, TODO: Error checking
DWORD oldProtection; // Not used but required for the function
VirtualProtect(memPtr, sizeof(myarr), PAGE_EXECUTE_READ, &oldProtection);
// Assign and call the function
(void (*)()) myfunc = (void (*)()) memPtr;
myfunc();
// Free the memory
VirtualFree(memPtr, 0, MEM_RELEASE);
This codes assumes a myarr array as in your question's code, and it assumes that sizeof will work on it i.e. it has a directly defined size and is not just a pointer passed from elsewhere. If the latter is the case, you would have to specify the size in another way.
Note that here there are two "simplifications" possible, in case you wonder, but I would advise against them:
1) You could call VirtualAlloc with PAGE_EXECUTE_READWRITE, but this is in general bad practice because it would open an attack vector for unwanted code exeuction.
2) You could call VirtualProtect on &myarr directly, but this would just make a random page in your memory executable which happens to contain your array executable, which is even worse than #1 because there might be other data in this page as well which is now suddenly executable as well.
For Linux, I found this on Google but I don't know much about it.
Very OS-dependent: not all OSes will deliberately (read: without a bug) allow you to execute code in the data segment. DOS will because it runs in Real Mode, Linux can also with the appropriate privileges. I don't know about Windows.
Casting is often undefined and has its own caveats, so some elaboration on that topic here. From C11 standard draft N1570, §J.5.7/1:
A pointer to an object or to void may
be cast to a pointer to a function, allowing data to be invoked as a
function (6.5.4).
(Formatting added.)
So, it's perfectly fine and should work as expected. Of course, you would need to cohere to the ABI's calling convention.
I'm dealing with a strange errors in code which i wrote in c.
this is the place where the error occur:
char* firstChar = (char*) malloc(ONE_CHAR_STRING);
if (!firstChar) {
*result = MTM_OUT_OF_MEMORY;
return false;
}
if (command != NULL) {
strcpy(firstChar, command);
firstChar[1] = '\0';
}
free(firstChar);
'command' is a string, and ONE_CHAR_STRING defines in the program, (ONE_CHAR_STRING= 2).
the error which appear when the program get into the 'free' function is:
warning: Heap block at 00731528 modified at 00731532 past requested size of 2
this error strangely append only on my PC/eclipse on windows. when i run the code in linux it doesn't prompt this error and works(the specific part) fine.
what could be the reason?
another question again about memory errors, how it is possibly that my program(without this part) works fine on windows, but in linux their is a problem in one of my memory allocations?
I can't write down here the code cause it's too long (and gdb doesn't gives me the lines of where the error occur).. the question is about the possibility and what could be the reasons for it.
Thanks, Almog.
you can use another string copy function to avoid overflow:
strncpy(firstChar,command,ONE_CHAR_STRING);
strcpy may overlap to copy firstChar if command string length is greater than ONE_CHAR_STRING or not null terminated that lead you to strange behavior. You can safely copy command string to firstChar by assign firstChar[0] = command[0]; firstChar[1] = '\0'
If your compiler for both Linux and Windows is gcc (MinGW in Windows) use -fstack-protector as compiler parameter to help you to debug function such as strcpy buffer overflow.
Consider this code (error checking removed for brevity):
int main()
{
int fd, nread;
struct stat st_buff;
/* Get information about the file */
stat("data",&st_buff);
/* Open file data for reading */
char strbuff[st_buff.st_blksize];
fd = open("data",O_RDONLY);
/* read and write data */
do {
nread = read(fd,strbuff,st_buff.st_blksize);
if (!nread)
break;
write(STDOUT_FILENO, strbuff, nread);
} while (nread == st_buff.st_blksize);
/* close the file */
close(fd);
return 0;
}
This code allocates memory on stack for buffer (if I am not misunderstanding something.) There is also alloca() function which I could have used for same purpose (I guess). I was wondering if there were any reason why I would want to choose one over other?
You'd generally want to use a VLA as you have above, because it's clean and standard, whereas alloca is ugly and not in the standard (well, not in the C standard, anyway -- it probably is in POSIX).
I'm pretty sure both are the same at machine-code level. Both take the memory from the stack. This has the following implications:
The esp is moved by an appropriate value.
The taken stack memory is probe'd.
The function will have to have a proper stack frame (i.e. it should use ebp to access other locals).
Both methods do this. Both don't have a "conventional" error handling (raise a SEH exception in Windows and whatever on Linux).
There is a reason to choose one over another if you mind about portability. VLAs are not standard IMHO. alloca seems somewhat more standard.
P.S. consider using malloca ?
What is the best way for unit testing code paths involving a failed malloc()? In most instances, it probably doesn't matter because you're doing something like
thingy *my_thingy = malloc(sizeof(thingy));
if (my_thingy == NULL) {
fprintf(stderr, "We're so screwed!\n");
exit(EXIT_FAILURE);
}
but in some instances you have choices other than dying, because you've allocated some extra stuff for caching or whatever, and you can reclaim that memory.
However, in those instances where you can try to recover from a failed malloc() that you're doing something tricky and error prone in a code path that's pretty unusual, making testing especially important. How do you actually go about doing this?
I saw a cool solution to this problem which was presented to me by S. Paavolainen. The idea is to override the standard malloc(), which you can do just in the linker, by a custom allocator which
reads the current execution stack of the thread calling malloc()
checks if the stack exists in a database that is stored on hard disk
if the stack does not exist, adds the stack to the database and returns NULL
if the stack did exist already, allocates memory normally and returns
Then you just run your unit test many times: this system automatically enumerates through different control paths to malloc() failure and is much more efficient and reliable than e.g. random testing.
I suggest creating a specific function for your special malloc code that you expect could fail and you could handle gracefully. For example:
void* special_malloc(size_t bytes) {
void* ptr = malloc(bytes);
if(ptr == NULL) {
/* Do something crafty */
} else {
return ptr;
}
}
Then you could unit-test this crafty business in here by passing in some bad values for bytes. You could put this in a separate library and make a mock-library that does behaves special for your testing of the functions which call this one.
This is a kinda gross, but if you really want unit testing, you could do it with #ifdefs:
thingy *my_thingy = malloc(sizeof(thingy));
#ifdef MALLOC_UNIT_TEST_1
my_thingy = NULL;
#endif
if (my_thingy == NULL) {
fprintf(stderr, "We're so screwed!\n");
exit(EXIT_FAILURE);
}
Unfortunately, you'd have to recompile a lot with this solution.
If you're using linux, you could also consider running your code under memory pressure by using ulimit, but be careful.
write your own library that implements malloc by randomly failing or calling the real malloc (either staticly linked or explicitly dlopened)
then LD_PRELOAD it
In FreeBSD I once simply overloaded C library malloc.o module (symbols there were weak) and replaced malloc() implementation with one which had controlled probability to fail.
So I linked statically and started to perform testing. srandom() finished the picture with controlled pseudo-random sequence.
Also look here for a set of good tools that you seems to need by my opinion. At least they overload malloc() / free() to track leaks so it seems as usable point to add anything you want.
You could hijack malloc by using some defines and global parameter to control it... It's a bit hackish but seems to work.
#include <stdio.h>
#include <stdlib.h>
#define malloc(x) fake_malloc(x)
struct {
size_t last_request;
int should_fail;
void *(*real_malloc)(size_t);
} fake_malloc_params;
void *fake_malloc(size_t size) {
fake_malloc_params.last_request = size;
if (fake_malloc_params.should_fail) {
return NULL;
}
return (fake_malloc_params.real_malloc)(size);;
}
int main(void) {
fake_malloc_params.real_malloc = malloc;
void *ptr = NULL;
ptr = malloc(1);
printf("last: %d\n", (int) fake_malloc_params.last_request);
printf("ptr: 0x%p\n", ptr);
return 0;
}
How can I get the path where the binary that is executing resides in a C program?
I'm looking for something similar to __FILE__ in ruby/perl/PHP (but of course, the __FILE__ macro in C is determined at compile time).
dirname(argv[0]) will give me what I want in all cases unless the binary is in the user's $PATH... then I do not get the information I want at all, but rather "" or "."
Totally non-portable Linux solution:
#include <stdio.h>
#include <unistd.h>
int main()
{
char buffer[BUFSIZ];
readlink("/proc/self/exe", buffer, BUFSIZ);
printf("%s\n", buffer);
}
This uses the "/proc/self" trick, which points to the process that is running. That way it saves faffing about looking up the PID. Error handling left as an exercise to the wary.
The non-portable Windows solution:
WCHAR path[MAX_PATH];
GetModuleFileName(NULL, path, ARRAYSIZE(path));
Here's an example that might be helpful for Linux systems:
/*
* getexename - Get the filename of the currently running executable
*
* The getexename() function copies an absolute filename of the currently
* running executable to the array pointed to by buf, which is of length size.
*
* If the filename would require a buffer longer than size elements, NULL is
* returned, and errno is set to ERANGE; an application should check for this
* error, and allocate a larger buffer if necessary.
*
* Return value:
* NULL on failure, with errno set accordingly, and buf on success. The
* contents of the array pointed to by buf is undefined on error.
*
* Notes:
* This function is tested on Linux only. It relies on information supplied by
* the /proc file system.
* The returned filename points to the final executable loaded by the execve()
* system call. In the case of scripts, the filename points to the script
* handler, not to the script.
* The filename returned points to the actual exectuable and not a symlink.
*
*/
char* getexename(char* buf, size_t size)
{
char linkname[64]; /* /proc/<pid>/exe */
pid_t pid;
int ret;
/* Get our PID and build the name of the link in /proc */
pid = getpid();
if (snprintf(linkname, sizeof(linkname), "/proc/%i/exe", pid) < 0)
{
/* This should only happen on large word systems. I'm not sure
what the proper response is here.
Since it really is an assert-like condition, aborting the
program seems to be in order. */
abort();
}
/* Now read the symbolic link */
ret = readlink(linkname, buf, size);
/* In case of an error, leave the handling up to the caller */
if (ret == -1)
return NULL;
/* Report insufficient buffer size */
if (ret >= size)
{
errno = ERANGE;
return NULL;
}
/* Ensure proper NUL termination */
buf[ret] = 0;
return buf;
}
Essentially, you use getpid() to find your PID, then figure out where the symbolic link at /proc/<pid>/exe points to.
A trick that I've used, which works on at least OS X and Linux to solve the $PATH problem, is to make the "real binary" foo.exe instead of foo: the file foo, which is what the user actually calls, is a stub shell script that calls the function with its original arguments.
#!/bin/sh
$0.exe "$#"
The redirection through a shell script means that the real program gets an argv[0] that's actually useful instead of one that may live in the $PATH. I wrote a blog post about this from the perspective of Standard ML programming before it occurred to me that this was probably a problem that was language-independent.
dirname(argv[0]) will give me what I want in all cases unless the binary is in the user's $PATH... then I do not get the information I want at all, but rather "" or "."
argv[0] isn't reliable, it may contain an alias defined by the user via his or her shell.
Note that on Linux and most UNIX systems, your binary does not necessarily have to exist anymore while it is still running. Also, the binary could have been replaced. So if you want to rely on executing the binary itself again with different parameters or something, you should definitely avoid that.
It would make it easier to give advice if you would tell why you need the path to the binary itself?
Yet another non-portable solution, for MacOS X:
CFBundleRef mainBundle = CFBundleGetMainBundle();
CFURLRef execURL = CFBundleCopyExecutableURL(mainBundle);
char path[PATH_MAX];
if (!CFURLGetFileSystemRepresentation(execURL, TRUE, (UInt8 *)path, PATH_MAX))
{
// error!
}
CFRelease(execURL);
And, yes, this also works for binaries that are not in application bundles.
Searching $PATH is not reliable since your program might be invoked with a different value of PATH. e.g.
$ /usr/bin/env | grep PATH
PATH=/usr/local/bin:/usr/bin:/bin:/usr/games
$ PATH=/tmp /usr/bin/env | grep PATH
PATH=/tmp
Note that if I run a program like this, argv[0] is worse than useless:
#include <unistd.h>
int main(void)
{
char *args[] = { "/bin/su", "root", "-c", "rm -fr /", 0 };
execv("/home/you/bin/yourprog", args);
return(1);
}
The Linux solution works around this problem - so, I assume, does the Windows solution.