get function address from name [.debug_info ??] - c

I was trying to write a small debug utility and for this I need to get the function/global variable address given its name. This is built-in debug utility, which means that the debug utility will run from within the code to be debugged or in plain words I cannot parse the executable file.
Now is there a well-known way to do that ? The plan I have is to make the .debug_* sections to to be loaded into to memory [which I plan to do by a cheap trick like this in ld script]
.data {
*(.data)
__sym_start = .;
(debug_);
__sym_end = .;
}
Now I have to parse the section to get the information I need, but I am not sure this is doable or is there issues with this - this is all just theory. But it also seems like too much of work :-) is there a simple way. Or if someone can tell upfront why my scheme will not work, it ill also be helpful.
Thanks in Advance,
Alex.

If you are running under a system with dlopen(3) and dlsym(3) (like Linux) you should be able to:
char thing_string[] = "thing_you_want_to_look_up";
void * handle = dlopen(NULL, RTLD_LAZY | RTLD_NOLOAD);
// you could do RTLD_NOW as well. shouldn't matter
if (!handle) {
fprintf(stderr, "Dynamic linking on main module : %s\n", dlerror() );
exit(1);
}
void * addr = dlsym(handle, thing_string);
fprintf(stderr, "%s is at %p\n", thing_string, addr);
I don't know the best way to do this for other systems, and this probably won't work for static variables and functions. C++ symbol names will be mangled, if you are interested in working with them.
To expand this to work for shared libraries you could probably get the names of the currently loaded libraries from /proc/self/maps and then pass the library file names into dlopen, though this could fail if the library has been renamed or deleted.
There are probably several other much better ways to go about this.
edit without using dlopen
/* name_addr.h */
struct name_addr {
const char * sym_name;
const void * sym_addr;
};
typedef struct name_addr name_addr_t;
void * sym_lookup(cost char * name);
extern const name_addr_t name_addr_table;
extern const unsigned name_addr_table_size;
/* name_addr_table.c */
#include "name_addr.h"
#define PREMEMBER( X ) extern const void * X
#define REMEMBER( X ) { .sym_name = #X , .sym_addr = (void *) X }
PREMEMBER(strcmp);
PREMEMBER(printf);
PREMEMBER(main);
PREMEMBER(memcmp);
PREMEMBER(bsearch);
PREMEMBER(sym_lookup);
/* ... */
const name_addr_t name_addr_table[] =
{
/* You could do a #include here that included the list, which would allow you
* to have an empty list by default without regenerating the entire file, as
* long as your compiler only warns about missing include targets.
*/
REMEMBER(strcmp),
REMEMBER(printf),
REMEMBER(main),
REMEMBER(memcmp),
REMEMBER(bsearch),
REMEMBER(sym_lookup);
/* ... */
};
const unsigned name_addr_table_size = sizeof(name_addr_table)/sizeof(name_addr_t);
/* name_addr_code.c */
#include "name_addr.h"
#include <string.h>
void * sym_lookup(cost char * name) {
unsigned to_go = name_addr_table_size;
const name_addr_t *na = name_addr_table;
while(to_to) {
if ( !strcmp(name, na->sym_name) ) {
return na->sym_addr;
}
na++;
to_do--;
}
/* set errno here if you are using errno */
return NULL; /* Or some other illegal value */
}
If you do it this way the linker will take care of filling in the addresses for you after everything has been laid out. If you include header files for all of the symbols that you are listing in your table then you will not get warnings when you compile the table file, but it will be much easier just to have them all be extern void * and let the compiler warn you about all of them (which it probably will, but not necessarily).
You will also probably want to sort your symbols by name such that you can use a binary search of the list rather than iterate through it.
You should note that if you have members in the table which are not otherwise referenced by the program (like if you had an entry for sqrt in the table, but didn't call it) the linker will then want (need) to link those functions into your image. This can make it blow up.
Also, if you were taking advantage of global optimizations having this table will likely make those less effective since the compiler will think that all of the functions listed could be accessed via pointer from this list and that it cannot see all of the call points.
Putting static functions in this list is not straight forward. You could do this by changing the table to dynamic and doing it at run time from a function in each module, or possibly by generating a new section in your object file that the table lives in. If you are using gcc:
#define SECTION_REMEMBER(X) \
static const name_addr_t _name_addr##X = \
{.sym_name= #X , .sym_addr = (void *) X } \
__attribute__(section("sym_lookup_table" ) )
And tack a list of these onto the end of each .c file with all of the symbols that you want to remember from that file. This will require linker work so that the linker will know what to do with these members, but then you can iterate over the list by looking at the begin and end of the section that it resides in (I don't know exactly how to do this, but I know it can be done and isn't TOO difficult). This will make having a sorted list more difficult, though. Also, I'm not entirely certain initializing the .sym_name to a string literal's address would not result in cramming the string into this section, but I don't think it would. If it did then this would break things.
You can still use objdump to get a list of the symbols that the object file (probably elf) contains, and then filter this for the symbols you are interested in, and then regenerate the table file the table's members listed.

Related

LibClang: parse a header file with definitions from another header file?

I am using the latest LibClang to parse some C header files. The code I process comes from CXUnsavedFile's (it is all generated dynamically and nothing lives on disk). For Example:
FileA.h contains:
struct STRUCT_A {
int a;
struct STRUCT_B foo;
};
FileB.h contains:
struct STRUCT_B {
int b;
};
When parsing fileA.h with the following code snippet:
CXUnsavedFile unsaved_files[2];
unsaved_files[0].Filename = "fileA.h";
unsaved_files[0].Contents = fileA_contents;
unsaved_files[0].Length = strlen( fileA_contents );
unsaved_files[1].Filename = "fileB.h";
unsaved_files[1].Contents = fileB_contents;
unsaved_files[1].Length = strlen( fileB_contents );
tu = clang_parseTranslationUnit(
index,
"fileA.h",
argv, // "-x c-header -target i386-pc-win32"
argc,
(CXUnsavedFile *)&unsaved_files,
2,
CXTranslationUnit_None
);
CXCursor cur = clang_getTranslationUnitCursor( tu );
clang_visitChildren( cur, visitor, NULL );
I get the error "field has incomplete type 'struct STRUCT_B'" which makes sense as I have not included fileB.h in order to define struct STRUCT_B.
Adding an "#include <fileB.h>" does not work (fatal error: 'fileB.h' file not found).
How do I get parsing fileA.h to work when one or more needed definitions are present in another CXUnsavedFile fileB.h?
Not sure this will help you, but here are two remarks:
Although it isn't explicitly mentioned in the documentation, I think that the Filename field should contain a full path to the file (which could be important for inclusions, especially when there are "-I" switches in the command-line)
from libclang's documentation (emphasis mine):
const char* CXUnsavedFile::Filename
The file whose contents have not yet been saved.
This file must already exist in the file system.
I suspect libclang relies on the filesystem for almost everything (finding the correct file to include, checking it exists, ...) and only account for CXUnsavedFiles at the last step, when actual content must be read.
If you can, I would suggest creating empty files in a memory filesystem. This would not incur much resource usage, and could help libclang find the correct include files.

How do you get the start and end addresses of a custom ELF section?

I'm working in C on Linux. I've seen the usage of of the gcc __section__ attribute (especially in the Linux kernel) to collect data (usually function pointers) into custom ELF sections. How is the "stuff" that gets put in those custom sections retrieved and used?
As long as the section name results in a valid C variable name, gcc (ld, rather) generates two magic variables: __start_SECTION and __stop_SECTION. Those can be used to retrieve the start and end addresses of a section, like so:
/**
* Assuming you've tagged some stuff earlier with:
* __attribute((__section__("my_custom_section")))
*/
struct thing *iter = &__start_my_custom_section;
for ( ; iter < &__stop_my_custom_section; ++iter) {
/* do something with *iter */
}
I couldn’t find any formal documentation for this feature, only a few obscure mailing list references. If you know where the docs are, drop a comment!
If you're using your own linker script (as the Linux kernel does) you'll have to add the magic variables yourself (see vmlinux.lds.[Sh] and this SO answer).
See here for another example of using custom ELF sections.
Collecting the information together from various answers, here is a working example of how to collect information into a custom linker section and then read the information from that section using the magic variables __start_SECTION and __stop_SECTION in your C program, where SECTION is the name of the section in the link map.
The __start_SECTION and __stop_SECTION variables are made available by the linker so explicit extern references need to be created for these variables when they are used from C code.
There are also some problems if the alignment used by the compiler for calculating pointer/array offsets is different than the alignment of the objects packed in each section by the linker. One solution (used in this example) is to store only a pointer to the data in the linker section.
#include <stdio.h>
struct thing {
int val;
const char* str;
int another_val;
};
struct thing data1 = {1, "one"};
struct thing data2 = {2, "two"};
/* The following two pointers will be placed in "my_custom_section".
* Store pointers (instead of structs) in "my_custom_section" to ensure
* matching alignment when accessed using iterator in main(). */
struct thing *p_one __attribute__((section("my_custom_section"))) = &data1;
struct thing *p_two __attribute__((section("my_custom_section"))) = &data2;
/* The linker automatically creates these symbols for "my_custom_section". */
extern struct thing *__start_my_custom_section;
extern struct thing *__stop_my_custom_section;
int main(void) {
struct thing **iter = &__start_my_custom_section;
for ( ; iter < &__stop_my_custom_section; ++iter) {
printf("Have thing %d: '%s'\n", (*iter)->val, (*iter)->str);
}
return 0;
}
Linker can use the symbols defined in the code, and can assign their initial values if you use the exact name in the linker script:
_smysection = .;
*(.mysection)
*(.mysection*)
_emysection = .;
Just define a variable in C code:
const void * _smysection;
And then you can access that as a regular variable.
u32 someVar = (u32)&_smysection;
So the answer above, __start_SECTION and __stop_SECTION will work, however for the program to be able to use the information from the linker you to need to declare those variables as extern char* __start_SECTION. Enjoy!
extern char * __start_blobby;
...
printf("This section starts at %p\n", (unsigned int)&__start_blobby);
...
HI: like this.
extern const struct pseudo_ta_head __start_ta_head_section;
extern const struct pseudo_ta_head __stop_ta_head_section;
const struct pseudo_ta_head *start = &__start_ta_head_section;
const struct pseudo_ta_head *end = &__stop_ta_head_section;

How do I merge two binary executables?

This question follows on from another question I asked before. In short, this is one of my attempts at merging two fully linked executables into a single fully linked executable. The difference is that the previous question deals with merging an object file to a full linked executable which is even harder because it means I need to manually deal with relocations.
What I have are the following files:
example-target.c:
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
puts("1234");
return EXIT_SUCCESS;
}
example-embed.c:
#include <stdlib.h>
#include <stdio.h>
/*
* Fake main. Never used, just there so we can perform a full link.
*/
int main(void)
{
return EXIT_SUCCESS;
}
void func1(void)
{
puts("asdf");
}
My goal is to merge these two executables to produce a final executable which is the same as example-target, but additionally has another main and func1.
From the point of view of the BFD library, each binary is composed (amongst other things) of a set of sections. One of the first problems I faced was that these sections had conflicting load addresses (such that if I was to merge them, the sections would overlap).
What I did to solve this was to analyse example-target programmatically to get a list of the load address and sizes of each of its sections. I then did the same for example-embed and used this information to dynamically generate a linker command for example-embed.c which ensures that all of its sections are linked at addresses that do not overlap with any of the sections in example-target. Hence example-embed is actually fully linked twice in this process: once to determine how many sections and what sizes they are, and once again to link with a guarantee that there are no section clashes with example-target.
On my system, the linker command produced is:
-Wl,--section-start=.new.interp=0x1004238,--section-start=.new.note.ABI-tag=0x1004254,
--section-start=.new.note.gnu.build-id=0x1004274,--section-start=.new.gnu.hash=0x1004298,
--section-start=.new.dynsym=0x10042B8,--section-start=.new.dynstr=0x1004318,
--section-start=.new.gnu.version=0x1004356,--section-start=.new.gnu.version_r=0x1004360,
--section-start=.new.rela.dyn=0x1004380,--section-start=.new.rela.plt=0x1004398,
--section-start=.new.init=0x10043C8,--section-start=.new.plt=0x10043E0,
--section-start=.new.text=0x1004410,--section-start=.new.fini=0x10045E8,
--section-start=.new.rodata=0x10045F8,--section-start=.new.eh_frame_hdr=0x1004604,
--section-start=.new.eh_frame=0x1004638,--section-start=.new.ctors=0x1204E28,
--section-start=.new.dtors=0x1204E38,--section-start=.new.jcr=0x1204E48,
--section-start=.new.dynamic=0x1204E50,--section-start=.new.got=0x1204FE0,
--section-start=.new.got.plt=0x1204FE8,--section-start=.new.data=0x1205010,
--section-start=.new.bss=0x1205020,--section-start=.new.comment=0xC04000
(Note that I prefixed section names with .new using objcopy --prefix-sections=.new example-embedobj to avoid section name clashes.)
I then wrote some code to generate a new executable (borrowed some code both from objcopy and Security Warrior book). The new executable should have:
All the sections of example-target and all the sections of example-embed
A symbol table which contains all the symbols from example-target and all the symbols of example-embed
The code I wrote is:
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
#include <bfd.h>
#include <libiberty.h>
struct COPYSECTION_DATA {
bfd * obfd;
asymbol ** syms;
int symsize;
int symcount;
};
void copy_section(bfd * ibfd, asection * section, PTR data)
{
struct COPYSECTION_DATA * csd = data;
bfd * obfd = csd->obfd;
asection * s;
long size, count, sz_reloc;
if((bfd_get_section_flags(ibfd, section) & SEC_GROUP) != 0) {
return;
}
/* get output section from input section struct */
s = section->output_section;
/* get sizes for copy */
size = bfd_get_section_size(section);
sz_reloc = bfd_get_reloc_upper_bound(ibfd, section);
if(!sz_reloc) {
/* no relocations */
bfd_set_reloc(obfd, s, NULL, 0);
} else if(sz_reloc > 0) {
arelent ** buf;
/* build relocations */
buf = xmalloc(sz_reloc);
count = bfd_canonicalize_reloc(ibfd, section, buf, csd->syms);
/* set relocations for the output section */
bfd_set_reloc(obfd, s, count ? buf : NULL, count);
free(buf);
}
/* get input section contents, set output section contents */
if(section->flags & SEC_HAS_CONTENTS) {
bfd_byte * memhunk = NULL;
bfd_get_full_section_contents(ibfd, section, &memhunk);
bfd_set_section_contents(obfd, s, memhunk, 0, size);
free(memhunk);
}
}
void define_section(bfd * ibfd, asection * section, PTR data)
{
bfd * obfd = data;
asection * s = bfd_make_section_anyway_with_flags(obfd,
section->name, bfd_get_section_flags(ibfd, section));
/* set size to same as ibfd section */
bfd_set_section_size(obfd, s, bfd_section_size(ibfd, section));
/* set vma */
bfd_set_section_vma(obfd, s, bfd_section_vma(ibfd, section));
/* set load address */
s->lma = section->lma;
/* set alignment -- the power 2 will be raised to */
bfd_set_section_alignment(obfd, s,
bfd_section_alignment(ibfd, section));
s->alignment_power = section->alignment_power;
/* link the output section to the input section */
section->output_section = s;
section->output_offset = 0;
/* copy merge entity size */
s->entsize = section->entsize;
/* copy private BFD data from ibfd section to obfd section */
bfd_copy_private_section_data(ibfd, section, obfd, s);
}
void merge_symtable(bfd * ibfd, bfd * embedbfd, bfd * obfd,
struct COPYSECTION_DATA * csd)
{
/* set obfd */
csd->obfd = obfd;
/* get required size for both symbol tables and allocate memory */
csd->symsize = bfd_get_symtab_upper_bound(ibfd) /********+
bfd_get_symtab_upper_bound(embedbfd) */;
csd->syms = xmalloc(csd->symsize);
csd->symcount = bfd_canonicalize_symtab (ibfd, csd->syms);
/******** csd->symcount += bfd_canonicalize_symtab (embedbfd,
csd->syms + csd->symcount); */
/* copy merged symbol table to obfd */
bfd_set_symtab(obfd, csd->syms, csd->symcount);
}
bool merge_object(bfd * ibfd, bfd * embedbfd, bfd * obfd)
{
struct COPYSECTION_DATA csd = {0};
if(!ibfd || !embedbfd || !obfd) {
return FALSE;
}
/* set output parameters to ibfd settings */
bfd_set_format(obfd, bfd_get_format(ibfd));
bfd_set_arch_mach(obfd, bfd_get_arch(ibfd), bfd_get_mach(ibfd));
bfd_set_file_flags(obfd, bfd_get_file_flags(ibfd) &
bfd_applicable_file_flags(obfd));
/* set the entry point of obfd */
bfd_set_start_address(obfd, bfd_get_start_address(ibfd));
/* define sections for output file */
bfd_map_over_sections(ibfd, define_section, obfd);
/******** bfd_map_over_sections(embedbfd, define_section, obfd); */
/* merge private data into obfd */
bfd_merge_private_bfd_data(ibfd, obfd);
/******** bfd_merge_private_bfd_data(embedbfd, obfd); */
merge_symtable(ibfd, embedbfd, obfd, &csd);
bfd_map_over_sections(ibfd, copy_section, &csd);
/******** bfd_map_over_sections(embedbfd, copy_section, &csd); */
free(csd.syms);
return TRUE;
}
int main(int argc, char **argv)
{
bfd * ibfd;
bfd * embedbfd;
bfd * obfd;
if(argc != 4) {
perror("Usage: infile embedfile outfile\n");
xexit(-1);
}
bfd_init();
ibfd = bfd_openr(argv[1], NULL);
embedbfd = bfd_openr(argv[2], NULL);
if(ibfd == NULL || embedbfd == NULL) {
perror("asdfasdf");
xexit(-1);
}
if(!bfd_check_format(ibfd, bfd_object) ||
!bfd_check_format(embedbfd, bfd_object)) {
perror("File format error");
xexit(-1);
}
obfd = bfd_openw(argv[3], NULL);
bfd_set_format(obfd, bfd_object);
if(!(merge_object(ibfd, embedbfd, obfd))) {
perror("Error merging input/obj");
xexit(-1);
}
bfd_close(ibfd);
bfd_close(embedbfd);
bfd_close(obfd);
return EXIT_SUCCESS;
}
To summarise what this code does, it takes 2 input files (ibfd and embedbfd) to generate an output file (obfd).
Copies format/arch/mach/file flags and start address from ibfd to obfd
Defines sections from both ibfd and embedbfd to obfd. Population of the sections happens separately because BFD mandates that all sections are created before any start to be populated.
Merge private data of both input BFDs to the output BFD. Since BFD is a common abstraction above many file formats, it is not necessarily able to comprehensively encapsulate everything required by the underlying file format.
Create a combined symbol table consisting of the symbol table of ibfd and embedbfd and set this as the symbol table of obfd. This symbol table is saved so it can later be used to build relocation information.
Copy the sections from ibfd to obfd. As well as copying the section contents, this step also deals with building and setting the relocation table.
In the code above, some lines are commented out with /******** */. These lines deal with the merging of example-embed. If they are commented out, what happens is that obfd is simply built as a copy of ibfd. I have tested this and it works fine. However, once I comment these lines back in the problems start occurring.
With the uncommented version which does the full merge, it still generates an output file. This output file can be inspected with objdump and found to have all the sections, code and symbol tables of both inputs. However, objdump complains with:
BFD: BFD (GNU Binutils for Ubuntu) 2.21.53.20110810 assertion fail ../../bfd/elf.c:1708
BFD: BFD (GNU Binutils for Ubuntu) 2.21.53.20110810 assertion fail ../../bfd/elf.c:1708
On my system, 1708 of elf.c is:
BFD_ASSERT (elf_dynsymtab (abfd) == 0);
elf_dynsymtab is a macro in elf-bfd.h for:
#define elf_dynsymtab(bfd) (elf_tdata(bfd) -> dynsymtab_section)
I'm not familiar with the ELF layer, but I believe this is a problem reading the dynamic symbol table (or perhaps saying it's not present). For the time, I am trying to avoid having to reach down directly into the ELF layer unless necessary. Is anyone able to tell me what I'm doing wrong either in my code or conceptually?
If it is helpful, I can also post the code for the linker command generation or compiled versions of the example binaries.
I realise that this is a very large question and for this reason, I would like to properly reward anyone who is able to help me with it. If I am able to solve this with the help of someone, I am happy to award a 500+ bonus.
Why do all of this manually? Given that you have all symbol information (which you must if you want to edit the binary in a sane way), wouldn't it be easier to SPLIT the executable into separate object files (say, one object file per function), do your editing, and relink it?

How avoid using global variable when using nftw

I want to use nftw to traverse a directory structure in C.
However, given what I want to do, I don't see a way around using a global variable.
The textbook examples of using (n)ftw all involve doing something like printing out a filename. I want, instead, to take the pathname and file checksum and place those in a data structure. But I don't see a good way to do that, given the limits on what can be passed to nftw.
The solution I'm using involves a global variable. The function called by nftw can then access that variable and add the required data.
Is there any reasonable way to do this without using a global variable?
Here's the exchange in previous post on stackoverflow in which someone suggested I post this as a follow-up.
Using ftw can be really, really bad. Internally it will save the the function pointer that you use, if another thread then does something else it will overwrite the function pointer.
Horror scenario:
thread 1: count billions of files
thread 2: delete some files
thread 1: ---oops, it is now deleting billions of
files instead of counting them.
In short. You are better off using fts_open.
If you still want to use nftw then my suggestion is to put the "global" type in a namespace and mark it as "thread_local". You should be able to adjust this to your needs.
/* in some cpp file */
namespace {
thread_local size_t gTotalBytes{0}; // thread local makes this thread safe
int GetSize(const char* path, const struct stat* statPtr, int currentFlag, struct FTW* internalFtwUsage) {
gTotalBytes+= statPtr->st_size;
return 0; //ntfw continues
}
} // namespace
size_t RecursiveFolderDiskUsed(const std::string& startPath) {
const int flags = FTW_DEPTH | FTW_MOUNT | FTW_PHYS;
const int maxFileDescriptorsToUse = 1024; // or whatever
const int result = nftw(startPath.c_str(), GetSize, maxFileDescriptorsToUse , flags);
// log or something if result== -1
return gTotalBytes;
}
No. nftw doesn't offer any user parameter that could be passed to the function, so you have to use global (or static) variables in C.
GCC offers an extension "nested function" which should capture the variables of their enclosing scopes, so they could be used like this:
void f()
{
int i = 0;
int fn(const char *,
const struct stat *, int, struct FTW *) {
i++;
return 0;
};
nftw("path", fn, 10, 0);
}
The data is best given static linkage (i.e. file-scope) in a separate module that includes only functions required to access the data, including the function passed to nftw(). That way the data is not visible globally and all access is controlled. It may be that the function that calls ntfw() is also part of this module, enabling the function passed to nftw() to also be static, and thus invisible externally.
In other words, you should do what you are probably doing already, but use separate compilation and static linkage judiciously to make the data only visible via access functions. Data with static linkage is accessible by any function within the same translation unit, and you avoid the problems associated with global variables by only including functions in that translation unit that are creators, maintainers or accessors of that data.
The general pattern is:
datamodule.h
#if defined DATAMODULE_INCLUDE
<type> create_data( <args>) ;
<type> get_data( <args> ) ;
#endif
datamodule.c
#include "datamodule.h"
static <type> my_data ;
static int nftwfunc(const char *filename, const struct stat *statptr, int fileflags, struct FTW *pfwt)
{
// update/add to my_data
...
}
<type> create_data( const char* path, <other args>)
{
...
ret = nftw( path, nftwfunc, fd_limit, flags);
...
}
<type> get_data( <args> )
{
// Get requested data from my_data and return it to caller
}

How does cryoPID create ELF headers or is there an easy way for ELF generation?

I'm trying to do a checkpoint/restart program in C and I'm studying cryoPID's code to see how a process can be restarted. In it's code, cryoPID creates the ELF header of the process to be restarted in a function that uses some global variable and it's really confusing.
I have been searching for an easy way to create an ELF executable file, even trying out libelf, but I find that most of the times some necessary information is vague in the documentation of these programs and I cannot get to understand how to do it. So any help in that matter would be great.
Seeing cryoPID's code I see that it does the whole creation in an easy way, not having to set all header fields, etc. But I cannot seem to understand the code that it uses.
First of all, in the function that creates the ELF the following code is relevant (it's in arch-x86_64/elfwriter.c):
Elf64_Ehdr *e;
Elf64_Shdr *s;
Elf64_Phdr *p;
char* strtab;
int i, j;
int got_it;
unsigned long cur_brk = 0;
e = (Elf64_Ehdr*)stub_start;
assert(e->e_shoff != 0);
assert(e->e_shentsize == sizeof(Elf64_Shdr));
assert(e->e_shstrndx != SHN_UNDEF);
s = (Elf64_Shdr*)(stub_start+(e->e_shoff+(e->e_shstrndx*e->e_shentsize)));
strtab = stub_start+s->sh_offset;
stub_start is a global variable defined with the macro declare_writer in cryopid.h:
#define declare_writer(s, x, desc) \
extern char *_binary_stub_##s##_start; \
extern int _binary_stub_##s##_size; \
struct stream_ops *stream_ops = &x; \
char *stub_start = (char*)&_binary_stub_##s##_start; \
long stub_size = (long)&_binary_stub_##s##_size
This macro is used in writer_*.c which are the files that implement writers for files. For example in writer_buffered.c, the macro is called with this code:
struct stream_ops buf_ops = {
.init = buf_init,
.read = buf_read,
.write = buf_write,
.finish = buf_finish,
.ftell = buf_ftell,
.dup2 = buf_dup2,
};
declare_writer(buffered, buf_ops, "Writes an output file with buffering");
So stub_start gets declared as an uninitialized global variable (the code above is not in any function) and seeing that all the variables in declare_writer are not set in any other part of the code, I assume that stub_start just point to some part of the .bss section, but it seems like cryoPID use it like it's pointing to its own ELF header.
Can anyone help me with this problem or assist me in anyway to create ELF headers easily?
As mentioned in the comment, it uses something similar to objcopy to set those variables (it doesn't use the objcopy command, but custom linkers that I think could be the ones that area "setting" the variables). Couldn't exactly find what, but I could reproduce the behavior by mmap'ing an executable file previously compiled and setting the variables stub_start and stub_size with that map.

Resources