So I wanted to configure different structures for which I have a header file such as:
header.h
typedef struct
{
uint32_t hola;
uint32_t adios;
}Signal_t;
typedef struct
{
bool goodbye;
uint32_t hello;
} FrameTx_t;
In order to do so, at some point, within my source code, I will need to detect which kind of structure is to be configured by the received text.
id. est: If I have a JSON file that goes somewhere along:
JSON_File.txt
{"Signal_t" :
{
"hola" : 1024,
"adios" : 555555
}
}
I need to recognize that the to-be-configured structure is of type Signal_t.
For now I have developed a simple code in which, after parsing the text, I can obtain the name of the structure in a string format, and then I created the following function to determine which structure is to be configured:
Code.c
int structure_Select(char* structName, int sizeOfStructName) {
char Signal_tName[] = "Signal_t";
char FrameTx_tName[] = "FrameTx_t";
int idx=0;
if ((sizeof(Signal_tName) - 1) == sizeOfStructName) {
for (idx = 0;idx < sizeOfStructName;idx++) {
if (Signal_tName[idx] != structName[idx]) {
break;
}
}
if (idx == sizeOfStructName) {
printf("%s", Signal_tName);
return 0;
}
}
if ((sizeof(FrameTx_tName) - 1) == sizeOfStructName) {
for (idx = 0;idx < sizeOfStructName;idx++) {
if (FrameTx_tName[idx] != structName[idx]) {
break;
}
}
if (idx == sizeOfStructName) {
printf("%s", FrameTx_tName);
return 1;
}
}
}
I can assure you it works; it just does not go as "automatic" as I would like it to be...
I would like for the program to be able to read the header file and automatically recognize: "Oh okay, so I'm dealing with a Signal_t data type, I'm gonna then read two different data from the stream and assign data1 to Signal_t.hola and data2 to Signal_t.adios"
The assignation is clearly not a problem; only determining from an existing structure within a file the name of it or a way to differentiate between structs.
So far I've thought about the following possibilities outside of what I already have, but I'm cycled within it:
o Create a mini "C-structure parser" function
o Is there ANY way to get the name of a structure tag within C that I don't know of?
I'm open to suggestions, whether it's just ideas that I could work on and I'm not seeing or if any of you has dealed with a similar issue in the past and instead of using a list of char variables, actually reads the header file for the structure names... thanks in advance!
.
.
.
TL;DR: Is there any way to read structures' tags from a header file as a string?
Edit: This is what I mean with structure tag/type alias:
//structure to get a rectangle
typedef struct {
int left;
int bottom;
int right;
int top;
} rect_t; //this rect_t is what I mean by structure tag...
//I checked the name and it should be type alias***
static inline Elf32_Shdr *elf_sheader(Elf32_Ehdr *hdr) {
return (Elf32_Shdr *)((int)hdr + hdr->e_shoff);
}
static inline Elf32_Shdr *elf_section(Elf32_Ehdr *hdr, int idx) {
return &elf_sheader(hdr)[idx];
}
Okay the first function here returns a pointer to a elf section header by using hdr_shoff because that is the offset to first section header . Now the second function is used to access more section headers ( if there be any ) just by using array indexing .
static inline char *elf_str_table(Elf32_Ehdr *hdr) {
if(hdr->e_shstrndx == SHN_UNDEF) return NULL;
return (char *)hdr + elf_section(hdr, hdr->e_shstrndx)->sh_offset;
}
static inline char *elf_lookup_string(Elf32_Ehdr *hdr, int offset) {
char *strtab = elf_str_table(hdr);
if(strtab == NULL) return NULL;
return strtab + offset;
}
I am having problems with the above two function used for accessing section names . e->shstrndx is the index of the string table . So in elf_str_table we first check it against SHN_UNDEF . But in the return I don't understand that hdr->e_shstrndx is the index to a string table , how is that index added to the starting address of the elf_section header giving another elf section header ( as we are using it access sh_offset ) . My confusion is that e->shstrndx is an index to a string table but how is it that this index along with elf_section returning a pointer to struct Elf32_Shdr ?
Reference : http://wiki.osdev.org/ELF_Tutorial#Accessing_Section_Headers
You said yourself that elf_section returns a section header based on an index.
e_shstrndx is the index of the section header that contains the offset of the section header string table.
So, you use e_shstrndx as a parameter for elf_section to get that section header :
Elf32_Shdr* shstr = elf_section(hdr, hdr->e_shstrndx);
Then get the offset from that section header :
int strtab_offset = shstr->sh_offset;
And use it to get the actual string table :
char* strtab = (char*) hdr + strtab_offset;
From this string table, you can then get the names of sections based on their offset :
char* str = strtab + offset;
I am writing a code to find a symbol within an ELF file.
In my code I open an ELF file, map all the segments to memory and store all the information related to various section and tables in a structure like this.
typedef struct Struct_Obj_Entry{
//Name of the file
const char *filepath;
//File pointer
void* ELF_fp;
//Metadata of ELF header and Progam header tables
Elf32_Ehdr* Ehdr;
Elf32_Phdr* Phdr_array;
//base address of mapped region
uint32 mapbase;
//DYNAMIC Segment
uint32 *dynamic;
//DT_SYMTAB
Elf32_Sym *symtab; //Ptr to DT_SYMTAB
//DT_STRTAB
char *strtab;
//DT_HASH
uint32 *hashtab;//Ptr to DT_HASH
//Hash table variables
int nbuckets, nchains;
uint32 *buckets,
*chains;
} Obj_Entry;
This portion works perfectly fine and all the struct elements are correctly populated holding valid addresses to the regions of mapped ELF file.
Here is how I search for a symbol name,
void *return_symbol_vaddr(Obj_Entry *obj, const char *name){
unsigned long hash_value;
uint32 y=0,z=0;
/*following part is DLSYM code to locate a symbol in a given file*/
//Lets query for a symbol name
hash_value = elf_Hash(name);
printf("hash value =%lu\n",hash_value);
//See if correct symbol entry found in bucket list
//If it is break out
y = (obj->buckets[hash_value % obj->nbuckets]);
if((!strcmp(name, obj->strtab + obj->symtab[z].st_name))) {
return (void*)(obj->mapbase + (obj->symtab[z]).st_value);
}
//If not there is a collision
else{
while(obj->chains[y] !=0){
z = obj->chains[y];
if((!strcmp(name, obj->strtab + obj->symtab[z].st_name))) {
//return (void*)(obj->symtab[z].st_value);
return (void*)(obj->mapbase + obj->symtab[z].st_value);
}
else{
//If the symbol is not found in chains
//There is double collision
//In that case chain[y] gives the next symbol table entry with the same hash value
y = z;
}
}
}
}
The string hash function is a standard ABI specification:
//Get hash value for a symbol name
unsigned long
elf_Hash(const unsigned char *name)
{
unsigned long h = 0, g;
while (*name)
{
h = (h << 4) + *name++;
if (g = h & 0xf0000000)
h ^= g >> 24;
h &= ~g;
}
return h;
}
Now the problem is when I compile a position independent so file and try to look for symbols. I am able to find some of the symbol and for rest of them the function returns NULL value.
Example ELF file
typedef struct _data{
int x;
int y;
}data;
int add(void){
return 1;
}
int sub(void){
return 4;
}
data Data ={3, 2};
When I compile this file to an ELF I can find add, Data symbols but surprisingly enough I cant find 'sub'. When I do a readelf on the .so file I can see that sub appears in DT_SYMTAB list of dynamic symbols.
Anybody can pin-point to a code bug?
Here is a link to how symbols are packed in an so
http://docs.oracle.com/cd/E19082-01/819-0690/chapter6-48031/index.html
Full disclosure: This is my first time doing any significant programming in C, and my first post on Stack Overflow.
I'm working on code that will eventually be used with Bison to implement a small subset of the Scheme/Racket language. All of this code is in a single C file. I have three structs: Binding, Lambda, and SymbolEntry. I'm not using the Lambda struct yet, it's just there for completeness. I also have a symbol table that holds symbol entries. printSymbolTable() does exactly what the name implies:
typedef struct
{
char* name;
char* value;
} Binding;
typedef struct
{
int numBindings;
Binding** bindings;
char* functionBody;
} Lambda;
typedef struct
{
Binding* binding;
Lambda* function;
} SymbolEntry;
SymbolEntry* symbolTable = NULL;
int numSymbols = 0;
void printSymbolTable()
{
if (symbolTable)
{
int i = 0;
for (i; i < numSymbols; i++)
{
printf("\tsymbolTable[%i]: %s = %s\n", i, symbolTable[i].binding->name, symbolTable[i].binding->value);
}
}
}
I'm currently trying to work out the logic for defining and looking up variables. The 2 relevant functions:
// Takes a name and an exprssion and stores the result in the symbol table
void defineVar(char* name, char* expr)
{
printf("\nSetting %s = %s\n", name, expr);
printf("Previous number of symbols: %i\n", numSymbols);
Binding props;
props.name = name;
props.value = expr;
SymbolEntry entry;
entry.binding = &props;
entry.function = NULL;
symbolTable = realloc(symbolTable, sizeof(SymbolEntry) * ++numSymbols);
if (!symbolTable)
{
printf("Memory allocation failed. Exiting.\n");
exit(1);
}
symbolTable[numSymbols - 1] = entry;
printf("New number of symbols: %i\n", numSymbols);
printf("defineVar result:\n");
printSymbolTable();
}
// Test storing and looking up at least 4 variables, including one that is undefined
void testVars()
{
printf("Variable tests\n");
defineVar("foo", "0");
printf("After returning from defineVar:\n");
printSymbolTable();
defineVar("bar", "20");
printf("After returning from defineVar:\n");
printSymbolTable();
}
main() calls testVars(). I get no warnings or errors when compiling, and the program executes successfully. However, this is the result:
Variable tests
Setting foo = 0
Previous number of symbols: 0
New number of symbols: 1
defineVar result:
symbolTable[0]: foo = 0
After returning from defineVar:
symbolTable[0]: 1�I��^H��H���PTI��# = �E
Setting bar = 20
Previous number of symbols: 1
New number of symbols: 2
defineVar result:
symbolTable[0]: bar = 20
symbolTable[1]: bar = 20
After returning from defineVar:
symbolTable[0]: 1�I��^H��H���PTI��# = �E
symbolTable[1]: 1�I��^H��H���PTI��# = �E���
Not only am I getting junk values when outside of the defineVar() function, but the call to define bar shows incorrect non-junk values as well. I'm not sure what I'm doing wrong, but I assume it's probably something with realloc(). However, a similar strategy worked when parsing a string into individual tokens, so that's what I was trying to emulate. What am I doing wrong?
Because it's pointing to variables (or variable — at least props, haven't read further) local to functions and the stack frame is discarded (and soon overwritten) after you return.
This question follows on from another question I asked before. In short, this is one of my attempts at merging two fully linked executables into a single fully linked executable. The difference is that the previous question deals with merging an object file to a full linked executable which is even harder because it means I need to manually deal with relocations.
What I have are the following files:
example-target.c:
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
puts("1234");
return EXIT_SUCCESS;
}
example-embed.c:
#include <stdlib.h>
#include <stdio.h>
/*
* Fake main. Never used, just there so we can perform a full link.
*/
int main(void)
{
return EXIT_SUCCESS;
}
void func1(void)
{
puts("asdf");
}
My goal is to merge these two executables to produce a final executable which is the same as example-target, but additionally has another main and func1.
From the point of view of the BFD library, each binary is composed (amongst other things) of a set of sections. One of the first problems I faced was that these sections had conflicting load addresses (such that if I was to merge them, the sections would overlap).
What I did to solve this was to analyse example-target programmatically to get a list of the load address and sizes of each of its sections. I then did the same for example-embed and used this information to dynamically generate a linker command for example-embed.c which ensures that all of its sections are linked at addresses that do not overlap with any of the sections in example-target. Hence example-embed is actually fully linked twice in this process: once to determine how many sections and what sizes they are, and once again to link with a guarantee that there are no section clashes with example-target.
On my system, the linker command produced is:
-Wl,--section-start=.new.interp=0x1004238,--section-start=.new.note.ABI-tag=0x1004254,
--section-start=.new.note.gnu.build-id=0x1004274,--section-start=.new.gnu.hash=0x1004298,
--section-start=.new.dynsym=0x10042B8,--section-start=.new.dynstr=0x1004318,
--section-start=.new.gnu.version=0x1004356,--section-start=.new.gnu.version_r=0x1004360,
--section-start=.new.rela.dyn=0x1004380,--section-start=.new.rela.plt=0x1004398,
--section-start=.new.init=0x10043C8,--section-start=.new.plt=0x10043E0,
--section-start=.new.text=0x1004410,--section-start=.new.fini=0x10045E8,
--section-start=.new.rodata=0x10045F8,--section-start=.new.eh_frame_hdr=0x1004604,
--section-start=.new.eh_frame=0x1004638,--section-start=.new.ctors=0x1204E28,
--section-start=.new.dtors=0x1204E38,--section-start=.new.jcr=0x1204E48,
--section-start=.new.dynamic=0x1204E50,--section-start=.new.got=0x1204FE0,
--section-start=.new.got.plt=0x1204FE8,--section-start=.new.data=0x1205010,
--section-start=.new.bss=0x1205020,--section-start=.new.comment=0xC04000
(Note that I prefixed section names with .new using objcopy --prefix-sections=.new example-embedobj to avoid section name clashes.)
I then wrote some code to generate a new executable (borrowed some code both from objcopy and Security Warrior book). The new executable should have:
All the sections of example-target and all the sections of example-embed
A symbol table which contains all the symbols from example-target and all the symbols of example-embed
The code I wrote is:
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
#include <bfd.h>
#include <libiberty.h>
struct COPYSECTION_DATA {
bfd * obfd;
asymbol ** syms;
int symsize;
int symcount;
};
void copy_section(bfd * ibfd, asection * section, PTR data)
{
struct COPYSECTION_DATA * csd = data;
bfd * obfd = csd->obfd;
asection * s;
long size, count, sz_reloc;
if((bfd_get_section_flags(ibfd, section) & SEC_GROUP) != 0) {
return;
}
/* get output section from input section struct */
s = section->output_section;
/* get sizes for copy */
size = bfd_get_section_size(section);
sz_reloc = bfd_get_reloc_upper_bound(ibfd, section);
if(!sz_reloc) {
/* no relocations */
bfd_set_reloc(obfd, s, NULL, 0);
} else if(sz_reloc > 0) {
arelent ** buf;
/* build relocations */
buf = xmalloc(sz_reloc);
count = bfd_canonicalize_reloc(ibfd, section, buf, csd->syms);
/* set relocations for the output section */
bfd_set_reloc(obfd, s, count ? buf : NULL, count);
free(buf);
}
/* get input section contents, set output section contents */
if(section->flags & SEC_HAS_CONTENTS) {
bfd_byte * memhunk = NULL;
bfd_get_full_section_contents(ibfd, section, &memhunk);
bfd_set_section_contents(obfd, s, memhunk, 0, size);
free(memhunk);
}
}
void define_section(bfd * ibfd, asection * section, PTR data)
{
bfd * obfd = data;
asection * s = bfd_make_section_anyway_with_flags(obfd,
section->name, bfd_get_section_flags(ibfd, section));
/* set size to same as ibfd section */
bfd_set_section_size(obfd, s, bfd_section_size(ibfd, section));
/* set vma */
bfd_set_section_vma(obfd, s, bfd_section_vma(ibfd, section));
/* set load address */
s->lma = section->lma;
/* set alignment -- the power 2 will be raised to */
bfd_set_section_alignment(obfd, s,
bfd_section_alignment(ibfd, section));
s->alignment_power = section->alignment_power;
/* link the output section to the input section */
section->output_section = s;
section->output_offset = 0;
/* copy merge entity size */
s->entsize = section->entsize;
/* copy private BFD data from ibfd section to obfd section */
bfd_copy_private_section_data(ibfd, section, obfd, s);
}
void merge_symtable(bfd * ibfd, bfd * embedbfd, bfd * obfd,
struct COPYSECTION_DATA * csd)
{
/* set obfd */
csd->obfd = obfd;
/* get required size for both symbol tables and allocate memory */
csd->symsize = bfd_get_symtab_upper_bound(ibfd) /********+
bfd_get_symtab_upper_bound(embedbfd) */;
csd->syms = xmalloc(csd->symsize);
csd->symcount = bfd_canonicalize_symtab (ibfd, csd->syms);
/******** csd->symcount += bfd_canonicalize_symtab (embedbfd,
csd->syms + csd->symcount); */
/* copy merged symbol table to obfd */
bfd_set_symtab(obfd, csd->syms, csd->symcount);
}
bool merge_object(bfd * ibfd, bfd * embedbfd, bfd * obfd)
{
struct COPYSECTION_DATA csd = {0};
if(!ibfd || !embedbfd || !obfd) {
return FALSE;
}
/* set output parameters to ibfd settings */
bfd_set_format(obfd, bfd_get_format(ibfd));
bfd_set_arch_mach(obfd, bfd_get_arch(ibfd), bfd_get_mach(ibfd));
bfd_set_file_flags(obfd, bfd_get_file_flags(ibfd) &
bfd_applicable_file_flags(obfd));
/* set the entry point of obfd */
bfd_set_start_address(obfd, bfd_get_start_address(ibfd));
/* define sections for output file */
bfd_map_over_sections(ibfd, define_section, obfd);
/******** bfd_map_over_sections(embedbfd, define_section, obfd); */
/* merge private data into obfd */
bfd_merge_private_bfd_data(ibfd, obfd);
/******** bfd_merge_private_bfd_data(embedbfd, obfd); */
merge_symtable(ibfd, embedbfd, obfd, &csd);
bfd_map_over_sections(ibfd, copy_section, &csd);
/******** bfd_map_over_sections(embedbfd, copy_section, &csd); */
free(csd.syms);
return TRUE;
}
int main(int argc, char **argv)
{
bfd * ibfd;
bfd * embedbfd;
bfd * obfd;
if(argc != 4) {
perror("Usage: infile embedfile outfile\n");
xexit(-1);
}
bfd_init();
ibfd = bfd_openr(argv[1], NULL);
embedbfd = bfd_openr(argv[2], NULL);
if(ibfd == NULL || embedbfd == NULL) {
perror("asdfasdf");
xexit(-1);
}
if(!bfd_check_format(ibfd, bfd_object) ||
!bfd_check_format(embedbfd, bfd_object)) {
perror("File format error");
xexit(-1);
}
obfd = bfd_openw(argv[3], NULL);
bfd_set_format(obfd, bfd_object);
if(!(merge_object(ibfd, embedbfd, obfd))) {
perror("Error merging input/obj");
xexit(-1);
}
bfd_close(ibfd);
bfd_close(embedbfd);
bfd_close(obfd);
return EXIT_SUCCESS;
}
To summarise what this code does, it takes 2 input files (ibfd and embedbfd) to generate an output file (obfd).
Copies format/arch/mach/file flags and start address from ibfd to obfd
Defines sections from both ibfd and embedbfd to obfd. Population of the sections happens separately because BFD mandates that all sections are created before any start to be populated.
Merge private data of both input BFDs to the output BFD. Since BFD is a common abstraction above many file formats, it is not necessarily able to comprehensively encapsulate everything required by the underlying file format.
Create a combined symbol table consisting of the symbol table of ibfd and embedbfd and set this as the symbol table of obfd. This symbol table is saved so it can later be used to build relocation information.
Copy the sections from ibfd to obfd. As well as copying the section contents, this step also deals with building and setting the relocation table.
In the code above, some lines are commented out with /******** */. These lines deal with the merging of example-embed. If they are commented out, what happens is that obfd is simply built as a copy of ibfd. I have tested this and it works fine. However, once I comment these lines back in the problems start occurring.
With the uncommented version which does the full merge, it still generates an output file. This output file can be inspected with objdump and found to have all the sections, code and symbol tables of both inputs. However, objdump complains with:
BFD: BFD (GNU Binutils for Ubuntu) 2.21.53.20110810 assertion fail ../../bfd/elf.c:1708
BFD: BFD (GNU Binutils for Ubuntu) 2.21.53.20110810 assertion fail ../../bfd/elf.c:1708
On my system, 1708 of elf.c is:
BFD_ASSERT (elf_dynsymtab (abfd) == 0);
elf_dynsymtab is a macro in elf-bfd.h for:
#define elf_dynsymtab(bfd) (elf_tdata(bfd) -> dynsymtab_section)
I'm not familiar with the ELF layer, but I believe this is a problem reading the dynamic symbol table (or perhaps saying it's not present). For the time, I am trying to avoid having to reach down directly into the ELF layer unless necessary. Is anyone able to tell me what I'm doing wrong either in my code or conceptually?
If it is helpful, I can also post the code for the linker command generation or compiled versions of the example binaries.
I realise that this is a very large question and for this reason, I would like to properly reward anyone who is able to help me with it. If I am able to solve this with the help of someone, I am happy to award a 500+ bonus.
Why do all of this manually? Given that you have all symbol information (which you must if you want to edit the binary in a sane way), wouldn't it be easier to SPLIT the executable into separate object files (say, one object file per function), do your editing, and relink it?