Get the address of __data_start symbol - c

I want to get the address of the __data_start symbol progammatically. For _GLOBAL_OFFSET_TABLE_, using extern void* _GLOBAL_OFFSET_TABLE_ worked (See an example here). However, the same technique does not work for __data_start. Although the compiler compiles the program fine, the value returned by the program is bogus. Any idea how this problem can be solved.

Magic symbols like __data_start are not pointer variables whose value is the address you want. It's the address of the symbol that you want. So you need the & operator, as in &__data_start.

You could try
extern char _GLOBAL_OFFSET_TABLE_[];
extern char __data_start[];
(It is declaration of arrays, not of pointers!)
and use &__data_start in your code.

This code works with no problems at all.
extern void *data_start;
int main() {
fprintf(stdout,">%p\n", &data_start);
return 0;
}
atom :: » nm test | grep "data_start" ; ./test
0804a00c D __data_start
0804a00c W data_start
>0x804a00

Related

Is there a way to find the address of a const variable in the map file?

I am writing embedded C-code with the Segger embedded Studio for a Nordic NRF52840 Microcontroller.
After compiling my code, I get a .map-file and a .hex file by the gcc.
I want to write a script, which is looking for the address of a constant variable in the .map file.
Then, the script should search for this address in the .hex-file and return the value of the constant variable, by reading the value from the .hex-file at this address.
My assumption is, that a constant variable should be placed in the Flash of the Microcontroller. Therefore, it should be present in the .hex-file. To find the address of this variable, I look for the Name of the variable in the .map file.
But now, I have the following behaviour of my compiler/linker:
In My code, I have defined the following variable in the main file and guaranteed, that the linker is not optimizing my constant:
uint32_t const test123= 0x12345;
int main()
{
int retVal = foo_in_other_file();
if(test123 == retval)
{
static volatile int i = 0;
i++;
retval = true;
}
}
The Section around the constant in the .map-file looks like:
.bss.is_asleeparr
0x0000000000000000 0x400 temp/release/main.o
.rodata.test123
0x0000000000000000 0x4 temp/release/main.o
.text 0x0000000000000000 0x0 temp/release/ble_advdata.o
.data 0x0000000000000000 0x0 temp/release/ble_advdata.o
.bss 0x0000000000000000 0x0 temp/release/ble_advdata.o
.text.sd_ble_gap_addr_get
The constant variable test123 is present in the .map file, but why is it's address zero?
Was it optimized away by the linker?
Thank you in advance for your help :)
EDIT:
Here is a screenshot of the disassembly. Does This mean, that the compiler does interprent the value as an immediate and it is not even stored into the flash?
If this is true, how van I avoid This?
Disassembly of my code
Thanks to the help in the comments of my post, I could solve the problem.
With the following declaration, the address of the variable is shown in my map file:
static uint32_t const test123 __attribute__((used)) = 0x4711;
I had to access the variable as follows, to prevent the compiler and linker from optimization or use the variable as an immediate:
if(*((volatile uint32_t*) &test123) == 4711)
{
static volatile uint32_t optimization_prevention = 0;
optimization_prevention+= test123;
}
With this Code, the address of "test123" is shown in the .map file:
.rodata.test123
0x0000000000052dcc 0x4 temp/release/test_file.o
Thanks to all for the great help :)

How to find compiled variable/function address from debug symbols

I found the following post (How to generate gcc debug symbol outside the build target?) on how to split a the compiled file and the debugging symbols.
However, I cannot find any useful information in the debugging file.
For example,
My helloWorld code is:
#include<stdio.h>
int main(void) {
int a;
a = 5;
printf("The memory address of a is: %p\n", (void*) &a);
return 0;
}
I ran gcc -g -o hello hello.c
objcopy --only-keep-debug hello hello.debug
gdb -s main.debug -e main
In gdb, anything I tried won't give me any information on a, I cannot find its address, I cannot find the main function address
For example :
(gdb) info variables
All defined variables:
Non-debugging symbols:
0x0000000000400618 _IO_stdin_used
0x0000000000400710 __FRAME_END__
0x0000000000600e3c __init_array_end
0x0000000000600e3c __init_array_start
0x0000000000600e40 __CTOR_LIST__
0x0000000000600e48 __CTOR_END__
0x0000000000600e50 __DTOR_LIST__
0x0000000000600e58 __DTOR_END__
0x0000000000600e60 __JCR_END__
0x0000000000600e60 __JCR_LIST__
0x0000000000600e68 _DYNAMIC
0x0000000000601000 _GLOBAL_OFFSET_TABLE_
0x0000000000601028 __data_start
0x0000000000601028 data_start
0x0000000000601030 __dso_handle
0x0000000000601038 __bss_start
0x0000000000601038 _edata
0x0000000000601038 completed.6603
0x0000000000601040 dtor_idx.6605
0x0000000000601048 _end
Am I doing something wrong? Am I understanding the debug file incorrectly? Is there even a way to find out an address of compiled variable/function from a saved debugging information?
int a is a stack variable so it does not have a fixed address unless you are in a call to that specific function. Furthermore, each call to that function will allocate its own variable.
When we say "debugging symbols" we usually mean functions and global variables. A local variable is not a "symbol" in this context. In fact, if you compile with optimisations enabled int a would almost certainly be optimised to a register variable so it would not have an address at all, unless you forced it to be written to memory by doing some_function(&a) or similar.
You can find the address of main just by writing print main in GDB. This is because functions are implicitly converted to pointers in C when they appear in value context, and GDB's print uses C semantics.

What is the first column of nm output?

Thats my code:
int const const_global_init = 2;
int const const_global;
int global_init = 4;
int global;
static int static_global_init = 3;
static int static_global;
static int static_function(){
return 2;
}
double function_with_param(int a){
static int static_local_init = 3;
static int static_local;
return 2.2;
}
int main(){
}
I generate main.o and i try to understood nm output. After i use nm main.o --printfile-name -a i get this output:
main.o:0000000000000000 b .bss
main.o:0000000000000000 n .comment
main.o:0000000000000004 C const_global
main.o:0000000000000000 R const_global_init
main.o:0000000000000000 d .data
main.o:0000000000000000 r .eh_frame
main.o:000000000000000b T function_with_param
main.o:0000000000000004 C global
main.o:0000000000000000 D global_init
main.o:0000000000000027 T main
main.o:0000000000000000 a main.c
main.o:0000000000000000 n .note.GNU-stack
main.o:0000000000000000 r .rodata
main.o:0000000000000000 t static_function
main.o:0000000000000000 b static_global
main.o:0000000000000004 d static_global_init
main.o:0000000000000004 b static_local.1733
main.o:0000000000000008 d static_local_init.1732
main.o:0000000000000000 t .text
I understood 2nd and 3rd column but, i really dont know what is in the first column, whether it is the address or size? I know somethink about .bbs, .comment, .data and .text segments but what is it .eh_frame, .note.GNU-stack and .rodata?
... i really dont know what is in the first column, whether it is the address or size?
My local manpage (from man nm) says
DESCRIPTION
GNU nm lists the symbols from object files objfile.... If no object files are listed as arguments, nm assumes the file a.out.
For each symbol, nm shows:
· The symbol value, in the radix selected by options (see below), or hexadecimal by default.
that is, the first column is the 'value' of the symbol. To understand what that means, it's helpful to know something about ELF and the runtime linker, but in general it will simply be an offset into the relevant section.
Understanding something about ELF will also help with the other points: man elf tells us that the .rodata section is read-only data (that is: constant values hardcoded into the program that never change. String literals might go here).
.eh_frame is used for exception-handling and other call-stack-frame metadata (a search for eh_frame has this question as the first hit).

How can the C function "Exp" be properly used in NASM for Linux?

I am trying to implement the C function "exp" in NASM for Linux. The function takes a double value x, and returns a double value r = e^x, where e is Euler's Number. This is my implementation:
extern exp
SECTION .bss
doubleActual: resq 1
doubleX: resq 1
SECTION .text
main:
;some other code here
;calculate actual result
push doubleActual ; place to store result
push doubleX ;give the function what x is.
call exp
add esp, 8
On compile attempt, i get the following:
hw7_3.o: In function `termIsLess':
hw7_3.asm:(.text+0xf9): undefined reference to `exp'
This is referring to when i actually call exp, which is odd, because "extern exp" seems to work just fine. What am i doing incorrectly?
via http://www.linuxtopia.org/online_books/an_introduction_to_gcc/gccintro_17.html ....
I need to do the following with gcc:
gcc -m32 name.o -lm -o name
The "-lm" tag is a shortcut to link the C math library, which is separate from the standard library.

Replacing static function in kernel module

Folks,
I'm trying to hack a kernel module by modifying its symbol. The basic idea is to replace the original function with new function by overwriting its address in the symtab. However, I found when declaring the function as static, the hacking fails. But it works with non-static function. My example code is below:
filename: orig.c
int fun(void) {
printk(KERN_ALERT "calling fun!\n");
return 0;
}
int evil(void) {
printk(KERN_ALERT "===== EVIL ====\n");
return 0;
}
static int init(void) {
printk(KERN_ALERT "Init Original!");
fun();
return 0;
}
void clean(void) {
printk(KERN_ALERT "Exit Original!");
return;
}
module_init(init);
module_exit(clean);
Then I follow the styx's article to replace the original function "fun" in symtab to call function "evil", http://www.phrack.org/issues.html?issue=68&id=11
>objdump -t orig.ko
...
000000000000001b g F .text 000000000000001b evil
0000000000000056 g F .text 0000000000000019 cleanup_module
0000000000000036 g F .text 0000000000000020 init_module
0000000000000000 g F .text 000000000000001b fun
...
By executing the elfchger
>./elfchger -s fun -v 1b orig.ko
[+] Opening orig.ko file...
[+] Reading Elf header...
>> Done!
[+] Finding ".symtab" section...
>> Found at 0xc630
[+] Finding ".strtab" section...
>> Found at 0xc670
[+] Getting symbol' infos:
>> Symbol found at 0x159f8
>> Index in symbol table: 0x1d
[+] Replacing 0x00000000 with 0x0000001b... done!
I can successfully change the fun's symbol table to be equal to evil and inserting the module see the effects:
000000000000001b g F .text 000000000000001b evil
...
000000000000001b g F .text 000000000000001b fun
> insmod ./orig.ko
> dmesg
[ 7687.797211] Init Original!
[ 7687.797215] ===== EVIL ====
While this works fine. When I change the declaration of fun to be "static int fun(void)" and follows the same steps as mentioned above, I found the evil does not get called. Could anyone give me some suggestion?
Thanks,
William
Short version: Declaring a function as 'static' makes it local and prevents the symbol to be exported. Thus, the call is linked statically, and the dynamic linker does not effect the call in any way at load time.
Long Version
Declaring a symbol as 'static' prevents the compiler from exporting the symbol, making it local instead of global. You can verify this by looking for the (missing) 'g' in your objdump output, or at the lower-case 't' (instead of 'T') in the output of 'nm'. The compiler might also inline the local function, in which case the symbol table wouldn't contain it at all.
Local symbols have to be unique only for the translation unit in which they are defined. If your module consisted of multiple translation units, you could have a static fun() in each of them. An nm or objdump of the finished .ko may then contain multiple local symbols called fun.
This also implies that local symbols are valid only in their respective translation unit, and also can be referred (in your case: called) only from inside this unit. Otherwise, the linker just would not now, which one you mean. Thus, the call to static fun() is already linked at compile time, before the module is loaded.
At load time, the dynamic linker won't tamper with the local symbol fun or references (in particular: calls) to it, since:
its local linkage already done
there are potentially more symbols named 'fun' throughout and the dynamic linker would not be able to tell, which one you meant

Resources