So I'm relatively new with arm and I'm having problems with scatter files. This next file is what I have in the project I entered:
ROOT 0x00020200 0x000DFDFC
{
ITCM 0x00020200
{
ssw01fiq.o (Startup, +First)
startup.o (+RO)
ssw01irq.o (+RO)
}
EXEC_REST +0
{
* (+RO)
}
DTCM 0x00400000 {startup.o (StartOfRAM, +First)}
SCT +0x348
{
ibmp_slot.o* (+ZI);
}
INIT_CALL +0
{
* (INIT)
}
SRAM +0
{
* (+RW, +ZI)
startup.o* (DummyStack, +Last)
}
JUSTAFTERRAM 0x00410000
{
startup.o* (JustAfterRAM)
}
JUSTAFTERROM 0x00100000
{
startup.o* (JustAfterROM)
}
}
So, what I would like to do is add a new execution region called INIT_CALL and then, on the source code, I will define pointers to init functions and place them at the INIT section (in a simple way this is what is done on the linux kernel...).
For that I use a macro like this:
typedef int (*_init_fn) (void);
#define component_init(__fn) static _init_fn fn __attribute__((section ("INIT"))) = __fn
And in one .c file im using it to initialize some init funtion.
component_init(productUiInitialize);
void dump_fn()
{
printf("Here we have, fn=0x%08X, addr=0x%08X\n", fn, &fn);
}
On the other .c file i do this:
extern int productUiInitialize(void);
_init_fn *test = (_init_fn *)0x00400F78;
void do_init_calls( void )
{
dump_fn();
printf("FUNC=0x%08X, INITCALL=0x%08X, ADDR=0x%08X\n",productUiInitialize, *test, test);
}
So, the outcome is really strange, I see that my pointers are not being well initialized, they have NULL value. I then noticed, if I use const in my define the pointer in the dump_fn function will be ok, so the addr is 0x00400F78 and value is the addr of productUiInitialize. But on the other c file, the value of test is naturally 0x00400F78, but when I dereference it I have NULL (I was expecting to have productUiInitialize).
Am I doing anything wrong at the scatter file?? Some inputs would be really appreciated...
UPDATE: So i recently had some more time to look into this. What i found is that, if i set my INIT section inside SRAM everything works as expected. The problem is, in this case i don't know how to force the init address and the size of my SECTION. Moreover i checked that if i initialized a variable like this in test.c :
_init_fn *fn attribute((section ("INIT"))) = (init *)0x00400F78;
The variable is not correctly initialized. When i print it's value is something else completely different of 0x00400F78. So it seems that if my section is outside the SRAM execution region (where RW and ZI section are) things don't work as i expected. So i have basically two questions, Does anyone knows why of this behavior? Also, is there any possibility of forcing an init address for my section INIT, if i place it inside SRAM execution region?
Thanks in advance!
Related
In the linker script, I have defined
MEMORY {
sec_1 : ORIGIN = 0x1B800, LENGTH = 2048
......
}
How can I read the start address of this section in C? I would like to copy it in a variable in the C code
Basically to achieve this, you have two tasks to fulfill:
Tell the linker to save the start address of the section. This can be achieved by placing a symbol in the linker script at the beginning of your section.
Tell the compiler to save initialize a constant with an address filled in later by the linker
As for the first step: In your section sec_1 you have to place a symbol that will be placed at the start of that section:
SECTIONS
{
...
.sec_1 :
{
__SEC_1_START = ABSOLUTE(.); /* <-- add this */
...
} > sec_1
...
}
Now that the linker produces bespoke symbol, you have to make it accessible from the compiler side. In order to do so, you need somewhere some code like this:
/* Make the compiler aware of the linker symbol, by telling it, there
is something, somewhere that the linker will put together (i.e. "extern") */
extern int __SEC_1_START;
void Sec1StartPrint(void) {
void * const SEC_1_START = &__SEC_1_START;
printf("The start address for sec_1 is: %p", SEC_1_START);
}
By calling Sec1StartPrint() you should get an address output that matches your *.map file the linker created.
I'm using arm-none-eabi-g++ to compile for an ARM Cortex M microcontroller. My code statically instantiates some modules, initialize them, and execute them sequentially in a loop. Each module (ModuleFoo, ModuleBar...) is a class which herits from a common Module class. The Module class defines two virtual functions, init() and exec(), which are implemented in every derived module. There are no explicit constructors, neither in the base nor the derived classes. Finally, I have a Context struct which is passed around and contains a list of pointers to the modules (Module* modules[]).
I had the following code which worked :
int main() {
ModuleFoo foo;
ModuleBar bar;
Context context;
const int N_MODULES = 2;
context.modules[0] = &foo; // Indexes are actually an enum but I stripped it to make it shorter
context.modules[1] = &bar;
for (int i = 0; i < N_MODULES; i++) {
context.modules[i]->init(context);
}
while (1) {
for (int i = 0; i < N_MODULES; i++) {
context.modules[i]->exec(context);
}
}
}
So far, so good (at least I think so, in any case it worked).
Now, I want to make the system more maintainable by moving all the code related to "which modules are used in a particular configuration" to a separate config.cpp/config.h file :
config.cpp :
ModuleFoo foo;
ModuleBar bar;
void initContext(Context& context) {
context.nModules = 2;
context.modules[0] = &foo;
context.modules[1] = &bar;
}
main.cpp :
#include "config.h"
int main() {
Context context;
initContext(context);
for (int i = 0; i < context.nModules; i++) {
context.modules[i]->init(context);
}
while (1) {
for (int i = 0; i < context.nModules; i++) {
context.modules[i]->exec(context);
}
}
}
The problem appears when init() is called on the first module (the MCU HardFaults). This is because, according to GDB, the vtable pointer is not initialized :
(gdb) p foo
$1 = {
<Module> = {
_vptr.Module = 0x0 <__isr_vector>,
_enabled = false
},
I rolled back with Git to check, with the previous code structure the vtable pointer was correctly initialized. And according to the linker's map file and GDB, the vtable exists (at around the same address as before):
.rodata 0x0000000000008e14 0x2c ModuleFoo.o
0x0000000000008e14 typeinfo name for ModuleFoo
0x0000000000008e1c typeinfo for ModuleFoo
0x0000000000008e28 vtable for ModuleFoo
The pointer is simply not set. The only difference I see between the two versions is that in the first one the modules are instanciated on the stack, whereas on the second they are instanciated globally in the bss :
.bss 0x00000000200015fc 0x22c config.o
0x00000000200015fc foo
0x000000002000164c bar
Could this be the problem?
In any case, thanks for taking the time to read this far!
**EDIT : **
The problem was coming from the startup code and the linker script. I used the sample files provided with Atmel's ARM GCC toolchain, which seem to be poorely written and, most importantly, didn't call __libc_init_array() (which is used to call global constructors). I switched to using the startup/linker script from ASF, and it works way better. Thanks #FreddieChopin !
Show us the startup code you are using. Most likely you did not enable global constructors, which can be done by calling __libc_init_array() function. You can test this theory, by manually calling this function at the beginning of main() - it should work fine then. If it does, then you should add that function to your startup code (Reset_Handler).
Quick test:
int main() {
extern "C" void __libc_init_array();
__libc_init_array();
// rest of your code...
To do it properly, find the place where your startup code calls main() (usually sth like ldr rX, =main and blx rX or maybe directly as bl main) and right before that do exactly the same but with __libc_init_array instead of main.
I have a legacy C Linux application that I need to reuse . This application uses a lot of global variables. I want to reuse this application's main method and invoke that in a loop. I have found that when I call the main method( renamed to callableMain) in a loop , the application behavior is not consistent as the values of global variables set in previous iteration impact the program flow in the new iteration.
What I would like to do is to reset all the global variables to the default value before the execution of the the new iteration.
for example , the original program is like this
OriginalMain.C
#include <stdio.h>
int global = 3; /* This is the global variable. */
void doSomething(){
global++; /* Reference to global variable in a function. */
}
// i want to rename this main method to callableMain() and
// invoke it in a loop
int main(void){
if(global==3) {
printf(" All Is Well \n");
doSomething() ;
}
else{
printf(" Noooo\n");
doNothing() ;
}
return 0;
}
I want to change this program as follows:
I changed the above file to rename the main() to callableMain()
And my new main methods is as follows:
int main(){
for(int i=0;i<20;i++){
callableMain();
// this is where I need to reset the value of global vaiables
// otherwise the execution flow changes
}
}
Is this possible to reset all the global variables to the values before main() was invoked ?
The short answer is that there is no magical api call that would reset global variables. The global variables would have to be cached and reused.
I would invoke it as a subprocess, modifying its input and output as needed. Let the operating system do the dirty work for you.
The idea is to isolate the legacy program from your new program by relegating it to its own process. Then you have a clean separation between the two. Also, the legacy program is reset to a clean state every time you run it.
First, modify the program so that it reads the input data from a file, and writes its output in a machine-readable format to another file, with the files being given on the command line.
You can then create named pipes (using the mkfifo call) and invoke the legacy program using system, passing it the named pipes on the command line. Then you feed it its input and read back its output.
I am not an expert on these matters; there is probably a better way of doing the IPC. Others here have mentioned fork. However, the basic idea of separating out the legacy code and invoking it as a subprocess is probably the best approach here.
fork() early?
You could fork(2) at some early point when you think the globals are in a good state, and then have the child wait on a pipe or something for some work to do. This would require writing any changed state or at least the results back to the parent process but would decouple your worker from your primary control process.
In fact, it might make sense to fork() at least twice, once to set up a worker controller and save the initialized (but not too initialized :-) global state, and then have this worker controller fork() again for each loop you need run.
A simpler variation might be to just modify the code so that the process can start in a "worker mode", and then use fork() or system() to start the application at the top, but with an argument that puts it in to the slave mode.
There is a way to do this on certain platforms / compilers, you'd basically be performing the same initialization your compiler performs before calling main().
I have done this for a TI DSP, in that case I had the section with globals mapped to a specific section of memory and there were linker directives available that declared variables pointing to the start and end of this section (so you can memset() the whole area to zero before starting initialization). Then, the compiler provided a list of records, each of which comprised of an address, data length and the actual data to be copied into the address location. So you'd just loop through the records and do memcpy() into the target address to initialize all globals.
Very compiler specific, so hopefully the compiler you're using allows you to do something similar.
In short, no. What I would do in this instance is create definitions, constants if you will, and then use those to reset the global variables with.
Basically
#define var1 10
int vara = 10
etc... basic C right?
You can then go ahead and wrap the reinitialization in a handy function =)
I think you must change the way you see the problem.
Declare all the variables used by callableMain() inside callableMain()'s body, so they are not global anymore and are destroyed after the function is executed and created once again with the default values when you call callableMain() on the next iteration.
EDIT:
Ok, here's what you could do if you have the source code for callableMain(): in the beginning of the function, add a check to verify if its the first time the function its being called. Inside this check you will copy the values of all global variables used to another set of static variables (name them as you like). Then, on the function's body replace all occurences of the global variables by the static variables you created.
This way you will preserve the initial values of all the global variables and use them on every iteration of callableMain(). Does it makes sense to you?
void callableMain()
{
static bool first_iter = true;
if (first_iter)
{
first_iter = false;
static int my_global_var1 = global_var1;
static float my_global_var2 = global_var2;
..
}
// perform operations on my_global_var1 and my_global_var2,
// which store the default values of the original global variables.
}
for (int i = 0; i < 20; i++) {
int saved_var1 = global_var1;
char saved_var2 = global_var2;
double saved_var3 = global_var3;
callableMain();
global_var1 = saved_var1;
global_var2 = saved_var2;
global_var3 = saved_var2;
}
Or maybe you can find out where global variables start memcpy them. But I would always cringe when starting a loop ...
for (int i = 0; i < 20; i++) {
static unsigned char global_copy[SIZEOFGLOBALDATA];
memcpy(global_copy, STARTOFGLOBALDATA, SIZEOFGLOBALDATA);
callableMain();
memcpy(STARTOFGLOBALDATA, global_copy, SIZEOFGLOBALDATA);
}
If you don't want to refactor the code and encapsulate these global variables, I think the best you can do is define a reset function and then call it within the loop.
Assuming we are dealing with ELF on Linux, then the following function to reset the variables works
// these extern variables come from glibc
// https://github.com/ysbaddaden/gc/blob/master/include/config.h
extern char __data_start[];
extern char __bss_start[];
extern char _end[];
#define DATA_START ((char *)&__data_start)
#define DATA_END ((char *)&__bss_start)
#define BSS_START ((char *)&__bss_start)
#define BSS_END ((char *)&_end)
/// first call saves globals, subsequent calls restore
void reset_static_data();
// variable for quick check
static int pepa = 42;
// writes to memory between global variables are reported as buffer overflows by asan
ATTRIBUTE_NO_SANITIZE_ADDRESS
void reset_static_data()
{
// global variable, ok to leak it
static char * x;
size_t s = BSS_END - DATA_START;
// memcpy is always sanitized, so access memory as chars in a loop
if (x == NULL) { // store current static variables
x = (char *) malloc(s);
for (size_t i = 0; i < s; i++) {
*(x+i) = *(DATA_START + i);
}
} else { // restore previously saved static variables
for (size_t i = 0; i < s; i++) {
*(DATA_START + i) = *(x+i);
}
}
// quick check, see that pepa does not grow in stderr output
fprintf(stderr, "pepa: %d\n", pepa++);
}
The general approach is based on answer in How to get the data and bss address space in run time (In Unix C program), see the linked ysbaddaden/gc GitHub repo for macOS version of the macros.
To test the above code, just call it a few times and note that the incremented global variable pepa still keeps the value of 42.
reset_static_data();
reset_static_data();
reset_static_data();
Saving current state of the globals is convenient in that it does not require rerunning __attribute__((constructor)) functions which would be necessary if I set everything in .bss to zero (which is easy) and everything in .data to the initial values (which is not so easy). For example, if you load libpython3.so in your program, it does do run-time initialization which is lost by zeroing .bss. Calling into Python then crashes.
Sanitizers
Writing into areas of memory immediately before or after a static variable will trigger buffer-overflow warning from Address Sanitizer. To prevent this, use the ATTRIBUTE_NO_SANITIZE_ADDRESS macro the way the code above does. The macro is defined in sanitizer/asan_interface.h.
Code coverage
Code coverage counters are implemented as global variables. Therefore, resetting globals will cause coverage information to be forgotten. To solve this, always dump the coverage-to-date before restoring the globals. There does not seem to be a macro to detect whether code coverage is enabled or not in the compiler, so use your build system (CMake, ...) to define suitable macro yourself, such as QD_COVERAGE below.
// The __gcov_dump function writes the coverage counters to gcda files
// and the __gcov_reset function resets them to zero.
// The interface is defined at https://github.com/gcc-mirror/gcc/blob/7501eec65c60701f72621d04eeb5342bad2fe4fb/libgcc/libgcov-interface.c
extern "C" void __gcov_reset();
extern "C" void __gcov_dump();
void flush_coverage() {
#if defined(QD_COVERAGE)
__gcov_dump();
__gcov_reset();
#endif
}
I was trying to write a small debug utility and for this I need to get the function/global variable address given its name. This is built-in debug utility, which means that the debug utility will run from within the code to be debugged or in plain words I cannot parse the executable file.
Now is there a well-known way to do that ? The plan I have is to make the .debug_* sections to to be loaded into to memory [which I plan to do by a cheap trick like this in ld script]
.data {
*(.data)
__sym_start = .;
(debug_);
__sym_end = .;
}
Now I have to parse the section to get the information I need, but I am not sure this is doable or is there issues with this - this is all just theory. But it also seems like too much of work :-) is there a simple way. Or if someone can tell upfront why my scheme will not work, it ill also be helpful.
Thanks in Advance,
Alex.
If you are running under a system with dlopen(3) and dlsym(3) (like Linux) you should be able to:
char thing_string[] = "thing_you_want_to_look_up";
void * handle = dlopen(NULL, RTLD_LAZY | RTLD_NOLOAD);
// you could do RTLD_NOW as well. shouldn't matter
if (!handle) {
fprintf(stderr, "Dynamic linking on main module : %s\n", dlerror() );
exit(1);
}
void * addr = dlsym(handle, thing_string);
fprintf(stderr, "%s is at %p\n", thing_string, addr);
I don't know the best way to do this for other systems, and this probably won't work for static variables and functions. C++ symbol names will be mangled, if you are interested in working with them.
To expand this to work for shared libraries you could probably get the names of the currently loaded libraries from /proc/self/maps and then pass the library file names into dlopen, though this could fail if the library has been renamed or deleted.
There are probably several other much better ways to go about this.
edit without using dlopen
/* name_addr.h */
struct name_addr {
const char * sym_name;
const void * sym_addr;
};
typedef struct name_addr name_addr_t;
void * sym_lookup(cost char * name);
extern const name_addr_t name_addr_table;
extern const unsigned name_addr_table_size;
/* name_addr_table.c */
#include "name_addr.h"
#define PREMEMBER( X ) extern const void * X
#define REMEMBER( X ) { .sym_name = #X , .sym_addr = (void *) X }
PREMEMBER(strcmp);
PREMEMBER(printf);
PREMEMBER(main);
PREMEMBER(memcmp);
PREMEMBER(bsearch);
PREMEMBER(sym_lookup);
/* ... */
const name_addr_t name_addr_table[] =
{
/* You could do a #include here that included the list, which would allow you
* to have an empty list by default without regenerating the entire file, as
* long as your compiler only warns about missing include targets.
*/
REMEMBER(strcmp),
REMEMBER(printf),
REMEMBER(main),
REMEMBER(memcmp),
REMEMBER(bsearch),
REMEMBER(sym_lookup);
/* ... */
};
const unsigned name_addr_table_size = sizeof(name_addr_table)/sizeof(name_addr_t);
/* name_addr_code.c */
#include "name_addr.h"
#include <string.h>
void * sym_lookup(cost char * name) {
unsigned to_go = name_addr_table_size;
const name_addr_t *na = name_addr_table;
while(to_to) {
if ( !strcmp(name, na->sym_name) ) {
return na->sym_addr;
}
na++;
to_do--;
}
/* set errno here if you are using errno */
return NULL; /* Or some other illegal value */
}
If you do it this way the linker will take care of filling in the addresses for you after everything has been laid out. If you include header files for all of the symbols that you are listing in your table then you will not get warnings when you compile the table file, but it will be much easier just to have them all be extern void * and let the compiler warn you about all of them (which it probably will, but not necessarily).
You will also probably want to sort your symbols by name such that you can use a binary search of the list rather than iterate through it.
You should note that if you have members in the table which are not otherwise referenced by the program (like if you had an entry for sqrt in the table, but didn't call it) the linker will then want (need) to link those functions into your image. This can make it blow up.
Also, if you were taking advantage of global optimizations having this table will likely make those less effective since the compiler will think that all of the functions listed could be accessed via pointer from this list and that it cannot see all of the call points.
Putting static functions in this list is not straight forward. You could do this by changing the table to dynamic and doing it at run time from a function in each module, or possibly by generating a new section in your object file that the table lives in. If you are using gcc:
#define SECTION_REMEMBER(X) \
static const name_addr_t _name_addr##X = \
{.sym_name= #X , .sym_addr = (void *) X } \
__attribute__(section("sym_lookup_table" ) )
And tack a list of these onto the end of each .c file with all of the symbols that you want to remember from that file. This will require linker work so that the linker will know what to do with these members, but then you can iterate over the list by looking at the begin and end of the section that it resides in (I don't know exactly how to do this, but I know it can be done and isn't TOO difficult). This will make having a sorted list more difficult, though. Also, I'm not entirely certain initializing the .sym_name to a string literal's address would not result in cramming the string into this section, but I don't think it would. If it did then this would break things.
You can still use objdump to get a list of the symbols that the object file (probably elf) contains, and then filter this for the symbols you are interested in, and then regenerate the table file the table's members listed.
Im getting into kernel work for a bit of my summer research. We are looking to make modifications to the TCP, in specific RTT calculations. What I would like to do is replace the resolution of one of the functions in tcp_input.c to a function provided by a dynamically loaded kernel module. I think this would improve the pace at which we can develop and distribute the modification.
The function I'm interested in was declared as static, however I've recompiled the kernel with the function non-static and exported by EXPORT_SYMBOL. This means the function is now accessible to other modules/parts of the kernel. I have verified this by "cat /proc/kallsyms".
Now I'd like to be able to load a module that can rewrite the symbol address from the initial to my dynamically loaded function. Similarly, when the module is to be unloaded, it would restore the original address. Is this a feasible approach? Do you all have suggestions how this might be better implemented?
Thanks!
Same as Overriding functionality with modules in Linux kernel
Edit:
This was my eventual approach.
Given the following function (which I wanted to override, and is not exported):
static void internal_function(void)
{
// do something interesting
return;
}
modify like so:
static void internal_function_original(void)
{
// do something interesting
return;
}
static void (*internal_function)(void) = &internal_function_original;
EXPORT_SYMBOL(internal_function);
This redefines the expected function identifier instead as a function pointer (which can be called in a similar manner) pointing to the original implementation. EXPORT_SYMBOL() makes the address globally accessible, so we can modify it from a module (or other kernel location).
Now you can write a kernel module with the following form:
static void (*original_function_reference)(void);
extern void (*internal_function)(void);
static void new_function_implementation(void)
{
// do something new and interesting
// return
}
int init_module(void)
{
original_function_reference = internal_function;
internal_function = &new_function_implementation;
return 0;
}
void cleanup_module(void)
{
internal_function = original_function_reference;
}
This module replaces the original implementation with a dynamically loaded version. Upon unloading, the original reference (and implementation) is restored. In my specific case, I provided a new estimator for the RTT in TCP. By using a module, I am able to make small tweaks and restart testing, all without having to recompile and reboot the kernel.
I'm not sure that'll work - I believe the symbol resolution for the internal calls to the function you want to replace will have already been done by the time your module loads.
Instead, you could change the code by renaming the existing function, then creating a global function pointer with the original name of the function. Initialise the function pointer to the address of the internal function, so the existing code will work unmodified. Export the symbol of the global function pointer, then your module can just change its value by assignment at module load and unload time.
I once made a proof of concept of a hijack module that inserted it's own function in place of kernel function.
I just so happens that the new kernel tacing architecture uses a very similar system.
I injected my own function in the kernel by overwriting the first couple of bytes of code with a jump pointing to my custom function. As soon as the real function gets called, it jumps instead to my function that after it had done it's work called the original function.
#include <linux/module.h>
#include <linux/kernel.h>
#define CODESIZE 12
static unsigned char original_code[CODESIZE];
static unsigned char jump_code[CODESIZE] =
"\x48\xb8\x00\x00\x00\x00\x00\x00\x00\x00" /* movq $0, %rax */
"\xff\xe0" /* jump *%rax */
;
/* FILL THIS IN YOURSELF */
int (*real_printk)( char * fmt, ... ) = (int (*)(char *,...) )0xffffffff805e5f6e;
int hijack_start(void);
void hijack_stop(void);
void intercept_init(void);
void intercept_start(void);
void intercept_stop(void);
int fake_printk(char *, ... );
int hijack_start()
{
real_printk(KERN_INFO "I can haz hijack?\n" );
intercept_init();
intercept_start();
return 0;
}
void hijack_stop()
{
intercept_stop();
return;
}
void intercept_init()
{
*(long *)&jump_code[2] = (long)fake_printk;
memcpy( original_code, real_printk, CODESIZE );
return;
}
void intercept_start()
{
memcpy( real_printk, jump_code, CODESIZE );
}
void intercept_stop()
{
memcpy( real_printk, original_code, CODESIZE );
}
int fake_printk( char *fmt, ... )
{
int ret;
intercept_stop();
ret = real_printk(KERN_INFO "Someone called printk\n");
intercept_start();
return ret;
}
module_init( hijack_start );
module_exit( hijack_stop );
I'm warning you, when you're going to experiment with these kind of things, watch out for kernel panics and other disastrous events. I would advise you to do this in a virtualised environment. This is a proof-of-concept code I wrote a while ago, I'm not sure it still works.
It's a really easy principle, but very effective. Of course, a real solution would use locks to make sure nobody would call the function while you're overwriting it.
Have fun!
You can try using ksplice - you don't even need to make it non static.
I think what you want is Kprobe.
Another way that caf has mentioned is to add a hook to the original routine, and register/unregister hook in the module.