I have two distinct projects which are running on the same target.
I want my second project to use few functions written in the first project at specific addresses.
To do that I thought I could use the symbol table from the first project in the second but it doesn't work. (I use arm-none-eabi toolchain and -nm on .elf file to generate symbols table).
I know that is possible but how can I do that ?
Well, the brute-force approach will very likely work:
int (*far_function)(int a, int b, int c) = (int(*)(int, int, int)) 0xfeedf00d;
far_function(1, 2, 3);
In other words, just make a function pointer and initialize it using the known address.
If the address isn't well-known (which it won't be if the other application is re-built and you haven't taken steps to "lock" the target function to a particular address), I would instead add meta-data at some fixed address, that contains the pointer. The other application would embed this data, thereby "exporting" the location of the interesting function.
The addresses yielded by nm are the location of the symbols, but on Cortex-M which used the Thumb2 instruction set, those addresses cannot be used directly for jump/call/branch execution - it is necessary to set the LSB of the address to 1 to indicate Thumb mode.
For example:
typedef void (*voidFn_void_t)(void) ;
uint32_t symbol_address = symbolLookup( "myfunction" ) ;
symbol_address |= 1 ; // Make Thumb mode address
((voidFn_void_t)symbol_address)() ; // Make call
The called function must even then have no dependencies on the execution environment since it is executing in the environment of the caller, not that of the project it was built in. You may get away with it if the execution environment is be identical but maintaining that may be a problem.
Related
Are these attributes incompatible? The address attribute seems to be ignored, emitting no warnings (-Wall).
(For reference, EEMEM is defined in eeprom.h as: #define EEMEM __attribute__((section(".eeprom"))).)
Using a declaration like:
uint8_t storedFlags EEMEM __attribute__((address (100)));
(and similarly for all the others) results in the variables being placed in whatever order the linker prefers, ignoring my attribute. Order of attributes doesn't make a difference.
I am aware of the preferred method (creating sections and passing their locations to the linker). I was just looking to shove them around for the moment, as I'm in active development and adding and removing allocations in EEPROM; I'd rather things not move around every other build so I don't have to reprogram EEPROM from default values every damn time. Worst of all, I'm sure I've done precisely this before, and had it work. Version differences? Coincidental assignments? (I have GCC 3.4 and 8.1, not sure what that project used; I'm using 8.1 for this one.)
The documentation for the address attribute states:
Variables with the address attribute are used to address memory-mapped peripherals that may lie outside the io address range.
Looking at the AVR memory space shows the I/O addresses fall under the SRAM data memory space.
This explains why your construct doesn't work as expected since EEMEM and the address attribute map to conflicting memory sections.
Edit: Testing with avr-gcc 3.6.2 suggest that the section attribute overrides the address attribute (without warning). Using eeprom_read_byte to read data from EEPROM, the following example gets correctly compiled by avr-gcc (correct because the address 0x0123 is passed to the eeprom_read_byte function):
#include <avr/eeprom.h>
uint8_t __attribute__((address (0x0123))) storedFlags;
int main(void){
if (eeprom_read_byte(&storedFlags) == 1){
return 1;
}
}
Edit2: tested on avr-gcc 11.1, also generates correct instructions.
I am aware of C memory layout and binary formation process.
I have a doubt/query regarding the phase when and who assigns address to global variables.
extern int dummy; //Declared in some other file
int * pTest = &dummy;
This code compiles well. Here pTest will have address of dummy only if address is assigned to it.
I want to know in which phase (compilation or linker) does dummy variable gets address?
The compiler says:
int *pTest = &<where is dummy?>;
The linker says:
int *pTest= &<dummy is here>;
The loader says:
int *pTest= <dummy is at 0x1234>;
This somewhat simplified explanation tries to convey the following:
The compiler identifies that an external variable dummy is used
The linker identifies where and in which module this variable resides
But only once the executable program is placed in memory is the actual location of the variable known and the loader puts this actual address in all the places where dummy is used.
the actual process is actually a bit different.
The compiler saves the information in the object file about the the assignment and the external object reference.
The linker depending on the actual hardware IS and implementation calculates the absolute address ( if the code will be placed at the fixed address - for example the embedded uC project) or same virtual and sets the entry in the relocation table (If the code is position independent) and the loaded is changing this virtuall address to the correct one during the program loading and start-up.
I'm writing a small tool that should be able to inspect an arbitrary process of interest and check if any of its statically linked functions were trampolined. (An example of a trampoline could be what Microsoft Detours does to a process.)
For that I parse the PE header of the target process and retrieve all of its imported DLLs with all imported functions in them. Then I can compare the following between DLLs on disk and the DLLs loaded in the target process memory:
A. Entries in the Import Address Table for each imported function.
B. First N bytes of each function's machine code.
And if any of the above do not match, this will most certainly mean that a trampoline was applied to a particular function (or WinAPI.)
This works well, except of one situation when a target process can import a global variable instead of a function. For example _acmdln is such global variable. You can still find it in msvcrt.dll and use it as such:
//I'm not sure why you'd want to do it this way,
//but it will give you the current command line.
//So just to prove the concept ...
HMODULE hMod = ::GetModuleHandle(L"msvcrt.dll");
char* pVar = (char*)::GetProcAddress(hMod, "_acmdln");
char* pCmdLine = pVar ? *(char**)pVar : NULL;
So, what this means for my trampoline checking tool is that I need to differentiate between an imported function (WinAPI) and a global variable. Any idea how?
PS. If I don't do that, my algorithm that I described above will compare a global variable's "code bytes" as if it was a function, which is just a pointer to a command line that will most certainly be different, and then flag it as a trampolined function.
PS2. Not exactly my code, but a similar way to parse PE header can be found here. (Search for DumpImports function for extracting DLL imports.)
Global variables will be in the .data section not the .text section, in addition the section will not have execute permissions if it's not a function. Therefore you can use both of these characteristics to filter.
#define _FUID1(x) __attribute__((section("__FUID1.sec"),space(prog))) int _FUID1 = (x);
I am trying to make sense of the about the above define. the _FUID(x) macro. This relates to program memory and has the attribute of the section defining in the code section memory area?
what does the above trying to accomplish?
The macro isn't doing anything interesting or complicated at all; it just outputs a declaration for int _FUID1, with its parameter as an initializer, and with an attributes list ahead of it.
As for what the attributes list means, look at the documentation for variable attributes in GCC. section puts the variable in a named section, which allows the linker to relocate it to a special address or do some other interesting thing to it, and space isn't documented, but space(prog) sounds like a directive to put a value into the program address space instead of the data address space on a Harvard-architecture machine.
I think this is hardware specific (some Microchip unit), it places a value, for example:
__attribute__((section("__FUID1.sec"),space(prog))) int _FUID1 = (0xf1);
into unit id register 1 (__FUID1.sec), in the program flash to configure the hardware. See the pic documentation (for references to FUID) and MPLAB C30 manual (for description of memory spaces).
How is scope of a variable is implemented by compilers?
I mean, when we say static variable, the scope is limited to the block or functions that defined in the same file where the static variable is defined?
How is this achieved in machine level or at memory level?
How actually is this restriction achieved?
How is this scoping resolved at program run time?
It is not achieved at all at the machine level. The compiler checks for scopes before machine code is actually generated. The rules of C are implemented by the compiler, not by the machine. The compiler must check those rules, the machine does not and cannot.
A very simplistic explanation of how the compiler checks this:
Whenever a scope is introduced, the compiler gives it a name and puts it in a structure (a tree) that makes it easy to determine the position of that scope in relation to other scopes, and it is marked as being the current scope. When a variable is declared, its assigned to the current scope. When accessing a variable, it is looked for in the current scope. If not found, the tree is looked up to find the scope above the current one. This continues until we reach the topmost scope. If the variable is still not found, then we have a scope violation.
inside compilers, its implementation defined. For example if I were writing a compiler, I would use a tree to define 'scope' and it would definitely be a symbol table inside a binary tree.
Some would use an arbitrary depth Hash table. Its all implementation defined.
I'm not 100% sure I understand what you are asking, but if you mean "how are static variables and functions stored in the final program", that is implementation-defined.
That said, a common way of storing such variables and functions is in the same place as any other global symbols (and some non-global ones) -- the difference is that these are not "exported", and thus not visible in any outside code trying to link to our software.
In other words, a program which has the following in it:
int var;
static int svar;
int func() { static int func_static; ... }
static int sfunc() { ... }
... might have the following layout in memory (let's say our data starts at 0xF000 and functions at 0xFF00):
0xF000: var
0xF004: svar
0xF008: func.func_static
...
0xFF00: func's data
0xFF40: sfunc's data /* assuming we needed 0x40 bytes for `func`! */
The list of exports, however, would only contain the non-static symbols, aka the exported ones:
var v 0xF000
func f 0xFF00
Again -- note how, while the static data is still written into the files (it has to be stored somewhere!), it is not exported; in layman's terms, our program does not tell anyone that it contains svar, sfunc and similar.
In Unices, you can list the symbols that a library or a program exports with the nm tool: http://unixhelp.ed.ac.uk/CGI/man-cgi?nm ; there do exist similar tools for Windows (GnuWin32 might have something similar).
In practice, executable code is often stored separately from the data (so that it can be protected from writes, for example), and it both may get reordered to minimize memory use and cache misses, but the idea remains the same.
Of course, optimizations can be applied -- for example, a static function could be inlined in its every invokation, meaning that no code is generated for the function itself at all, and thus it does not exist on its own anywhere.