I'm trying to force a global variable to a specific address without modifying the source code.
I'm well aware of solution such as:
// C source code
MyStruct globalVariable __attribute__((section(".myLinkerSection")));
// Linker script
. = 0x400000;
.myLinkerSection:
{
*(.myLinkerSection)
}
But in my case I would like to do the same thing without the __attribute__((section(".myLinkerSection"))) keyword.
Is it doable ?
EDIT:
I cannot modify the source code at all.
The variable is defined as follow:
file.h:
extern MyStruct globalVariable;
file.c:
MyStruct globalVariable;
I assume from the mentions of __attribute__ that you are using gcc / clang or something compatible. You can use the -fdata-sections option to make the compiler put every variable into its own section. With that option, your globalVariable, assuming it would otherwise go in .bss, will be placed in a section called .bss.globalVariable (the exact name might be platform-dependent). Then you can use your linker script to place this section at the desired address.
Note that this option will inhibit certain compiler optimizations. There is a guarantee that objects defined in the same section within the same assembler module are assembled in strict order, and that their addresses do not change after that. In some cases the compiler can take advantage of this; e.g. if it defines int variables foo and bar consecutively in the same section, then it knows their addresses are consecutive, and it can safely generate code that "hardcodes" their relative position. For instance, on some platforms such as ARM64, it takes multiple instructions to materialize the address of a global or static object. So if some function accesses both foo and bar, the compiler can materialize the address of foo, then add the fixed constant 4 to get the address of bar. But if foo and bar are in different sections, this can't be done, and you will pay the (small but nonzero) cost of materializing both addresses separately.
As such, you may want to use -fdata-sections only on the particular source files that define the particular variables of concern.
This also illustrates why you have to get the variable in its own section in order to set its address; you can't move just one variable from a section, since the compiler may have been relying on its relative position to some other variable in that section.
You can define this variable in a separate translation unit. Then list its object file in the appropriate section.
Related
I have a program test.c
int global_var=10;
printf("Done");
i did
gcc -g test.c -o test
My query is
Is there a way i can get the variable name as argument (say "global_var") and print the value.
Thanks
No, C doesn't have introspection. Once the compiler has generated code, the program can not look up variable names.
The way these things are usually solved is by having a collection of all special variables that needs to be looked up by name, containing both the actual name as a string and the variable it self.
Usually it's an array of structures, something like
struct
{
const char *name;
int value;
} variables[] = {
{ "global_var", 10 }
};
The program can then look through the array variables to search for "global_var" and use (or change) the value in the structure.
General answer: No. There is no connection between a variable name and its string representation (you can get the string representation of a variable name at compile time with the preprocessor, though).
For identifiers with external linkage, there are (platform-dependent) ways: See e.g. dlsym for POSIX systems.
You can compile with debugging information and access (most) variables by names from input. Unless you really write something like a debugger, this would be a horrible design, however (and even then, you don’t access the variables used in the debugger itself but of the programme being debugged).
Finally, you could implement your own lookup table mapping from string representations to values.
No.
We only have variable names so humans don't get confused .
After your program gets turned into assembly and eventually machine code, the computer doesn't care what you name your variables.
Alternatively you could use a structure in which you would store the value and the name as a string:
struct tag_name {
char *member1;
int member2;
};
In general, it is not possible to access at runtime global variables by name. Sometimes, it might depend upon the operating system, and how the compiler is invoked. I still assume you want to dereference a global variable, and you know its type.
Then on Linux and some other systems, you could use dlopen(3) with a NULL path (to get a handle for the executable), then use dlsym on the global variable name to get its address; you can then cast that void* pointer to a pointer of the appropriate type and dereference it. Notice that you need to know the type (or at least have a convention to encode the type of the variable in its name; C++ is doing that with name mangling). If you compiled and linked with debug information (i.e. with gcc -g) the type information is in its DWARF sections of your ELF executable, so there is some way to get it.
This works if you link your executable using -rdynamic and with -ldl
Another possibility might be to customize your recent GCC with your own MELT extension which would remember and later re-use some of the compiler internal representations (i.e. the GCC Tree-s related to global variables). Use MELT register_finish_decl_first function to register a handler on declarations. But this will require some work (in coding your MELT extension).
using preprocessor tricks
You could use (portable) preprocessor tricks to achieve your goals (accessing variable by name at runtime).
The simplest way might be to define and follow your own conventions. For example you could have your own globvar.def header file containing just lines like
/* file globvar.def */
MY_GLOBAL_VARIABLE(globalint,int)
MY_GLOBAL_VARIABLE(globalint2,int)
MY_GLOBAL_VARIABLE(globalstr,char*)
#undef MY_GLOBAL_VARIABLE
And you adopt the convention that all global variables are in the above globvar.def file. Then you would #include "globvar.def" several times. For instance, in your global header, expand MY_GLOBAL_VARIABLE to some extern declaration:
/* in yourheader.h */
#define MY_GLOBAL_VARIABLE(Nam,Typ) extern Typ Nam;
#include "globvar.def"
In your main.c you'll need a similar trick to declare your globals.
Elsewhere you might define a function to get integer variables by name:
/* return the address of global int variable or else NULL */
int* global_int_var_by_name (const char*name) {
#define MY_GLOBAL_VARIABLE(Nam,Typ) \
if (!strcmp(#Typ,"int") && !strcmp(name,#Nam)) return (int*)&Nam;
#include "globvar.def"
return NULL;
}
etc etc... I'm using stringification of macro arguments.
Such preprocessor tricks are purely standard C and would work with any C99 compliant compiler.
How is scope of a variable is implemented by compilers?
I mean, when we say static variable, the scope is limited to the block or functions that defined in the same file where the static variable is defined?
How is this achieved in machine level or at memory level?
How actually is this restriction achieved?
How is this scoping resolved at program run time?
It is not achieved at all at the machine level. The compiler checks for scopes before machine code is actually generated. The rules of C are implemented by the compiler, not by the machine. The compiler must check those rules, the machine does not and cannot.
A very simplistic explanation of how the compiler checks this:
Whenever a scope is introduced, the compiler gives it a name and puts it in a structure (a tree) that makes it easy to determine the position of that scope in relation to other scopes, and it is marked as being the current scope. When a variable is declared, its assigned to the current scope. When accessing a variable, it is looked for in the current scope. If not found, the tree is looked up to find the scope above the current one. This continues until we reach the topmost scope. If the variable is still not found, then we have a scope violation.
inside compilers, its implementation defined. For example if I were writing a compiler, I would use a tree to define 'scope' and it would definitely be a symbol table inside a binary tree.
Some would use an arbitrary depth Hash table. Its all implementation defined.
I'm not 100% sure I understand what you are asking, but if you mean "how are static variables and functions stored in the final program", that is implementation-defined.
That said, a common way of storing such variables and functions is in the same place as any other global symbols (and some non-global ones) -- the difference is that these are not "exported", and thus not visible in any outside code trying to link to our software.
In other words, a program which has the following in it:
int var;
static int svar;
int func() { static int func_static; ... }
static int sfunc() { ... }
... might have the following layout in memory (let's say our data starts at 0xF000 and functions at 0xFF00):
0xF000: var
0xF004: svar
0xF008: func.func_static
...
0xFF00: func's data
0xFF40: sfunc's data /* assuming we needed 0x40 bytes for `func`! */
The list of exports, however, would only contain the non-static symbols, aka the exported ones:
var v 0xF000
func f 0xFF00
Again -- note how, while the static data is still written into the files (it has to be stored somewhere!), it is not exported; in layman's terms, our program does not tell anyone that it contains svar, sfunc and similar.
In Unices, you can list the symbols that a library or a program exports with the nm tool: http://unixhelp.ed.ac.uk/CGI/man-cgi?nm ; there do exist similar tools for Windows (GnuWin32 might have something similar).
In practice, executable code is often stored separately from the data (so that it can be protected from writes, for example), and it both may get reordered to minimize memory use and cache misses, but the idea remains the same.
Of course, optimizations can be applied -- for example, a static function could be inlined in its every invokation, meaning that no code is generated for the function itself at all, and thus it does not exist on its own anywhere.
I found this declaration in kernel/sched/core.c and I do not understand what does this specify.
static void __sched __schedule(void)
Any help appreciated.
[EDIT] kernel version 3.5.4
__sched is actually a macro defined as __attribute__((__section__(".sched.text"))) in include/linux/sched.h. This attribute is picked up by the GCC compiler:
Normally, the compiler places the objects it generates in sections
like data and bss. Sometimes, however, you need additional sections,
or you need certain particular variables to appear in special
sections, for example to map to special hardware. The section
attribute specifies that a variable (or function) lives in a
particular section.
I have the following string declared as a constant in my code. The purpose is to provide a crude and simple way of storing simple metadata in the compiled output.
const char myString1[] ="abc123\0";
const char myString2[] = {'a','b','c','1','2','3','\0'};
When I inspect the output with a hex editor, I see other string constants but "abc123" does not appear. This leads me to believe that the optimizations that are enabled are causing the lines not to be compiled, as they are never referenced in the program.
Is there a way in code to force this to compile, or another way (in code) of getting this metadata into the binary? I don't want to do any manipulation of the binary post-compile, the goal is to keep it as simple as possible.
compiler flags
-O2 -g -Wall -c -fmessage-length=0 -fno-builtin -ffunction-sections -mcpu=cortex-m3 -mthumb
I think you are looking for the used attribute:
`used'
This attribute, attached to a variable, means that the variable
must be emitted even if it appears that the variable is not
referenced.
When applied to a static data member of a C++ class template, the
attribute also means that the member will be instantiated if the
class itself is instantiated.
Apply it like
__attribute__((used))
const char myString1[] ="abc123\0";
__attribute__((used))
const char myString2[] = {'a','b','c','1','2','3','\0'};
Given the compiler flags you posted, it is almost certainly the linker. The -ffunction-sections flag puts each definition into its own section in the object files. This allows the linker to easily determine that a data item or function is not referenced and omit it from the final binary.
Use the binutils strings command to see if these strings are present in your binary.
If they have been optimized out, you can try to use the volatile qualifier when you declare them. Note that if they are not used even with the volatile qualifier some compilers can still optimized them out.
I've come up with a solution that uses attributes and involves modifying the link script.
First I define a custom section called ".metadata".
__attribute__ ((section(".metadata")))
Then, in the SECTIONS block of the .ld script I added a KEEP(*(.metadata)) which will force the linker to include .metadata even if it's not used
.text :
{
KEEP(*(.isr_vector))
KEEP(*(.metadata))
*(.text*)
*(.rodata*)
} > MFlash32
NOTE
I found that the __attribute__ keyword had to be on the same line as the variable or else it didn't actually show up in the binary, though the .metadata section did show up in the memory map.
If you have these variables in file scope, the compiler must provide the strings, since he can't know if they will be used from a different compilation unit. So any of your ".o" files where you place these variables, must contain the string.
Now a clever linker could decide for the final binary that these constants are not needed. (I have never observed that, though.) If this is the case for your platform, you should use the variable on a "hypothetical" path, that in reality will never be taken by the program. Something like
int main(int argc, char*argv[]){
switch (argv[0][0]) {
case 1: return myString1[argv[0][1]];
case 2: return myString2[argv[0][1]];
}
...
}
I simply need a way to load the address of a label e.g. MyLabel: in e.g. 'src.asm' into a variable in e.g. 'src.c'. (These files will be linked together) I am using gcc and nasm to assemble these files. How can I load the label address?
There are two steps to this. First, you must export the label as global from the assembly file using the global directive.
global MyLabel
MyLabel: dd 1234 ; data or code, in whatever section. It doesn't matter.
Next, you must declare the label as external in C. You can do this either in the code using it, or in a header.
// It doesn't matter, and can be plain void,
// but prefer giving it a C type that matches what you put there with asm
extern void MyLabel(void); // The label is code, even if not actually a function
extern const uint32_t MyLabel[]; // The label is data
// *not* extern long *MyLabel, unless the memory at MyLabel *holds* a pointer.
Finally, you get the address of the label in C the same way you get the address of any variable.
doSomethingWith( &MyLabel );
Note that some compilers add an underscore to the beginning of C variable and function names. For example, GCC does this on Mac OS X, but not Linux. I don't know about other platforms/compilers. To be on the safe side, you can add an asm statement to the variable declaration to tell GCC what the assembly name for the variable is.
extern uint8_t MyLabel asm("MyLabel");
You might consider an assembler "getter" routine.
Also, you might be able to simply fake the label to look like a routine to the C binder so that you could take the address of the "procedure".