Are __func__ and __FUNCTION__ pointers persistent? - c

For gcc projects, are the pointers returned by __FUNCTION__, __FILE__ and __func__ guaranteed to point to persistent memory? (That is, can I safely deference the pointers in the scope of another function?) I know that __func__ is supposed to act like a const char __func__ = "filename" at the beginning of the function, which implies that "filename" points to something in the data segment of the program, and that the pointer should therefore be valid outside of the function. The others are strings, which again, should create entries in the data section. That being said, I don't trust it, and I'm wondering if someone here can confirm whether the assumption is correct.
For example:
struct debugLog_t {
const char * func;
const char * file;
const char * function;
uint32_t line;
int val;
} log;
struct debugLog_t someLog = {};
someFunc() {
// create debug log:
if (x) {
//uh oh...
someLog.func = __func__;
someLog.function = __FUNCTION__;
someLog.file = __FILE__;
someLog.line = line;
someLog.val = val;
}
}
void dumpSomeLog() {
printf("%s(%s) -- %s.%d: error val is x\n",
someLog.function, someLog.func, someLog.file, someLog.line,
someLog.val);
}
I want to do this to reduce memory/processing time of recording debug logs.

I won't call that persistent memory (read wikipage on persistence) but read only memory (or section) in the code segment.
And yes, __func__, __FUNCTION__, __FILE__ go there (as static const char[] arrays); like literal strings.
Notice that two occurrences of a literal string like "ab" may or not be compiled into the same addresses (likewise, "bc" can or not be equal to pointer "abc"+1). Likewise for two occurrences of __FILE__; however, within the same function, all occurrences of __func__ should have the same address.
With GCC (at least at -O1 optimization) literal constant strings of the same content share the same location. I would even believe that in function foo the __func__ and "foo" might share the same address (but with GCC they don't, even at -O2). You could check by compiling with gcc -fverbose-asm -S -O1 and look at the generated *.s assembler file.
For example:
const char*f(int x) {
if (x==0) return "f";
if (x>0) return __func__;
return __FUNCTION__;
}
gets compiled with gcc -O -fverbose-asm -S (using GCC 7 on Linux/Debian/Sid/x86-64) as
.section .rodata.str1.1,"aMS",#progbits,1
.LC0:
.string "f"
.text
.globl f
.type f, #function
f:
.LFB0:
.cfi_startproc
# f.c:2: if (x==0) return "f";
leaq .LC0(%rip), %rax #, <retval>
testl %edi, %edi # x
je .L1 #,
# f.c:3: if (x>0) return __func__;
testl %edi, %edi # x
# f.c:4: return __FUNCTION__;
leaq __func__.1795(%rip), %rax #, tmp94
leaq __FUNCTION__.1796(%rip), %rdx #, tmp95
cmovle %rdx, %rax # tmp94,, tmp95, <retval>
.L1:
# f.c:5: }
rep ret
.cfi_endproc
.LFE0:
.size f, .-f
.section .rodata
.type __FUNCTION__.1796, #object
.size __FUNCTION__.1796, 2
__FUNCTION__.1796:
.string "f"
.type __func__.1795, #object
.size __func__.1795, 2
__func__.1795:
.string "f"
.ident "GCC: (Debian 7.2.0-8) 7.2.0"
Even with -Os or -O3 I'm getting three different locations in the code segment.
However Clang 5 with -O3 (or even -O1) merge all three "f", __FUNCTION__ and __func__ by putting them at the same location (and optimize the test by removing it):
.type f,#function
f: # #f
.cfi_startproc
# BB#0:
movl $.L.str, %eax
retq
.Lfunc_end0:
.size f, .Lfunc_end0-f
.cfi_endproc
# -- End function
.type .L.str,#object # #.str
.section .rodata.str1.1,"aMS",#progbits,1
.L.str:
.asciz "f"
.size .L.str, 2
So the pointers you care about are pointers to static const char[] in the code segment but you should not always expect that __func__ has the same address than __FUNCTION__ (even if that could be).

Yes they are. These constants actually act like static declarations. From the GCC docs, __func__ acts as though the function begins with
static const char __func__[] = "function-name";
and __FUNCTION__ is basically the same.

According to C2011, the __FILE__ macro expands to
The presumed name of the current source file (a character string literal).
(C2011 6.10.8.1/1; emphasis added)
Therefore, yes, you can assign that to a pointer variable, and expect to be able to safely dereference it for the lifetime of the program.
The standard also specifies the form for __func__, which is effectively an implicit variable, not a macro:
The identifier __func__ shall be implicitly declared by the
translator as if, immediately following the opening brace of each
function definition, the declaration
static const char __func__[] = "function-name";
appeared [...].
(C2011, 6.4.2.2/1)
In this case, then, the identifier designates an array of const char with static storage duration. In this case, too, it is safe to record a pointer to this and dereference at an arbitrary time thereafter in the program run.
As an extension and backwards-compatibility provision, GCC also provides __FUNCTION__ as an alias for __func__, so the same answer applies to the former as applies to the latter: yes, the strings they reference reside in persistent memory, which you can safely access from another function.

Related

malloc pointer address in main and in other function difference [duplicate]

This question already has answers here:
Printing pointer addresses in C [two questions]
(5 answers)
Closed 5 years ago.
I have the following question. Why is there a difference in the addresses of the two pointers in following example? This is the full code:
#include <stdio.h>
#include <stdlib.h>
void *mymalloc(size_t bytes){
void * ptr = malloc(bytes);
printf("Address1 = %zx\n",(size_t)&ptr);
return ptr;
}
void main (void)
{
unsigned char *bitv = mymalloc(5);
printf("Address2 = %zx\n",(size_t)&bitv);
}
Result:
Address1 = 7ffe150307f0
Address2 = 7ffe15030810
It's because you are printing the address of the pointer variable, not the pointer. Remove the ampersand (&) from bitv and ptr in your printfs.
printf("Address1 = %zx\n",(size_t)ptr);
and
printf("Address2 = %zx\n",(size_t)bitv);
Also, use %p for pointers (and then don't cast to size_t)
WHY?
In this line of code:
unsigned char *bitv = mymalloc(5);
bitv is a pointer and its value is the address of the newly allocated block of memory. But that address also needs to be stored, and &bitv is the address of the where that value is stored. If you have two variables storing the same pointer, they will still each have their own address, which is why &ptr and &bitv have different values.
But, as you expected, ptr and bitv will have the same value when you change your code.
Why is there a difference in the addresses of the two pointers
Because the two pointers are two different pointer(-variable)s, each having it's own address.
The value those two pointer(-variable)s carry in fact are the same.
To prove this print their value (and not their address) by changing:
printf("Address1 = %zx\n",(size_t)&ptr);
to be
printf("Address1 = %p\n", (void*) ptr);
and
printf("Address2 = %zx\n",(size_t)&bitv);
to be
printf("Address2 = %p\n", (void*) bitv);
In your code you used to print pointer's address following code:
printf("%zx", (size_t)&p);
It doesn't print address of variabele it's pointing to, it prints address of pointer.
You could print address using '%p' format:
printf("%p", &n); // PRINTS ADDRESS OF 'n'
There's an example which explains printing addresses
int n;
int *v;
n = 54;
v = &n;
printf("%p", v); // PRINTS ADDRESS OF 'n'
printf("%p", &v); // PRINTS ADDRESS OF pointer 'v'
printf("%p", &n); // PRINTS ADDRESS OF 'n'
printf("%d", *v); // PRINTS VALUE OF 'n'
printf("%d", n); // PRINTS VALUE OF 'n'
So your code should be written like this:
void * get_mem(int size)
{
void * buff = malloc(size); // allocation of memory
// buff is pointing to result of malloc(size)
if (!buff) return NULL; //when malloc returns NULL end function
//else print address of pointer
printf("ADDRESS->%p\n", buff);
return buff;
}
int main(void)
{
void * buff = get_mem(54);
printf("ADDRESS->%p\n", buff);
free(buff);
return 0;
}
(In addition to other answers, which you would read first and probably should help you more ...)
Read a good C programming book. Pointers and addresses are very difficult to explain, and I'm not even trying to. So the address of a pointer &ptr is generally not the same as the value of a pointer (however, you could code ptr= &ptr; but you often don't want to do that)... Look also at the picture explaining virtual address space.
Then read more documentation about malloc: malloc(3) Linux man page, this reference documentation, etc... Here is fast, standard conforming, but disappointing implementation of malloc.
read also documentation about printf: printf(3) man page, printf reference, etc... It should mention %p for printing pointers...
Notice that you don't print a pointer (see Alk's answer), you don't even print its address (of an automatic variable on the call stack), you print some cast to size_t (which might not have the same bit width as a pointer, even if on my Linux/x86-64 it does).
Read also more about C dynamic memory allocation and about pointer aliasing.
At last, read the C11 standard specification n1570.
(I can't believe why you would expect the two outputs to be the same; actually it could happen if a compiler is optimizing the call to mymalloc by inlining a tail call)
So I did not expect the output to be the same in general. However, with gcc -O2 antonis.c -o antonis I've got (with a tiny modification of your code)....
a surprise
However, if you declare the first void *mymalloc(size_t bytes) as a static void*mymalloc(size_t bytes) and compile with GCC 7 on Linux/Debian/x86-64 with optimizations enabled, you do get the same output; because the compiler inlined the call and used the same location for bitv and ptr; here is the generated assembler code with gcc -S -O2 -fverbose-asm antonis.c:
.section .rodata.str1.1,"aMS",#progbits,1
.LC0:
.string "Address1 = %zx\n"
.LC1:
.string "Address2 = %zx\n"
.section .text.startup,"ax",#progbits
.p2align 4,,15
.globl main
.type main, #function
main:
.LFB22:
.cfi_startproc
pushq %rbx #
.cfi_def_cfa_offset 16
.cfi_offset 3, -16
# antonis.c:5: void * ptr = malloc(bytes);
movl $5, %edi #,
# antonis.c:11: {
subq $16, %rsp #,
.cfi_def_cfa_offset 32
# antonis.c:6: printf("Address1 = %zx\n",(size_t)&ptr);
leaq 8(%rsp), %rbx #, tmp92
# antonis.c:5: void * ptr = malloc(bytes);
call malloc#PLT #
# antonis.c:6: printf("Address1 = %zx\n",(size_t)&ptr);
leaq .LC0(%rip), %rdi #,
# antonis.c:5: void * ptr = malloc(bytes);
movq %rax, 8(%rsp) # tmp91, ptr
# antonis.c:6: printf("Address1 = %zx\n",(size_t)&ptr);
movq %rbx, %rsi # tmp92,
xorl %eax, %eax #
call printf#PLT #
# antonis.c:13: printf("Address2 = %zx\n",(size_t)&bitv);
leaq .LC1(%rip), %rdi #,
movq %rbx, %rsi # tmp92,
xorl %eax, %eax #
call printf#PLT #
# antonis.c:14: }
addq $16, %rsp #,
.cfi_def_cfa_offset 16
popq %rbx #
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE22:
.size main, .-main
BTW, if I compile your unmodified source (without static) with gcc -fwhole-program -O2 -S -fverbose-asm I'm getting the same assembler as above.
If you don't add static and don't compile with -fwhole-program the two Adddress1 and Address2 stay different.
two run outputs
I run that antonis executable and got on the first time:
/tmp$ ./antonis
Address1 = 7ffe2b07c148
Address2 = 7ffe2b07c148
and the second time:
/tmp$ ./antonis
Address1 = 7ffc441851a8
Address2 = 7ffc441851a8
If you want to guess why the outputs are different from one run to the next one, think of ASLR.
BTW, a very important notion when coding in C is that of undefined behavior (see also this and that answers and the references I gave there). You don't have any in your question (it is just unspecified behavior), but as my contrived answer shows, you should not expect a particular behavior in that precise case.
PS. I believe (but I am not entirely sure) that a standard conforming C implementation could output Address1= hello world and likewise for Address2. After all, the behavior of printf with %p is implementation defined. And surely you could get 0xdeadbeef for both. More seriously, an address is not always the same (of the same bitwidth) than a size_t or an int, and the standard defines intptr_t in <stdint.h>

What parts of this HelloWorld assembly code are essential if I were to write the program in assembly?

I have this short hello world program:
#include <stdio.h>
static const char* msg = "Hello world";
int main(){
printf("%s\n", msg);
return 0;
}
I compiled it into the following assembly code with gcc:
.file "hello_world.c"
.section .rodata
.LC0:
.string "Hello world"
.data
.align 4
.type msg, #object
.size msg, 4
msg:
.long .LC0
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
andl $-16, %esp
subl $16, %esp
movl msg, %eax
movl %eax, (%esp)
call puts
movl $0, %eax
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4"
.section .note.GNU-stack,"",#progbits
My question is: are all parts of this code essential if I were to write this program in assembly (instead of writing it in C and then compiling to assembly)? I understand the assembly instructions but there are certain pieces I don't understand. For instance, I don't know what .cfi* is, and I'm wondering if I would need to include this to write this program in assembly.
The absolute bare minimum that will work on the platform that this appears to be, is
.globl main
main:
pushl $.LC0
call puts
addl $4, %esp
xorl %eax, %eax
ret
.LC0:
.string "Hello world"
But this breaks a number of ABI requirements. The minimum for an ABI-compliant program is
.globl main
.type main, #function
main:
subl $24, %esp
pushl $.LC0
call puts
xorl %eax, %eax
addl $28, %esp
ret
.size main, .-main
.section .rodata
.LC0:
.string "Hello world"
Everything else in your object file is either the compiler not optimizing the code down as tightly as possible, or optional annotations to be written to the object file.
The .cfi_* directives, in particular, are optional annotations. They are necessary if and only if the function might be on the call stack when a C++ exception is thrown, but they are useful in any program from which you might want to extract a stack trace. If you are going to write nontrivial code by hand in assembly language, it will probably be worth learning how to write them. Unfortunately, they are very poorly documented; I am not currently finding anything that I think is worth linking to.
The line
.section .note.GNU-stack,"",#progbits
is also important to know about if you are writing assembly language by hand; it is another optional annotation, but a valuable one, because what it means is "nothing in this object file requires the stack to be executable." If all the object files in a program have this annotation, the kernel won't make the stack executable, which improves security a little bit.
(To indicate that you do need the stack to be executable, you put "x" instead of "". GCC may do this if you use its "nested function" extension. (Don't do that.))
It is probably worth mentioning that in the "AT&T" assembly syntax used (by default) by GCC and GNU binutils, there are three kinds of lines: A line
with a single token on it, ending in a colon, is a label. (I don't remember the rules for what characters can appear in labels.) A line whose first token begins with a dot, and does not end in a colon, is some kind of directive to the assembler. Anything else is an assembly instruction.
related: How to remove "noise" from GCC/clang assembly output? The .cfi directives are not directly useful to you, and the program would work without them. (It's stack-unwind info needed for exception handling and backtraces, so -fomit-frame-pointer can be enabled by default. And yes, gcc emits this even for C.)
As far as the number of asm source lines needed to produce a value Hello World program, obviously we want to use libc functions to do more work for us.
#Zwol's answer has the shortest implementation of your original C code.
Here's what you could do by hand, if you don't care about the exit status of your program, just that it prints your string.
# Hand-optimized asm, not compiler output
.globl main # necessary for the linker to see this symbol
main:
# main gets two args: argv and argc, so we know we can modify 8 bytes above our return address.
movl $.LC0, 4(%esp) # replace our first arg with the string
jmp puts # tail-call puts.
# you would normally put the string in .rodata, not leave it in .text where the linker will mix it with other functions.
.section .rodata
.LC0:
.asciz "Hello world" # asciz zero-terminates
The equivalent C (you just asked for the shortest Hello World, not one that had identical semantics):
int main(int argc, char **argv) {
return puts("Hello world");
}
Its exit status is implementation-defined but it definitely prints. puts(3) returns "a non-negative number", which could be outside the 0..255 range, so we can't say anything about the program's exit status being 0 / non-zero in Linux (where the process's exit status is the low 8 bits of the integer passed to the exit_group() system call (in this case by the CRT startup code that called main()).
Using JMP to implement the tail-call is a standard practice, and commonly used when a function doesn't need to do anything after another function returns. puts() will eventually return to the function that called main(), just like if puts() had returned to main() and then main() had returned. main()'s caller still has to deal with the args it put on the stack for main(), because they're still there (but modified, and we're allowed to do that).
gcc and clang don't generate code that modifies arg-passing space on the stack. It is perfectly safe and ABI-compliant, though: functions "own" their args on the stack, even if they were const. If you call a function, you can't assume that the args you put on the stack are still there. To make another call with the same or similar args, you need to store them all again.
Also note that this calls puts() with the same stack alignment that we had on entry to main(), so again we're ABI-compliant in preserving the 16B alignment required by modern version of the x86-32 aka i386 System V ABI (used by Linux).
.string zero-terminates strings, same as .asciz, but I had to look it up to check. I'd recommend just using .ascii or .asciz to make sure you're clear on whether your data has a terminating byte or not. (You don't need one if you use it with explicit-length functions like write())
In the x86-64 System V ABI (and Windows), args are passed in registers. This makes tail-call optimization a lot easier, because you can rearrange args or pass more args (as long as you don't run out of registers). This makes compilers willing to do it in practice. (Because as I said, they currently don't like to generate code that modifies the incoming arg space on the stack, even though the ABI is clear that they're allowed to, and compiler generated functions do assume that callees clobber their stack args.)
clang or gcc -O3 will do this optimization for x86-64, as you can see on the Godbolt compiler explorer:
#include <stdio.h>
int main() { return puts("Hello World"); }
# clang -O3 output
main: # #main
movl $.L.str, %edi
jmp puts # TAILCALL
# Godbolt strips out comment-only lines and directives; there's actually a .section .rodata before this
.L.str:
.asciz "Hello World"
Static data addresses always fit in the low 31 bits of address-space, and executable don't need position-independent code, otherwise the mov would be lea .LC0(%rip), %rdi. (You'll get this from gcc if it was configured with --enable-default-pie to make position-independent executables.)
How to load address of function or label into register in GNU Assembler
Hello World using 32-bit x86 Linux int 0x80 system calls directly, no libc
See Hello, world in assembly language with Linux system calls? My answer there was originally written for SO Docs, then moved here as a place to put it when SO Docs closed down. It didn't really belong here so I moved it to another question.
related: A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux. The smallest binary file you can run that just makes an exit() system call. That is about minimizing the binary size, not the source size or even just the number of instructions that actually run.

Editing ASM result of an operation in C when compiling in GCC

me and my friend got a computer architecture project and we don't really know how to get to it. I hope you could at least point us in the right direction so we know what to look for. As our professor isn't really good at explaining what we really need to do and the subject is rather vague we'll start from the beginning.
Our task is to somehow "edit" GCC to treat some operations differently. For example when you add two char arguments in a .c program it uses addb. We need to change it to f.e. 16bit registers(addl), without using unnecessary parameters during compilation(just regular gcc p.c -o p). Why or will it work doesn't really matter at this point.
We'd like to know how we could change something inside GCC, where we can even start looking as I can't find any information about similar tasks besides making plugins/extensions. Is there anything we could read about something like this or anything we could use?
In C 'char' variables are normally added together as integers so the C compiler will already use addl. Except when it can see that it makes no difference to the result to use a smaller or faster form.
For example this C code
unsigned char a, b, c;
int i;
void func1(void) { a = b + c; }
void func2(void) { i = b + c; }
Gives this assembler for GCC.
.file "xq.c"
.text
.p2align 4,,15
.globl func1
.type func1, #function
func1:
movzbl c, %eax
addb b, %al
movb %al, a
ret
.size func1, .-func1
.p2align 4,,15
.globl func2
.type func2, #function
func2:
movzbl b, %edx
movzbl c, %eax
addl %edx, %eax
movl %eax, i
ret
.size func2, .-func2
.comm i,4,4
.comm c,1,4
.comm b,1,4
.comm a,1,4
.ident "GCC: (Debian 4.7.2-5) 4.7.2"
.section .note.GNU-stack,"",#progbits
Note that the first function uses addb but the second uses addl because the high bits of the result will be discarded in the first function when the result is stored.
This version of GCC is generating i686 code so the integers are 32bit (addl) depending on exactly what you want you may need to make the result a short or actually get a compiler version that outputs 16bit 8086 code.

good explanation of __read_mostly, __init, __exit macros

The macro expansion of __read_mostly :
#define __read_mostly __attribute__((__section__(".data..read_mostly"))
This one is from cache.h
__init:
#define __init __section(.init.text) __cold notrace
from init.h
__exit:
#define __exit __section(.exit.text) __exitused __cold notrace
After searching through net i have not found any good explanation of
what is happening there.
Additonal question : I have heard about various "linker magic"
employed in kernel development. Any information
regarding this will be wonderful.
I have some ideas about these macros about what they do. Like __init supposed to indicate that the function code can be removed after initialization. __read_mostly is for indicating that the data is seldom written and by this it minimizes cache misses. But i have not idea about How they do it. I mean they are gcc extensions. So in theory they can be demonstrated by small userland c code.
UPDATE 1:
I have tried to test the __section__ with arbitrary section name. the test code :
#include <stdio.h>
#define __read_mostly __attribute__((__section__("MY_DATA")))
struct ro {
char a;
int b;
char * c;
};
struct ro my_ro __read_mostly = {
.a = 'a',
.b = 3,
.c = NULL,
};
int main(int argc, char **argv) {
printf("hello");
printf("my ro %c %d %p \n", my_ro.a, my_ro.b, my_ro.c);
return 0;
}
Now with __read_mostly the generated assembly code :
.file "ro.c"
.globl my_ro
.section MY_DATA,"aw",#progbits
.align 16
.type my_ro, #object
.size my_ro, 16
my_ro:
.byte 97
.zero 3
.long 3
.quad 0
.section .rodata
.LC0:
.string "hello"
.LC1:
.string "my ro %c %d %p \n"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
pushq %rbx
subq $24, %rsp
movl %edi, -20(%rbp)
movq %rsi, -32(%rbp)
movl $.LC0, %eax
movq %rax, %rdi
movl $0, %eax
.cfi_offset 3, -24
call printf
movq my_ro+8(%rip), %rcx
movl my_ro+4(%rip), %edx
movzbl my_ro(%rip), %eax
movsbl %al, %ebx
movl $.LC1, %eax
movl %ebx, %esi
movq %rax, %rdi
movl $0, %eax
call printf
movl $0, %eax
addq $24, %rsp
popq %rbx
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (GNU) 4.4.6 20110731 (Red Hat 4.4.6-3)"
.section .note.GNU-stack,"",#progbits
Now without the __read_mostly macro the assembly code remains more or less the same.
this is the diff
--- rm.S 2012-07-17 16:17:05.795771270 +0600
+++ rw.S 2012-07-17 16:19:08.633895693 +0600
## -1,6 +1,6 ##
.file "ro.c"
.globl my_ro
- .section MY_DATA,"aw",#progbits
+ .data
.align 16
.type my_ro, #object
.size my_ro, 16
So essentially only the a subsection is created, nothing fancy.
Even the objdump disassmbly does not show any difference.
So my final conclusion about them, its the linker's job do something for data section marked with a special name. I think linux kernel uses some kind of custom linker script do achieve these things.
One of the thing about __read_mostly, data which were put there can be grouped and managed in a way so that cache misses can be reduced.
Someone at lkml submitted a patch to remove __read_mostly. Which spawned a fascinated discussion on the merits and demerits of __read_mostly.
here is the link : https://lkml.org/lkml/2007/12/13/477
I will post further update on __init and __exit.
UPDATE 2
These macros __init , __exit and __read_mostly put the contents of data(in case of __read_mostly) and text(in cases of __init and __exit) are put into custom named sections. These sections are utilized by the linker. Now as linker is not used as its default behaviour for various reasons, A linker script is employed to achieve the purposes of these macros.
A background may be found how a custom linker script can be used to eliminate dead code(code which is linked to by linker but never executed). This issue is of very high importance in embedded scenarios. This document discusses how a linker script can be fine tuned to remove dead code : elinux.org/images/2/2d/ELC2010-gc-sections_Denys_Vlasenko.pdf
In case kernel the initial linker script can be found include/asm-generic/vmlinux.lds.h. This is not the final script. This is kind of starting point, the linker script is further modified for different platforms.
A quick look at this file the portions of interest can immediately found:
#define READ_MOSTLY_DATA(align) \
. = ALIGN(align); \
*(.data..read_mostly) \
. = ALIGN(align);
It seems this section is using the ".data..readmostly" section.
Also you can find __init and __exit section related linker commands :
#define INIT_TEXT \
*(.init.text) \
DEV_DISCARD(init.text) \
CPU_DISCARD(init.text) \
MEM_DISCARD(init.text)
#define EXIT_TEXT \
*(.exit.text) \
DEV_DISCARD(exit.text) \
CPU_DISCARD(exit.text) \
MEM_DISCARD(exit.text)
Linking seems pretty complex thing to do :)
GCC attributes are a general mechanism to give instructions to the compiler that are outside the specification of the language itself.
The common facility that the macros you list is the use of the __section__ attribute which is described as:
The section attribute specifies that a function lives in a particular section. For example, the declaration:
extern void foobar (void) __attribute__ ((section ("bar")));
puts the function foobar in the bar section.
So what does it mean to put something in a section? An object file is divided into sections: .text for executable machine code, .data for read-write data, .rodata for read-only data, .bss for data initialised to zero, etc. The names and purposes of these sections is a matter of platform convention, and some special sections can only be accessed from C using the __attribute__ ((section)) syntax.
In your example you can guess that .data..read_mostly is a subsection of .data for data that will be mostly read; .init.text is a text (machine code) section that will be run when the program is initialised, etc.
On Linux, deciding what to do with the various sections is the job of the kernel; when userspace requests to exec a program, it will read the program image section-by-section and process them appropriately: .data sections get mapped as read-write pages, .rodata as read-only, .text as execute-only, etc. Presumably .init.text will be executed before the program starts; that could either be done by the kernel or by userspace code placed at the program's entry point (I'm guessing the latter).
If you want to see the effect of these attributes, a good test is to run gcc with the -S option to output assembler code, which will contain the section directives. You could then run the assembler with and without the section directives and use objdump or even hex dump the resulting object file to see how it differs.
As far as I know, these macros are used exclusively by the kernel. In theory, they could apply to user-space, but I don't believe this is the case. They all group similar variable and code together for different effects.
init/exit
A lot of code is needed to setup the kernel; this happens before any user space is running at all. Ie, before the init task runs. In many cases, this code is never used again. So it would be a waste to consume un-swappable RAM after boot. The familiar kernel message Freeing init memory is a result of the init section. Some drivers maybe configured as modules. In these cases, they exit. However, if they are compiled into the kernel, the don't necessarily exit (they may shutdown). This is another section to group this type of code/data.
cold/hot
Each cache line has a fixed sized. You can maximize a cache by putting the same type of data/function in it. The idea is that often used code can go side by side. If the cache is four instructions, the end of one hot routine should merge with the beginning of the next hot routine. Similarly, it is good to keep seldom used code together, as we hope it never goes in the cache.
read_mostly
The idea here is similar to hot; the difference with data we can update the values. When this is done, the entire cache line becomes dirty and must be re-written to main RAM. This is needed for multi-CPU consistency and when that cache line goes stale. If nothing has changed in the difference between the CPU cache version and main memory, then nothing needs to happen on an eviction. This optimizes the RAM bus so that other important things can happen.
These items are strictly for the kernel. Similar tricks could (are?) be implemented for user space. That would depend on the loader in use; which is often different depending on the libc in use.

What's the difference between this 2 type of code?

Here are 2 type of code snippets which have the same outputs.
char *p = "abc";
::printf("%s",p);
And
::printf("%s","abc");
Is there any difference as to where the "abc" string is stored in memory?
I once heard that in the second code, the "abc" string is placed by the compiler in read-only memory (the .text part?)
How to tell this difference from code if any?
Many thanks.
Update
My current understanding is:
when we write:
char *p="abc"
Though this seems to be only a declarative statement, but indeed the compiler will generate many imperative instructions for it. These instructions will allocate proper space within the stack frame of the containing method, it could be like this:
subl %esp, $4
then the address of "abc" string is moved to that allocated space, it could be like this:
movl $abc_string_address, -4(%ebp)
The "abc" string is stored in the executable file image. But where in the memory it (i mean the string) will be loaded totally depends on the implementation of the compiler/linker, if it is loaded into the read-only part of the process's address space (i.e. the protection bit of the memory page is flagged as read-only), then the p is a read-only pointer, if it is loaded into the r/w part, the p is writable.
Correct me if I am wrong. Now I am looking into the assembly code generated by the gcc to have a confirmation for my understanding. I'll update this thread again shortly.
Is there any difference as to where the "abc" string is stored in memory?
Nope, and that is true for both. String literals are stored in the read-only segment. However, if you declare your variable as a char[] it will be copied onto the stack, i.e., not read only.
There is no difference in where the string literal is stored. The only difference is that the former also allocates space on the stack for the variable to store the pointer.
No difference besides a char pointer allocated on the stack for the first one.
They both use a string literal, delimited by double quotes.
Yes, it won't be stored in different locations, they are all compile-time know variables, so the compiler will generate the assembly code, and the "abc" string will be on the data segment, that is initialized data. .bss section is for unitialized data.
Try compiling with gcc and the -s option. It will generate a .s file, which is assembly code. The "abc" variable will be under the .rodata segment, same as .data for NASM assembly.
Here's the assembly code if you don't want to do the work:
This is for char* c = "abc"; printf("%s\n", c);
Note how this file has more lines of code than the other one, since this code allocates a pointer variable, and print this variable, the other solution doesn't use a variable, it justs references a static memory address.
.file "test.c"
.section .rodata
.LC0:
.string "abc"
.text
.globl main
.type main, #function
main:
pushl %ebp
movl %esp, %ebp
andl $-16, %esp
subl $32, %esp
movl $.LC0, 28(%esp)
movl 28(%esp), %eax
movl %eax, (%esp)
call puts
movl $0, %eax
leave
ret
.size main, .-main
.ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
.section .note.GNU-stack,"",#progbits
And this is for printf("abc\n");
.file "test2.c"
.section .rodata
.LC0:
.string "abc"
.text
.globl main
.type main, #function
main:
pushl %ebp
movl %esp, %ebp
andl $-16, %esp
subl $16, %esp
movl $.LC0, (%esp)
call puts
movl $0, %eax
leave
ret
.size main, .-main
.ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
.section .note.GNU-stack,"",#progbits
To your edit:
I don't think any compiler would put p on a read-only memory, since you declared it as a variable on the code, it's not a hidden/protected variable generated by the compiler, it's a variable you can use whenever you want.
If you do char* p = "abc"; the compiler will sub the pointer's size to the stack, and on the later instruction, it will insert the memory address of the "abc" string (now this is put into read only) into the register, and if the compiler needs register, save it's value to the stack.
If you do printf("abc"); no variable will be alocated, since the compiler knows the string's value at compile time, so it just inserts a number there (relative to the start of the executable file) and it can read the content of that part of the memory.
In this option, you can compile it, generate a .exe, then use a HEX editor, and search for the "abc" string, and change it to "cba" or whatever (probably it will be one of the first lines or one of the last), if the compiler generates a simple .exe like this, which is probable.

Resources