How do 'C' statements execute inside memory - c

Suppose I have this piece of C code here:
int main() {
int a = 10;
printf("test1");
printf("test2");
someFunc();
a = 20;
printf("%d",a);
}
Well I think that all these statements are stored on the stack at a time and then popped one by one to get executed. Am I correct? If not, please correct me.

Not really no. The C standard doesn't mention a stack so your notions are ill-conceived.
A C compiler (or interpreter for that matter) can do anything it likes so long as it follows the C standard.
In your case, that could mean, among other things, (i) the removal of a altogether, since it's only used to output 20 at the end of the function, and (ii) someFunc() could be removed if there are no side effects in doing so.
What normally happens is that your code is converted to machine code suitable for the target architecture. These machine code instructions do tend to follow the code quite faithfully (C in this sense is quite "low level") although modern compilers will optimise aggressively.

The C standard doesn't specify where things go in memory.
However, computers work like this in general:
Your program consists of various memory sections/segments, usually they are called .data, .bss, .rodata, .stack, .text and there is possibly also a .heap. The .text section is for storing the actual program code, the rest of the sections are for storing variables. More info on Wikipedia.
The C code is translated to machine code (assembler) by the compiler. All code is stored inside the .text section, which is read-only memory.
Modern computers can "mark" memory as either code or data, and will be capable of generating hardware exceptions if you try to execute code in the data section, or treat the code section as data. This way the processor can assist in catching bugs like dangling pointers or run-away code.
In theory you could execute code from any memory section, even on the stack, but because of the previously mentioned feature, this is not typically done. Most often, it wouldn't make any sense to do so anyhow.
So for your specific code snippet, the only thing that is stored on the stack is the variable a. Or more likely, it is stored in a CPU register, for performance reasons.
The string literals "test1", "test2" and "%d" will be stored in the .rodata section.
The literal 20 could either be stored in the .rodata section, or more likely, merged into the code and therefore stored in .text together with the rest of the code.
The program counter determines which part of the code that is currently executed. The stack is not involved in that what-so-ever, it is only for storing data.

Whats worth noting is that the C standard does not mandate implementation. So as long as the output is correct according to the standard, the compiler is free to implement this as it choses.
What most likely happens for you is that the compiler translates this bit of C into assembler code which is then executed top to bottom.
a = 20;
Is most likely optimised out unless you use it somewhere else. Good compilers also throw a warning for you like:
Warning: Unused Variable a

Related

When initializing variables in C does the compiler store the intial value in the executable

When Initializing variables in C I was wondering does the compiler make it so that when the code is loaded at run time the value is already set OR does it have to make an explicit assembly call to set it to an initial value?
The first one would be slightly more efficient because you're not making a second CPU call just to set the value.
e.g.
void foo() {
int c = 1234;
}
A compiler is not required to do either of them. As long as the behavior of the program stays the same it can pretty much do whatever it wants.
Especially when using optimization, crazy stuff can happen. Looking at the assembly code after heavy optimization can be confusing to say the least.
In your example, both the constant 1234 and the variable c would be optimized away since they are not used.
If it's a variable with static lifetime, it'll typically become part of the executable's static image, which'll get memcpy'ed, along with other statically known data, into the process's allocated memory when the process is started/loaded.
void take_ptr(int*);
void static_lifetime_var(void)
{
static int c = 1234;
take_ptr(&c);
}
x86-64 assembly from gcc -Os:
static_lifetime_var:
mov edi, OFFSET FLAT:c.1910
jmp take_ptr
c.1910:
.long 1234
If it's unused, it'll typically vanish:
void unused(void)
{
int c = 1234;
}
x86-64 assembly from gcc -Os:
unused:
ret
If it is used, it may not be necessary to put it into the function's frame (its local variables on the stack)—it might be possible to directly embed it into an assembly instruction, or "use it as an immediate":
void take_int(int);
void used_as_an_immediate(int d)
{
int c = 1234;
take_int(c*d);
}
x86-64 assembly from gcc -Os:
used_as_an_immediate:
imul edi, edi, 1234
jmp take_int
If it is used as a true local, it'll need to be loaded into stack-allocated space:
void take_ptr(int*);
void used(int d)
{
int c = 1234;
take_ptr(&c);
}
x86-64 assembly from gcc -Os:
used:
sub rsp, 24
lea rdi, [rsp+12]
mov DWORD PTR [rsp+12], 1234
call take_ptr
add rsp, 24
ret
When pondering these things Compiler Explorer along with some basic knowledge of assembly are your friends.
TL;DR: Your examples declares and initializes an automatic variable. It has to be initialized each time the function is called. So there will be some instruction to do this.
As an adjusted duplicate of my answer to How compile time initialization of variables works internally in c?:
The standard defines no exact way of initialization. It depends on the environment your code is developed and run on.
How variables are initialized depends also on their storage duration. You didn't mention it in the text, your example is an automatic variable. (Which is most probably optimized away, as commenters point out.)
Initialized automatic variables will be written each time their declaration is reached. The compiled program executes some machine code for this to happen.
Static variables are always intialized and only once before the program startup.
Examples from the real world:
Most (if not all) PC systems store the initial values of explicitly (and not zero-) initialized static variables in a special section called data that is loaded by the system's loader to RAM. That way those variables get their values before the program startup. Static variables not explicitly initialized or with zero-like values are placed in a section bss and are filled with zeroes by the startup code before program startup.
Many embedded systems have their program in non-volatile memory that can't be changed. On such systems the startup code copies the initial values of the section data into its allocated space in RAM, producing a similar result. The same startup code zeroes also the section bss.
Note 1: The sections don't have to be named liked this. But it is common.
This startup code might be part of the compiled program, or might not. It depends, see above. But speaking of efficience it doesn't matter which program initializes variables. It just has to be done.
Note 2: There are more kinds of storage duration, please see chapter 6.2.4 of the standard.
As long as the standard is met, a system is free to implement any other kind of initialization, including writing the initial values into their variables step by step.
Firstly, its important to have a common understanding of the word 'compiler', else we can end-up arguing endlessly.
In simple words,
a compiler is a computer program that translates computer code written
in one programming language (the source language) into another
programming language (the target language). The name compiler is
primarily used for programs that translate source code from a
high-level programming language to a lower level language (e.g.,
assembly language, object code, or machine code) to create an
executable program.
(Ref: Wikipedia)
With this common understanding, let's now find answer you question: The answer is 'yes, the final code contains explicit assembly call to set it to an initial value' for any kind of variables. It is so because finally the variables are either stored in some memory location, or they live in some CPU register in case the number of variables are so less that the variable can be accommodated into some CPU registers such as your code snippet running on lets say most modern servers (Side note: different systems have different number of registers).
For the variables that are stored in registers, there has to be a mov (or equivalent) kind of instruction to load that initial value into the register. Without such an instruction the assigned register cannot be assigned with the intended value.
For the variables that are stored in memory, depending on the architecture and the compiler efficiency, the init value has to be somehow pushed to the given/assigned address, which takes at least a couple of asm instructions.
Does this answer your question?

C Function to Step through its own Assembly

I am wondering if there is a way to write a function in a C program that walks through another function to locate an address that an instruction is called.
For example, I want to find the address that the ret instruction is used in the main function.
My first thoughts are to make a while loop that begins at "&main()" and then looping each time increments the address by 1 until the instruction is "ret" at the current address and returning the address.
It is certainly possible to write a program that disassembles machine code. (Obviously, this is architecture-specific. A program like this works only for the architectures it is designed for.) And such a program could take the address of its main routine and examine it. (In some C implementations, a pointer to a function is not actually the address of the code of the function. However, a program designed to disassemble code would take this into an account.)
This would be a task of considerable difficulty for a novice.
Your program would not increment the address by one byte between instructions. Many architectures have a fixed instruction size of four bytes, although other sizes are possible. The x86-64 architecture (known by various names) has variable instruction sizes. Disassembling it is fairly complicated. As part of the process of disassembling an instruction, you have to figure out how big it is, so you know where the next instruction is.
In general, though, it is not always feasible to determine which return instruction is the one executed by main when it is done. Although functions are often written in a straightforward way, they may jump around. A function may have multiple return statements. Its code may be in multiple non-contiguous places, and it might even share code with other functions. (I do not know if this is common practice in common compilers, but it could be.) And, of course main might not ever return (and, if the compiler detects this, it might not bother writing a return instruction at all).
(Incidentally, there is a mathematical proof that it is impossible to write a program that always determines whether a program terminates or not. This is called the Halting Problem.)

Economizing on variable use

I am working on some (embedded) device, recently I just started thinking maybe to use less memory, in case stack size isn't that big.
I have long functions (unfortunately).
And inside I was thinking to save space in this way.
Imagine there is code
1. void f()
2. {
3. ...
4. char someArray[300];
5. char someOtherArray[300];
6. someFunc(someArray, someOtherArray);
7. ...
8. }
Now, imagine, someArray and someOtherArray are never used in f function beyond line: 6.
Would following save some stack space??
1. void f()
2. {
3. ...
4. {//added
5. char someArray[300];
6. char someOtherArray[300];
7. someFunc(someArray, someOtherArray);
8. }//added
9. ...
8. }
nb: removed second part of the question
For the compiler proper both are exactly the same and thus makes no difference. The preprocessor would replace all instances of TEXT1 with the string constant.
#define TEXT1 "SomeLongStringLiteral"
someFunc(TEXT1)
someOtherFunc(TEXT1)
After the preprocessor's job is done, the above snippet becomes
someFunc("SomeLongStringLiteral");
someOtherFunc("SomeLongStringLiteral");
Thus it makes no difference performance or memory-wise.
Aside: The reason #define TEXT1 "SomeLongStringLiteral" is done is to have a single place to change all instances of TEXT1s usage; but that's a convinience only for the programmer and has no effect on the produced output.
recently I just started thinking maybe to use less memory, in case stack size isn't that big.
Never micro optimise or prematurely optimise. In case the stack size isn't that big, you'll get to know it when you benchmark/measure it. Don't make any assumptions when you optimise; 99% of the times it'd be wrong.
I am working on some device
Really? Are you? I wouldn't have thought that.
Now, imagine, someArray and someOtherArray are never used in f function beyond line 6. Would following save some stack space?
On a good compiler, it wouldn't make a difference. By the standard, it isn't specified if it saves or not, it isn't even specified if there is a stack or not.
But on a not so good compiler, the one with the additional {} may be better. It is worth a test: compile it and look at the generated assembler code.
it seems my compiler doesn't allow me to do this (this is C), so never mind...
But it should so. What happens then? Maybe you are just confusing levels of {} ...
I'll ask another one here.
Would better be a separate question...
someFunc("SomeLongStringLiteral");
someOtherFunc("SomeLongStringLiteral");
vs.
someFunc(TEXT1)
someOtherFunc(TEXT1)
A #define is processed before any compilation step, so it makes absolutely no difference.
If it happens within the same compilation unit, the compiler will tie them together anyway. (At least, in this case. On an ATXmega, if you use PSTR("whatever") for having them in flash space only, each occurrence of them will be put into flash separately. But that's a completely different thing...)
Modern compilers should push stack variables before they are used, and pop them when they are no longer needed. The old thinking with { ... } marking the start and end of a stack push/pop should be rather obsolete by now.
Since 1999, C allows stack variables to be allocated anywhere and not just immediately after a {. C++ allowed this far earlier. Today, where the local variable is declared inside the scope has little to do with when it actually starts to exist in the machine code. And similarly, the } has little to do with when is ceases to exist.
So regarding adding extra { }, don't bother. It is premature optimization and only adds pointless clutter.
Regarding the #define it absolutely makes no difference in terms of efficiency. Macros are just text replacement.
Furthermore, from the generic point-of-view, data must always be allocated somewhere. Data used by a program cannot be allocated in thin air! That's a very common misunderstanding. For example, many people incorrectly believe that
int x = func();
if(x == something)
consumes more memory than
if(func() == something)
But both examples compile into identical machine code. The result of func must be stored somewhere, it cannot be stored in thin air. In the first example, the result is stored in a memory segment that the programmer may refer to as x.
In the second example, it is stored in the very same memory segment, taking up the same amount of space, for the same duration of program execution. The only difference is that the memory segment is anonymous and the programmer has no name for it. As far as the machine code is concerned, that doesn't matter, since no variable names exist in machine code.
And this would be why every professional C programmer needs to understand a certain amount of assembler. You cannot hope to ever do any kind of manual code optimization if you don't.
(Please don't ask two questions in one, this is really annoying since you get two types of answer for your two different questions.)
For your first question. Probably putting {} around the use of a variable will not help. The lifetime of automatic variables that are not VLA (see below) is not bound to the scope in which it is declared. So compilers may have a hard time in figuring out how the use of the stack may be optimized, and maybe don't do such an optimization at all. In your case this is most likely the case, since you are exporting pointers to your data to a function that is perhaps not visible, here. The compiler has no way to figure out if there is a valid use of the arrays later on in the code.
I see two ways to "force" the compiler into optimizing that space, functions or VLA. The first, functions is simple: instead of putting the block around the code, put it in a static function. Function calls are quite optimized on modern platforms, and here the compiler knows exactly how he may clear the stack at the end.
The second alternative in your case is a VLA, variable length array, if you compiler supports that c99 feature. Arrays that have a size that doesn't depend on a compile time constant have a special rule for their lifetime. That lifetime exactly ends at the end of the scope where they are defined. Even a const-qualified variable could be used for that:
{
size_t const len = 300;
char someArray[len];
char someOtherArray[len];
someFunc(someArray, someOtherArray);
}
At the end, on a given platform, you'd really have to inspect what assembler your compiler produces.

auto variable and register variable -- optimized the same?

I am reading APUE, and when I came to longjmp , the question came. Before optimization, both the auto variable and register variable store in memory, after optimization, they store in register, the book says. But when I use objdump -S a.out, I found both of them became to immediate operand. So ?
Your book was just simplifying. Even before optimizing there is no guarantee that variables are realized in memory. The difference between auto and register is just that you are not allowed to take the address of a register variable. The C compiler can do anything that behaves the same as the abstract machine.
That your compiler realizes these variables as immediats is an indication that the values that you have there are small and are compile time constants. So you probably could have declared them const or even as enum constants in the first place.
So the program is very simple, and the compilers have gotten much smarter since the book has been written.
So you used a different compiler, maybe on a different machine, probably with different levels of optimization, and you can conclude essentially nothing from this except that the behaviour of compilers varies, which makes it hard to write a text book that is accurate in every detail on all machines for all time.

Why compilers creates one variable "twice"?

I know this is more "heavy" question, but I think its interesting too. It was part of my previous questions about compiler functions, but back than I explained it very badly, and many answered just my first question, so ther it is:
So, if my knowledge is correct, modern Windows systems use paging as a way to switch tasks and secure that each task has propriate place in memory. So, every process gets its own place starting from 0.
When multitasking goes into effect, Kernel has to save all important registers to the task´s stack i believe than save the current stack pointer, change page entry to switch to another proces´s physical adress space, load new process stack pointer, pop saved registers and continue by call to poped instruction pointer adress.
Becouse of this nice feature (paging) every process thinks it has nice flat memory within reach. So, there is no far jumps, far pointers, memory segment or data segment. All is nice and linear.
But, when there is no more segmentation for the process, why does still compilers create variables on the stack, or when global directly in other memory space, than directly in program code?
Let me give an example, I have a C code:int a=10;
which gets translated into (Intel syntax):mov [position of a],#10
But than, you actually ocupy more bytes in RAM than needed. Becouse, first few bytes takes the actuall instruction, and after that instruction is done, there is new byte containing the value 10.
Why, instead of this, when there is no need to switch any segment (thus slowing the process speed) isn´t just a value of 10 coded directly into program like this:
xor eax,eax //just some instruction
10 //the value iserted to the program
call end //just some instruction
Becouse compiler know the exact position of every instruction, when operating with that variable, it would just use it´s adress.
I know, that const variables do this, but they are not really variables, when you cannot change them.
I hope I eplained my question well, but I am still learning English, so forgive my sytactical and even semantical errors.
EDIT:
I have read your answers, and it seems that based on those I can modify my question:
So, someone told here that global variable is actually that piece of values attached directly into program, I mean, when variable is global, is it atached to the end of program, or just created like the local one at the time of execution, but instead of on stack on heap directly?
If the first case - attached to the program itself, why is there even existence of local variables? I know, you will tell me becouse of recursion, but that is not the case. When you call function, you can push any memory space on stack, so there is no program there.
I hope you do understand me, there always is ineficient use of memory, when some value (even 0) is created on stack from some instruction, becouse you need space in program for that instruction and than for the actual var. Like so: push #5 //instruction that says to create local variable with integer 5
And than this instruction just makes number 5 to be on stack. Please help me, I really want to know why its this way. Thanks.
Consider:
local variables may have more than one simultaneous existence if a routine is called recursively (even indirectly in, say, a recursive decent parser) or from more than one thread, and these cases occur in the same memory context
marking the program memory non-writable and the stack+heap as non-executable is a small but useful defense against certain classes of attacks (stack smashing...) and is used by some OSs (I don't know if windows does this, however)
Your proposal doesn't allow for either of these cases.
So, there is no far jumps, far pointers, memory segment or data segment. All is nice and linear.
Yes and no. Different program segments have different purposes - despite the fact that they reside within flat virtual memory. E.g. data segment is readable and writable, but you can't execute data. Code segment is readable and executable, but you can't write into it.
why does still compilers create variables on the stack, [...] than directly in program code?
Simple.
Code segment isn't writable. For safety reasons first. Second,
most CPUs do not like to have code segment being written into as it
breaks many existing optimization used to accelerate execution.
State of the function has to be private to the function due to
things like recursion and multi-threading.
isn´t just a value of 10 coded directly into program like this
Modern CPUs prefetch instructions to allow things like parallel execution and out-of-order execution. Putting the garbage (to CPU that is the garbage) into the code segment would simply diminish (or flat out cancel) the effect of the techniques. And they are responsible for the lion share of the performance gains CPUs had showed in the past decade.
when there is no need to switch any segment
So if there is no overhead of switching segment, why then put that into the code segment? There are no problems to keep it in data segment.
Especially in case of read-only data segment, it makes sense to put all read-only data of the program into one place - since it can be shared by all instances of the running application, saving physical RAM.
Becouse compiler know the exact position of every instruction, when operating with that variable, it would just use it´s adress.
No, not really. Most of the code is relocatable or position independent. The code is patched with real memory addresses when OS loads it into the memory. Actually special techniques are used to actually avoid patching the code so that the code segment too could be shared by all running application instances.
The ABI is responsible for defining how and what compiler and linker supposed to do for program to be executable by the complying OS. I haven't seen the Windows ABI, but the ABIs used by Linux are easy to find: search for "AMD64 ABI". Even reading the Linux ABI might answer some of your questions.
What you are talking about is optimization, and that is the compiler's business. If nothing ever changes that value, and the compiler can figure that out, then the compiler is perfectly free to do just what you say (unless a is declared volatile).
Now if you are saying that you are seeing that the compiler isn't doing that, and you think it should, you'd have to talk to your compiler writer. If you are using VisualStudio, their address is One Microsoft Way, Redmond WA. Good luck knocking on doors there. :-)
Why isn´t just a value of 10 coded directly into program like this:
xor eax,eax //just some instruction
10 //the value iserted to the program
call end //just some instruction
That is how global variables are stored. However, instead of being stuck in the middle of executable code (which is messy, and not even possible nowadays), they are stored just after the program code in memory (in Windows and Linux, at least), in what's called the .data section.
When it can, the compiler will move variables to the .data section to optimize performance. However, there are several reasons it might not:
Some variables cannot be made global, including instance variables for a class, parameters passed into a function (obviously), and variables used in recursive functions.
The variable still exists in memory somewhere, and still must have code to access it. Thus, memory usage will not change. In fact, on the x86 ("Intel"), according to this page the instruction to reference a local variable:
mov eax, [esp+8]
and the instruction to reference a global variable:
mov eax, [0xb3a7135]
both take 1 (one!) clock cycle.
The only advantage, then, is that if every local variable is global, you wouldn't have to make room on the stack for local variables.
Adding a variable to the .data segment may actually increase the size of the executable, since the variable is actually contained in the file itself.
As caf mentions in the comments, stack-based variables only exist while the function is running - global variables take up memory during the entire execution of the program.
not quite sure what your confusion is?
int a = 10; means make a spot in memory, and put the value 10 at the memory address
if you want a to be 10
#define a 10
though more typically
#define TEN 10
Variables have storage space and can be modified. It makes no sense to stick them in the code segment, where they cannot be modified.
If you have code with int a=10 or even const int a=10, the compiler cannot convert code which references 'a' to use the constant 10 directly, because it has no way of knowing whether 'a' may be changed behind its back (even const variables can be changed). For example, one way 'a' can be changed without the compiler knowing is, if you have a pointer which points 'a'. Pointers are not fixed at runtime, so the compiler cannot determine at compile time whether there will be a pointer which will point to and modify 'a'.

Resources