Big array initializations issues in C - c

need your help in three questions (which regard more or less to the same subject I guess).
1) I have a LARGE array of int's which is initialized in the following manner:
int arr [] = {.....}; // allot of values !!
within the program there is only one function that "uses" this array for "read only" operations.
We have two options regarding this array:
a) Declare that array as a local array in the that function.
b) Declare it as a global array outside of this function.
How will the image file of the program will be modified for both these cases?
How will the program speed of execution will be modified ?
2) Regarding the TI MSP430 micro controller:
I have in my program a very large global array of C style string as follows:
char *arr [] = {"string 1","string2",.......}; // allot of strings
Usually, at the beginning of the main program I use a command to stop the "Watch Dog" timer.
As I see it , it is needed , for instance, to cases where there is , for example, a very large array that needs to be initialized ....so my question is:
Does it the case? (having the large array of "strings" ) ? When does the array gets initialized?
Will it matter if I declare it in a different manner?
3) How (if so) the answer to questions 1 & 2 will be different in C++?
Thanks allot ,
Guy.

"How will the image file of the program be modified for each of these cases?":
If you declare it as a local variable, then the total size of your executable will remain the same, but every time you call the function, a large amount of data-copy operations will take place before the rest of the function code is executed.
If you declare it as a global variable, then the total size of your executable will increase, but there will be no additional data-copy operations during runtime, as the image values will be hard-coded into the executable itself (the executable loading-time will increase, if that makes any difference).
So option #1 is better in terms of size, and option #2 is better in terms of performance.
HOWEVER, please note that in the first option, you will most likely have a stack-overflow during runtime, which will cause your program to perform a memory access violation and crash. In order to avoid it, you will have to increase the size of your stack, typically defined in your project's linker-command file (*.lcf). Increasing the size of the stack means increasing the size of the executable, hence option #1 is no better than option #2 in any aspect, leaving you with only one choice to take (declaring it as a global variable).
Another issue, is that you might wanna declare this array as const, for two reasons:
It will prevent runtime errors (and give you compilation errors instead), should you ever attempt to change values within this read-only array.
It will tell the linker to allocate this array in the RO section of your program, which is possibly mapped to an EPROM on the MSP430. If you choose not to use const, then the linker will allocate this array in the RW section of your program, which is probably mapped to the RAM. So this consideration is really a matter of - which memory you're shorter of, RAM or EPROM. If you're not sure, then you can check it in your project's linker-command file, or in the map file that is generated every time you build the project.
"When does the global array get initialized?":
It is initialized during compilation, and the actual values are hard-coded into the executable.
So there is no running-time involved here, and there is something else causing your watch-dog to perform a HW reset (my guess - some memory access violation which causes your program to crash).
NOTE:
An executable program is typically divided into three sections:
Code (read-only) section, which contains the code and all the constant variables.
Data (read-write) section, which contains all the non-constant global/static variables.
Stack (read-write) section, where all the other variables are allocated during runtime.
The size and base-address of each section, can be configured in the linker-settings of the project (or in the linker-command file of the project).

You have a third option; declare it within the function, but with the static keyword:
void func()
{
static int arr[] = {...};
...
}
This will set aside storage in a different memory segment (depends on the architecture and the executable file format; for ELF, it will use the .data segment), which will be initialized at program startup and held until the program terminates.
Advantages:
The array is allocated and initialzed once, at program startup, rather than every time you enter the function;
The array name is still local to the function, so it's not visible to the rest of the program;
The array size can be quite a bit larger than if allocated on the stack;
Disadvantages:
If the array is not truly read-only, but is updated by the function, then the function is no longer re-entrant;
Note that if the array is truly meant to be read-only, you might want to declare it
static const int arr[] = {...}

Related

How memory allocation of variables or data in a program are done by compiler and OS

Want to get an overview on a few things about how exactly the memory for a variable is allocated.
In C programming,
Taking the context of "auto" variables, which are allocated on the stack section, I have the following question:
Does the compiler generate a logical address for the variables? If yes, then how? Won't the compiler need OS permission to generate or assign such addresses? If no, then is there some sort of indication or instruction that the compiler puts in the code segment asking the OS to allocate memory when running the executable?
Now taking the context of heap allocated variables,
Is the heap of the same size for all programs? If not, then does the executable consist of a header or something that tells the OS how much heap space it needs for dynamic allocation?
I'd be grateful if someone provides the answer or shares any related content/links that explains this.
Stack (most implementations use stack for automatic storage duration objects) and static storage duration objects memory is allocated during the program load and startup.
Does the compiler generate a logical address for the variables? If
yes, then how?
I do not know what is the "logical address" but compilers do "calculate" the references to the automatic storage duration objects. How? Simply compiler knows how far from the stack pointer address the automatic storage duration object is located (offset).
Generally the same applies to the static duration objects and the code, the compiler only calculates the offset from the their sections.
Is the heap of the same size for all programs?
It is implementation defined.
A method typically used in operating systems is that, when a program is starting, there is a piece (or collection) of software used that loads the program. The program loader reads the executable file and sets up memory for the program.
Part of the executable file says what size stack should be allocated for it. Most often, this is set by default when linking the program. (It is 8 MiB for macOS, 2 MiB for Linux, and 1 MiB for Windows.) However, it can be changed by asking the linker to set a different size.
The program loader calls operating system routines to request virtual memory be mapped. It does this for the stack and for other parts of the program, such as the code sections (the parts of memory that contain, mostly, the executable instructions of the program), and the initialized and uninitialized data. When it starts the program, it tells the program where the stack starts by putting that address into a designated register (or similar means).
One of the processor registers is used as a stack pointer; it points to address within the memory allocated for the stack that is the current top of stack. When the compiler arranges to use stack space for objects, it generates instructions that adjust the stack pointer. The addresses for the objects are calculated relative to the stack pointer. If a function needs 128 bytes of data, the compiler generates an instruction that subtracts 128 from the stack pointer. (This may occur in multiple steps, such as “call” and “push” instructions that make some changes to the stack pointer plus an additional “subtract” instruction that finishes the changes.) Then the addresses of all the objects in this stack frame are calculated as offsets from the value of the stack pointer. For example, by taking the stack pointer and adding 40, we calculate the address of the object that has been assigned to be 40 bytes higher than the top of the stack.
(There is some confusion about the wording of directions here because stacks commonly grow from high addresses to low addresses. The program loader may allocate some chunk of memory from, say, address 12300000016 to 12400000016. The stack pointer will start at 12400000016. Subtracting 128 will make it 123FFFF8016. Then 123FFFFA816 is an address that is 40 bytes “higher” than 123FFFF8016 in the address space, but the “top of stack” is below that. That is because the term “top of stack” refers to the model of physically stacking things on top of each other, with the latest thing on top.)
The so-called “heap” is not the same size of all programs. In typical implementations, the memory management routines call system routines to request more virtual memory when they need it.
Note that “heap” is properly a word for a general data structure. Heaps may be used to organize things other than available memory, and the memory management routine keep track of available memory using data structures other than heaps. When referring to memory allocated via the memory management routines, you can call it “dynamically allocated memory.” It may also be shortened to “allocated memory,” but that can be confusing in some situations since all memory that has been reserved for some use is allocated memory.
Some background first
In C programming, Taking the context of "auto" variables, which are allocated on the stack section ...
To understand my answer, you first need to know how the stack works.
Imagine you write the following function:
int myFunction()
{
return function1() + function2() + function3();
}
Unfortunately, you do not use C as programming language but you use a programming language that neither supports local variables nor return values. (This is how most CPUs work internally.)
You may return a value from a function in a global variable:
function1()
{
result = 1234; // instead of: return 1234;
}
And your program may now look the following way if you use a global variable instead of local ones:
int a;
myFunction()
{
function1();
a = result;
function2();
a += result;
function3();
result += a;
}
Unfortunately, one of the three functions (e.g. function3()) may call myFunction() (so the function is called recursively) and the variable a is overwritten when calling function3().
To solve this problem, you may define an array for local variables (myvars[]) and a variable named mypos. In the example, the elements 0...mypos in myvars[] are used; the elements (mypos+1)...(MAX_LOCALS-1) are free:
int myvars[MAX_LOCALS];
int mypos;
...
myFunction()
{
function1();
mypos++;
myvars[mypos] = result;
function2();
myvars[mypos] += result;
function3();
result += myvars[mypos];
mypos--;
}
By changing the value of mypos from 10 to 11 (as an example), your program indicates that the element mypos[11] is now in use and that the functions being called shall store their data in elements mypos[x] with x>=12.
Exactly this is how the stack is working.
Typically, the "variable" mypos is not a variable but a CPU register named "stack pointer". (However, there are a few historic CPUs where an ordinary variable was used for this!)
The actual answers
Does the compiler generate a logical address for the variables?
In the example above, the compiler will perform a mypos+=3 if there are 3 local variables. Let's say they are named a, b and c.
The compiler simply replaces a by myvars[mypos-2], b by myvars[mypos-1] and c by myvars[mypos].
On most CPUs, the stack pointer (named mypos in the example) is not an index into an array but already a pointer (comparable to int * mypos;), so the compiler would replace a by *(mypos-2) instead of myvars[mypos-2] in the example.
For global variables, the compiler simply counts the number of bytes needed for all global variables. In the simplest case, it chooses a range of memory of the same size (e.g. 0x10000...0x10123) and places the variables there.
Won't the compiler need OS permission to generate or assign such addresses?
No.
The "stack" (in the example this is the array myvars[]) is already provided by the OS and the stack pointer (mypos in the example) is also set to a correct value by the OS before the program is started.
Your program knows that the elements myvars[x] with x>mypos can be used.
For global variables, the information about the range used by global variables (e.g. 0x10000...0x10123) is stored in the executable file. The OS must ensure that this memory range can be used by the program. (For example by configuring the MMU accordingly.)
If this is not possible, the OS will simply refuse to start the program with an error message.
... asking the OS to allocate memory when running the executable?
For variables on the stack:
There may be operating systems where this is done.
However, in most cases, the program will simply crash with a "stack overflow" if too much stack is needed. In the example, this would mean: The program crashes if an elements myvars[x] with x>=MAX_LOCALS is accessed.
Now taking the context of heap allocated variables ...
Please first note that global variables are not stored on the heap.
The heap is used for data allocated using malloc() and similar functions (new in C++ for example).
Under many operating systems, malloc() calls an operating system function - so it is actually the operating system that allocates memory on the heap.
... and if there is not enough space, the OS (and malloc()) will return NULL.
Does the compiler generate a logical address for the variables? If yes, then how?
Yes, but they are related to the stack pointer at function entry time (which is normally saved as a constant base pointer, stored in a cpu register) This is because the function can be recursive, and you can have two calls to the function with different instances for that variable (and related to different copies of the base pointer), the compiler assigns the offset to the base pointer for the variable, but the base pointer can be different, depending on the stack contents at function entry time.
Won't the compiler need OS permission to generate or assign such addresses?
Nope, the compiler just generates an executable in the format and form needed for the operating system to manage process' memory. When the program starts, it is given normally three (or more) segments of memory:
text segment. A normally read-only (or execute only) segment that gives no write access to the program. This is normally because the text segment is shared between all programs that are using the same executable at the same time. A program can demand exclusive read-write acces to the text (to allow programs that modify their own executable code) but this happens only rarely. This is normally specified to the compiler and the compiler writes an special flag in the text segment to inform the kernel of this requirement.
Data segment. A read-write segment, that can be grown by means of a system cal (sbrk(2)) This is used for global variables and the heap (while in modern systems, the heap is allocated into a new segment acquired by calling the mmap(2) system call. Sometimes this segment is divided in two. A data segment read-only for constants (so the program receives a signal in case you try to change the value of a constant) and a read-write segment, freely usable by the program. This is where global variables are stored.
Stack segment. A read-write segment, that is allocated for the process to use as the stack segment. It has the capability of growing in one direction as the process starts using it. When the process accesses the data one memory page below the start of the segment, it generates a page fault trap that results in a new page being appended to the segment, so its workings are transparent to the process. This is the memory we are talking about.
the process can ask the kernel explicitly to get a new segment if it wants to (let's say it needs to map some file on memory, or if it has to load a shared executable/library) and on some systems, the read only variables (declared as const) are explititly stored in the text segment or in a specific section called .rodata that demands from the system a special data segment that is read-only. The compiler doesn't normally code this kind of resource itself, it is normally encoded in the program being compiled.
The complete memory is limited by system imposed limits, so if you try to overpass them (around 8Mb of stack space, by default, and depending on the operating system) you will get signalled by the system and your program aborted.
As you see, the process memory is owned by the process, and it can make whatever use it is permitted to. The stack memory is read/write, and allocated on demand, you can use up to 8Mb, but there's no provision to check on the use you do about it.
If no, then is there some sort of indication or instruction that the compiler puts in the code segment asking the OS to allocate memory when running the executable?
The system will know the size of the text segment of the process by the size it has on the executable. The data segment is divided into two parts, normally based on the assumption of what are the global initialized variables and what are the ones defaulting to zero (the memory allocated by the kernel to a process is initialized to zeros for security reasons) so the sum of both the initialized/data and the non initialized data sections are added to know how much memory to assign to the data segment. And the stack segment is assigned initialy just one page of memory, but as the process starts running and filling the stack, it grows as the process generates page faults on the so called next page of the stack segment. As you see, there's no nedd for the compiler to embed in the code any instruction to ask for more memory. Everything is read from the executable file.
The compiler runs as a normal program... it only generates all this information and writes it in a file (the executable file) for the kernel to know the resources needed to run the program. But the compiler's communication with the kernel is just to ask it to open files, write on them, read from source code and struggle it's head to achieve its task. :)
In most POSIX systems, the kernel loads a program in memory by means of the exec*(2) system calls. The kernel reads the executable file pointed to in a parameter of the call and creates the segments above mentioned, based on the parameters passed in the file, checks if another instance of the same program is running in the system to avoid loading the instructions from the file and referencing in this process the segment already open by the other. The data segment contents is initialized to zeros, and the contents of the initialization data are read into the segment (so the first part has the .data section of initialized global variables and the .bss section, which has only a size, is used to calculate the total size of the data segment). Then the stack is normally allocated one or more pages, depending on the initial contents that the exec() calls put in the initial stack. The initial stack is filled with:
a structure of data containing references to the program parameter list that was used on legacy systems to provide the kernel about the command line parameters to show in the ps(1) command output (this is still being generated for legacy purposes, but not used by the kernel for obvious security reasons) Today, a special system call is used to indicate the kernel the command line parameters to be output in the ps(1) output.
a snippet of machine code to use in the return from a system call to allow the execution (in user mode) of any signal handler that should be executed (this is the reason for the requirement that all signal handlers are called when the kernel returns from kernel mode and switches back again to user mode, and not otherwise)
the environment of the process.
the array of pointers to environment strings.
the command line parameters.
the array of char pointers that point to the command line parameters.
the envp array referenct to main().
the argv array reference to the command line parameters.
the argc counter of the number of command line parameters.
Once all these data is pushed to the stack, the program jumps to the start address (fixed by the linker, or by the user by a linker option) and is let to start running.
Before the program jumps to main() the executed code is part of the C runtime, that loads a special shared executable (called /lib/ld.so or similar) that is responsible of searching and loading of all the shared libraries that are linked to the program. Not all the programs have this feature (but almost all of them today are dynamically linked) but IMHO this is out of the scope to this question, as the program has already started and is running.

Organization of Virtual Memory in C

For each of the following, where does it appear to be stored in memory, and in what order: global variables, local variables, static local variables, function parameters, global constants, local constants, the functions themselves (and is main a special case?), dynamically allocated variables.
How will I evaluate this experimentally,i.e., using C code?
I know that
global variables -- data
static variables -- data
constant data types -- code
local variables(declared and defined in functions) -- stack
variables declared and defined in main function -- stack
pointers(ex: char *arr,int *arr) -- data or stack
dynamically allocated space(using malloc,calloc) -- heap
You could write some code to create all of the above, and then print out their addresses. For example:
void func(int a) {
int i = 0;
printf("local i address is %x\n", &i);
printf("parameter a address is %x\n", &a);
}
printf("func address is %x\n", (void *) &func);
note the function address is a bit tricky, you have to cast it a void* and when you take the address of a function you omit the (). Compare memory addresses and you will start to get a picture or where things are. Normally text (instructions) are at the bottom (closest to 0x0000) the heap is in the middle, and the stack starts at the top and grows down.
In theory
Pointers are no different from other variables as far as memory location is concerned.
Local variables and parameters might be allocated on the stack or directly in registers.
constant strings will be stored in a special data section, but basically the same kind of location as data.
numerical constants themselves will not be stored anywhere, they will be put into other variables or translated directly into CPU instructions.
for instance int a = 5; will store the constant 5 into the variable a (the actual memory is tied to the variable, not the constant), but a *= 5 will generate the code necessary to multiply a by the constant 5.
main is just a function like any other as far as memory location is concerned. A local main variable is no different from any other local variable, main code is located somewhere in code section like any other function, argc and argv are just parameters like any others (they are provided by the startup code that calls the main), etc.
code generation
Now if you want to see where the compiler and runtime put all these things, a possibility is to write a small program that defines a few of each, and ask the compiler to produce an assembly listing. You will then see how each element is stored.
For heap data, you will see calls to malloc, which is responsible for interfacing with the dynamic memory allocator.
For stack data, you will see strange references to stack pointers (the ebp register on x86 architectures), that will both be used for parameters and (automatic) local variables.
For global/static data, you will see labels named after your variables.
Constant strings will probably be labelled with an awful name, but you will notice they all go into a section (usually named bss) that will be linked next to data.
runtime addresses
Alternatively, you can run this program and ask it to print the addresses of each element. This, however, will not show you the register usage.
If you use a variable address, you will force the compiler to put it into memory, while it could have kept it into a register otherwise.
Note also that the memory organization is compiler and system dependent. The same code compiled with gcc and MSVC may have completely different addresses and elements in a completely different order.
Code optimizer is likely to do strange things too, so I advise to compile your sample code with all optimizations disabled first.
Looking at what the compiler does to gain size and/or speed might be interesting though.

How a pointer initialization C statement in global space gets its assigned value during compile/link time?

The background of this question is to understand how the compiler/linker deals with the pointers when it is initialized in global space.
For e.g.
#include <stdio.h>
int a = 8;
int *p = &a;
int main(void) {
printf("Address of 'a' = %x", p);
return 0;
}
Executing the above code prints the exact address for a.
My question here is, during at which process (compile? or linker?) the pointer p gets address of a ? It would be nice if your explanation includes equivalent Assembly code of the above program and how the compiler and linker deals with pointer assignment int *p = &a; in global space.
P.S: I could find lot of examples when the pointer is declared and initialized in local scope but hardly for global space.
Thanks in advance!
A module is linked (often named crt0.o) along with your program code, which is responsible for setting up the environment for a C program. There will be global and static variables initialized which is executed before main is called.
The actual address of the global variables are determined by the operating system, when it loads an executable and performs the necessary relocations so that the new process can be executed.
To run a program, the system has to load it into RAM. So it creates one huge memory block containing the actual compiled instructions. This block usually also contains a "data section" which contains strings etc. If you declare a global variable, what compilers usually do is reserve space for that variable in such a data section (there's usually several, non-writable ones for strings, and writable ones for globals etc.).
Whenever you reference the global, it just records the offset from the current instruction to that global. So an instruction can just calculate [current instruction address] + [offset] to get at the global, wherever it ended up being loaded. Since space in the data section has been reserved in the file anyway, they can write any (constant) value in there you want, and it will get loaded with the rest of the code.
This is how it works in C, and is why C only allows constants. C++ works like Devolus wrote, where there is extra code that is run before main(). Effectively they rename the main function and give you a function that does the setup, then calls your main function. This allows C++ to call constructors.
There are also some optimizations like, if a global is initialized to zero, it usually just gets an offset in a "zero" section that doesn't exist in the file. The file just says: "After this code, I want 64 bytes of zeroes". That way, your file doesn't waste space on disk with hundreds of "empty" bytes.
It gets a tad more complicated if you have dynamically loaded libraries (dylibs or DLLs), where you have two segments loaded into separate memory blocks. Since neither knows where in RAM the other one ended up, the executable file contains a list of name -> offset mappings. When you load a library, the loader looks up the symbol in the (already loaded) other library and calculates the actual address at which e.g. the global is at, before main() is called (and before any of the constructors run).

How do global variables contribute to the size of the executable?

Does having global variables increase the size of the executable? If yes how? Does it increase only the data section size or also the text section size?
If I have a global variable and initialization as below:
char g_glbarr[1024] = {"jhgdasdghaKJSDGksgJKASDGHKDGAJKsdghkajdgaDGKAjdghaJKSDGHAjksdghJKDG"};
Now, does this add 1024 to data section and the size of the initilization string to text section?
If instead if allocating space for this array statically, if I malloc it, and then do a memcpy, only the data section size will reduce or the text section size also will reduce?
Yes, it does. Basically compilers store them to data segment. Sometimes if you use a constant char array in you code (like printf("<1024 char array goes here");) it will go to data segment (AFAIK some old compilers /Borland?/ may store it in the text segment). You can force the compiler to put a global variable in a custom section (for VC++ it was #pragma data_seg(<segment name>)).
Dynamic memory allocation doesn't affect data/text segments, since it allocates memory in the heap.
The answer is implementation-dependent, but for sane implementations this is how it works for variables with static storage duration (global or otherwise):
Whenever the variable is initialized, the whole initialized value of the object will be stored in the executable file. This is true even if only the initial part of it is explicitly initialized (the rest is implicitly zero).
If the variable is constant and initialized, it will be in the "text" segment, or equivalent. Some systems (modern ELF-based, maybe Windows too?) have a separate "rodata" segment for read-only data to allow it to be marked non-executable, separate from program code.
Non-constant initialized variables will be in the "data" segment in the executable, which is mapped into memory in copy-on-write mode by the operating system when the program is loaded.
Uninitialized variables (which are implicitly zero as per the standard) will have no storage reserved in the executable itself, but a size and offset in the "bss" segment, which is created at program load-time by the operating system.
Such uninitialized variables may be created in a separate read-only "bss"-like segment if they're const-qualified.
I am not speaking as an expert, but I would guess that simply having that epic string literal in your program would increase the size of your executable. What you do with that string literal doesn't matter, because it has to be stored somewhere.
Why does it matter which "section" of the executable is increased? This isn't a rhetorical question!
The answer is slightly implementation sensitive, but in general, no. Your g_glbarr is really a pointer to char, or an address. The string itself will be put into the data section with constant strings, and g_glbarr will become a symbol for the address of the string at compile time. You don't end up allocating space for the pointer and the compiler simply resolves the address at link time.
Update
#Jay, it's sorta kinda the same. The integers (usually) just are in-line: the compiler will come as close as it can to just putting the constant in the code, because that's such a common case that most normal architectures have a straightforward way of doing it from immediate data. The string constants will still be in some read-only data section. So when you make something like:
// warning: I haven't compiled this and wouldn't normally
// do it quite this way so I'm not positive this is
// completely grammatical C
struct X {int a; char * b; } x = { 1, "Hello" } ;
the 1 becomes "immediate" data, the "Hello" is allocated in read-only data somewhere, and the compiler will just generate something that allocates a piece of read-write data that looks something like
x:
x.a: WORD 1
x.b WORD #STR42
where STR42 is a symbolic name for the location of the string "Hello" in memory. Then when everything is linked together, the #STR42 is replaced with the actual virtual address of the string in memory.

Where are constant variables stored in C?

I wonder where constant variables are stored. Is it in the same memory area as global variables? Or is it on the stack?
How they are stored is an implementation detail (depends on the compiler).
For example, in the GCC compiler, on most machines, read-only variables, constants, and jump tables are placed in the text section.
Depending on the data segmentation that a particular processor follows, we have five segments:
Code Segment - Stores only code, ROM
BSS (or Block Started by Symbol) Data segment - Stores initialised global and static variables
Stack segment - stores all the local variables and other informations regarding function return address etc
Heap segment - all dynamic allocations happens here
Data BSS (or Block Started by Symbol) segment - stores uninitialised global and static variables
Note that the difference between the data and BSS segments is that the former stores initialized global and static variables and the later stores UNinitialised ones.
Now, Why am I talking about the data segmentation when I must be just telling where are the constant variables stored... there's a reason to it...
Every segment has a write protected region where all the constants are stored.
For example:
If I have a const int which is local variable, then it is stored in the write protected region of stack segment.
If I have a global that is initialised const var, then it is stored in the data segment.
If I have an uninitialised const var, then it is stored in the BSS segment...
To summarize, "const" is just a data QUALIFIER, which means that first the compiler has to decide which segment the variable has to be stored and then if the variable is a const, then it qualifies to be stored in the write protected region of that particular segment.
Consider the code:
const int i = 0;
static const int k = 99;
int function(void)
{
const int j = 37;
totherfunc(&j);
totherfunc(&i);
//totherfunc(&k);
return(j+3);
}
Generally, i can be stored in the text segment (it's a read-only variable with a fixed value). If it is not in the text segment, it will be stored beside the global variables. Given that it is initialized to zero, it might be in the 'bss' section (where zeroed variables are usually allocated) or in the 'data' section (where initialized variables are usually allocated).
If the compiler is convinced the k is unused (which it could be since it is local to a single file), it might not appear in the object code at all. If the call to totherfunc() that references k was not commented out, then k would have to be allocated an address somewhere - it would likely be in the same segment as i.
The constant (if it is a constant, is it still a variable?) j will most probably appear on the stack of a conventional C implementation. (If you were asking in the comp.std.c news group, someone would mention that the standard doesn't say that automatic variables appear on the stack; fortunately, SO isn't comp.std.c!)
Note that I forced the variables to appear because I passed them by reference - presumably to a function expecting a pointer to a constant integer. If the addresses were never taken, then j and k could be optimized out of the code altogether. To remove i, the compiler would have to know all the source code for the entire program - it is accessible in other translation units (source files), and so cannot as readily be removed. Doubly not if the program indulges in dynamic loading of shared libraries - one of those libraries might rely on that global variable.
(Stylistically - the variables i and j should have longer, more meaningful names; this is only an example!)
Depends on your compiler, your system capabilities, your configuration while compiling.
gcc puts read-only constants on the .text section, unless instructed otherwise.
Usually they are stored in read-only data section (while global variables' section has write permissions). So, trying to modify constant by taking its address may result in access violation aka segfault.
But it depends on your hardware, OS and compiler really.
offcourse not , because
1) bss segment stored non inilized variables it obviously another type is there.
(I) large static and global and non constants and non initilaized variables it stored .BSS section.
(II) second thing small static and global variables and non constants and non initilaized variables stored in .SBSS section this included in .BSS segment.
2) data segment is initlaized variables it has 3 types ,
(I) large static and global and initlaized and non constants variables its stord in .DATA section.
(II) small static and global and non constant and initilaized variables its stord in .SDATA1 sectiion.
(III) small static and global and constant and initilaized OR non initilaized variables its stord in .SDATA2 sectiion.
i mention above small and large means depents upon complier for example small means < than 8 bytes and large means > than 8 bytes and equal values.
but my doubt is local constant are where it will stroe??????
This is mostly an educated guess, but I'd say that constants are usually stored in the actual CPU instructions of your compiled program, as immediate data. So in other words, most instructions include space for the address to get data from, but if it's a constant, the space can hold the value itself.
This is specific to Win32 systems.
It's compiler dependence but please aware that it may not be even fully stored. Since the compiler just needs to optimize it and adds the value of it directly into the expression that uses it.
I add this code in a program and compile with gcc for arm cortex m4, check the difference in the memory usage.
Without const:
int someConst[1000] = {0};
With const:
const int someConst[1000] = {0};
Global and constant are two completely separated keywords. You can have one or the other, none or both.
Where your variable, then, is stored in memory depends on the configuration. Read up a bit on the heap and the stack, that will give you some knowledge to ask more (and if I may, better and more specific) questions.
It may not be stored at all.
Consider some code like this:
#import<math.h>//import PI
double toRadian(int degree){
return degree*PI*2/360.0;
}
This enables the programmer to gather the idea of what is going on, but the compiler can optimize away some of that, and most compilers do, by evaluating constant expressions at compile time, which means that the value PI may not be in the resulting program at all.
Just as an an add on ,as you know that its during linking process the memory lay out of the final executable is decided .There is one more section called COMMON at which the common symbols from different input files are placed.This common section actually falls under the .bss section.
Some constants aren't even stored.
Consider the following code:
int x = foo();
x *= 2;
Chances are that the compiler will turn the multiplication into x = x+x; as that reduces the need to load the number 2 from memory.
I checked on x86_64 GNU/Linux system. By dereferencing the pointer to 'const' variable, the value can be changed. I used objdump. Didn't find 'const' variable in text segment. 'const' variable is stored on stack.
'const' is a compiler directive in "C". The compiler throws error when it comes across a statement changing 'const' variable.

Resources