A highly voted previous answer's highly voted comment states:
consider having many uninitialized buffers 4096 bytes in length. Would
you want all of those 4k buffers to contribute to the size of the
binary? That would be a lot of wasted space.
I am building the following two files into an executable on ubuntu:
main.c
int sum(int *a, int n);
int array[2] = {1,2};
int abc;//Comment in case (a) Uncomment in case (b) and (c)
int def;//Comment in case (a) and (b) Uncomment in case (c)
int main(){
int val = sum(array, 2);
return val;
}
sum.c
int sum(int *a, int n){
int i, s = 0;
for(i = 0; i < n; i++)
s += a[i];
return s;
}
The following command is used to create the executable
$gcc -Og -o prog main.c sum.c
There are 3 cases:
(a) has no uninitialized global variable. The executable has size 8648 bytes.
(b) has uninitialized global variable abc. The executable has size 8680 bytes.
(c) has uninitialized global variables abc and def. The executable has size 8704.
My question is, why does the executable size even change? My understanding (also confirmed by the answer linked to above) was that uninitialized global variables should NOT affect executable size.
Related
let's imagine I have this dynamically allocated 2D array:
//Example of a 3 row * 2 columns int array
int (*arr)[2] = malloc(sizeof(int[3][2]));
However, then I found that if I do:
arr[0][5] = 1;
The compiler does not complain, and at least testing with valgrind, it neither complains. It doesn't unless I try to access to a space which exceeds the size of the allocated space.
I found that the same happens for automatic arrays:
int arr[3][2];
arr[0][5] = 1; //Code works without errors
My question now is: what's the point of having for example declared: int arr[3][2]; if the compiler will accept arr[0][5] = 1; anyway?
I'm using GCC compiler
In general, don't write past the bounds of memory that you've allocated.
Clang will warn about both examples by default, while GCC will warn about neither without the variables actually being used (that's the fault of the dead code eliminator). You can enable the warning with -O2 -Wall -Wextra if the variable is used or is declared volatile.
With GCC and Clang it's sort of "safe" to do this; the same thing will happen each time.
However, this is undefined behavior, so it's a bad idea. It's entirely valid for a program that does this to make your computer grow legs and walk away.
An equivalent way of doing the assignment would be:
arr[2][1] = 1;
This goes based on the assumption that the array elements are stored sequentially in memory.
So, &arr[5][0] is technically the same as &arr[2][1], but it shouldn't be used.
My advice:
int arr[3][2];
int x, y;
for( x = 0; x < 3; x++ )
for( y = 0; y < 2; y++ )
arr[x][y] = x * y;
This is guaranteed to be safe.
In my pc Gcc 8.1.0
#include <stdio.h>
#include <stdlib.h>
int main(){
int i,j;
int (*arr)[2] = malloc(sizeof(int[3][2]));
printf("%p %d %d\n",arr,sizeof(int),sizeof(int[3][2]));
//in my computer print
//00C63E38 4 24
//legal memory from 00C63E38~00C63E4C
for(i=0;i<3;i++){
for(j=0;j<2;j++){
printf("%p ",&arr[i][j]);
}
printf("\n");
}
//00C63E38 00C63E3C
//00C63E40 00C63E44
//00C63E48 00C63E4C
printf("------------------\n");
for(i=0;i<3;i++){
for(j=0;j<2;j++){
printf("%p ",*(arr+i)+j);
}
printf("\n");
}
//00C63E38 00C63E3C
//00C63E40 00C63E44
//00C63E48 00C63E4C
//So arr[i][j] is equel *(arr+i)+j
printf("-------------\n");
for(i=0;i<6;i++){
printf("%p ",arr+i);
printf("\n");
}
printf("-------------\n");
//jump 4*2 pointer address per loop from 00C63E38
//00C63E38
//00C63E40
//00C63E48
//00C63E50
//00C63E58
//00C63E60
for(i=0;i<6;i++){
printf("%p ",arr[0]+i);
printf("\n");
}
//jump 4 pointer address per loop from 00C63E38
//00C63E38
//00C63E3C
//00C63E40
//00C63E44
//00C63E48
//00C63E4C
free(arr);
return 0;
}
I have two C files:
main.c
#include <stdio.h>
int sum(int n);
double array[2] = { 0.001, 1.0001 };
int main()
{
int val = sum(2);
printf("%d\n", val);
return 0;
}
sum.c
extern int array[2];
int sum(int n)
{
int i, ret = 0;
for (i = 0; i < n; i++) {
ret += array[i];
}
return ret;
}
I compile and link the files and then run the executable, but I am getting some unexpected output:
306318409
Why is that happening?
C Standard section 6.2.7/2 says
All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.
So anything could have happened. That's what you get when you lie to the compiler.
This really helps to make out the difference between compiling and linking .
When these two files where compiled :
main.c
implicitly declares a sum() function with return type as int*.
Has an array variable of type double* in it's list of variables
sum.c :
externed variable array allowed by compiler to get a value at link time and added to it's list of symbols .
defines a function sum which now uses the variable array with type as int* whose information was maintained by the compiler independently for this program that was externed .
Now at link time :
sum() call in main.c gets resolved from sum.c
sum.c had maintained the type of array variable as int* , while the extern variable was resolved by the double* type ,which however does not change the behaviour of variable array decided in sum() definition at compile time .
An array of doubles is stored with each in 8 bytes , while the sum() function assumed it would be stored per 4 bytes . So array[0] read the first 4 bytes and array[1] read the next 4 bytes , while however the number 0.001 in main.c from it's array[0] was itself stored in 8 bytes , leading to this undefined behavior.
As it is said that 8 mb of stack is given to each process.
This stack will be used to store local variables.
So if i take an array of size max than of the stack , it must overflow ??
int main()
{
int arr[88388608];
int arr1[88388608];
int arr2[88388608];
while(1);
return 0;
}
But i am unable to get the result !
Welcome to the world of optimizing compilers!
Because of the as-if rule, the compiler is only required to build something that would have same observable results as your original code.
So the compiler if free to:
remove the unused arrays
remove the empty loop
store the dynamic arrays from main outside of the stack - because main is a special function that shall be called only once by the environment
If you want to observe the stack overflow (the bad one, not our nice site :-) ),
you should:
use some code to fill the arrays
compile with all optimization removed and preferently in debug mode to tell the compiler do what I wrote as accurately as you can
The following code does SIGSEGV with CLang 3.4.1 when compiled as cc -g foo.c -o foo
#include <stdio.h>
#define SIZE 88388608
void fill(int *arr, size_t size, int val) {
for (size_t i=0; i<size; i++) {
arr[i] = val;
}
}
int main() {
int arr[SIZE];
int arr1[SIZE];
int arr2[SIZE];
fill(arr, SIZE, 0);
fill(arr1, SIZE, 0);
fill(arr2, SIZE, 0);
printf("%d %d %d\n", arr[12], arr1[15], arr2[18]);
return 0;
}
and even this code works fine when compiled as -O2 optimization level... Compilers are now too clever for me, and I'm not brave enough to thoroughly look at the assembly code which would be the only real way to understand what is actually executed!
I am getting a strange result using global variables. This question was inspired by another question. In the code below if I change
int ncols = 4096;
to
static int ncols = 4096;
or
const int ncols = 4096;
the code runs much faster and the assembly is much simpler.
//c99 -O3 -Wall -fopenmp foo.c
#include <stdlib.h>
#include <stdio.h>
#include <omp.h>
int nrows = 4096;
int ncols = 4096;
//static int ncols = 4096;
char* buff;
void func(char* pbuff, int * _nrows, int * _ncols) {
for (int i=0; i<*_nrows; i++) {
for (int j=0; j<*_ncols; j++) {
*pbuff += 1;
pbuff++;
}
}
}
int main(void) {
buff = calloc(ncols*nrows, sizeof*buff);
double dtime = -omp_get_wtime();
for(int k=0; k<100; k++) func(buff, &nrows, &ncols);
dtime += omp_get_wtime();
printf("time %.16e\n", dtime/100);
return 0;
}
I also get the same result if char* buff is a automatic variable (i.e. not global or static). I mean:
//c99 -O3 -Wall -fopenmp foo.c
#include <stdlib.h>
#include <stdio.h>
#include <omp.h>
int nrows = 4096;
int ncols = 4096;
void func(char* pbuff, int * _nrows, int * _ncols) {
for (int i=0; i<*_nrows; i++) {
for (int j=0; j<*_ncols; j++) {
*pbuff += 1;
pbuff++;
}
}
}
int main(void) {
char* buff = calloc(ncols*nrows, sizeof*buff);
double dtime = -omp_get_wtime();
for(int k=0; k<100; k++) func(buff, &nrows, &ncols);
dtime += omp_get_wtime();
printf("time %.16e\n", dtime/100);
return 0;
}
If I change buff to be a short pointer then the performance is fast and does not depend on if ncols is static or constant of if buff is automatic. However, when I make buff an int* pointer I observe the same effect as char*.
I thought this may be due to pointer aliasing so I also tried
void func(int * restrict pbuff, int * restrict _nrows, int * restirct _ncols)
but it made no difference.
Here are my questions
When buff is either a char* pointer or a int* global pointer why is the code
faster when ncols has file scope or is constant?
Why does buff being an automatic variable instead of global or static make the code faster?
Why does it make no difference when buff is a short pointer?
If this is due to pointer aliasing why does restrict have no noticeable effect?
Note that I'm using omp_get_wtime() simply because it's convenient for timing.
Some elements allow, as it's been written, GCC to assume different behaviors in terms of optimization; likely, the most impacting optimization we see is loop vectorization. Therefore,
Why is the code faster?
The code is faster because the hot part of it, the loops in func, have been optimized with auto-vectorization. In the case of a qualified ncols with static/const, indeed, GCC emits:
note: loop vectorized
note: loop peeled for vectorization to enhance alignment
which is visible if you turn on -fopt-info-loop, -fopt-info-vec or combinations of those with a further -optimized since it has the same effect.
Why does buff being an automatic variable instead of global or static
make the code faster?
In this case, GCC is able to compute the number of iterations which is intuitively necessary to apply vectorization. This is again due to the storage of buf which is external if not specified otherwise. The whole vectorization is immediately skipped, unlike when buff is local where it carries on and succeeds.
Why does it make no difference when buff is a short pointer?
Why should it? func accepts a char* which may alias anything.
If this is due to pointer aliasing why does restrict have no noticeable effect?
I don't think because GCC can see that they don't alias when func is invoked: restrict isn't needed.
A const will most likely always yield faster or equally fast code as a read/write variable, since the compiler knows that the variable won't be changed, which in turn enables a whole lot of optimization options.
Declaring a file scope variable int or static int should not affect performance much, as it will still be allocated at the very same place: the .data section.
But as mentioned in comments, if the variable is global, the compiler might have to assume that some other file (translation unit) might modify it and therefore block some optimization. I suppose this is what's happening.
But this shouldn't be any concern anyhow, since there is never a reason to declare a global variable in C, period. Always declare them as static to prevent the variable from getting abused for spaghetti-coding purposes.
In general I'd also question your benchmarking results. In Windows you should be using QueryPerformanceCounter and similar.
https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408%28v=vs.85%29.aspx
I just want to understand the difference in RAM allocation.
Why if i define a variable before function i have a RAM overflow and when i define it inside a function it is ok?
For example:
/*RAM OK*/
void Record(int16_t* current, int i,int n)
{
float Arr[NLOG2] = {0};
for(i=0;i<n;i++)
Arr[i]=current[i*5];
}
/*RAM OVERFLOW*/
static float Arr[NLOG2] = {0};
void Record(int16_t* current, int i,int n)
{
for(i=0;i<n;i++)
Arr[i]=current[i*5];
}
This is the message:
unable to allocate space for sections/blocks with a total estimated
minimum size of 0x330b bytes (max align 0x8) in
<[0x200000c8-0x200031ff]> (total uncommitted space 0x2f38).
The difference is that in the first case, Arr is declared on the stack; until the function is called, that array doesn't exist. The generated binary contains code for creating the array, but the array itself isn't in the binary.
In the second case, however, Arr is declared outside of any function (aka at file scope). Therefore, it always exists, and is stored in the binary. Because you appear to be working on an embedded platform, this otherwise insignificant difference causes your "RAM overflow" error.
In the 2nd case, the array is allocated when the application starts. It remains in the memory until the app quits.
In the 1st case, the array is only allocated when function void Record(int16_t* current, int i,int n) is called. The array is gone after the function finishes its execution.
static keyword doesn't have any impact if you have only a single compilation unit (.o file).
Global variables (not static) are there when you create the .o file available to the linker for use in other files. Therefore, if you have two files like this, you get name collision on a:
a.c:
#include <stdio.h>
int a;
int compute(void);
int main()
{
a = 1;
printf("%d %d\n", a, compute());
return 0;
}
b.c:
int a;
int compute(void)
{
a = 0;
return a;
}
because the linker doesn't know which of the global as to use.
However, when you define static globals, you are telling the compiler to keep the variable only for that file and don't let the linker know about it. So if you add static (in the definition of a) to the two sample codes I wrote, you won't get name collisions simply because the linker doesn't even know there is an a in either of the files:
a.c:
#include <stdio.h>
static int a;
int compute(void);
int main()
{
a = 1;
printf("%d %d\n", a, compute());
return 0;
}
b.c:
static int a;
int compute(void)
{
a = 0;
return a;
}
This means that each file works with its own a without knowing about the other ones.