Difference between memory addresses of variables is constant - c

I ran the following code and this is the output I got:
#include <stdio.h>
int main()
{
int x = 3;
int y = x;
printf("%d\n", &x);
printf("%d\n", &y);
getchar();
return 0;
}
Output:
3078020
3078008
Now, the output changes every time I run the program, but the difference between the location of x to the location of y is always 12. I wondered why.
Edit: I understand why the difference is constant. What I don't understand is why the difference is specifically 12, and why the memory address of y, who's defined later, is less than x's.

The addresses themselves may well change due to protective measures such as address space layout randomisation (ASLR).
This is something done to mitigate the possibility of an attack vector on code. If it always loads at the same address, it's more likely that a successful attack can be made.
By moving the addresses around each time it loads, it becomes much more difficult for an attack to work everywhere (or anywhere, really).
However, the fact that both variables will be on the stack in a single stack frame (assuming the implementation even uses a stack which is by no means guaranteed), the difference between them will be very unlikely to change.

In your code,
printf("%d\n", &x);
is not the correct way to print a pointer value. What you need is
printf("%p\n", (void *)&x);
That said, the difference between the addresses is constant because those two variables are usually placed in successive memory locations in stack. However, AFAIK this is not guranteed in c standard. Only thing guraneted is sizeof(a) [size of a datatype] will be fixed for a particular platform.

To print the address of a variable you have to use the %p not %d.
To print the integer value only you have to give %d.
printf("%d\n",x);// It will print the value of x
printf("%p\n",(void*)&x);// It will print the address of x.
From the man page of printf
p The void * pointer argument is printed in hexadecimal

Related

code order with variable length array

In C99, is there a big different between these two?:
int main() {
int n , m;
scanf("%d %d", &n, &m);
int X[n][m];
X[n-1][m-1] = 5;
printf("%d", X[n-1][m-1]);
}
and:
int main(int argc, char *argv[]) {
int n , m;
int X[n][m];
scanf("%d %d", &n, &m);
X[n-1][m-1] = 5;
printf("%d", X[n-1][m-1]);
}
The first one seems to always work, whereas the second one appears to work for most inputs, but gives a segfault for the inputs 5 5 and 6 6 and returns a different value than 5 for the input 9 9. So do you need to make sure to get the values before declaring them with variable length arrays or is there something else going on here?
When the second one works, it's pure chance. The fact that it ever works proves that, thankfully, compilers can't yet make demons fly out of your nose.
Declaring a variable doesn't necessarily initialize it. int n, m; leaves both n and m with undefined values in this case, and attempting to access those values is undefined behavior. If the raw binary data in the memory those point to happen to be interpreted into a value larger than the values entered for n and m -- which is very, very far from guaranteed -- then your code will work; if not, it won't. Your compiler could also have made this segfault, or made it melt your CPU; it's undefined behavior, so anything can happen.
For example, let's say that the area of memory that the compiler dedicates to n happened to contain the number 10589231, and m got 14. If you then entered an n of 12 and an m of 6, you're golden -- the array happens to be big enough. On the other hand, if n got 4 and m got 2, then your code will look past the end of the array, and you'll get undefined behavior -- which might not even break, since it's entirely possible that the bits stored in four-byte segments after the end of the array are both accessible to your program and valid integers according to your compiler/the C standard. In addition, it's possible for n and m to end up with negative values, which leads to... weird stuff. Probably.
Of course, this is all fluff and speculation depending on the compiler, OS, time of day, and phase of the moon,1 and you can't rely on any numbers happening to be initialized to the right ones.
With the first one, on the other hand, you're assigning the values through scanf, so (assuming it doesn't error) (and the entered numbers aren't negative) (or zero) you're going to have valid indices, because the array is guaranteed to be big enough because the variables are initialized properly.
Just to be clear, even though variables are required to be zero-initialized under some circumstances doesn't mean you should rely on that behavior. You should always explicitly give variables a default value, or initialize them as soon as possible after their declaration (in the case of using something like scanf). This makes your code clearer, and prevents people from wondering if you're relying on this type of UB.
1: Source: Ryan Bemrose, in chat
int X[n][m]; means to declare an array whose dimensions are the values that n and m currently have. C code doesn't look into the future; statements and declarations are executed in the order they are encountered.
In your second code you did not give n or m values, so this is undefined behaviour which means that anything may happen.
Here is another example of sequential execution:
int x = 5;
printf("%d\n", x);
x = 7;
This will print 5, not 7.
The second one should produce bugs because n and m are initialized with pretty much random values if they're local variables. If they're global, they'll be with value 0.

In C, technical explanation for how %d recognizes variables in a string

I came across an interesting output and I'd to know how the computer is working to produce this. I know that whenever you have %d in a string, you should have a variable to accompany it. When I wrote two %d's and only one variable, I expected that the computer would churn out the same value for the %d's, since it had only one variable to draw on, but for some reason, the %d's returned the value for x and the value for the variable xCubed. I want to know why the program returns xCubed without my writing xCubed at the end of the string. Here's the code:
#include <stdio.h>
int cube(int x);
int main(void){
int x = 5;
int xCubed = cube(x);
printf("Why does this number, %d, equal this number %d?", x);
return 0;
}
int cube(int x){
return x * x * x;
}
Thank you!
Your program invokes undefined behaviour. Anything could happen. Possibly the valued returned from the call to cube happens to lie next to the value of x on the stack. Of course, this behaviour being undefined means that any change to your program, or your compiler options, could result in different behaviour.
In any case, you are expected to supply two values. Do so.
printf("Why does this number, %d, equal this number %d?", x, x);
If you compiled your program with full warnings then the compiler would have warned you of your error. And you could even ask your compiler to treat warnings as errors to stop you committing the mistake.
Your program causes undefined behaviour, so anything is possible. It's some quirk of stack/register layout and calling convention for your platform that gives you the results you see.
That is because xCubed happens to be allocated just after x, which means closer to the printf part of the stack (activation frame).
printf is a vararg function, it has no implicit way of knowing how many arguments it was passed. So, when you call printf with two placeholders but just one value supplied, it will read past the first argument expecting a second and "fall" into the stack of the caller, whose nearest content is exactly xCubed.
Just to be clear: this is the reason why your code exhibits that particular behaviour, not the way it is expected to work. You have a serious bug in your code.
This was by good luck. In effect, it is undefined behaviour.
Obviously, in your case the variable xCubed was put onto stack immediately after the free space. Upon doing the printf() call, x was put immediately before that, and then the address of the format string.
If you compile this program with other optimization settings, your compiler might decide to put xCubed somwhere else, or in a register, or omit it altogether, as its value is never used.

C memory management in gcc

I am using gcc version 4.7.2 on Ubuntu 12.10 x86_64.
First of all these are the sizes of data types on my terminal:
sizeof(char) = 1
sizeof(short) = 2 sizeof(int) = 4
sizeof(long) = 8 sizeof(long long) = 8
sizeof(float) = 4 sizeof(double) = 8
sizeof(long double) = 16
Now please have a look at this code snippet:
int main(void)
{
char c = 'a';
printf("&c = %p\n", &c);
return 0;
}
If I am not wrong we can't predict anything about the address of c. But each time this program gives some random hex address ending in f. So the next available location will be some hex value ending in 0.
I observed this pattern in case of other data types too. For an int value the address was some hex value ending in c. For double it was some random hex value ending in 8 and so on.
So I have 2 questions here.
1) Who is governing this kind of memory allocation ? Is it gcc or C standard ?
2) Whoever it is, Why it's so ? Why the variable is stored in such a way that next available memory location starts at a hex value ending in 0 ? Any specific benefit ?
Now please have a look at this code snippet:
int main(void)
{
double a = 10.2;
int b = 20;
char c = 30;
short d = 40;
printf("&a = %p\n", &a);
printf("&b = %p\n", &b);
printf("&c = %p\n", &c);
printf("&d = %p\n", &d);
return 0;
}
Now here what I observed is completely new for me. I thought the variable would get stored in the same order they are declared. But No! That's not the case. Here is the sample output of one of random run:
&a = 0x7fff8686a698
&b = 0x7fff8686a694
&c = 0x7fff8686a691
&d = 0x7fff8686a692
It seems that variables get sorted in increasing order of their sizes and then they are stored in the same sorted order but with maintaining the observation 1. i.e. the last variable (largest one) gets stored in such a way that the next available memory location is an hex value ending in 0.
Here are my questions:
3) Who is behind this ? Is it gcc or C standard ?
4) Why to waste the time in sorting the variables first and then allocating the memory instead of directly allocating the memory on 'first come first serve' basis ? Any specific benefit of this kind of sorting and then allocating memory ?
Now please have a look at this code snippet:
int main(void)
{
char array1[] = {1, 2};
int array2[] = {1, 2, 3};
printf("&array1[0] = %p\n", &array1[0]);
printf("&array1[1] = %p\n\n", &array1[1]);
printf("&array2[0] = %p\n", &array2[0]);
printf("&array2[1] = %p\n", &array2[1]);
printf("&array2[2] = %p\n", &array2[2]);
return 0;
}
Now this is also shocking for me. What I observed is that the array is always stored at some random hex value ending in '0' if the elements of an array >= 2 and if elements < 2
then it gets memory location following observation 1.
So here are my questions:
5) Who is behind this storing an array at some random hex value ending at 0 thing ? Is it gcc or C standard ?
6) Now why to waste the memory ? I mean array2 could have been stored immediately after array1 (and hence array2 would have memory location ending at 2). But instead of that array2 is stored at next hex value ending at 0 thereby leaving 14 memory locations in between. Any specific benefits ?
The address at which the stack and the heap start is given to the process by the operating system. Everything else is decided by the compiler, using offsets that are known at compile time. Some of these things may follow an existing convention followed in your target architecture and some of these do not.
The C standard does not mandate anything regarding the order of the local variables inside the stack frame (as pointed out in a comment, it doesn't even mandate the use of a stack at all). The standard only bothers to define order when it comes to structs and, even then, it does not define specific offsets, only the fact that these offsets must be in increasing order. Usually, compilers try to align the variables in such a way that access to them takes as few CPU instructions as possible - and the standard permits that, without mandating it.
Part of the reasons are mandated by the application binary interface (ABI) specifications for your system & processor.
See the x86 calling conventions and the SVR4 x86-64 ABI supplement (I'm giving the URL of a recent copy; the latest original is surprisingly hard to find on the Web).
Within a given call frame, the compiler could place variables in arbitrary stack slots. It may try (when optimizing) to reorganize the stack at will, e.g. by decreasing alignment constraints. You should not worry about that.
A compiler try to put local variables on stack location with suitable alignment. See the alignof extension of GCC. Where exactly the compiler put these variables is not important, see my answer here. (If it is important to your code, you really should pack the variables in a single common local struct, since each compiler, version and optimization flags could do different things; so don't depend on that precise behavior of your particular compiler).

How is this loop ending and are the results deterministic?

I found some code and I am baffled as to how the loop exits, and how it works. Does the program produce a deterministic output?
The reason I am baffled is:
1. `someArray` is of size 2, but clearly, the loop goes till size 3,
2. The value is deterministic and it always exits `someNumber` reaches 4
Can someone please explain how this is happening?
The code was not printing correctly when I put angle brackets <> around include's library names.
#include <stdlib.h>
#include <time.h>
#include <stdio.h>
int main() {
int someNumber = 97;
int someArray[2] = {0,1};
int findTheValue;
for (findTheValue=0; (someNumber -= someArray[findTheValue]) >0; findTheValue++) {
}
printf("The crazy value is %d", findTheValue);
return EXIT_SUCCESS;
}
Accessing an array element beyond its bounds is undefined behavior. That is, the program is allowed to do anything it pleases, reply 42, eat your hard disk or spend all your money. Said in other words what is happening in such cases is entirely platform dependent. It may look "deterministic" but this is just because you are lucky, and also probably because you are only reading from that place and not writing to it.
This kind of code is just bad. Don't do that.
Depending on your compiler, someArray[2] is a pointer to findTheValue!
Because these variables are declared one-after-another, it's entirely possible that they would be positioned consecutively in memory (I believe on the stack). C doesn't really do any memory management or errorchecking, so someArray[2] just means the memory at someArray[0] + 2 * sizeof(int).
So when findTheValue is 0, we subtract, then when findTheValue is 1, we subtract 1. When findTheValue is 2, we subtract someNumber (which is now 94) and exit.
This behavior is by no means guaranteed. Don't rely on it!
EDIT: It is probably more likely that someArray[2] just points to garbage (unspecified) values in your RAM. These values are likely more than 93 and will cause the loop to exit.
EDIT2: Or maybe someArray[2] and someArray[3] are large negative numbers, and subtracting both causes someNumber to roll over to negative.
The loop exits because (someNumber -= someArray[findTheValue]) doesnt set.
Adding a debug line, you can see
value 0 number 97 array 0
value 1 number 96 array 1
value 2 number 1208148276 array -1208148180
that is printing out findTheValue, someNumber, someArray[findTheValue]
Its not the answer I would have expected at first glance.
Checking addresses:
printf("&someNumber = %p\n", &someNumber);
printf("&someArray[0] = %p\n", &someArray[0]);
printf("&someArray[1] = %p\n", &someArray[1]);
printf("&findTheValue = %p\n", &findTheValue);
gave this output:
&someNumber = 0xbfc78e5c
&someArray[0] = 0xbfc78e50
&someArray[1] = 0xbfc78e54
&findTheValue = 0xbfc78e58
It seems that for some reason the compiler puts the array in the beginning of the stack area, then the variables that are declared below and then those that are above in the order they are declared. So someArray[3] effectively points at someNumber.
I really do not know the reason, but I tried gcc on Ubuntu 32 bit and Visual Studio with and without optimisation and the results were always similar.

Query on activation record in C

Below is a code which I am not understand.
#include<stdio.h>
int main(int argc, char *argv[])
{
int num;
printf("\n Number: " );
scanf("%d", &num);
if (num >= 0)
{
int abs = num;
}
else
{
int abs = -num;
}
{
int abs;
printf("\n Values are %d %d", num ,abs);
}
return 0;
}
When I enter a number as 4, the output is Values are 4 4
When I enter a number as -4, the output is Values are -4 4
I am not able to understand how is it able to print the absolute value?. the variable abs defined in the if loop and else loop should have been deallocated after exiting.
Kindly let me know.
Regards,
darkie
You are absolutely correct.
Do you see the last block where int abs is declared that last time? Notice that abs is not initialized, and using uninitialized variables yields undefined results. With your particular compiler, it just happens that you luck out, and block of memory where the new abs sits still contains the result from it's (expired) previous scope.
These variables are allocated on the stack yet you didn't modify it, I mean you didn't get out of the function, so yet, programmatically you'll get a "new" 'abs' int in the last code block, but in reality, this "new" 'abs' int is in the place where the old 'abs' was (on the STACK!), thus, its default value is the same.
That's called "undefined behavior".
You're getting "stack trash" when you declare the abs with the printf.
It works like this:
if (num >= 0) {
create 'abs' at memory address N, put 'num' in it.
destroy 'abs' // but leave the 'garbage' at memory address N
} else {
create 'abs' at memory address N, put '-num' in it.
destroy 'abs' // but leave the 'garbage' at memory address N
}
{
create 'abs' at memory address N, don't put anything in it.
// your compiler has decided it will reuse N. That's a valid choice.
// your compiler has decided it will not zero the memory at address N. That's valid.
read whatever was at 'abs'. // it's whatever was assigned in the conditional.
}
Always compile with -Wall :)
Hilarious code.
It relies on the fact that all three definitions of abs will be allocated on the same place on the stack, due to compiler optimization.
The third abs has to be random garbag, the garbage turns out to be the result of the previous variable with the same name (the name would not matter).
You're using an uninitialized value of abs in the printf. The C language standard doesn't require it to be anything in particular because it's uninitialized. It could be 0, or 1, or -32765
In this particular case, you're probably getting the same number because the compiled code is reusing a register for the temporary values of abs, and that same register again for your abs variable in the printf block.
You can look at the disassembly code to see exactly what the compiler is doing in terms of machine instructions.

Resources