#define N 100
&N not possible. How can I know the address of N? It must have some address.
A macro is a fragment of code which has been given a name. Whenever
the name is used, it is replaced by the contents of the macro.
So NO N is not a variable and you can't get the address of N
MACROS
Macros don't have any addresses and that's why they are referred to as constant values.
A macro is simple text substitution. If you write code like
int* ptr = &N;
it will then be pre-processed into
int* ptr = &100;
100 is an integer literal and all integer literals are constant rvalues, meaning you can't assign a value to it, nor can you take its address.
The literal 100 is of course stored in memory though - you can't allocate numbers in thin air - but it will most likely be stored as part of the executed program code. In the binary, you'll have some machine code instruction looking like "store 100 in the memory location of the pointer variable" and the 100 is stored in that very machine code instruction. And a part of a machine code instruction isn't addressable.
It is simple text substitution. The compiler's pre-processor replaces all occurrences of N with 100, but, it depends on what N is. In your example taking the address of a constant won't compile, but these two other examples do work.
#include <stdio.h>
#define N 100
#define M x
#define L "Hallo world!"
int main()
{
int x = 42;
//printf ("Address of 'N' is %p\n", (void*)&N); // error C2101: '&' on constant
printf ("Address of 'M' is %p\n", (void*)&M);
printf ("Address of 'L' is %p\n", (void*)&L);
return 0;
}
Program output:
Address of 'M' is 0018FF3C
Address of 'L' is 0040C018
MORE explanation.
With #define N 100 you can't get the address of N because a numerical constant like that does not have a single memory location. 100 might be assigned to the value of a variable, or indeed the optimising compiler might load 100 directly into a processor register.
In the case of #define M x that's a simple substitution so that M can be used exactly as x can. There is no functional difference between &x and &M because the two statements are identical after the preprocessor has made the substitution.
In the case of #define L "Hallo world!" we have a string literal, which the compiler does place in memory. Asking for &L is the same as asking for &"Hallo world!" and that is what you get.
N is not a variable, it never has any address. It's just a value to get pasted in when you use patterns like int val=N;. Then you can get the address of val using &.
Related
Could someone explain why the following code behaves differently if the 3 first printf calls are there or not ?
#include <stdlib.h>
#include <stdio.h>
int main(){
int a = 1;
int b = 2;
int c = 3;
int* a_ptr = &a;
// comment out the following 3 lines
printf(" &a : %p\n", &a);
printf(" &b : %p\n", &b);
printf(" &c : %p\n", &c);
printf("*(a_ptr) : %d\n", *(a_ptr));
printf("*(a_ptr-1) : %d\n", *(a_ptr-1));
printf("*(a_ptr-2) : %d\n", *(a_ptr-2));
I was playing around to learn how variables are being stacked in memory depending on the order of declaration. By printing the addresses, I see they are 1 int size apart if they are declared after each other. After printing the addreses, I just subtract 1 and 2 from the address of a and dereference it. When I print it, it shows what i'd expect, the values of a, b and c. BUT, if i do not print the addresses before, I just get a=1, while b and c seem random to me. What is going on here ? Do you get the same behaviour on your machine ?
// without printing the addresses
*(a_ptr) : 1
*(a_ptr-1) : 4200880
*(a_ptr-2) : 6422400
// with printing
&a : 0061FF18
&b : 0061FF14
&c : 0061FF10
*(a_ptr) : 1
*(a_ptr-1) : 2
*(a_ptr-2) : 3
The C standard does not define what happens in the code you show, notably because the address arithmetic a_ptr-1 and a_ptr-2 is not defined. For an array of n elements, pointer arithmetic is defined only for adjusting pointers to locations from that corresponding to index 0 to that corresponding to index n (which is one beyond the last element of the array). For a single object, pointer arithmetic is defined as if it were an array of one element, so only for adjusting a pointer to the locations corresponding to index 0 and index 1 (just beyond the object). a_ptr-1 and a_ptr-2 would point to elements at indices −1 and −2, and the C standard does not define the behavior for these.
However, what is happening in the experiments you tried is:
When the addresses of a, b, and c are printed, the compiler has to ensure there are addresses for it to print. So it assigns memory locations to a, b, and c. It happens that these are consecutive in memory and in reverse order, and therefore a_ptr-1 happened to point to b and a_ptr-2 happened to point to c. (However, compilers may assign locations in memory based on alphabetical order of the names rather than declaration order, based on alignment and size and other properties, based on some arbitrary hash it uses to organize the names in a data structure, and/or other factors. In particular, if you compiled without optimization, and you request high optimization instead, the order may change.)
When the addresses are not printed, the compiler has no need to assign memory locations for b and c because they are not used in the code. In this case, a_ptr-1 and a_ptr-2 do not point to b and c but point to locations in memory that have been used for other purposes.
I am trying to achieve that C interprets my string as macro.
Hey, let's suppose there is a defined macro as,
#define ABC 900
If i define;
char* s[] = "ABC" ;
then,
printf("%d",s) ;
Is there any way the compiler understands that "ABC" as macro ABC and passes 900 integer value to printf ?
#include<stdio.h>
#define abc 15
int main(void) {
char a[] = "abc" ;
printf("%d",a);
return 0;
}
When i try the above code, instead of my desired output 15 , i get 6487568 which i guess the integer equivalent of that string.
Edit : those were random values , or address of strings. ( as stated below by others )
No, what you're trying to do is double impossible. You can't access variables by name at runtime (string -> variable) because the compiled machine code knows nothing about the names in your C code, and you can't access macros from the compiler because the compiler knows nothing about macros (they're expanded by the preprocessor before the compiler even sees the code).
In other words, compilation / execution happens in multiple stages:
C source code is preprocessed (which gets rid of directives like #include or #define and expands macros).
The preprocessed token stream is passed to the compiler, which converts it to machine code (a runnable program).
Finally the program runs.
Simplified example:
// original C code
#define FOO 42
...
int x = y + FOO;
After preprocessing:
...
int x = y + 42;
After compilation:
movl %ecx, %eax
addl $42, %eax
There is no trace of FOO in step 2, and the final code knows nothing about x or y.
Variable values such as strings only exist at runtime, in step 3. You can't get back to step 1 from there. If you wanted to access information about macros at runtime, you'd have to keep it explicitly in some sort of data structure, but none of this is automatic.
Macros are simple copy paste and they are pretty limited. A macro will not expand if it's quoted or commented.
One solution would be:
#define ABC "900"
char s[] = ABC;
But no, macros cannot be used for what you're trying to do.
When i try the above code, instead of my desired output 15 , i get 6487568 which i guess the integer equivalent of that string.
It's undefined behavior. Most likely it's the address of the string. If you compile with -Wall you will get a warning for this.
#include<stdio.h>
int main()
{
char *str[] = {"Frogs","Do","Not","Die.","They","Croak"};
printf("%c %c %c",*str[0],*str[1],*str[2]);//expected F D N
printf("\n%u %u %u",str[0],str[1],str[2]);//expected 1000 1006 1003
}
this output is based on the assumption that froak begins at 1000
the output is as follows
F D N
2162395060 2162395057 2162395053
how can that be possible, here the address is decreasing for str[0] to str[2], printing the address of str[3], str[4], str[5], shows no pattern and rather have abrupt changes in the addresses
You are printing the addresses of three string constants. The compiler is under no obligation to organize the string constants in any predictable fashion.
The compiler is required to provide an array of pointers. The array can be accessed sequentially to obtain addresses of the string constants, but the string constants may be stored in any location which the compiler deems efficient or useful.
I ran the same code on mac OS using AppleClang 10.0.0.10001044 and got the following output:
F D N
104431486 104431492 104431495
As you can see, the pointers are sequential using AppleClang.
However, that is irrelevant. Nothing in your code should depend on how the compiler chooses to allocate memory for the string constants.
I ran the following code and this is the output I got:
#include <stdio.h>
int main()
{
int x = 3;
int y = x;
printf("%d\n", &x);
printf("%d\n", &y);
getchar();
return 0;
}
Output:
3078020
3078008
Now, the output changes every time I run the program, but the difference between the location of x to the location of y is always 12. I wondered why.
Edit: I understand why the difference is constant. What I don't understand is why the difference is specifically 12, and why the memory address of y, who's defined later, is less than x's.
The addresses themselves may well change due to protective measures such as address space layout randomisation (ASLR).
This is something done to mitigate the possibility of an attack vector on code. If it always loads at the same address, it's more likely that a successful attack can be made.
By moving the addresses around each time it loads, it becomes much more difficult for an attack to work everywhere (or anywhere, really).
However, the fact that both variables will be on the stack in a single stack frame (assuming the implementation even uses a stack which is by no means guaranteed), the difference between them will be very unlikely to change.
In your code,
printf("%d\n", &x);
is not the correct way to print a pointer value. What you need is
printf("%p\n", (void *)&x);
That said, the difference between the addresses is constant because those two variables are usually placed in successive memory locations in stack. However, AFAIK this is not guranteed in c standard. Only thing guraneted is sizeof(a) [size of a datatype] will be fixed for a particular platform.
To print the address of a variable you have to use the %p not %d.
To print the integer value only you have to give %d.
printf("%d\n",x);// It will print the value of x
printf("%p\n",(void*)&x);// It will print the address of x.
From the man page of printf
p The void * pointer argument is printed in hexadecimal
I am using gcc version 4.7.2 on Ubuntu 12.10 x86_64.
First of all these are the sizes of data types on my terminal:
sizeof(char) = 1
sizeof(short) = 2 sizeof(int) = 4
sizeof(long) = 8 sizeof(long long) = 8
sizeof(float) = 4 sizeof(double) = 8
sizeof(long double) = 16
Now please have a look at this code snippet:
int main(void)
{
char c = 'a';
printf("&c = %p\n", &c);
return 0;
}
If I am not wrong we can't predict anything about the address of c. But each time this program gives some random hex address ending in f. So the next available location will be some hex value ending in 0.
I observed this pattern in case of other data types too. For an int value the address was some hex value ending in c. For double it was some random hex value ending in 8 and so on.
So I have 2 questions here.
1) Who is governing this kind of memory allocation ? Is it gcc or C standard ?
2) Whoever it is, Why it's so ? Why the variable is stored in such a way that next available memory location starts at a hex value ending in 0 ? Any specific benefit ?
Now please have a look at this code snippet:
int main(void)
{
double a = 10.2;
int b = 20;
char c = 30;
short d = 40;
printf("&a = %p\n", &a);
printf("&b = %p\n", &b);
printf("&c = %p\n", &c);
printf("&d = %p\n", &d);
return 0;
}
Now here what I observed is completely new for me. I thought the variable would get stored in the same order they are declared. But No! That's not the case. Here is the sample output of one of random run:
&a = 0x7fff8686a698
&b = 0x7fff8686a694
&c = 0x7fff8686a691
&d = 0x7fff8686a692
It seems that variables get sorted in increasing order of their sizes and then they are stored in the same sorted order but with maintaining the observation 1. i.e. the last variable (largest one) gets stored in such a way that the next available memory location is an hex value ending in 0.
Here are my questions:
3) Who is behind this ? Is it gcc or C standard ?
4) Why to waste the time in sorting the variables first and then allocating the memory instead of directly allocating the memory on 'first come first serve' basis ? Any specific benefit of this kind of sorting and then allocating memory ?
Now please have a look at this code snippet:
int main(void)
{
char array1[] = {1, 2};
int array2[] = {1, 2, 3};
printf("&array1[0] = %p\n", &array1[0]);
printf("&array1[1] = %p\n\n", &array1[1]);
printf("&array2[0] = %p\n", &array2[0]);
printf("&array2[1] = %p\n", &array2[1]);
printf("&array2[2] = %p\n", &array2[2]);
return 0;
}
Now this is also shocking for me. What I observed is that the array is always stored at some random hex value ending in '0' if the elements of an array >= 2 and if elements < 2
then it gets memory location following observation 1.
So here are my questions:
5) Who is behind this storing an array at some random hex value ending at 0 thing ? Is it gcc or C standard ?
6) Now why to waste the memory ? I mean array2 could have been stored immediately after array1 (and hence array2 would have memory location ending at 2). But instead of that array2 is stored at next hex value ending at 0 thereby leaving 14 memory locations in between. Any specific benefits ?
The address at which the stack and the heap start is given to the process by the operating system. Everything else is decided by the compiler, using offsets that are known at compile time. Some of these things may follow an existing convention followed in your target architecture and some of these do not.
The C standard does not mandate anything regarding the order of the local variables inside the stack frame (as pointed out in a comment, it doesn't even mandate the use of a stack at all). The standard only bothers to define order when it comes to structs and, even then, it does not define specific offsets, only the fact that these offsets must be in increasing order. Usually, compilers try to align the variables in such a way that access to them takes as few CPU instructions as possible - and the standard permits that, without mandating it.
Part of the reasons are mandated by the application binary interface (ABI) specifications for your system & processor.
See the x86 calling conventions and the SVR4 x86-64 ABI supplement (I'm giving the URL of a recent copy; the latest original is surprisingly hard to find on the Web).
Within a given call frame, the compiler could place variables in arbitrary stack slots. It may try (when optimizing) to reorganize the stack at will, e.g. by decreasing alignment constraints. You should not worry about that.
A compiler try to put local variables on stack location with suitable alignment. See the alignof extension of GCC. Where exactly the compiler put these variables is not important, see my answer here. (If it is important to your code, you really should pack the variables in a single common local struct, since each compiler, version and optimization flags could do different things; so don't depend on that precise behavior of your particular compiler).