C variable not where I expect to find it in memory - c

Can someone explain why printing the pointers to the two ints results in them being placed in different locations in relation to the chars.
The piece of code below should print out the memory address from &a to &c which (I think) should include the two ints defined but it doesn't, however when I try to find out where they're stored in memory (see second code segment) it does print them between the two chars as expected.
Please explain why printing the int pointers effects the ints being stored between the chars in memory.
The two code samples are the same except code 2 has an extra line printf("\n\n%p,%p\n",&i,&j); which prints the pointers of the two ints.
Edit: Yes I know the prinf formating is ugly but the code was only to help me clarify how memory and pointers work, so I didn't need it to be pretty
Code1
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char **argv){
char a='a';
int i=1;
int j=2;
char c='c';
char *pos;
for ( pos=&c; pos<=&a; pos++ ){
printf("%p\t",pos);
}
printf("\n");
for ( pos=&c; pos<=&a; pos++ ){
printf("%i\t\t",*pos);
}
}
Results from Code1
0x7ffde6321e7e 0x7ffde6321e7f
99 97
Code2
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char **argv){
char a='a';
int i=1;
int j=2;
char c='c';
char *pos;
for ( pos=&c; pos<=&a; pos++ ){
printf("%p\t",pos);
}
printf("\n");
for ( pos=&c; pos<=&a; pos++ ){
printf("%i\t\t",*pos);
}
printf("\n\n%p,%p\n",&i,&j);
}
Results from Code2
0x7ffc3575616b 0x7ffc3575616c 0x7ffc3575616d 0x7ffc3575616e 0x7ffc3575616f 0x7ffc35756170 0x7ffc35756171 0x7ffc35756172 0x7ffc35756173 0x7ffc35756174 0x7ffc35756175 0x7ffc35756176 0x7ffc35756177
99 2 0 0 0 1 0 0 0 -4 127 0 97
0x7ffc35756170,0x7ffc3575616c

You're relying on somethingNote 1 which is not specified in C standard. The behaviour cannot be defined. It invokes undefined behavior.Note 2
That said, you should always cast the argument of %p to void *, as the expected type is void * and there's no default promotion for pointers.
Note 1:
C does not mention or guarantee the order of allocation of variables / objects in a program. There's no guarantee that they will have consecutive memory locations, either increasing or decreasing. They are purely allowed to have random memory locations, so the theory you're believing in,
for ( pos=&c; pos<=&a; pos++ )
does not hold true. An(y) implementation can choose to place (reorder) variable(s) however it does see fit. There's absolutely no guarantee of the order of memory address with respect to their definition in the code.
Note 2:
For relational operators, quoting C11. chapter §6.5.8, (emphasis mine)
When two pointers are compared, the result depends on the relative locations in the
address space of the objects pointed to. If two pointers to object types both point to the
same object, or both point one past the last element of the same array object, they
compare equal. If the objects pointed to are members of the same aggregate object,
pointers to structure members declared later compare greater than pointers to members
declared earlier in the structure, and pointers to array elements with larger subscript
values compare greater than pointers to elements of the same array with lower subscript values. All pointers to members of the same union object compare equal. If the
expression P points to an element of an array object and the expression Q points to the
last element of the same array object, the pointer expression Q+1 compares greater than
P. In all other cases, the behavior is undefined.
So, for your case, the comparison pos<=&a; is an attempt to compare two pointers which are neither
pointing to same object
members of the same aggregate object
pointers to array elements
pointers to members of the same union object
In short, they are not within the defined scope and hence, using them as operand of the relational operator invokes undefined behaviour.

The location of local variables is implementation defined. The compiler may put them in any order it deems best.
Making seemingly unrelated code changes such as an extra print statement or changing the optimization level can change how the compiler lays out the variables.
In short, you can't depend on any particular layout of variables in memory.

Local variables are placed in the stack (or in register if possible & if their address is not referred). In your example the i is first and j is second local vars, so you have push i, push j - the address of the second &j is &i-1.

Related

C : If as I understand 0 and '\0' are the same, how does the compiler knows the size of an array when I write int my_array = {0};?

I am trying to create a function to copy an array into another using pointers. I'd like to add the following condition : if the array of destination is smaller, the loop must break.
So basically it's working, but it is not working if I intilize the the destination array as follows :
int dest_array[10] = {0};
From what I understand it fills the array with int 0's which are equivalent to '\0' (null characters). So here is my question :
In this case how can the computer know the array size or when it ends ?
(And how do I compare arrays passed as parameters ?)
void copy(int *src_arr, int *dest_arr)
{
// The advantage of using pointers is that you don't need to provide the source array's size
// I can't use sizeof to compare the sizes of the arrays because it does not work on parameters.
// It returns the size of the pointer to the array and not of of the whole array
int* ptr1;
int* ptr2;
for( ptr1 = source, ptr2 = dest_arr ;
*ptr1 != '\0' ;
ptr1++, ptr2++ )
{
if(!*ptr2) // Problem here if dest_arr full of 0's
{
printf("Copy interrupted :\n" +
"Destination array is too small");
break;
}
*ptr2 = *ptr1;
}
In C, it is impossible to know the length of an array inherently. This is due to the fact that an array is really just a contiguous chunk of memory, and the value passed to functions is really just a pointer to the first element in the array. As a result of this, to actually know the length of an array within a function other than the function where that array was declared, you have to somehow provide that value to the function. Two common approaches are the use of sentinel values which indicate the last element (similar to the way '\0', the null character, is per convention interpreted as the first character not part of a string in C), or providing another parameter which contains the array length.
As a very common example of this: if you have written any programs which use command-line parameters, then surely you are familiar with the common definition of int main(int argc, char *argv[]), which uses the second of the aforementioned approaches by providing the length of the argv array via the argc parameter.
The compiler has some ways to work around this for local variables. E.g., the following would work:
#include <stdio.h>
int main(){
int nums[10] = {0};
printf("%zu\n", sizeof(nums)/sizeof(nums[0]));
return 0;
}
Which prints 10 to STDOUT; however, this only works because the sizeof operation is done locally, and the compiler knows the length of the array at that point.
On the other hand, we can consider the situation of passing the array to another function:
#include <stdio.h>
int tryToGetSizeOf(int arr[]){
printf("%zu", sizeof(arr)/sizeof(arr[0]));
}
int main(){
int nums[10] = {0};
printf("%zu\n", sizeof(nums)/sizeof(nums[0]));
puts("Calling other function...");
tryToGetSizeOf(nums);
return 0;
}
This will end up printing the following to STDOUT:
10
Calling other function...
2
This may not be the value you're expecting, but this occurs due to the fact that the method signature int tryToGetSizeOf(int arr[]) is functionally equivalent to int tryToGetSizeOf(int *arr). Therefore, you are dividing the size of an integer pointer (int *) by the size of a single int; whereas while you're still in the local context of main() (i.e., where the array was defined originally), you are dividing the size of the allocated memory region by the size of the datatype that memory region is partitioned as (int).
An example of this available on Ideone.
int* ptr1;
int* ptr2;
You lose size information when you refer to arrays as pointers. There is no way you can identify the size of the array i.e. the number of elements using ptr1. You have to take help of another variable which will denote the size of the array referred by ptr1 (or ptr2).
Same holds for character arrays as well. Consider the below:
char some_string[100];
strcpy(some_string, "hello");
The approach you mentioned of checking for \0 (or 0) gives you the number of elements which are part of the string residing in some_string. In no way does it refer to the number of elements in some_string which is 100.
To identify the size of destination, you have to pass another argument depicting its size.
There are other ways to identify the end of the array but t is cleaner to pass the size explicitly rather than using some pointer hack like passing a pointer to end of the array or using some invalid value as the last element in array.
TL/DR - You will need to pass the array size as a separate parameter to your function. Sentinel values like 0 only mark the logical end of a sequence, not the end of the array itself.
Unless it is the operand of the sizeof or unary & operators, or is a string literal used to initialize a character array in a declaration, an expression of type "N-element array of T" will be converted ("decay") to an expression of type "pointer to T", and the value of the expression will be the address of the first element of the array. So when you pass your source and destination arrays as arguments to copy, what the function actually receives is just two pointers.
There's no metadata associated with a pointer that tells it whether it's pointing to the first object in a sequence, or how long that sequence is1. A sentinel value like the 0 terminator in strings only tells you how long a logical sequence of values is, not the size of the array in which they are stored2.
You will need to supply at least one more parameter to copy to tell it how large the target buffer is, so you stop copying when you've reached the end of the target buffer or you see a 0 in the source buffer, whichever comes first.
The same is true for array objects - there's no runtime metadata in the array object to store the size or anything else. The only reason the sizeof trick works is that the array's declaration is in scope. The array object itself doesn't know how big it is.
This is a problem for library functions like strcpy, which only receives the starting address for each buffer - if there are more characters in the source buffer than the target is sized to hold, strcpy will blast right past the end of the target buffer and overwrite whatever follows.

can you explain me why it is possible p[-1]?

int *p[10]={5,663,36,6};
*(p - 1) = 'e';
int c=*(p-1);
printf("%c",c);
i am not able to understand why we use negative number in array index
*(p - 1) = 'e';
For your example it would be undefined behaviour, but there are situations where you might want to use it, notably if your pointer was pointing to somewhere inside an array and you check that you are still inside the bounds of the array.
Like in this example...
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]){
char hello[]="worlld";
char *p;
for(p=hello;*p!='\0';p++) {
if ((p>hello) && (p[-1]==p[0])) {
printf("%c\n",p[0]);
}
}
return(0);
}
The language does not prevent you from using negative numbers in indexing of an array or a pointer. This does not meant that it is always correct. i.e. in your example it would access an array element which is 1 position before the beginning of the array. in other words you access invalid memory addres.
However in the situation like the following, where p1 points to a non-0 element of the array, you can use negative indexes:
int p[] = {1,2,3,4};
int *p1 = &p[1];
int x = *(p1-1);
int y = p1[-1]; // equivalent to the previous one
In both cases 'x' and 'y' will become '1';
i am not able to understand why we use negative number in array index
That's because
you apparently think [] is an array operator, but it is not. It is a pointer operator, defined in terms of pointer arithmetic, which, in a general sense, permits subtracting integers from pointers.
you seem to have an expectation of some particular kind of behavior arising from evaluating your example code, but it exhibits undefined behavior on account of performing arithmetic on pointer p that does not produce a result pointing into (or just past the end of) the same object that p points [in]to. "Undefined" means exactly what it says. Although an obvious program failure or error message might be emitted, you cannot rely on that, or on any other particular behavior. Of the entire program.

Two ways to initialize an array. What happens with each one?

There are two ways (at least) to initialize an array in C. What is the difference between
int array[] = {1,2,3,4,5,6,7,8,9} ;
and:
int array[100] = {1,2,3,4,5,6,7,8,9} ;
I do not mean in means of memory allocation. Perhaps the thing that provoked this question would be useful so as to understand my question.
I wanted to get the length of an int array by iterating trough it. Here is the code:
#include <stdio.h>
#include <stdlib.h>
int array[] = {1,2,3,4,5,6,7,8,9} ;
int i = 0 ; // i is length
while( array[i] ) {
printf("%d\n" , array[i] ) ;
i++ ;
}
printf("%d\n" , i) ;
And I noticed that when I used array[] the length sometimes was wrong because of some sort of overflow , but when I used array[100] the length was always right. What is the difference between these two?
Has it got something to do with '\0' character ?
When you create the array without specifying its size the compiler infers it from the initializer (in this case, the length would be 9). The memory locations immediately after the array have unspecified contents since noone bothered giving them specific values, and that's why you get the "overflow" behavior -- this is technically undefined behavior, but the result is a very common way for the compiler vendor to implement "undefined".
When you explicitly specify the size the compiler initializes the array with as many elements as you have provided, then fills the remaining space with zeroes.
In both cases the behavior is according to the standard.
You can programmatically get the size of the array with the sizeof operator. So in this case, you can do sizeof(array)/sizeof(int) to get the actual size. The sizeof operator is handled by the compiler, which will insert the correct size constant at compile time.
Note that iterating through the array until you get to a false result is undefined behavior and should not be done.
Pertaining to your original question, #Jon is correct; either array size specifier is correct and will yield the same results.
Because there is no null terminating character in arrays of numbers, only array of char. So if your array was a c style string, your code would have successfully found the length

Calculate and output difference of two pointers

The output of the following c program is 1. Can someone please explain?
#include<stdio.h>
#include<string.h>
int main(){
int a = 5,b = 10,c;
int *p = &a,*q = &b;
c = p - q;
printf("%d" , c);
return 0;
}
The program invokes undefined behavior. Pointer subtraction has to be done with pointers to elements of the same array.
From the C Standard:
(C99, 6.5.6p9) "When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object [...]"
You are returning the differences in memory location of two successive stack-allocated pointers using int pointer arithmetic as your unit of measurement.
Technically the program behaviour is undefined. Rather pointless therefore to say any more.
The variables were allocated one after the other in the stack. Say on a machine with 4-byte sized integers the addresses may be say 1000 and 1004; the difference is 4 bytes; pointer arithmetic says that the difference should return the value in elements (not bytes), so number of elements (ints) between the address is 1 (integer).
However, this is valid only within the same array or one element past it, which isn't the case in your example, and thus is undefined.
C++11 §5.7/6:
When two pointers to elements of the same array object are subtracted,
the result is the difference of the subscripts of the two array
elements ... Unless both pointers point to elements of the same array
object, or one past the last element of the array object, the behavior
is undefined.
EDIT: Thanks ouah for making me lookup the C++ standard on the matter.

Does C99 guarantee that arrays are contiguous?

Following an hot comment thread in another question, I came to debate of what is and what is not defined in C99 standard about C arrays.
Basically when I define a 2D array like int a[5][5], does the standard C99 garantee or not that it will be a contiguous block of ints, can I cast it to (int *)a and be sure I will have a valid 1D array of 25 ints.
As I understand the standard the above property is implicit in the sizeof definition and in pointer arithmetic, but others seems to disagree and says casting to (int*) the above structure give an undefined behavior (even if they agree that all existing implementations actually allocate contiguous values).
More specifically, if we think an implementation that would instrument arrays to check array boundaries for all dimensions and return some kind of error when accessing 1D array, or does not give correct access to elements above 1st row. Could such implementation be standard compilant ? And in this case what parts of the C99 standard are relevant.
We should begin with inspecting what int a[5][5] really is. The types involved are:
int
array[5] of ints
array[5] of arrays
There is no array[25] of ints involved.
It is correct that the sizeof semantics imply that the array as a whole is contiguous. The array[5] of ints must have 5*sizeof(int), and recursively applied, a[5][5] must have 5*5*sizeof(int). There is no room for additional padding.
Additionally, the array as a whole must be working when given to memset, memmove or memcpy with the sizeof. It must also be possible to iterate over the whole array with a (char *). So a valid iteration is:
int a[5][5], i, *pi;
char *pc;
pc = (char *)(&a[0][0]);
for (i = 0; i < 25; i++)
{
pi = (int *)pc;
DoSomething(pi);
pc += sizeof(int);
}
Doing the same with an (int *) would be undefined behaviour, because, as said, there is no array[25] of int involved. Using a union as in Christoph's answer should be valid, too. But there is another point complicating this further, the equality operator:
6.5.9.6
Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space. 91)
91) Two objects may be adjacent in memory because they are adjacent elements of a larger array or adjacent members of a structure with no padding between them, or because the implementation chose to place them so, even though they are unrelated. If prior invalid pointer operations (such as accesses outside array bounds) produced undefined behavior, subsequent comparisons also produce undefined behavior.
This means for this:
int a[5][5], *i1, *i2;
i1 = &a[0][0] + 5;
i2 = &a[1][0];
i1 compares as equal to i2. But when iterating over the array with an (int *), it is still undefined behaviour, because it is originally derived from the first subarray. It doesn't magically convert to a pointer into the second subarray.
Even when doing this
char *c = (char *)(&a[0][0]) + 5*sizeof(int);
int *i3 = (int *)c;
won't help. It compares equal to i1 and i2, but it isn't derived from any of the subarrays; it is a pointer to a single int or an array[1] of int at best.
I don't consider this a bug in the standard. It is the other way around: Allowing this would introduce a special case that violates either the type system for arrays or the rules for pointer arithmetic or both. It may be considered a missing definition, but not a bug.
So even if the memory layout for a[5][5] is identical to the layout of a[25], and the very same loop using a (char *) can be used to iterate over both, an implementation is allowed to blow up if one is used as the other. I don't know why it should or know any implementation that would, and maybe there is a single fact in the Standard not mentioned till now that makes it well defined behaviour. Until then, I would consider it to be undefined and stay on the safe side.
I've added some more comments to our original discussion.
sizeof semantics imply that int a[5][5] is contiguous, but visiting all 25 integers via incrementing a pointer like int *p = *a is undefined behaviour: pointer arithmetics is only defined as long as all pointers invoved lie within (or one element past the last element of) the same array, as eg &a[2][1] and &a[3][1] do not (see C99 section 6.5.6).
In principle, you can work around this by casting &a - which has type int (*)[5][5] - to int (*)[25]. This is legal according to 6.3.2.3 §7, as it doesn't violate any alignment requirements. The problem is that accessing the integers through this new pointer is illegal as it violates the aliasing rules in 6.5 §7. You can work around this by using a union for type punning (see footnote 82 in TC3):
int *p = ((union { int multi[5][5]; int flat[25]; } *)&a)->flat;
This is, as far as I can tell, standards compliant C99.
If the array is static, like your int a[5][5] array, it's guaranteed to be contiguous.

Resources