Using memset with uninitialized variables - c

Is this valid C code without undefined behaviour?
int main(){
int a;
memset(&a, 5, sizeof(int));
return a;
}
I'm assuming this is equal to just doing int a = 5.
I'm trying to understand if just declaring a variable in the above example (without defining it) is enough to put it on the stack.

Is this valid C code without undefined behaviour?
Yes – Once the a variable has been declared in a given scope (like a function or other { ... } delimited block), it is valid to take its address and access the variable using that address within that scope (as your memset call does). An attempt to use that address when that scope has ended (i.e. is no longer 'active') will cause undefined behaviour; for example, the following is UB:
int main()
{
int* p;
{ // New scope ...
int a;
p = &a; // Pointer to "a" is valid HERE
} // The scope of "a" (and its 'lifetime') ends here
memset(p, 5, sizeof(int)); // INVALID: "p" now points to a dead (invalid) variable
}
However, there's a major caveat in your code sample …
I'm assuming this is equal to just doing int a = 5.
There's the rub: It's assigning 5 to each component byte of the a variable, so it's doing this (assuming a 4-byte int):
int a = 0x05050505;
Which is the same as:
int a = 84215045;

From the C Standard (7.23.6.1 The memset function)
2 The memset function copies the value of c (converted to an unsigned
char) into each of the first n characters of the object pointed to by
s.
So this call
memset(&a, 5, sizeof(int));
does not set the variable a equal to 5. Internally the variable will look like
0x05050505
Here is a demonstrative program
#include <stdio.h>
#include <string.h>
int main(void)
{
int a;
memset( &a, 5, sizeof( int ) );
printf( "%#x\n", ( unsigned )a );
return 0;
}
Its output is
0x5050505
You should use the function memset with integers with caution because in general it can produce a trap value. Also the result depends on how internally integers are stored starting from MSB or LSB.
P.S. You declared a variable inside a block scope with no linkage. It is also a variable definition that has automatic storage duration. As the variable explicitly was not initialized then it has an indeterminate value. You may apply the address of operator & to get the address of the memory extent where the variable is defined.

That's not undefined behavior. The problem is that it doesn't what you expect.
The result of
memset(&a, 5, sizeof(int));
consists in setting to 5 each of the four bytes of your integer a.

Related

Printing Dangling Pointers in C

#include <stdio.h>
int main()
{
int *ptr;
{
int x = 2;
ptr = &x;
}
printf("%x %d", ptr, *ptr);
return 0;
}
Output: address of x, value of x.
Here, ptr should be a dangling pointer, right? Yet, it still stores the address of x. How is it still pointing the value of x, even after that block is deleted?
#include <stdio.h>
int * func (int n)
{
int temp;
int *ptr = &temp;
temp = n * n;
return ptr;
}
int main()
{
int n = 4;
int *p = func(4);
printf("%x, %d", p, *p);
return 0;
}
Output: address of temp, 16
In this program, the data variable temp and its pointer variable ptr is created in separate function. Why does it produce a correct result?
#include <stdio.h>
int * func (int n)
{
int temp;
int *ptr = &temp;
temp = n * n;
for (int i = 0; i < 10; i++)
printf("%d ", *ptr);
return ptr;
}
int main()
{
int n = 4;
int *p = func(4);
printf("\n%x, %d", p, *p);
for (int i = 0; i < 10; i++)
printf("%d ", *ptr);
*p = 12;
printf("%d\n", *p);
printf("%d\n", *p);
return 0;
}
Output: 16 16 16 16 16 16 16 16 16 16
address of temp, 1
16 16 16 16 16 16 16 16 16 16
12
12
The above program is similar to the second one aside from the for loop. In
the main() function, it gives the correct output every time. Even if I tried to change it to *p = 10, it would still give the correct output no matter how many times I print it.
But in the second program, it only gives the correct output once because of undefined behavior. It gives garbage values after the first printf.
But in third program, how does it still give the correct output every time?
My questions are:
The pointer variable points to a local variable which goes out of scope, but still prints the correct output and is accessible through the pointer variable by changing it's value. Why is it?
Like the temp created in increment(), ptr is also created locally. Why is it printing the values correctly all of the time without any warning or error? If the for loop is not there, it also gives an error after printing once. Why is that so?
When I passed temp I got a warning and segmentation fault error. But why is ptr, a local variable, printing the values correctly?
In the first program, after printing *ptr many times, it gives a correct output, and I was able to change *ptr = 1; after the first printf. Why can I access ptr even though the variable went out of scope?
Thank you everyone for answering. I underatand now from all your answers. Thank you very much.
Both of your programs behaviour is undefined.
In first code, your program is accessing x, via its address, outside of block in which it was declared. x is a local(automatic) non-static variable and its lifetime is limited to its scope1) i.e. the block in which it has been declared. Any attempt to access it outside of its lifetime will result in undefined behaviour2). Same is the case with temp variable in second code.
An undefined behaviour includes it may execute incorrectly (either crashing or silently generating incorrect results), or it may fortuitously do exactly what the programmer intended.
Also, the correct format specifier for printing a pointer is %p.
1). From C11 Standard#6.2.1p4 [emphasis mine]
Every other identifier has scope determined by the placement of its declaration (in a declarator or type specifier). If the declarator or type specifier that declares the identifier appears outside of any block or list of parameters, the identifier has file scope, which terminates at the end of the translation unit. If the declarator or type specifier that declares the identifier appears inside a block or within the list of parameter declarations in a function definition, the identifier has block scope, which terminates at the end of the associated block. ......
2). From C11 Standard#6.2.4p2 [emphasis mine]
2 The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address,33) and retains its last-stored value throughout its lifetime.34) If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.
I have disassembled your third program by IDA.
The func() function is compiled as a part of the main() function, not compiled as an independent function.
So, the correct values are remained.
I guess this is the optimization result during compiling.
But, When I add one line to func(), the result of program is different.
In this case, the compiler recognized the 'func()' as a function.
The expected result is occurred and the program is crashed at '*p = 12'.
The 'x' in the first code, and the 'temp' in the second code is a local variables, and thus it is released from the stack when the variables are out of the defined block.
The 'ptr' and the 'p' are pointers to the address of these local variables, but the values stored in these pointers are not valid after the local variables are released from the stack.
After the local variable is released, whether the value remains in the memory or not, is a problem with the development tool and the environment. That is, the stack is released, then emptying the memory of the pointer that occupied the local variable, is being treated within the OS or compiler, and the point is that you can no longer use the value of that address valid.
When I reviewed VC ++ 2008, after the local variable was released, the pointer has no more valid value. It has random value.

Printing Pointer Data with Printf in C

My understanding is that when you declare a pointer, say int *a = 5, a is the pointer, and *ais the int pointed to - so the * indicates you're accessing the pointer data. (And the & is accessing the address). Hopefully this is correct?
How come when I'm doing printf it doesn't seem to work the way I want?
int main()
{
int *a = 5;
printf("%d\n",a);
return 0;
}
This gives me the correct result, which I didn't expect. When I did *a instead of a in the printf, it failed, which I'm confused with?
Nopes, int *a = 5; does not store an int value of 5 into the memory location pointed by a, the memory location itself is 5 (which is mostly invalid). This is an initialization statement, which initializes the variable a which is of type int * (a pointer) to 5.
For ease of understanding, consider the following valid case
int var = 10;
int *ptrVar = &var;
here, ptrVar is assigned the value of &var, the pointer. So, in other words, ptrVar points to a memory location which holds an int and upon dereferencing ptrVar, we'll get that int value.
That said, in general,
printf("%d\n",a);
is an invite to undefined behavior, as you're passing a pointer type as the argument to %d format specifier.
The declaration int *a does declare a to be a pointer. Thus, the declaration
int *a = 5;
initializes a with the value 5. Just like how
int i = 5;
would initialize i with the value 5.
There are very few situations where you would want to initialize a pointer variable with a literal value (other than 0 or NULL). Those would likely be embedded (or otherwise esoteric) applications where certain addresses have a defined meaning on a particular platform.

Assigning an int to a pointer, What happens?

Instead of initializing a pointer like this,
int main (){
int *ptr;
int x = 5;
ptr = &x;
}
What happens in memory when you do something like this?
int main (){
int *ptr = 100;
}
Would *ptr be looking for a random address that contains the value 100 or is it storing the value of 100 in ptr?
This is a constraint violation, the compiler should give you a diagnostic message. (If the compiler doesn't say "error" then I would recommend changing compiler flags so that it does). If the compiler generates an executable anyway, then the behaviour of that executable is undefined.
In Standard C, this code does not assign 100 to the pointer, as claimed by several other comments/answers. Integers cannot be assigned to pointers (i.e. using the assignment operator or initialization syntax with integer on the right-hand side), except the special case of constant expression zero.
To attempt to point the pointer to address 100, the code would be int *ptr = (int *)100;.
First of all, as mentioned in other answers, the code int *ptr = 100; is not even valid C. Assigning an integer to a pointer is not a valid form of simple assignment (6.5.16.1) so the code is a so-called "constraint violation", meaning it is a C standard violation.
So your first concern needs to be why your compiler does not follow the standard. If you are using gcc, please note that it is unfortunately not configured to be a conforming compiler per default, you have to tell it to become one by passing -std=c11 -pedantic-errors.
Once that is sorted, you can fix the code to become valid C by converting the integer to a pointer type, through an explicit cast. int *ptr = (int*)100;
This means nothing else but store the address 100 inside a pointer variable. No attempts have been made to access that memory location.
If you would attempt to access that memory by for example *ptr = 0; then it is your job to ensure that 100 is an aligned address for the given system. If not, you invoke undefined behavior.
And that's as far as the C language is concerned. C doesn't know or care what is stored at address 100. It is outside the scope of the language. On some systems this could be a perfectly valid address, on other systems it could be nonsense.
int *ptr = (int*)100; // valid
int *ptr = 100; // invalid as int to int * cast is not automatic
Usage of absolute address is discouraged because in a relocatable program segment, you would never know where a pointer should have as a value of some address, rather it should point to a valid address to avoid surprises.
Compiler won't give you any error. But it's a constraint violation. Pointer variables store addresses of other variables and *ptr is used to alter value stored at that address.
The long and short of it is nothing. Assigning the value 100 to the pointer value p does something; it sets the address for the pointer value to 100, just like assigning the value of the call malloc to it would; but since the value is never used, it doesn't actually do anything. We can take a look at what the value produces:
/* file test.c */
#include <stdlib.h>
#include <stdio.h>
int main() {
int *ptr = (int *) 100;
printf("%p\n", ptr);
return 0;
}
Executing this results in this output:
0x64
Pointers are, in a sense, just an integer. They (sometimes) take up about the same size of an integer (depending on the platform), and can be easily casted to and from an integer. There is even a specific (optional) type defined in C11 for an integer that is the same size as a pointer: intptr_t (see What is the use of intptr_t?).
Going back to the point, attempting to perform any dereferencing of this pointer of any kind can cause Weird Behavior(tm) - the layout of the memory is platform and implementation dependent, and so attempting to grab something at the address 100 will likely cause a Segmentation Fault (if you're lucky):
/* file test.c */
#include <stdlib.h>
#include <stdio.h>
int main() {
int *ptr = (int *) 100;
printf("%i\n", *ptr);
return 0;
}
Executing results in this output:
fish: "./test" terminated by signal SIGSEGV (Address boundary error)
So don't do it unless you know exactly what you're doing, and why you're doing it.
with:
int *ptr = 100;
you have assigned the value 100 to the pointer. Doing anything with it but printing its value will be undefined behavior:
printf ("pointer value is %p\n", (void *)ptr);

Reason of error on pointer de-referencing

Output is: 10 and it gives no error.
int main(){
int j=10;
int *i=&j;
printf("%d",*i);
return 0;
}
but it gives me an error:
int main(){
int *i;
int j=10;
*i=&j;
printf("%d",*i);
return 0;
}
I understand that pointer de-referencing is causing the error. But how is that happening?
Because you are using an uninitialized pointer.
Your *i = &j should be i = &j
This defines i as an int * and sets its value to the address of j:
int *i=&j;
This defines i as an int *, then tries to set what i points to to the address of j:
int *i;
int j=10;
*i=&j;
The final *i = ... is trying to dereference an uninitialized variable.
int *i=&j;
Here you're declaring i to be a int *, and assigning the address of j.
*i=&j;
In this case, though, you've already declared i, and you're assigning &j to the location that i points to rather than to i itself. So that's one error. Another is that i doesn't point to anything yet because you haven't initialized it. If you want i to point to j, you should drop the *:
i = &j;
i itself is declared as a "pointer to int". So you should write i = &j; to assign it with the address of j.
In your case, *i = &j dereferences it before the assignment, that is, the value of a pointer is assigned to an int, which resides in a legal or illegal memory block, because i is uninitialised.
Note that accessing an uninitialised variable causes undefined behaviour, not to mention accessing the object an uninitialised pointer points to.
Here is a simple declaration, with initialization, of an integer variable i:
int i = 10;
Normally it is very easy to split up the declaration and the initialization:
int i;
/* ... */
i = 10;
That's fine. But the syntax of pointer declarations in C is unusual, and it leads to a little bit of asymmetry when working with declarations and initializations. You can write
int *i = &j; /* correct */
But if you split it up, the * does not tag along, because it was part of the declaration.
int *i;
/* ... */
i = &j; /* right */
*i = &j; /* WRONG */
int *i;
You've declared i as a pointer to int, but you haven't set it to point to anything yet; the value of i is indeterminate1. It will contain some random string of bits that (most likely) does not correspond to a valid address2. Attempting to dereference i at this point leads to undefined behavior, which can mean anything from an outright crash to corrupted data to working without any apparent issues.
The line
*i = &j;
has two problems, the first one being that i doesn't point anywhere meaningful (and this is where your runtime error is undoubtedly coming from; you're attempting to access an invalid address). The second is that the types of *i and &j don't match; *i has type int, while &j has type int *.
Variables declared locally to a function without the static keyword have automatic storage duration, and are not implicitly initialized to any particular value. Do not assume that any such variable is initially set to 0 or NULL in the absence of an explicit initializer. Variables declared outside of any function body or with the static keyword have static storage duration, and those variables will be initialized to 0 or NULL in the absence of an explicit initializer.
"Valid" meaning the address of an object defined within your program (i.e., another variable, or a chunk of memory allocated via `malloc`, etc.) or a well-known address defined by the platform (such as a fixed hardware input address). NULL is a well-defined invalid address that's easy to test against.

Confusion with pointers

I am trying to learn C. The reading I've been doing explains pointers as such:
/* declare */
int *i;
/* assign */
i = &something;
/* or assign like this */
*i = 5;
Which I understand to mean i = the address of the thing stored in something
Or
Put 5, or an internal representation of 5, into the address that *i points to.
However in practice I am seeing:
i = 5;
Should that not cause a mismatch of types?
Edit: Semi-colons. Ruby habits..
Well, yes, in your example setting an int pointer to 5 is a mismatch of types, but this is C, so there's nothing stopping you. This will probably cause faults. Some real hackery could be expecting some relevant data at the absolute address of 5, but you should never do that.
The English equivalents:
i = &something
Assign i equal to the address of something
*i =5
Assign what i is pointing to, to 5.
If you set i = 5 as you wrote in your question, i would contain the address 0x00000005, which probably points to garbage.
Hope this helps explain things:
int *i; /* declare 'i' as a pointer to an integer */
int something; /* declare an integer, and set it to 42 */
something = 42;
i = &something; /* now this contains the address of 'something' */
*i = 5; /* change the value, of the int that 'i' points to, to 5 */
/* Oh, and 'something' now contains 5 rather than 42 */
If you're seeing something along the lines of
int *i;
...
i = 5;
then somebody is attempting to assign the address 0x00000005 to i. This is allowed, although somewhat dangerous (N1256):
6.3.2.3 Pointers
...
3 An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant.55) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.
...
5 An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.56)
...
55) The macro NULL is defined in <stddef.h> (and other headers) as a null pointer constant; see 7.17.
56) The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.
Depending on the architecture and environment you're working in, 0x00000005 may not be a valid integer address (most architectures I'm familiar with require multibyte types to start with even addresses) and such a low address may not be directly accessible by your code (I don't do embedded work, so take that with a grain of salt).
I understand to mean i = the address of the thing stored in something
Actually i contains an address, which SHOULD be the address of a variable containing an int.
I said should because you can't be sure of that in C:
char x;
int *i;
i = (int *)&x;
if i is a pointer, than assign to it something different to a valid address accessible from you program, is an error an I think could lead to undefined behavior:
int *i;
i = 5;
*i; //undefined behavior..probably segfault
here's some examples:
int var;
int *ptr_to_var;
var = 5;
ptr_to_var = var;
printf("var %d ptr_to_var %d\n", var, *ptr_to_var); //both print 5
printf("value of ptr_to_var %p must be equal to pointed variable var %p \n" , ptr_to_var, &var);
I hope this helps.
This declares a variable name "myIntPointer" which has type "pointer to an int".
int *myIntPointer;
This takes the address of an int variable named "blammy" and stores it in the int pointer named "myIntPointer".
int blammy;
int *myIntPointer;
myIntPointer = &blammy;
This takes an integer value 5 and stores it in the space in memory that is addressed by the int variable named "blammy" by assigning the value through an int pointer named "myIntPointer".
int blammy;
int *myIntPointer;
myIntPointer = &blammy;
*myIntPointer = 5;
This sets the int pointer named "myIntPointer" to point to memory address 5.
int *myIntPointer;
myIntPointer = 5;
assignment of hard-coded addresses, is something that shouldn't be done (even in the embedded world, however there are some cases where it's suitable.)
when declaring a pointer, limit yourself to only assign a value to it with dynamiclly allocated memory(see malloc()) or with the & (the address) of a static (not temporary) variable. this will ensure rebust code, and less chance to get the famous segmentation fault.
good luck with learning c.

Resources