I have written this code & expected it to fail, since I don't allocate memory for the pointer variable. It didn't throw up any error, to my surprise. What is the reason?
And if I, just randomly, delete 2nd LOC, it throws up a segmentation fault. How to explain for this seemingly strange behavior?
uint16_t *c;
uint8_t *d;
*c = 1;
printf("%x:%x",c,*c);
As others have pointed out, it is UB. Your observation of the code "working" in the sense of not causing a segfault or similar is more or less random. Your allocation of another variable on the stack might change where your c pointer is allocated, and thereby it can have a different (random) place it points to. (Or, to put it differently, it's initial random value will or might be different).
Observable different behaviour of a program depending on where and what (independent) objects are allocated within a function are a dead give-away that there is something wrong with memory allocation in the function.
Related
So my programming teacher told us, that if you don't use a pointer but like to declare it, it is always better to initialize it with NULL. How does it prevent any errors if i don't even use it?
Or if I am wrong, what are the benefits of it?
De-referencing NULL is likely (guaranteed?) to cause a segmentation fault, immediately crashing your app and alerting you to the unsafe memory access you just preformed.
Leaving your pointer uninitialized will mean it still has whatever junk was left over from the previous user of that memory. It's entirely possible for that to be a pointer to a real memory region in your app. Dereferencing it will be undefined behavior, might cause a seg fault, or it might not. The latter is the worst case, which you should fear. Your app will just keep chugging along with whatever non-sense behavior resulted from that.
Here's a demonstration:
#include <stdio.h>
#include <stdlib.h>
void i_segfault() {
fflush(stdout); // Flush whatever is left of STDOUT before we blow up
int *i = NULL;
printf("%i", *i); // Boom
}
void use_some_memory() {
int *some_pointer = malloc(sizeof(int));
*some_pointer = 123;
printf(
"I used the address %p to store a pointer to %p, which contains %i\n",
&some_pointer, some_pointer, *some_pointer
);
}
void i_dont_segfault() {
int *my_new_pointer; // uninitialized
printf(
"I re-used the address %p, which still has a lingering value %p, which still points to %i\n",
&my_new_pointer, my_new_pointer, *my_new_pointer
);
}
int main(int argc, char *argv[]) {
use_some_memory();
i_dont_segfault();
i_segfault();
}
There are at least two ways to make sure your program does not use an uninitialized pointer:
You can design and write your program carefully so that you are sure that program control never flows through a path that uses the pointer without initializing it first.
You can initialize the pointer to NULL.
The first can be hard, depending on the program, and humans keep making mistakes with it. The second is easy.
On the other hand, the second only solves one problem: It ensures that if you use the pointer without otherwise initializing it, it will have an assigned value. Further, in many systems, it ensures that if you attempt to use that value to access a pointed-to object, your program will crash rather than do something worse, like corrupt data and produce wrong results or erase valuable information.
That is a useful problem to solve, because it means this bug of failing to assign the desired value is likely to be caught during testing and, even if it is not, the damage is may due is likely to be limited. However, this does not solve the problem of ensuring that the pointer is assigned the desired value before it is used. So, like many of these code recommendations, it is a useful tip to help limit human errors, but it is not a complete solution.
If you don't initialize a variable (by mistake) the problem is that your program can be even more anomalous than normal. Imagine you have declared an enum with the values ONE, TWO, THREE, and you forget to initialize it, and at some point you include the following code:
switch(my_var) {
case ONE: /* do one */
...
break;
case TWO: /* do two */
...
break;
case THREE: /* do three */
...
break;
}
and you can get nuts because you assume that, at least one of the possible values of the type should be executed... but that's simply not true, as the variable has not been initialized and the value doesn't need even to comply with the constraints imposed to the data type it represents.
Initializing a variable can introduce a short couple of instructions in your code, but it saves a lot of nightmares when searching for errors.
In the case of an uninitialized pointer, the case is worse, as many programmers free() memory allocated from the heap based on the test
if (var) free(var);
and most probably an uninitialized automatic variable of pointer type will point somewhere, and sill be tried to be free()d.
#include <stdio.h>
#include <stdlib.h>
int main()
{
int *a;
a = (int *)malloc(100*sizeof(int));
int i=0;
for (i=0;i<100;i++)
{
a[i] = i+1;
printf("a[%d] = %d \n " , i,a[i]);
}
a = (int*)realloc(a,75*sizeof(int));
for (i=0;i<100;i++)
{
printf("a[%d] = %d \n " , i,a[i]);
}
free(a);
return 0;
}
In this program I expected the program to give me a segmentation fault because im trying to access an element of an array which is freed using realloc() . But then the output is pretty much the same except for a few final elements !
So my doubt is whether the memory is actually getting freed ? What exactly is happening ?
The way realloc works is that it guarantees that a[0]..a[74] will have the same values after the realloc as they did before it.
However, the moment you try to access a[75] after the realloc, you have undefined behaviour. This means that the program is free to behave in any way it pleases, including segfaulting, printing out the original values, printing out some random values, not printing anything at all, launching a nuclear strike, etc. There is no requirement for it to segfault.
So my doubt is whether the memory is actually getting freed?
There is absolutely no reason to think that realloc is not doing its job here.
What exactly is happening?
Most likely, the memory is getting freed by shrinking the original memory block and not wiping out the now unused final 25 array elements. As a result, the undefined behaviour manifests itself my printing out the original values. It is worth noting that even the slightest changes to the code, the compiler, the runtime library, the OS etc could make the undefined behaviour manifest itself differently.
You may get a segmentation fault, but you may not. The behaviour is undefined, which means anything can happen, but I'll attempt to explain what you might be experiencing.
There's a mapping between your virtual address space and physical pages, and that mapping is usually in pages of 4096 bytes at least (well, there's virtual memory also, but lets ignore that for the moment).
You get a segmentation fault if you attempt to address virtual address space that doesn't map to a physical page. So your call to realloc may not have resulted in a physical page being returned to the system, so it's still mapped to you program and can be used. However a following call to malloc could use that space, or it could be reclaimed by the system at any time. In the former case you'd possibly overwrite another variable, in the latter case you'll segfault.
Accessing an array beyond its bounds is undefined behaviour. You might encounter a runtime error. Or you might not. The memory manager may well have decided to re-use the original block of memory when you re-sized. But there's no guarantee of that. Undefined behaviour means that you cannot reason about or predict what will happen. There's no grounds for you to expect anything to happen.
Simply put, don't access beyond the end of the array.
Some other points:
The correct main declaration here is int main(void).
Casting the value returned by malloc is not needed and can mask errors. Don't do it.
Always store the return value of realloc into a separate variable so that you can detect NULL being returned and so avoid losing and leaking the original block.
Program was programmed in C and compiled with GCC.
I was trying to help a friend who was trying to use trying to (shallow) copy a value that was passed into a function. His the value was a struct that held primitives and pointers (no arrays or buffers). Unsure of how malloc works, he used it similar to how the following was done:
void some_function(int rand_params, SOME_STRUCT_TYPEDEF *ptr){
SOME_STRUCT_TYPEDEF *cpy;
cpy = malloc(sizeof(SOME_STRUCT_TYPEDEF));// this line makes a difference?!?!?
cpy = ptr;// overwrites cpy anyway, right?
//prints a value in the struct documented to be a char*,
//sorry couldn't find the documentation right now
}
I told him that the malloc shouldn't affect the program, so told him to comment it out. To my surprise, the malloc caused a different output (with some intended strings) from the implementation with the malloc commented out (prints our garbage values). The pointer that's passed into the this function is from some other library function which I don't have documentation for at the moment. The best I can assume it that the pointer was for a value that was actually a buffer (that was on the stack). But I still don't see how the malloc can cause such a difference. Could someone explain how that malloc may cause a difference?
I would say that the evident lack of understanding of pointers is responsible for ptr actually pointing to memory that has not been correctly allocated (if at all), and you are experiencing undefined behaviour. The issue is elsewhere in the program, prior to the call to some_function.
As an aside, the correct way to allocate and copy the data is this:
SOME_STRUCT_TYPEDEF *cpy = malloc(sizeof(SOME_STRUCT_TYPEDEF));
if (cpy) {
*cpy = *ptr;
// Don't forget to clean up later
free(cpy);
}
However, unless the structure is giant, it's a bit silly to do it on the heap when you can do it on the stack like this:
SOME_STRUCT_TYPEDEF cpy = *ptr;
I can't see why there difference in the print.
can you show the print code?
anyway the malloc causes memory leak. you're not supposed to allocate memory for 'cpy' because pointer assignment is not shallow-copy, you simply make 'cpy' point to same memory 'ptr' point by storing the address of the start of that memory in 'cpy' (cpy is mostly a 32/64 bit value that store address, in case of malloc, it will store the address of the memory section you allocated)
After writing a program to reverse a string, I am having trouble understanding why I got a seg fault while trying to reverse the string. I have listed my program below.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void reverse(char *);
int main() {
char *str = calloc(1,'\0');
strcpy(str,"mystring0123456789");
reverse(str);
printf("Reverse String is: %s\n",str);
return 0;
}
void reverse(char *string) {
char ch, *start, *end;
int c=0;
int length = strlen(string);
start = string;
end = string;
while (c < length-1){
end++;
c++;
}
c=0;
while(c < length/2){
ch = *end;
*end = *start;
*start = ch;
start++;
end--;
c++;
}
}
1st Question:
Even though I have allocated only 1 byte of memory to the char pointer
str (calloc(1,'\0')), and I copied a 18 bytes string mystring0123456789 into it, and it didn't throw any error and the program worked fine without any SEGFAULT.
Why did my program not throw an error? Ideally it should throw some error as it don't have any memory to store that big string. Can someone throw light on this?
The program ran perfectly and gives me output Reverse String is: 9876543210gnirtsym.
2nd Question:
If the replace the statement
strcpy(str,"mystring0123456789");
with
str="mystring0123456789\0";
the program gives segmentation fault even though I have allocated enough memory for str (malloc(100)).
Why the program throwing segmentation fault?
Even though i have allocated only 1 byte of memory to the char pointer str(calloc(1,'\0')), and i copied a 18 bytes string "mystring0123456789" into it, and it didn't throw any error and the program worked fine without any SEGFAULT.
Your code had a bug -- of course it's not going to do what you expect. Fix the bug and the mystery will go away.
If the replace the statement
strcpy(str,"mystring0123456789");
with
str="mystring0123456789\0";
the program gives segmentation fault even though i have allocated enough memory for str (malloc(100)).
Because when you finish this, str points to a constant. This throws away the previous value of str, a pointer to memory you allocated, and replaces it with a pointer to that constant.
You cannot modify a constant, that's what makes it a constant. The strcpy function copies the constant into a variable which you can then modify.
Imagine if you could do this:
int* h = &2;
Now, if you did *h = 1; you'd be trying to change that constant 2 in your code, which of course you can't do.
That's effectively what you're doing with str="mystring0123456789\0";. It makes str point to that constant in your source code which, of course, you can't modify.
There's no requirement that it throw a segmentation fault. All that happens is that your broken code invokes undefined behavior. If that behavior has no visible effect, that's fine. If it formats the hard drive and paints the screen blue, that's fine too. It's undefined.
You're overwriting the pointer value with the address of a string literal, which totally doesn't use the allocated memory. Then you try to reverse the string literal which is in read-only memory, which causes the segmentation fault.
Your program did not throw an error because, even though you did the wrong thing, ncaught you (more below). You wrote data were you were not supposed to, but you got “lucky” and did not break anything by doing this.
strcpy(str,"mystring0123456789"); copies data into the place where str points. It so happens that, at that place, you are able to write data without causing a trap (this time). In contrast, str="mystring0123456789\0"; changes str to point to a new place. The place it points to is the place where "mystring0123456789\0" is stored. That place is likely read-only memory, so, when you try to write to it in the reverse routine, you get a trap.
More about 1:
When calloc allocates memory, it merely arranges for there to be some space that you are allowed to use. Physically, there is other memory present. You can write to that other memory, but you should not. This is exactly the way things work in the real world: If you rent a hotel room, you are allowed to use that hotel room, but it is wrong for you to use other rooms even if they happen to be open.
Sometimes when you trespass where you are not supposed to, in the real world or in a program, nobody will see, and you will get away with it. Sometimes you will get caught. The fact that you do not get caught does not mean it was okay.
One more note about calloc: You asked it to allocate space for one thing of zero size (the source code '\0' evaluates to zero). So you are asking for zero bytes. Various standards (such as C and Open Unix) may say different things about this, so it may be that, when you ask for zero bytes, calloc gives you one byte. However, it certainly does not give you as many bytes as you wrote with strcpy.
It sounds like you are writing C programs having come from a dynamic language or at least a language that does automatic string handling. For lack of a more formal definition, I find C to be a language very close to the architecture of the machine. That is, you make a lot of the programming decisions. A lot of your program problems are the result of your code causing undefined behavior.You got a segfault with strcpy, because you copied memory into a protected location; the behavior was undefined. Whereas, assigning your fixed string "mystring0123456789\0" was just assigning that pointer to str.
When you implement in C, you decide whether you want to define your storage areas at compile or run-time, or decide to have storage allocated from the heap (malloc/calloc). In either case, you have to write housekeeping routines to make sure you do not exceed the storage you have defined.
Assigning a string to a pointer merely assigns the string's address in memory; it does not copy the string, and a fixed string inside quotes "test-string" is read-only, and you cannot modify it. Your program may have worked just fine, having done that assignment, even though it would not be considered good C coding practice.
There are advantages to handling storage allocations this way, which is why C is a popular language.
Another case is that you can have a segfault when you use memory correct AND your heap became so big that your physical memory cannot manage it (without overlap with stack|text|data|bss -> link)
Proof: link, section Possible Cause #2
this will give a proper output even though i have not allocated memory and have declared a pointer to structure two inside main
struct one
{
char x;
int y;
};
struct two
{
char a;
struct one * ONE;
};
main()
{
struct two *TWO;
scanf("%d",&TWO->ONE->y);
printf("%d\n",TWO->ONE->y);
}
but when i declare a pointer to two after the structure outside main i will get segmentation fault but why is it i don't get segmentation fault in previous case
struct one
{
char x;
int y;
};
struct two
{
char a;
struct one * ONE;
}*TWO;
main()
{
scanf("%d",&TWO->ONE->y);
printf("%d\n",TWO->ONE->y);
}
In both the cases TWO is a pointer to a object of type struct two.
In case 1 the pointer is wild and can be pointing anywhere.
In case 2 the pointer is NULL as it is global.
But in both the cases it a pointer not pointing to a valid struct two object. Your code in scanf is treating this pointer as though it was referring to a valid object. This leads to undefined behavior.
Because what you are doing is undefined behaviour. Sometimes it seems to work. That doesn't mean you should do it :-)
The most likely explanation is to do with how the variables are initialised. Automatic variables (on the stack) will get whatever garbage happens to be on the stack when the stack pointer was decremented.
Variables outside functions (like in the second case) are always initialised to zero (null pointer for pointer types).
That's the basic difference between your two situations but, as I said, the first one is working purely by accident.
When declaring a global pointer, it will be initialized to zero, and so the generated addresses will be small numbers that may or may not be readable on your system.
When declaring an automatic pointer, its initial value is likely to be much more interesting. It will be, in this case, whatever the run-time library left at that point on the stack prior to calling main(), or perhaps a left-over value from the compiler-generated stack-frame setup code. It is somewhat likely to be a saved stack pointer or frame pointer, which is a valid pointer if used with small offsets.
So anyway, the uninitialized pointer does have something in it, and one value leads to a fault while the other, for now, on your system, does not.
And that's because the segmentation fault is a mechanism of the OS and not the C language.
A fault is a block-based mechanism that allocates to itself and other programs some number of pages -- which are each several K -- and it protects itself and other program's pages while allowing your program free reign. You must stray outside of the block context or try to write a read-only page (even if yours) to generate a fault. Simply breaking a language rule is not necessarily enough. The OS is happy to let your program misbehave and act oddly due to its wild references, just as long as it only reads and writes (or clobbers) itself.