What is the value of an uninitialized string in c? [duplicate]

What is the value of an uninitialized string in c? [duplicate] - c

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
initial value of int array in C
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
int main(){
char name[10];
printf("%s\n", name);
return 0;
}
What value does an uninitialized string in C hold ? Does the compiler automatically allocate a storage of size 10 and fill it with garbage values ? What basically happens on writing the above code ?

10 bytes are allocated on the stack, that's all. Their value is left as is, that means it is what ever had been written to such 10 bytes before having been allocated.

As the string is uninitialized, the value is not defined - it may be anything. I would also say it is unsafe to print uninitialized string as it does not have a terminating zero character, so in theory you may end up printing way more than 10 chars.
And another thing - C does not fill the storage with anything. It just leaves it the way it is.
EDIT: Please note I am not saying that as long as you have a 0 terminating character it is safe to access the uninitialized string. Invoking an undefined behavior is never safe as it is undefined - you never know what will happen.

The contents of uninitialized variables is - other than e.g. in Java - undefined. In other words: The contents consists of values pushed on the stack lately for other method invocations.

In your particular example, it's probably going to be zeroes. But it doesn't matter.
The key point is that it's undefined. If you can't trust it to always be the same, it's of no use to you. You can't make any assumptions. No other part of your code can depend on it. It's like it didn't exist.
If you're curious as to where the actual contents come from, they are remainders of previous execution contexts stored in the stack. If you run a few function calls, you're going to leave garbage lying around that your program will feel free to overwrite. Those only-good-for-overwriting bytes may end up in your string.

The C standard uses the term "unspecified", i. e. it can be anything. In real life, it will most likely be filled with random garbage, and if you're unlucky, it won't have a terminating zero byte, so you invoke undefined behavior and probably the call to printf() will crash (segmentation fault, anyone?).

it contains garbage (random) values. Please do see more information on storage classes to have a better understanding.

Related

array of pointers in c the code should not run

//this code should give segmentation error....but it works fine ....how is it possible.....i just got this code by hit and trail whle i was trying out some code of topic ARRAY OF POINTERS....PLZ can anyone explain
int main()
{
int i,size;
printf("enter the no of names to be entered\n");
scanf("%d",&size);
char *name[size];
for(i=0;i<size;i++)
{
scanf("%s",name[i]);
}
printf("the names in your array are\n");
for(i=0;i<size;i++)
{
printf("%s\n",&name[i]);
}
return 0

The problem in your code (which is incomplete, BTW; you need #include <stdio.h> at the top and a closing } at the bottom) can be illustrated in a much shorter chunk of code:
char *name[10]; // make the size an arbitrary constant
scanf("%s", name[0]); // Read into memory pointed to by an uninitialized pointer
(name could be a single pointer rather than an array, but I wanted to preserve your program's structure for clarity.)
The pointer name[0] has not been initialized, so its value is garbage. You pass that garbage pointer value to scanf, which reads characters from stdin and stores them in whatever memory location that garbage pointer happens to point to.
The behavior is undefined.
That doesn't mean that the program will die with a segmentation fault. C does not require checking for invalid pointers (nor does it forbid it, but most implementations don't do that kind of checking). So the most likely behavior is that your program will take whatever input you provide and attempt to store it in some arbitrary memory location.
If the garbage value of name[0] happens to point to a detectably invalid memory location, your program might die with a segmentation fault. That's if you're luck. If you're not, it might happen to point to some writable memory location that your program is able to modify. Storing data in that location might be harmless, or it might clobber some critical internal data structure that your program depends on.
Again, your program's behavior is undefined. That means the C standard imposes no requirements on its behavior. It might appear to "work", it might blow up in your face, or it might do anything that it's physically possible for a program to do. Apparently to behave correctly is probably the worst consequence of undefined behavior, since it makes it difficult to diagnose the problem (which will probably appear during a critical demo).
Incidentally, using scanf with a %s format specifier is inherently unsafe, since there's no way to limit the amount of data it will attempt to read. Even with a properly initialized pointer, there's no way to guarantee that it points to enough memory to hold whatever input it receives.
You may be accustomed to languages that do run-time checking and can reliably detect (most) problems like this. C is not such a language.

I'm not sure what's your test case (No enough reputation to post a comment). I just try to input it with 0 and 1\n1\n2\n.
It's a little complex to explain the detail. However, Let's start it :-). There are two things you should know. First, main() is a function. Second, you use a C99 feature, variable-length array or gnu extension, zero-length array (supported by gcc), on char *name[size];.
main() is a function, so all the variable declared in this function is local variables. Local variables locate at stack section. You must know about it first.
If you input 1\n1\n2\n, the variable-length array is used. The implementation of it is also to allocate it on stack. Notice that value of each element in array is not initialized as 0. That is the possible answer for you to execute without segmentation fault. You cannot make sure that it'll point to the address which isn't writable (At least failed on me).
If the input is 0\n, you will use extension feature, zero-length array, supported by GNU. As you saw, it means no element in array. The value of name is equal to &size, because size is the last local variable you declared before you declared name[0] (Consider stack pointer). The value of name[0] is equal to dereference to &size, that's zero (='\0') , so it will work fine.

The simple answer to your question is that a segmentation fault is:
A segmentation fault (aka segfault) are caused by a program trying to read or write an illegal memory location.
So it all depends upon what is classed as illegal. If the memory in question is a part of the valid address space, e.g. the stack, for the process the program is running, it may not cause a segfault.
When I run this code in a debugger the line:
scanf("%s, name[i]);
over writes the content of the size variable, clearly not the intended behaviour, and the code essentially goes into an infinite loop.
But that is just what happens on my 64 bit Intel linux machine using gcc 5.4. Another environment will probably do something different.
If I put the missing & in front of name[i] it works OK. Whether that is luck, or expertly exploiting the intended behaviour of C99 variable length arrays, as suggested. I'm afraid I don't know.
So welcome to the world of subtle memory overwriting bugs.

indexing past the end of C arrays [duplicate]

This question already has answers here:
How dangerous is it to access an array out of bounds?
(12 answers)
Closed 6 years ago.
I wrote a short program in C just to see what happens when you index past the end of an array.
I found that it mostly produces random values (I know they are not actually random) up until a point (52 indexes past the end in this case) where it produced 0 every single time. Every value past this point and the program crashes. Why is this? is it the end of the programs allocated memory space?
main()
{
int ar[4];
ar[0] = 99;
ar[1] = 45;
printf("array: %d, %d random value: %d", ar[0], ar[1], ar[55]);
}
Edit: I also found that if I alter this value that always ends up being 0 (i.e. ar[55] = 1000) then the return code for the program goes up.

... just to see what happens when you index past the end of an array
Trying to access out of bound memory, invokes undefined behavior. Anything can happen, just anything.
In your case, for some reason, the memory address for upto index 52 is accessible from the process, so it allows the access. Index past 52 points to a memory region not allocated to your process address space and thus, raises the access violation leading to segfault. This is not a deterministic behaviour, at all and there's no way you can rely on the output of a program invoking UB.

Accessing array elements beyond array boundaries (before 0 or from its size up) is undefined behavior. It may or may not produce values, it may cause the program to end abruptly, it may cause your system to stop, restart or catch fire...
Modern systems try to confine undefined behavior within reasonable limits via memory protection, user space limitations etc. but even user space code errors can have dire consequences:
pacemaker messing with its timing values can cause premature death ;
banking software overflowing array boundaries can overwrite account balance information crediting some random account with untold amounts of dollars.
your self driving car could behave worse than drunk drivers...
think of nuclear power-plant control software, airplane instruments, military stuff...
There is no question undefined behavior should be avoided.
Regarding the exit status, your program uses an obsolete syntax for the definition of main(), implicit return type, which is no longer supported in C99 and later, but does not return anything, which means its return value can be any random value, including a different value for every execution. C99 specified a kludge for the main() function and forces an implicit return 0; at the end of main(), but relying on it is bad style.
Similarly, invoking printf() without a proper prototype is undefined behavior. You should include <stdio.h> before the definition of function main().
Lastly, ar[0] and ar[1] are initialized in main(), but ar[2] and ar[3] are not. Be aware that accessing uninitialized values also has undefined behavior. The values can be anything at all, what you describe as random values, but on some systems, they could be trap values, causing undefined behavior by just reading them.
Some very handy tools are available to track this kind of problems in simple and complex programs, most notably valgrind. If you are curious about this subject, You should definitely look at it.

Array upper bound cecking in C language [duplicate]

This question already has answers here:
How dangerous is it to access an array out of bounds?
(12 answers)
Closed 8 years ago.
In c program we can initialize an array like int array[10]. So it can store 10 integer value.But when I give input using loop it takes input more than 10 and doesn't show any error.
actually what is happening??
#include<stdio.H>
main()
{
int array[10],i;
for(i=0;i<=11;i++)
scanf("%d",&array[i]);
for(i=0;i<10;i++)
printf("%d",array[i]);
}

Because C doesn't do any array bounds checking. You as a programmer are responsible for making sure that you don't index out of bounds.
Depending on the used compiler and the system the code is running on, you might read random data from memory or get a SIGSEGV eventually when reading/writing out of bounds.

The C compiler and the runtime are not required to perform any array bounds checking.
What you describe is an example of a whole class of programming errors that result in undefined behavior. From Wikipedia:
In computer programming, undefined behavior refers to computer code whose behavior is specified to be arbitrary.
What this means is that the program is allowed to misbehave (or not) in any way it pleases.
In practice, any of the following are reasonably likely to happen when you write past the end of an array:
The program crashes, either immediately or at a later point.
Other, unrelated, data gets overwritten. This could result in arbitrary misbehaviour and/or in serious security vulnerabilities.
Internal data structures that are used to keep track of allocated memory get corrupted by the out-of-bounds write.
The program works exactly as if more memory had been allocated in the first place (memory is often allocated in block, and by luck there might happen to be some spare capacity after the end of the array).
(This is not an exhaustive list.)
There exist tools, such as Valgrid, that can help discover and diagnose this type of errors.

The C-language standard does not dictate how variables should be allocated in memory.
So the theoretical answer is that you are performing an unsafe memory access operation, which will lead to undefined behavior (anything could happen).
Technically, however, all compilers allocate local variables in the stack and global variables in the data-section, so the practical answer is:
In the case of a local array, you will either override some other local variable or perform an illegal memory access operation.
In the case of a global array, you will either override some other global variable or perform an illegal memory access operation.

storage of string array when input string size is more [duplicate]

This question already has answers here:
Is character array in C dynamic?
(4 answers)
Closed 8 years ago.
#include<stdio.h>
int main()
{
char str[10];
scanf("%[^\n]", str);
printf("%s\n", str);
}
when this is compile and input is given as "subhash das india".
Output is same as subhash das india.
I want to know how the string is stored in memory as str size is 10 and size of input given is greater than 10.

The first ten characters are stored in the array, the rest in the bytes right after the array in memory. You don't know what was supposed to be stored in these bytes, nevertheless you overwrite them with the string "s india". This is bad. Even though it might work in this case, it is not even likely that it will work with a different compiler / on a different system / when you you run the program again.
printf() does not care what you have overwritten, it will only search the string for the terminating null byte, which happens to come somewhere after the end of the array. The unexpected behavior is not likely to take effect until you return from your function (main() in this case), which may segfault, or use some other variables declared in main() which happen to have been overwritten by scanf().
Note, that the behavior of your program is entirely undefined. It is allowed to format your harddrive. It is allowed to save a backup of your computer to the NSA. It is allowed to download a nifty little program that will automatically redirect your browser the next time you try to do online banking. These are real dangers, code like the one you've given is at the heart of the most easily exploitable security holes.
If you want to be safe, use the POSIX-2008 functions: getline() to read entire lines and asprintf() to replace use of sprintf() and strcat(). The GNU implementation of the standard C library also allows you to use the format "%as" with scanf() to allocate enough memory for the string that is to be read, but that is not a part of a standard.

printf will continue to print until it finds the null (terminating character):\0 thus although only 10 bytes are set aside the text will still be put in the array as str is just a pointer to an array.

The array str takes 10 bytes from stack of the application.
Some compilers calculate the required stack size by simply sum the number of bytes of all variables, arrays and structures located on stack in all functions plus the number of bytes all function arguments and the function calls would need. But this stack size calculation is definitely always wrong as not all functions are called in a sequence.
Therefore many compilers often define a fixed stack size which is allocated on start of the application before main is called. The stack size can be controlled by a developer, for example when an application needs more than applications usually need because of lots of recursive function calls.
For example, take a look on the Visual Studio page about /F (Set Stack Size). The default stack size for C/C++ applications compiled with Visual C/C++ have 1 MB of stack.
Therefore it is no problem on your simple code that a string with more than 10 bytes is copied on stack to the location on which str points. The string from scanf is terminated with a null byte. printf simply outputs bytes until a null byte is found.
But if you would modify your code by adding a function which contains for example also an array located on stack which is additionally also initialized with something, and this function is called between scanf and printf, you would see a different string on output as used on input.
scanf and printf should be both not used anymore because of no check on array sizes. The C/C++ libraries and the various frameworks/libraries offer more secure functions which check the sizes of the string arrays.

Why is this C program returning correct value in VC++2008?

We know that automatic variables are destroyed upon the return of the function.
Then, why is this C program returning correct value?
#include <stdio.h>
#include <process.h>
int * ReturningPointer()
{
int myInteger = 99;
int * ptrToMyInteger = &myInteger;
return ptrToMyInteger;
}
main()
{
int * pointerToInteger = ReturningPointer();
printf("*pointerToInteger = %d\n", *pointerToInteger);
system("PAUSE");
}
Output
*pointerToInteger = 99
Edit
Then why is this giving garbage values?
#include <stdio.h>
#include <process.h>
char * ReturningPointer()
{
char array[13] = "Hello World!";
return array;
}
main()
{
printf("%s\n", ReturningPointer());
system("PAUSE");
}
Output
x≈§

There is no answer to that question: your code exhibits undefined behavior. It could print "the right value" as you are seeing, it could print anything else, it could segfault, it could order pizza online with your credit card.
Dereferencing that pointer in main is illegal, it doesn't point to valid memory at that point. Don't do it.
There's a big difference between you two examples: in the first case, *pointer is evaluated before calling printf. So, given that there are no function calls between the line where you get the pointer value, and the printf, chances are high that the stack location pointer points to will not have been overwritten. So the value that was stored there prior to calling printf is likely to be output (that value will be passed on to printf's stack, not the pointer).
In the second case, you're passing a pointer to the stack to printf. The call to printf overwrites (a part of) that same stack region the pointer is pointing to, and printf ends up trying to print its own stack (more or less) which doesn't have a high chance of containing something readable.
Note that you can't rely on getting gibberish either. Your implementation is free to use a different stack for the printf call if it feels like it, as long as it follows the requirements laid out by the standard.

This is undefined behavior, and it could have launched a missile instead. But it just happened to give you the correct answer.
Think about it, it kind of make sense -- what else did you expect? Should it have given you zero? If so, then the compiler must insert special instructions at the scope end to erase the variable's content -- waste of resources. The most natural thing for the compiler to do is to leave the contents unchanged -- so you just got the correct output from undefined behavior by chance.
You could say this behavior is implementation defined. For example. Another compiler (or the same compiler in "Release" mode) may decide to allocate myInteger purely in register (not sure if it actually can do this when you take an address of it, but for the sake of argument...), in that case no memory would be allocated for 99 and you would get garbage output.
As a more illustrative (but totally untested) example -- if you insert some malloc and exercise some memory usage before printf you may find the garbage value you were looking for :P
Answer to "Edited" part
The "real" answer that you want needs to be answered in disassembly. A good place to start is gcc -S and gcc -O3 -S. I will leave the in-depth analysis for wizards that will come around. But I did a cursory peek using GCC and it turns out that printf("%s\n") gets translated to puts, so the calling convention is different. Since local variables are allocated on the stack, calling a function could "destroy" previously allocated local variables.

Destroying is the wrong word imho. Locals reside on the stack, if the function returns the stack space may be reused again. Until then it is not overwritten and still accessible by pointers which you might not really want (because this might never point to something valid)
Pointers are used to address space in memory, for local pointers the same as I described in 1 is valid. However the pointer seems to be passed to the main program.
If it really is the address storing the former integer it will result in "99" up until that point in the execution of your program when the program overwrite this memory location. It may also be another 99 by coincidence. Any way: do not do this.
These kind of errors will lead to trouble some day, may be on other machines, other OS, other compiler or compiler options - imagine you upgrade your compiler which may change the behaviour the memory usage or even a build with optimization flags, e.g. release builds vs default debug builds, you name it.

In most C/C++ programs their local variables live on the stack, and destroyed means overwritten with something else. In this case that particular location had not been overwritten yet when it was passed as a parameter to printf().
Of course, having such code is asking for trouble because per the C and C++ standards it exhibits undefined behavior.

That is undefined behavior. That means that anything can happen, even what you would expect.
The tricky part of UB is when it gives you the result you expect, and so you think that you are doing it right. Then, any change in an unrelated part of the program changes that...
Answering your question more specifically, you are returning a pointer to an automatic variable, that no longer exists when the function returns, but since you call no other functions in the middle, it happens to keep the old value.
If you call, for example printf twice, the second time it will most likely print a different value.

The key idea is that a variable represents a name and type for value stored somewhere in memory. When it is "destroyed", it means that a) that value can no longer be accessed using that name, and b) the memory location is free to be overwritten.
The behavior is undefined because the implementation of the compiler is free to choose what time after "destruction" the location is actually overwritten.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight