I am starting to learn c and cannot find a clear example of handling memory violations. Currently I have written a piece of code that uses a variable and an array.
I assign a value to the variable and then populate the array with a set of initial values. However one of the values in the array is being saved at the same address as the variable and hence overwriting the variable.
Could some one please give me a simple example of how to handle such errors or to avoid such errors....thanks
Once an error such as a memory violation has occurred in C, you cannot 'handle' it. So, you have to avoid it in the first place. The way to do what you want is as follows:
int a[10];
int i;
for( i = 0; i < 10; i++ )
a[i] = 5;
This is a guess but seems pretty much your problem.
You are overwriting beyond the bounds of the array.
C does not guard you against writing beyond the bounds of an allocated array. You as a programmer must ensure you do not do so. Failing to do so will result in Undefined Behavior and then anything can happen(literally) your program might work or might not or show unusual behavior.
For eg:
int arr[10];
Declares an array of 10 integers and the valid subscript range is from 0 to 9,
You should ensure your program uses valid subscripts.
Related
I was reading through some source code and found a functionality that basically allows you to use an array as a linked list? The code works as follows:
#include <stdio.h>
int
main (void)
{
int *s;
for (int i = 0; i < 10; i++)
{
s[i] = i;
}
for (int i = 0; i < 10; i++)
{
printf ("%d\n", s[i]);
}
return 0;
}
I understand that s points to the beginning of an array in this case, but the size of the array was never defined. Why does this work and what are the limitations of it? Memory corruption, etc.
Why does this work
It does not, it appears to work (which is actually bad luck).
and what are the limitations of it? Memory corruption, etc.
Undefined behavior.
Keep in mind: In your program whatever memory location you try to use, it must be defined. Either you have to make use of compile-time allocation (scalar variable definitions, for example), or, for pointer types, you need to either make them point to some valid memory (address of a previously defined variable) or, allocate memory at run-time (using allocator functions). Using any arbitrary memory location, which is indeterminate, is invalid and will cause UB.
I understand that s points to the beginning of an array in this case
No the pointer has automatic storage duration and was not initialized
int *s;
So it has an indeterminate value and points nowhere.
but the size of the array was never defined
There is neither array declared or defined in the program.
Why does this work and what are the limitations of it?
It works by chance. That is it produced the expected result when you run it. But actually the program has undefined behavior.
As I have pointed out first on the comments, what you are doing does not work, it seems to work, but it is in fact undefined behaviour.
In computer programming, undefined behavior (UB) is the result of
executing a program whose behavior is prescribed to be unpredictable,
in the language specification to which the computer code adheres.
Hence, it might "work" sometimes, and sometimes not. Consequently, one should never rely on such behaviour.
If it would be that easy to allocate a dynamic array in C what would one use malloc?! Try it out with a bigger value than 10 to increase the likelihood of leading to a segmentation fault.
Look into the SO Thread to see the how to properly allocation and array in C.
When I try to print all the values in the array(which should be zero?), it starts printing 0's but at the end prints wonky numbers:
"(printing zeros)...0,0,0,0,0,0,0,1810432,0,1809600,0,1809600,0,0,0,5,0,3907584..."
When I extend the array, only at the end do the numbers start to mess up. Is this a memory limitation or something? Very confused, would greatly appreciate if anyone could help a newbie out.
Done in CS50IDE, not sure if that changes anything
int main()
{
int counter [100000];
for(int i = 0; i < 100000; i++)
{
printf("%i,", counter[i]);
}
}
Your array isn't initialized. You simply declare it but never actually set it. In C (and C++, Objective-C) you need to manually set a starting value. Unlike Python, Java, JavaScript or C# this isn't done for you...
which should be zero?
The above assertion is incorrect.
auto variables (variables declared within a block without the static keyword) are not initialized to any particular value when they are created; their value is indeterminate. You can't rely on that value being 0 or anything else.
static variables (declared at file scope or with the static keyword) are initialized to 0 or NULL, depending on type.
You can initialize all of the elements of the array to 0 by doing
int counter [100000] = {{0}};
If there are fewer elements in the initializer than there are elements in the array, then the extra elements are initialized as though they were static - 0 or NULL. So the first element is being explicitly initialized to 0, and the remaining 99999 elements are implicitly initialized to 0.
The reason why this is happening is because you reserved 100000*4 = 400000 bytes of memory but didn't write anything to it (didn't initialize it).
So therefore, garbage is printed if you access a memory location which hasn't been written to yet. The reason why 0's aren't printed is because we want optimization and don't want the compiler wasting time in writing to 100000 integer addresses and also the best practices expect a developer to never access a memory place that he has never written to or allocated yet. If you try printing:
printf("%d\n", counter[100000]);
This would also print a garbage value, but you didn't allocate that did you? It's because C/C++ don't restrict or raise errors when you try to do such operation unlike Java.
Try it yourself
for (int i=0; i<100000; i++) {
counter[i] = i;
printf("%d\n", counter[i]);
}
Now only numbers from 1,2,3....99999 will be printed on the screen.
When you declare an array in C, it does not set the elements to zero by default. Instead, it will be filled with whatever data last occupied that location in memory, which could be anything.
The fact that the first portion of the array contained zeros is just a coincidence.
This beginning state of an array is referred to as an "uninitialized" array, as you have not provided any initial values for the array. Before you can use the array, it should be "initialized", meaning that you specify a default value for each position.
This question already has answers here:
Array index out of bound behavior
(10 answers)
Closed 7 years ago.
Suppose I declare the following
typedef struct{
int age;
int weight;
} Man;
Then I make an array of Man such as
Man *manArr = malloc(sizeof(Man) * 2);
My understanding is that I now have two cells each capable of holding a Man type in them..but how am I able to do this then?
manArr[45] = (Man) {33, 23};
I would have imagined that I would have seg faulted because there only exists two cells but I can printf the values of manArr[45]. What's a good way to for example go through struct arrays, do something to their fields, and move on to the next without "going out of bounds" per say?
Thanks
Accessing out-of-bounds is not guaranteed to segfault. It is defined by the C standard as undefined behavior, i.e. anything can happen, including seemingly error-free behavior.
What's a good way to for example go through struct arrays, do something to their fields, and move on to the next without "going out of bounds" per say?
Remember the size of the array.
const size_t manArrSize = 2;
Man *manArr = malloc(sizeof(Man) * manArrSize);
for (size_t index = 0; index < manArrSize; ++index)
{
// Access `manArr[index]`.
}
Going out of bounds of an array causes undefined behaviour. This means that anything could happen. If you're lucky you'll get a segfault, but you may also get cases where the memory location happens to be somewhere you can access.
As for move on to the next without "going out of bounds":
You could either use an extra variable to store the size of the array, or decide on a sentinel value in the array (and allocate one more slot for it) so that if a certain element is equal to the sentinel value, you know it is the end. For example, argv uses NULL as the sentinel value.
Line 1:
int temp2 [4];
for(j=0;j<=4;j++){
for(i=0;i<=4;i++) {
temp2[j] = temp2[j] + election[i][j];
}
}
printf("%d",temp2[3]);
In this above example, the nested for loops sums up the columns of a 5x5 table.
However, the last column is always summed up incorrectly.
When I changed Line 1 to:
int temp2[4] = {0};
All of a sudden the calculations came out perfectly! What exactly happened between the initialization of the array?
If an array is uninitialized, does that mean its last element will always contain some garbage value?
If an array is uninitialized, does that mean its last element will always contain some garbage value?
Whether they contain a garbage value or any value at all is a matter of interpretation, because any attempt to read from such uninitialized variables is undefined behaviour (UB)1. So, you can't even check what is stored in those variables. In practice, UB may manifest itself as "garbage" values being printed out, but technically anything could happen.
Also note that you are accessing the array out of bounds. That is also UB.
for(j=0;j<=4;j++){ /* Oops! Should be j < 4 */
[1] This is a simplification. In practice, implementations can assign unspecified values to uninitialized variables, or use trap representations. This means the results or reading an uninitialized variables could simply be unspecified. But they could also do whatever a given implementation does when a trap value is read. I find it easier to lump everything under UB. See related question: What happens to a declared, uninitialized variable in C? Does it have a value?
Yes, an uninitialized array will contain unpredictable garbage. You must initialize it.
If an array is uninitialized, does that mean its last element will always contain some garbage value?
If the array is not global or static, yes it will contain the garbage value. The BSS initializes the static or globalvariable or memory location to default values unless the variable is initially assigned some value.
Thus, the information at the memory location is overwritten by compiler the program may crash.
Now, when you are accessing that memory what you get is undefined behavior.
Also, note that the snippet is accessing the array out of bounds. So, please use:
int temp2 [4];
for(j=0;j<=3;j++){
for(i=0;i<=3;i++) {
or
int temp2 [4];
for(j=0;j<4;j++){
for(i=0;i<4;i++) {
First, As Jonathan Leffler mentioned, You are looping too much - You initialized an array of 4 but looping 5 times. Try changing your outer loop to j<4 and inner loop to i<4:
Line 1: int temp2 [4];
for(j=0;j<4;j++){
for(i=0;i<4;i++) {
temp2[j] += election[i][j];
}
}
printf("%d",temp2[3]);
You should also initialize your array, as you can't predict what is in memory at the point of creation (also depends on what language you're using)
An uninitialized array will contain garbage data. I've notice that in the last version of visual studio if the array is of simple data types such as int than the compiler/ide automatically initialize it to zeros, but I wouldn't rely on it. As a rule, I recommend you initialize your arrays before you start doing operations like summing etc.
I want to write a C code to see the difference between static and dynamic allocation.
That's my idea but it doesn't work.
It simply initializes an array of size 10, but assigns 100 elements instead of 10. I'll then initialize another array large enough hoping to replace the 90 elements that're not part of array1[10], then I print out the 100 elements of array1.
int i;
int array1[10];
int array2[10000];
for(i=0;i<100;i++)
array1[i] = i;
for(i=0;i<10000;i++)
array2[i] = i+1;
for(i=0;i<100;i++)
{
printf("%d \n",array1[i]);
}
What I hope to get is garbage outside then first 10 elements when using static allocation, afterwards, I'll use malloc and realloc to ensure that the 100 elements would be there correctly. But unfortunately, it seems that the memory is large enough so that the rest of the 100 elements wouldn't be replaced!
I tried to run the code on linux and use "ulimit" to limit the memory size, but it didn't work either.
Any ideas please?
Cdoesn't actually do any boundary checking with regards to arrays. It depends on the OS to ensure that you are accessing valid memory.
Accessing outside the array bounds is undefined behavior, from the c99 draft standard section Annex J.2 J.2 Undefined behavior includes the follow point:
An array subscript is out of range, even if an object is apparently accessible with the
given subscript (as in the lvalue expression a[1][7] given the declaration int
a[4][5]) (6.5.6).
In this example you are declaring a stack based array. Accessing out of bound will get memory from already allocated stack space. Currently undefined behavior is not in your favor as there is no Seg fault. Its programmer's responsibility to handle boundary conditions while writing code in C/C++.
You do get garbage after the first 10 elements of array1. All of the data after element 9 should not be considered allocated by the stack and can be written over at any time. When the program prints the 100 elements of array1, you might see the remnants of either for loop because the two arrays are allocated next to each other and normally haven't been written over. If this were implemented in a larger program, other arrays might take up the space after these two example arrays.
When you access array1[10] and higher index values, the program will just keep writing into adjacent memory locations even though they don't "belong" to your array. At some point you might try to access a memory location that's forbidden, but as long as you're mucking with memory that the OS has given to your program, this will run. The results will be unpredictable though. It could happen that this will corrupt data that belongs to another variable in your program, for example. It could also happen that the value that you wrote there will still be there when you go back to read it if no other variable has been "properly assigned" that memory location. (This seems to be what's happening in the specific case that you posted.)
All of that being said, I'm not clear at all how this relates to potential differences between static and dynamic memory allocation since you've only done static allocation in the program and you've deliberately introduced a bug.
Changing the memory size won't resolve your problem, because when you create your two arrays, the second one should be right after the first one in memory.
Your code should do what you think it will, and on my computer, it does.
Here's my output :
0
1
2
3
4
5
6
7
8
9
10
11
1
2
3
4
5
...
What OS are you running your code on ? (I'm on linux 64bit).
Anyway, as everybody told you, DON'T EVER DO THIS IN A REAL PROGRAM. Writing outside an array is an undefined behaviour and could lead your program to crash.
Writing out of bounds of an array will prove nothing and is not well-defined. Generally, there's nothing clever or interesting involved in invoking undefined behavior. The only thing you'll achieve by that is random crashes.
If you wish to know where a variable is allocated, you have to look at addresses. Here's one example:
#include <stdio.h>
#include <stdlib.h>
int main (void)
{
int stack;
static int data = 1;
static int bss = 0;
int* heap = malloc(sizeof(*heap));
printf("stack: %p\n", (void*)&stack);
printf(".data: %p\n", (void*)&data);
printf(".bss: %p\n", (void*)&bss);
printf(".heap: %p\n", (void*)heap);
}
This should print 4 distinctively different addresses (.data and .bss probably close to each other though). To know exactly where a certain memory area starts, you either need to check some linker script or use a system-specific API. And once you know the memory area's offset and size, you can determine if a variable is stored within one of the different memory segments.