C Array Memory Error?(Beginner) - c

When I try to print all the values in the array(which should be zero?), it starts printing 0's but at the end prints wonky numbers:
"(printing zeros)...0,0,0,0,0,0,0,1810432,0,1809600,0,1809600,0,0,0,5,0,3907584..."
When I extend the array, only at the end do the numbers start to mess up. Is this a memory limitation or something? Very confused, would greatly appreciate if anyone could help a newbie out.
Done in CS50IDE, not sure if that changes anything
int main()
{
int counter [100000];
for(int i = 0; i < 100000; i++)
{
printf("%i,", counter[i]);
}
}

Your array isn't initialized. You simply declare it but never actually set it. In C (and C++, Objective-C) you need to manually set a starting value. Unlike Python, Java, JavaScript or C# this isn't done for you...
which should be zero?
The above assertion is incorrect.

auto variables (variables declared within a block without the static keyword) are not initialized to any particular value when they are created; their value is indeterminate. You can't rely on that value being 0 or anything else.
static variables (declared at file scope or with the static keyword) are initialized to 0 or NULL, depending on type.
You can initialize all of the elements of the array to 0 by doing
int counter [100000] = {{0}};
If there are fewer elements in the initializer than there are elements in the array, then the extra elements are initialized as though they were static - 0 or NULL. So the first element is being explicitly initialized to 0, and the remaining 99999 elements are implicitly initialized to 0.

The reason why this is happening is because you reserved 100000*4 = 400000 bytes of memory but didn't write anything to it (didn't initialize it).
So therefore, garbage is printed if you access a memory location which hasn't been written to yet. The reason why 0's aren't printed is because we want optimization and don't want the compiler wasting time in writing to 100000 integer addresses and also the best practices expect a developer to never access a memory place that he has never written to or allocated yet. If you try printing:
printf("%d\n", counter[100000]);
This would also print a garbage value, but you didn't allocate that did you? It's because C/C++ don't restrict or raise errors when you try to do such operation unlike Java.
Try it yourself
for (int i=0; i<100000; i++) {
counter[i] = i;
printf("%d\n", counter[i]);
}
Now only numbers from 1,2,3....99999 will be printed on the screen.

When you declare an array in C, it does not set the elements to zero by default. Instead, it will be filled with whatever data last occupied that location in memory, which could be anything.
The fact that the first portion of the array contained zeros is just a coincidence.
This beginning state of an array is referred to as an "uninitialized" array, as you have not provided any initial values for the array. Before you can use the array, it should be "initialized", meaning that you specify a default value for each position.

Related

Why do variables declared with the same name in different scopes get assigned the same memory addresses?

I know that declaring a char[] variable in a while loop is scoped, having seen this post: Redeclaring variables in C.
Going through a tutorial on creating a simple web server in C, I'm finding that I have to manually clear memory assigned to responseData in the example below, otherwise the contents of index.html are just continuously appended to the response and the response contains duplicated contents from index.html:
while (1)
{
int clientSocket = accept(serverSocket, NULL, NULL);
char httpResponse[8000] = "HTTP/1.1 200 OK\r\n\n";
FILE *htmlData = fopen("index.html", "r");
char line[100];
char responseData[8000];
while(fgets(line, 100, htmlData) != 0)
{
strcat(responseData, line);
}
strcat(httpResponse, responseData);
send(clientSocket, httpResponse, sizeof(httpResponse), 0);
close(clientSocket);
}
Correct by:
while (1)
{
...
char responseData[8000];
memset(responseData, 0, strlen(responseData));
...
}
Coming from JavaScript, this was surprising. Why would I want to declare a variable and have access to the memory contents of a variable declared in a different scope with the same name? Why wouldn't C just reset that memory behind the scenes?
Also... Why is it that variables of the same name declared in different scopes get assigned the same memory addresses?
According to this question: Variable declared interchangebly has the same pattern of memory address that ISN'T the case. However, I'm finding that this is occurring pretty reliably.
Not completely correct. You don't need to clear the whole responseData array - clearing its first byte is just enough:
responseData[0] = 0;
As Gabriel Pellegrino notes in the comment, a more idiomatic expression is
responseData[0] = '\0';
It explicitly defines a character via its code point of zero value, while the former uses an int constant zero. In both cases the right-side argument has type int which is implicitly converted (truncated) to char type for assignment. (Paragraph fixed thx to the pmg's comment.)
You could know that from the strcat documentation: the function appends its second argument string to the first one. If you need the very first chunk to get stored into the buffer, you want to append it to an empty string, so you need to ensure the string in the buffer is empty. That is, it consists of the terminating NUL character only. memset-ting the whole array is an overkill, hence a waste of time.
Additionally, using a strlen on the array is asking for troubles. You can't know what the actual contents of the memory block allocated for the array is. If it was not used yet or was overwritten with some other data since your last use, it may contain no NUL character. Then strlen will run out of the array causing Undefined Behavior. And even if it returns successfuly, it will give you the string's length bigger than the size of the array. As a result memset will run out of the array, possibly overwriting some vital data!
Use sizeof whenever you memset an array!
memset(responseData, 0, sizeof(responseData));
EDIT
In the above I tried to explain how to fix the issue with your code, but I didn't answer your questions. Here they are:
Why do variables (...) in different scopes get assigned the same memory addresses?
In regard of execution each iteration of the while(1) { ... } loop indeed creates a new scope. However, each scope terminates before the new one is created, so the compiler reserves appropriate block of memory on the stack and the loop re-uses it in every iteration. That also simplifies a compiled code: every iteration is executed by exactly the same code, which simply jumps at the end to the beginning. All instructions within the loop that access local variables use exactly the same addressing (relative to the stack) in each iteration. So, each variable in the next iteration has precisely the same location in memory as in all previous iterations.
I'm finding that I have to manually clear memory
Yes, automatic variables, allocated on the stack, are not initialized in C by default. We always need to explicitly assign an initial value before we use it – otherwise the value is undefined and may be incorrect (for example, a floating-point variable can appear not-a-number, a character array may appear not terminated, an enum variable may have a value out of the enum's definition, a pointer variable may not point at a valid, accessible location, etc.).
otherwise the contents (...) are just continuously appended
This one was answered above.
Coming from JavaScript, this was surprising
Yes, JavaScript apparently creates new variables at the new scope, hence each time you get a brand new array – and it is empty. In C you just get the same area of a previously allocated memory for an automatic variable, and it's your responsibility to initialize it.
Additionally, consider two consecutive loops:
void test()
{
int i;
for (i=0; i<5; i++) {
char buf1[10];
sprintf(buf1, "%d", i);
}
for (i=0; i<1; i++) {
char buf2[10];
printf("%s\n", buf2);
}
}
The first one prints a single-digit, character representation of five numbers into the character array, overwriting it each time - hence the last value of buf1[] (as a string) is "4".
What output do you expect from the second loop? Generally speaking, we can't know what buf2[] will contain, and printf-ing it causes UB. However we may suppose the same set of variables (namely a single 10-items character array) from both disjoint scopes will get allocated the same way in the same part of a stack. If this is the case, we'll get a digit 4 as an output from a (formally uninitialized) array.
This result depends on the compiler construction and should be considered a coincidence. Do not rely on it as this is UB!
Why wouldn't C just reset that memory behind the scenes?
Because it's not told to. The language was created to compile to effective, compact code. It does as little 'behind the scenes' as possible. Among others things it does not do is not initializing automatic variables unless it's told to. Which means you need to add an explicit initializer to a local variable declaration or add an initializing instruction (e.g. an assignment) before the first use. (This does not apply to global, module-scope variables; those are initialized to zeros by default.)
In higher-level languages some or all variables are initialized on creation, but not in C. That's its feature and we must live with it – or just not use this language.
With this line:
char responseData[8000];
You are saying to your compiler: Hey big C, give me a 8000 bytes chunk and name it responseData.
In runtime, if you don't specify, no one will ever clean or give you a "brand-new" chunk of memory. That means that the 8000 bytes chunk you get in every single execution can hold all the possible permutations of bits in this 8000 bytes. Something extraordinary that can happens, is that you're getting in every execution the same memory region and thus, the same bits in this 8000 bytes your big C gave to you in the first time. So, if you don't clean, you have the impression that you're using the same variable, but you're not! You're just using the same (never cleaned) memory region.
I'd add that it's part of the programmer's responsibilities to clean, if you need to, the memory you're allocating, in dynamic or static way.
Why would I want to declare a variable and have access to the memory contents of a variable declared in a different scope with the same name? Why wouldn't C just reset that memory behind the scenes?
Objects with auto storage duration (i.e., block-scope variables) are not automatically initialized - their initial contents are indeterminate. Remember that C is a product of the early 1970s, and errs on the side of runtime speed over convenience. The C philosophy is that the programmer is in the best position to know whether something should be initialized to a known value or not, and is smart enough to do it themselves if needed.
While you're logically creating and destroying a new instance of responseData on each loop iteration, it turns out the same memory location is being reused each time through. We like to think that space is allocated for each block-scope object as we enter the block and released as we leave it, but in practice that's (usually) not the case - space for all block-scope objects within a function is allocated on function entry, and released on function exit1.
Different objects in different scopes may map to the same memory behind the scenes. Consider something like
void bletch( void )
{
if ( some_condition )
{
int foo = some_function();
printf( "%d\n", foo );
}
else
{
int bar = some_other_function();
printf( "%d\n", bar );
}
It's impossible for both foo and bar to exist at the same time, so there's no reason to allocate separate space for both - the compiler will (usually) allocate space for one int object at function entry, and that space gets used for either foo or bar depending on which branch is taken.
So, what happens with responseData is that space for one 8000-character array is allocated on function entry, and that same space gets used for each iteration of the loop. That's why you need to clear it out on each iteration, either with a memset call or with an initializer like
char responseData[8000] = {0};
As M.M points out in a comment, this isn't true for variable-length arrays (and potentially other variably modified types) - space for those is set aside as needed, although where that space is taken from isn't specified by the language definition. For all other types, though, the usual practice is to allocate all necessary space on function entry.

How does an uninitialized array affect the outcome? (no debugging)

Line 1:
int temp2 [4];
for(j=0;j<=4;j++){
for(i=0;i<=4;i++) {
temp2[j] = temp2[j] + election[i][j];
}
}
printf("%d",temp2[3]);
In this above example, the nested for loops sums up the columns of a 5x5 table.
However, the last column is always summed up incorrectly.
When I changed Line 1 to:
int temp2[4] = {0};
All of a sudden the calculations came out perfectly! What exactly happened between the initialization of the array?
If an array is uninitialized, does that mean its last element will always contain some garbage value?
If an array is uninitialized, does that mean its last element will always contain some garbage value?
Whether they contain a garbage value or any value at all is a matter of interpretation, because any attempt to read from such uninitialized variables is undefined behaviour (UB)1. So, you can't even check what is stored in those variables. In practice, UB may manifest itself as "garbage" values being printed out, but technically anything could happen.
Also note that you are accessing the array out of bounds. That is also UB.
for(j=0;j<=4;j++){ /* Oops! Should be j < 4 */
[1] This is a simplification. In practice, implementations can assign unspecified values to uninitialized variables, or use trap representations. This means the results or reading an uninitialized variables could simply be unspecified. But they could also do whatever a given implementation does when a trap value is read. I find it easier to lump everything under UB. See related question: What happens to a declared, uninitialized variable in C? Does it have a value?
Yes, an uninitialized array will contain unpredictable garbage. You must initialize it.
If an array is uninitialized, does that mean its last element will always contain some garbage value?
If the array is not global or static, yes it will contain the garbage value. The BSS initializes the static or globalvariable or memory location to default values unless the variable is initially assigned some value.
Thus, the information at the memory location is overwritten by compiler the program may crash.
Now, when you are accessing that memory what you get is undefined behavior.
Also, note that the snippet is accessing the array out of bounds. So, please use:
int temp2 [4];
for(j=0;j<=3;j++){
for(i=0;i<=3;i++) {
or
int temp2 [4];
for(j=0;j<4;j++){
for(i=0;i<4;i++) {
First, As Jonathan Leffler mentioned, You are looping too much - You initialized an array of 4 but looping 5 times. Try changing your outer loop to j<4 and inner loop to i<4:
Line 1: int temp2 [4];
for(j=0;j<4;j++){
for(i=0;i<4;i++) {
temp2[j] += election[i][j];
}
}
printf("%d",temp2[3]);
You should also initialize your array, as you can't predict what is in memory at the point of creation (also depends on what language you're using)
An uninitialized array will contain garbage data. I've notice that in the last version of visual studio if the array is of simple data types such as int than the compiler/ide automatically initialize it to zeros, but I wouldn't rely on it. As a rule, I recommend you initialize your arrays before you start doing operations like summing etc.

handling data violation in c

I am starting to learn c and cannot find a clear example of handling memory violations. Currently I have written a piece of code that uses a variable and an array.
I assign a value to the variable and then populate the array with a set of initial values. However one of the values in the array is being saved at the same address as the variable and hence overwriting the variable.
Could some one please give me a simple example of how to handle such errors or to avoid such errors....thanks
Once an error such as a memory violation has occurred in C, you cannot 'handle' it. So, you have to avoid it in the first place. The way to do what you want is as follows:
int a[10];
int i;
for( i = 0; i < 10; i++ )
a[i] = 5;
This is a guess but seems pretty much your problem.
You are overwriting beyond the bounds of the array.
C does not guard you against writing beyond the bounds of an allocated array. You as a programmer must ensure you do not do so. Failing to do so will result in Undefined Behavior and then anything can happen(literally) your program might work or might not or show unusual behavior.
For eg:
int arr[10];
Declares an array of 10 integers and the valid subscript range is from 0 to 9,
You should ensure your program uses valid subscripts.

Check if 2d pointer array has user defined value in C?

Sample code:
float** a;
a = (float**) malloc(numNodes * sizeof(float*));
for(int i=0; i<`numNodes`; i++)
{
a[i] = (float*)malloc((numNodes-1) * sizeof(float));
}
I am creating a dynamic 2d array above. Before populating, I noticed that each block in the array already holds this value: -431602080.000000 and not NULL. Why is this?
There are situations where not all spaces within the array are used.
So, my query is simple, Is there an elegant way to check if the each block has this default value or a user defined value?
Thanks in advance.
The content of memory allocated with malloc (as well as of variables allocated on the stack) is undefined, so it may very well be anything. Usually you get space filled with zeroes (because the OS blanks memory pages that were used by other processes) or residues of the previous use of those memory pages (this is often the case if the memory page belonged to your process), but this is what happens under the hood, the C standard does not give any guarantees.
So, in general there's no "default value" and no way to check if your memory has been changed; however you can init the memory blocks you use with magic values that you're sure that will not be used as "real data", but it'll be just a convention internal to your application.
Luckily, for floating point variables there are several magic values like quiet NaN you can use for this purpose; in general you can use the macro NAN defined in <math.h> to set a float to NaN.
By the way, you shouldn't read uninitialized floats and doubles, since the usual format they are stored in (IEEE 754) contains some magic values (like the signaling NaN) that can raise arithmetic exceptions when they are read, so if your uninitialized memory happens to contain such bit pattern your application will probably crash.
C runtimes are not required to initialize any memory that you didn't initialize yourself and the values that they hold are essentially random garbage left over from the last time that memory was used. You will have to set them all to NULL explicitly first or use calloc.
Extending the good answer of Matteo Italia:
The code of initialization of a single array would look like:
float* row;
row = malloc( numNodes*sizeof(float) );
for (int i=0; i<numNodes; ++i) {
row[i] = nanf(); // set a Not-a-Number magic value of type float
}
(I'll leave it up to you to change this for your multi-dimensional array)
Then somewhere:
float value = ...; // read the array
if (isnan(value)) {
// not initialized
} else {
// initialized - do something with this
}
One thing is important to remember: NaN == NaN will yield false, so it's best to use isnan(), not == to test for the presence of this value.
In C automatic variables doesn't get automatically initialized. You need to explicitly set your variable to 0, if it's what you want.
The same is true for malloc that does'n initialize the space on the heap it allocates. You can use calloc if you want to initialize it:
a = malloc( numNodes*sizeof(float*) ); // no need to initialize this
for ... {
a[i] = calloc( numNodes-1, sizeof(float) );
}
Before populating, I noticed that each block in the array already holds this value: -431602080.000000 and not NULL. Why is this?
malloc() doesn't initialize the memory which it allocates. You need to use calloc() if you want 0 initialization
void *calloc(size_t nelem, size_t elsize);
The calloc() function allocates unused space for an array of nelem elements each of whose size in bytes is elsize. The space shall be initialized to all bits 0.

How to check if "set" in c

If I allocate a C array like this:
int array[ 5 ];
Then, set only one object:
array[ 0 ] = 7;
How can I check whether all the other keys ( array[1], array[2], …) are storing a value? (In this case, of course, they aren't.)
Is there a function like PHP's isset()?
if ( isset(array[ 1 ]) ) ...
There isn't things like this in C. A static array's content is always "set". You could, however, fill in some special value to pretend it is uninitialized, e.g.
// make sure this value isn't really used.
#define UNINITIALIZED 0xcdcdcdcd
int array[5] = {UNINITIALIZED, UNINITIALIZED, UNINITIALIZED, UNINITIALIZED, UNINITIALIZED};
array[0] = 7;
if (array[1] != UNINITIALIZED) {
...
You can't
There values are all undefined (thus random).
You could explicitly zero out all values to start with so you at least have a good starting point. But using magic numbers to detect if an object has been initialized is considered bad practice (but initializing variables is considered good practice).
int array[ 5 ] = {};
But if you want to explicitly check if they have been explicitly set (without using magic numbers) since creation you need to store that information in another structure.
int array[ 5 ] = {}; // Init all to 0
int isSet[ 5 ] = {}; // Init all to 0 (false)
int getVal(int index) {return array[index];}
int isSet(int index) {return isSet[index];}
void setVal(int index,int val) {array[index] = val; isSet[index] = 1; }
In C, all the elements will have values (garbage) at the time of allocation. So you cannot really have a function like what you are asking for.
However, you can by default fill it up with some standard values like 0 or INT_MIN using memset() and then write an isset() code.
I don't know php, but one of two things is going on here
the php array is actually a hash-map (awk does that)
the php array is being filled with nullable types
in either case there is a meaningful concept of "not set" for the values of the array. On the other hand a c array of built in type has some value in every cell at all times. If the array is uninitialized and is automatic or was allocated on the heap those values may be random, but they exist.
To get the php behavior:
Implement (or find a library wit) and use a hashmap instead on an array.
Make it an array of structures which include an isNull field.
Initialize the array to some sentinal value in all cells.
One solution perhaps is to use a separate array of flags. When you assign one of the elements, set the flag in the boolean array.
You can also use pointers. You can use null pointers to represent data which has not been assigned yet. I made an example below:
int * p_array[3] = {NULL,NULL,NULL};
p_array[0] = malloc(sizeof(int));
*p_array[0] = (int)0;
p_array[2] = malloc(sizeof(int));
*p_array[2] = (int)4;
for (int x = 0; x < 3; x++) {
if (p_array[x] != NULL) {
printf("Element at %i is assigned and the value is %i\n",x,*p_array[x]);
}else{
printf("Element at %i is not assigned.\n",x);
}
}
You could make a function which allocates the memory and sets the data and another function which works like the isset function in PHP by testing for NULL for you.
I hope that helps you.
Edit: Make sure the memory is deallocated once you have finished. Another function could be used to deallocate certain elements or the entire array.
I've used NULL pointers before to signify data has not been created yet or needs to be recreated.
An approach I like is to make 2 arrays, one a bit-array flagging which indices of the array are set, and the other containing the actual values. Even in cases where you don't need to know whether an item in the array is "set" or not, it can be a useful optimization. Zeroing a 1-bit-per-element bit array is a lot faster than initializing an 8-byte-per-element array of size_t, especially if the array will remain sparse (mostly unfilled) for its entire lifetime.
One practical example where I used this trick is in a substring search function, using a Boyer-Moore-style bad-character skip table. The table requires 256 entries of type size_t, but only the ones corresponding to characters which actually appear in the needle string need to be filled. A 1kb (or 2kb on 64-bit) memset would dominate cpu usage in the case of very short searches, leading other implementations to throw around heuristics for whether or not to use the table. But instead, I let the skip table go uninitialized, and used a 256-bit bit array (only 32 bytes to feed to memset) to flag which entries are in use.

Resources