I am having one c code.
Where i had given an array index as 12.But it is allowing me to initialize the array more to that index instead of giving error for index out of bound.
Can any one please explain me y it is happeining.
int vas1[12][12];
vas1[15][15]=0;
int i,j;
for (i = 0; i < 15; i ++)
{
for (j = 0; j < 15; j ++) {
printf("i=%d j=%d vas=%d",i,j,vas1[i][j]);
}
}
printf("Success");
Thanks
C doesn't do bounds checking on array accesses. It simply marks illegal accesses as "undefined behavior" so each implementation can do as it please. Since using C means you know what you're doing, C allows you to shoot yourself in the foot.
In practice, sometimes you will get an error, sometimes not. Sometimes you won't get an error but the client will. Worst case scenario: you won't get an error but the program will behave really weird (variables changing values for no reason etc).
Related
I'm stuck with a program where just having a printf statement is causing changes in the output.
I have an array of n elements. For the median of every d consecutive elements, if the (d+1)th element is greater or equals to twice of it (the median), I'm incrementing the value of notifications. The complete problem statement might be referred here.
This is my program:
#include <math.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <assert.h>
#include <limits.h>
#include <stdbool.h>
#define RANGE 200
float find_median(int *freq, int *ar, int i, int d) {
int *count = (int *)calloc(sizeof(int), RANGE + 1);
for (int j = 0; j <= RANGE; j++) {
count[j] = freq[j];
}
for (int j = 1; j <= RANGE; j++) {
count[j] += count[j - 1];
}
int *arr = (int *)malloc(sizeof(int) * d);
float median;
for (int j = i; j < i + d; j++) {
int index = count[ar[j]] - 1;
arr[index] = ar[j];
count[ar[j]]--;
if (index == d / 2) {
if (d % 2 == 0) {
median = (float)(arr[index] + arr[index - 1]) / 2;
} else {
median = arr[index];
}
break;
}
}
free(count);
free(arr);
return median;
}
int main() {
int n, d;
scanf("%d %d", &n, &d);
int *arr = malloc(sizeof(int) * n);
for (int i = 0; i < n; i++) {
scanf("%i", &arr[i]);
}
int *freq = (int *)calloc(sizeof(int), RANGE + 1);
int notifications = 0;
if (d < n) {
for (int i = 0; i < d; i++)
freq[arr[i]]++;
for (int i = 0; i < n - d; i++) {
float median = find_median(freq, arr, i, d); /* Count sorts the arr elements in the range i to i+d-1 and returns the median */
if (arr[i + d] >= 2 * median) { /* If the (i+d)th element is greater or equals to twice the median, increments notifications*/
printf("X");
notifications++;
}
freq[arr[i]]--;
freq[arr[i + d]]++;
}
}
printf("%d", notifications);
return 0;
}
Now, For large inputs like this, the program outputs 936 as the value of notifications whereas when I just exclude the statement printf("X") the program outputs 1027 as the value of notifications.
I'm really not able to understand what is causing this behavior in my program, and what I'm missing/overseeing.
Your program has undefined behavior here:
for (int j = 0; j <= RANGE; j++) {
count[j] += count[j - 1];
}
You should start the loop at j = 1. As coded, you access memory before the beginning of the array count, which could cause a crash or produce an unpredictable value. Changing anything in the running environment can lead to a different behavior. As a matter of fact, even changing nothing could.
The rest of the code is more difficult to follow at a quick glance, but given the computations on index values, there may be more problems there too.
For starters, you should add some consistency checks:
verify the return value of scanf() to ensure proper conversions.
verify the values read into arr, they must be in the range 0..RANGE
verify that int index = count[ar[j]] - 1; never produces a negative number.
same for count[ar[j]]--;
verify that median = (float)(arr[index] + arr[index - 1]) / 2; is never evaluated with index == 0.
Your program has undefined behavior (at several occasions). You really should be scared (and you are not scared enough).
I'm really not able to understand what is causing this behavior in my program
With UB, that question is pointless. You need to dive into implementation details (e.g. study the generated machine code of your program, and the code of your C compiler and standard library) to understand anything more. You probably don't want to do that (it could take years of work).
Please read as quickly as possible Lattner's blog on What Every C Programmer Should Know on Undefined Behavior
what I'm missing/overseeing.
You don't understand well enough UB. Be aware that a programming language is a specification (and code against it), not a software (e.g. your compiler). Program semantics is important.
As I said in comments:
compile with all warnings and debug info (gcc -Wall -Wextra -g with GCC)
improve your code to get no warnings; perhaps try also another compiler like Clang and work to also get no warnings from it (since different compilers give different warnings).
consider using some version control system like git to keep various variants of your code, and some build automation tool.
think more about your program and invariants inside it.
use the debugger (gdb), in particular with watchpoints, to understand the internal state of your process; and have several test cases to run under the debugger and without it.
use instrumentation facilities such as the address sanitizer -fsanitize=address of GCC and tools like valgrind.
use rubber duck debugging methodology
sometimes consider static source code analysis tools (e.g. Frama-C). They require expertise to be used, and/or give many false positives.
read more about programming (e.g. SICP) and about the C Programming Language. Download and study the C11 programming language specification n1570 (and be very careful about every mention of UB in it). Read carefully the documentation of every standard or external function you are using. Study also the documentation of your compiler and of other tools. Handle error and failure cases (e.g. calloc and scanf can fail).
Debugging is difficult (e.g. because of the Halting Problem, of Heisenbugs, etc...) - but sometimes fun and challenging. You can spend weeks on finding one single bug. And you often cannot understand the behavior of a buggy program without diving into implementation details (studying the machine code generated by the compiler, studying the code of the compiler).
PS. Your question shows a wrong mindset -which you should improve-, and misunderstanding of UB.
This question already has answers here:
int LA[] = {1,2,3,4,5} memory allocation confusion in c
(3 answers)
Closed 6 years ago.
So. I am teaching programming 1 to some college level pupils at the moment. And i specifically told them to go out and look online for references, specifically on the datastructure parts i am covering at the moment. Today one student emailed me with a link to tutorialspoint.com and asked about this piece of code he pulled from there:
#include <stdio.h>
main() {
int LA[] = {1,3,5,7,8};
int item = 10, k = 3, n = 5;
int i = 0, j = n;
printf("The original array elements are :\n");
for(i = 0; i<n; i++) {
printf("LA[%d] = %d \n", i, LA[i]);
}
n = n + 1;
while( j >= k) {
LA[j+1] = LA[j];
j = j - 1;
}
LA[k] = item;
printf("The array elements after insertion :\n");
for(i = 0; i<n; i++) {
printf("LA[%d] = %d \n", i, LA[i]);
}
}
Now, without knowing exactly where it's from i don't know exactly how they described it, but obviously it is insertion into an array of value at index k, shuffling upwards from k.
Now what he asked about was that i have told my students that when doing something like:
int arr[] = {1,2,3,4};
the compiler will auto-count the size, by checking the supplied value list. This case means an array size of 4 elements. I have also told them that an array size is fixed when the array is first initialized, like:
int likethis[5];
int orthis[] = {1,2,3,4};
int orlikeso[MAX_ARR_SIZE];
Thus, to resize an array, dynamic memory management is needed, so that you would declare space for a new array (a part of the course they have yet to get to).
But the code from this tutorial site actually seems to do an auto-size by the compiler with the initializer list, then go about merrily resizing it in the loop, when shuffling.
So the final size of LA in their example would be 6 elements. Now, my student wants to know why this is valid. I have not tested this code myself, but apperantly it compiles on GCC according to my student. If so, how can that code be valid? Wouldn't this overwrite the boundaries of LA, when setting LA[5] in the shuffle loop?
Questions: Is it me who is an old geezer, and this is allowed in C since way back? Only in GCC? Seeing as i learned C in the 80s somewhere, i assume i might be wrong here, but to me it is writing past the assigned size of LA. Just wanted to check it on S.O.
But the code from this tutorial site actually seems to do an auto-size by the compiler with the initializer list, then go about merrily resizing it in the loop, when shuffling.
The code only appears to do that. In reality, the code causes undefined behavior as soon as it touches index 5 of a five-element array.
Now, my student wants to know why this is valid.
He should have started with a simpler "is this valid" question. The answer to it would be "no". The code will compile, and may even appear to work, but this code is invalid.
Unfortunately, there is no easy way to demonstrate it to students at the early stages of learning C, because reading memory profiler reports (say, valgrind) is an advanced skill. On the other hand, if the students have enough determination to learn how to run their code through a memory profiler, they are in for a very rewarding experience of having good confidence in their code.
Note: I think this is a great teaching moment, because it lets you teach the student an important point about undefined behavior in C, and also reinforce the rule "you shouldn't trust things just because you found them on the internet" applies to code as well.
By attempting to write past the last element of the array, the code invokes undefined behavior, which means it may crash outright, silently corrupt data, or appear to run without any problems.
There may be some padding or scratch space that the extra element is being written to, which is why it isn't crashing, but this code is not valid.
To answer your question, the code is simply not valid. The array overflows but the bug is not visible (however if you enable compiler size optimization, it should improve probabilities that this code crashes).
In order to help you spotting the overflow, i suggest you run the code with Valgrind, as it will spot the overflow for you.
edit: I ran Valgrind with memcheck and it didn't spot that overflow. Surprising for me.
There is nothing as automatic resizing with arrays in C. What is happening here is something known as "buffer overflow" . (Check the answer at Memory confusion for strncpy in C for more details on possible side effects of buffer overflow)
To show that the size of LA has not changed at all you can try printing the size at the beginning and at the end of the code as below:
#include <stdio.h>
int main() {
int LA[] = {1,3,5,7,8};
int item = 10, k = 3, n = 5;
int i = 0, j = n;
printf("The original array elements are :\n");
printf("Number of elements in LA = %ld\n",(sizeof(LA)/sizeof(int)));
for(i = 0; i<n; i++) {
printf("LA[%d] = %d \n", i, LA[i]);
}
n = n + 1;
while( j >= k) {
LA[j+1] = LA[j];
j = j - 1;
}
LA[k] = item;
printf("The array elements after insertion :\n");
for(i = 0; i<n; i++) {
printf("LA[%d] = %d \n", i, LA[i]);
}
printf("Number of elements in LA = %ld\n",(sizeof(LA)/sizeof(int)));
}
Here is my simple program. I run it many times. Sometimes it will pop out a warning and break Turbo C. Why? I am using 32bit Windows 7.
#include <stdio.h>
#include <conio.h>
void main(){
int arr[10][10];
int i,j;
clrscr();
for(i=1;i<11;i++){
for(j=1;j<11;j++){
arr[i][j]=i*j;
printf("%d\t",arr[i][j]);
}
printf("\n");
}
}
It's very simple, the reason is that arrays in c are indexed from 0 to N - 1.
So instead of
for (i = 1 ; i < 11 ; ++i)
it has to be
for (i = 0 ; i < 10 ; ++i)
because N in your case is 10, and the same for j of course.
As you can see it's not Undefined Reason, it certainly is undefined behavior, but the reason is a bug in your code, so always blame your code first, it has the highest probability to be the responsible for the unexpected behavior, if you prove that your code works and I mean a mathematical proof kind of proof, then you can blame the compiler or anyone you like.
In this line:
arr[i][j]=i*j;
i and j values will range from 1 to 10. However, ar[10][10] is actually out of bounds of array.
Since C follows 0-based indexing, change this:
for(i=1;i<11;i++){
for(j=1;j<11;j++){
to this:
for(i=0;i<10;i++){
for(j=0;j<10;j++){
I am learning to use C in my operating systems class and this is what I have so far for my function to find the intersection of two arrays.
An intersection basically is when you take two sets and you get ONLY the elements that are in both sets.
So for example if set A contains = {1,2,3} and set B contains = {2,3,4} then the intersection of A and B are {2,3}. I'm trying to create a function in C that gets two arrays and returns an array containing integers that are in both passing arrays.
I think I almost have the solution here, but I'm getting an error that says:
"identifier 'count' is undefined"
int intersection(int array1[4], int array2[4])
{
int arrayReturn[sizeof(array1) + sizeof(array2)]
int count = 0;
for(int i = 0; i < 4; i++)
{
for(int j = 0; j < 4; j++)
{
if(array1[i]==array2[j])
{
count = count + 1;
arrayReturn[count] = array1[i];
}
}
}
}
I'm very used to Java and I feel like Java and C are nearly identical. I can't really find what's wrong here since count is well within its scope inside the if statement. I don't see how count is undefined.
What's wrong with count and how could I fix this intersection function?
You are missing a semi-colon in the line before count declaration.
int arrayReturn[sizeof(array1) + sizeof(array2)]; //Semicolon Here
int count = 0;
How did I see the error ?
The error message was identifier 'count' is undefined so the first thing I checked for is the cause that the compiler told me. That however was not the problem, as the declaration is there, and in the correct scope. So, now what should I do ? I should look at the line just before the initialization of the variable and at the line just before the usages. This is where you will most certainly find the error.
In short, when the compiler messages don't seem helpful, don't stop. Look around.
Also, as GRAYgoose124 points out, you should have a return statement at the end of your function body as your function is supposed to return an integer.
Missing a semi colon on this line:
int arrayReturn[sizeof(array1) + sizeof(array2)]; //semicolon was missing
As AshRj points out, you are missing a semicolon.
Tip: The clang compiler is excellent at giving diagnostic output. If you try to compile your code with it, you get the following output:
source.c:3:57: error: expected ';' at end of declaration
int arrayReturn[sizeof(array1) + sizeof(array2)]
^
;
Even if you're not compiling your project with clang normally, you can try to compile snippets to help you find out what's wrong. That's what I did with your snippet, as you can see here. (Note the warnings as well.)
void readArr(int arr1[],int n1, int arr2[],int n2,int arr3[]){
int n=0;
int j=0;
for(j=0;j<n1;j++){
for(int k=0; k<n2;k++){
if(arr1[j]==arr2[k]){
int element = arr1[j];
int k=checkArrayContains(arr3,n,element);
if(k==1){
printf("%d\n",arr1[j] );
arr3[n]=arr1[j];
n++;
}
}
}
}
displayArray(arr3,n);}
I think this implementation is more clear.
I am using Code::Blocks 10.05, and the GNU GCC Compiler.
Basically, I ran into a really strange (and for me, inexplicable) issue that arises when trying to initialize an array outside it's declared size. In words, it's this:
*There is a declared array of size [x][y].
*There is another declared array with size [y-1].
The issue comes up when trying to put values into this second, size [y-1] array, outside of the [y-1] size. When this is attempted, the first array [x][y] will no longer maintain all of its values. I simply don't understand why breaking (or attempting to break) one array would affect the contents of the other. Here is some sample code to see it happening (it is in the broken format. To see the issue vanish, simply change array2[4] to array2[5] (thus eliminating what I have pinpointed to be the problem).
#include <stdio.h>
int main(void)
{
//Declare the array/indices
char array[10][5];
int array2[4]; //to see it work (and verify the issue), change 4 to 5
int i, j;
//Set up use of an input text file to fill the array
FILE *ifp;
ifp = fopen("input.txt", "r");
//Fill the array
for (i = 0; i <= 9; i++)
{
for (j = 0; j <= 5; j++)
{
fscanf(ifp, "%c", &array[i][j]);
//printf("[%d][%d] = %c\n", i, j, array[i][j]);
}
}
for (j = 4; j >= 0; j--)
{
for (i = 0; i <= 9; i++)
{
printf("[%d][%d] = %c\n", i, j, array[i][j]);
}
//PROBLEM LINE*************
array2[j] = 5;
}
fclose(ifp);
return 0;
}
So does anyone know how or why this happens?
Because when you write outside of an array bounds, C lets you. You're just writing to somewhere else in the program.
C is known as the lowest level high level language. To understand what "low level" means, remember that each of these variables you have created you can think of as living in physical memory. An array of integers of length 16 might occupy 64 bytes if integers are size 4. Perhaps they occupy bytes 100-163 (unlikely but I'm not going to make up realistic numbers, also these are usually better thought of in hexadecimal). What occupies byte 164? Maybe another variable in your program. What happens if you write to one past your array of 16 integers? well, it might write to that byte.
C lets you do this. Why? If you can't think of any answers, then maybe you should switch languages. I'm not being pedantic - if this doesn't benefit you then you might want to program in a language in which it is a little harder for you to make weird mistakes like this. But reasons include:
It's faster and smaller. Adding bounds checking takes time and space, so if you're writing code for a microprocessor, or writing a JIT compiler, speed and size really do matter a lot.
If you want to understand machine architecture and go into hardware, e.g. if you're a student, it's a good gateway from programming into OS/hardware/electrical engineering. And much of computer science.
Being close to machine code, it's standard in a way that many other languages and systems have to, or can easily, support some degree of compatibility with.
Other reasons that I would be able to give if I ever actually had to work this close to the machine code.
The moral is: In C, be very careful. You must check your own array bounds. You must clean up your own memory. If you don't, your program often won't crash but will start just doing really weird things without telling you where or why.
for (j = 0; j <= 5; j++)
should be
for (j = 0; j <= 4; j++)
and array2 max index is 3 so
array2[j] = 5;
is also going to be a problem when j == 4.
C array indexes start from 0. So an [X] array valid indexes are from 0 to X-1, thus you get X elements in total.
You should use the < operator, instead of <=, in order to show the same number in both the array declaration [X] and in the expression < X. For instance
int array[10];
...
for (i=0 ; i < 10 ; ++i) ... // instead of `<= 9`
This is less error prone.
If you're outside the bounds of one array, there's always a possibility you'll be inside the bounds of the other.
array2[j] = 5; - This is your problem of overflow.
for (j = 0; j <= 5; j++) - This is also a problem of overflow. Here also you are trying to access 5th index, where you can access only 0th to 4th index.
In the process memory, while calling each function one activation records will be created to keep all the local variables of the function and also it will have some more memory to store the called function address location also. In your function four local variables are there, array, array2, i and j. All these four will be aligned in an order. So if overflow happens it will first tries to overwrite in the variable declared above or below which depends on architecture. If overflow happens for more bytes then it may corrupt the entire stack itself by overwriting some of the local variables of the called functions. This may leads to crash also, Sometimes it may not but it will behave indifferently as you are facing now.