Program crashing with undefined reason? - c

Here is my simple program. I run it many times. Sometimes it will pop out a warning and break Turbo C. Why? I am using 32bit Windows 7.
#include <stdio.h>
#include <conio.h>
void main(){
int arr[10][10];
int i,j;
clrscr();
for(i=1;i<11;i++){
for(j=1;j<11;j++){
arr[i][j]=i*j;
printf("%d\t",arr[i][j]);
}
printf("\n");
}
}

It's very simple, the reason is that arrays in c are indexed from 0 to N - 1.
So instead of
for (i = 1 ; i < 11 ; ++i)
it has to be
for (i = 0 ; i < 10 ; ++i)
because N in your case is 10, and the same for j of course.
As you can see it's not Undefined Reason, it certainly is undefined behavior, but the reason is a bug in your code, so always blame your code first, it has the highest probability to be the responsible for the unexpected behavior, if you prove that your code works and I mean a mathematical proof kind of proof, then you can blame the compiler or anyone you like.

In this line:
arr[i][j]=i*j;
i and j values will range from 1 to 10. However, ar[10][10] is actually out of bounds of array.
Since C follows 0-based indexing, change this:
for(i=1;i<11;i++){
for(j=1;j<11;j++){
to this:
for(i=0;i<10;i++){
for(j=0;j<10;j++){

Related

Why is the use of unrelated printf statement causing changes in my program output?

I'm stuck with a program where just having a printf statement is causing changes in the output.
I have an array of n elements. For the median of every d consecutive elements, if the (d+1)th element is greater or equals to twice of it (the median), I'm incrementing the value of notifications. The complete problem statement might be referred here.
This is my program:
#include <math.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <assert.h>
#include <limits.h>
#include <stdbool.h>
#define RANGE 200
float find_median(int *freq, int *ar, int i, int d) {
int *count = (int *)calloc(sizeof(int), RANGE + 1);
for (int j = 0; j <= RANGE; j++) {
count[j] = freq[j];
}
for (int j = 1; j <= RANGE; j++) {
count[j] += count[j - 1];
}
int *arr = (int *)malloc(sizeof(int) * d);
float median;
for (int j = i; j < i + d; j++) {
int index = count[ar[j]] - 1;
arr[index] = ar[j];
count[ar[j]]--;
if (index == d / 2) {
if (d % 2 == 0) {
median = (float)(arr[index] + arr[index - 1]) / 2;
} else {
median = arr[index];
}
break;
}
}
free(count);
free(arr);
return median;
}
int main() {
int n, d;
scanf("%d %d", &n, &d);
int *arr = malloc(sizeof(int) * n);
for (int i = 0; i < n; i++) {
scanf("%i", &arr[i]);
}
int *freq = (int *)calloc(sizeof(int), RANGE + 1);
int notifications = 0;
if (d < n) {
for (int i = 0; i < d; i++)
freq[arr[i]]++;
for (int i = 0; i < n - d; i++) {
float median = find_median(freq, arr, i, d); /* Count sorts the arr elements in the range i to i+d-1 and returns the median */
if (arr[i + d] >= 2 * median) { /* If the (i+d)th element is greater or equals to twice the median, increments notifications*/
printf("X");
notifications++;
}
freq[arr[i]]--;
freq[arr[i + d]]++;
}
}
printf("%d", notifications);
return 0;
}
Now, For large inputs like this, the program outputs 936 as the value of notifications whereas when I just exclude the statement printf("X") the program outputs 1027 as the value of notifications.
I'm really not able to understand what is causing this behavior in my program, and what I'm missing/overseeing.
Your program has undefined behavior here:
for (int j = 0; j <= RANGE; j++) {
count[j] += count[j - 1];
}
You should start the loop at j = 1. As coded, you access memory before the beginning of the array count, which could cause a crash or produce an unpredictable value. Changing anything in the running environment can lead to a different behavior. As a matter of fact, even changing nothing could.
The rest of the code is more difficult to follow at a quick glance, but given the computations on index values, there may be more problems there too.
For starters, you should add some consistency checks:
verify the return value of scanf() to ensure proper conversions.
verify the values read into arr, they must be in the range 0..RANGE
verify that int index = count[ar[j]] - 1; never produces a negative number.
same for count[ar[j]]--;
verify that median = (float)(arr[index] + arr[index - 1]) / 2; is never evaluated with index == 0.
Your program has undefined behavior (at several occasions). You really should be scared (and you are not scared enough).
I'm really not able to understand what is causing this behavior in my program
With UB, that question is pointless. You need to dive into implementation details (e.g. study the generated machine code of your program, and the code of your C compiler and standard library) to understand anything more. You probably don't want to do that (it could take years of work).
Please read as quickly as possible Lattner's blog on What Every C Programmer Should Know on Undefined Behavior
what I'm missing/overseeing.
You don't understand well enough UB. Be aware that a programming language is a specification (and code against it), not a software (e.g. your compiler). Program semantics is important.
As I said in comments:
compile with all warnings and debug info (gcc -Wall -Wextra -g with GCC)
improve your code to get no warnings; perhaps try also another compiler like Clang and work to also get no warnings from it (since different compilers give different warnings).
consider using some version control system like git to keep various variants of your code, and some build automation tool.
think more about your program and invariants inside it.
use the debugger (gdb), in particular with watchpoints, to understand the internal state of your process; and have several test cases to run under the debugger and without it.
use instrumentation facilities such as the address sanitizer -fsanitize=address of GCC and tools like valgrind.
use rubber duck debugging methodology
sometimes consider static source code analysis tools (e.g. Frama-C). They require expertise to be used, and/or give many false positives.
read more about programming (e.g. SICP) and about the C Programming Language. Download and study the C11 programming language specification n1570 (and be very careful about every mention of UB in it). Read carefully the documentation of every standard or external function you are using. Study also the documentation of your compiler and of other tools. Handle error and failure cases (e.g. calloc and scanf can fail).
Debugging is difficult (e.g. because of the Halting Problem, of Heisenbugs, etc...) - but sometimes fun and challenging. You can spend weeks on finding one single bug. And you often cannot understand the behavior of a buggy program without diving into implementation details (studying the machine code generated by the compiler, studying the code of the compiler).
PS. Your question shows a wrong mindset -which you should improve-, and misunderstanding of UB.

Summation of nth prime

I'm writing a program that finds the summation of prime numbers. It must use redirected input. I've written it so that it finds the largest number for the input then uses that as the nth prime. It then uses the nth prime to set the size of the array. It works up until I try to printf the sum. I can't figure out why I'm getting a seg fault there of all place.I think I've allocated the arrays correctly with malloc. why would the fault happen at printf on not when I'm using my arrays? Any suggestions on my code are also welcome.
EDIT
used a test input of 2000 form 1 to 2000 and it worked but the full test file of 10000 form 1 to 10000 crashes still looking into why. I'm guessing I didnt allocate enough space
EDIT
My problem was in my sieve I didnt take the sqrt(nthprime) so it found more primes then the array could hold
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int nprime (int max);
void sieve_sum ( int *primes, int nthprime, int tests,int *input);
int main(void)
{
int i=0;
int max=0; //largest input
int tests; //number of tests
int nthprime; //estimated nth prime
int *primes; //array of primes
int *input; // numbers to put in to P(n), where p(n) is the summation of primes
scanf("%d",&tests); //gets number of tests
input = malloc(sizeof(int)*tests);
//test values
for(i=0; i<=tests-1; i++)
scanf("%d",&input[i]);
//finds max test value
i=0;
for (i = 0; i < tests; i++ )
{
if ( input[i] > max-1 )
max = input[i];
}
// calls nprime places value in n
nthprime = nprime(max);
primes = malloc(sizeof(int)*nthprime);
// calls sieve_sum
sieve_sum( primes, nthprime, tests, input);
//free memory
free(input);
free(primes);
return 0;
}
//finds Primes and their sum
void sieve_sum ( int *primes, int nthprime, int tests,int *input)
{
int i;
int j;
//fills in arrays with 1's
for(i=2; i<=nthprime; i++)
primes[i] = 1;
//replaces non primes with 0's
i=0;
for(i=2; i<=sqrt(nthprime); i++)
{
if(primes[i] == 1)
{
for(j=i; (i*j)<=(nthprime); j++)
primes[(i*j)] = 0;
}
}
//rewrites array with only primes
j=1;
i=0;
for(i=2; i<=nthprime; i++)
{
if(primes[i] == 1)
{
primes[j] = i;
j++;
}
}
//sums
i=0;
for ( i=1; i<=tests; i++ )
{
int sum=0;//sum of primes
j=0;
for(j=1; j<=input[i-1]; j++)
{
sum = primes[j] + sum;
}
printf("%d\n", sum );
}
return 0;
}
//finds the Nth number prime
int nprime (int max)
{
//aproximization of pi(n) (the nth prime) times 2 ensures correct allocation of memory
max = ceil( max*((log (max)) + log ((log (max)))))*2;
return (max);
}
Example input file:
20
1
2
3
4
5
6
7
8
9
10
10
9
8
7
6
5
4
3
2
1
Example output should be:
2
5
10
17
28
41
58
77
100
129
129
100
77
58
41
28
17
10
5
2
OK, so I'm going to have a shot at this without saying too much because it looks like it might be a homework problem.
My best guess is that a lot of people have shied away from even looking at your code because it's a bit too messy for their level of patience. It's not irredeemable, but you would probably get a better response if it were considerably cleaner.
So, first, a few critical comments on your code to help you clean it up: it is insufficiently well-commented to make your intentions clear at any level, including what the overall purpose of the program is; it is inconsistently indented and spaced in an unconventional manner; and your choice of variable names leaves something to be desired, which is exacerbated by the absence of comments on variable declarations.
You should compile this code with something like (assuming your source file is called sumprimes.c):
gcc -std=c99 -pedantic -Wall -Wextra -o sumprimes sumprimes.c -lm
Have a look at the warnings this produces, and it will alert you to a few, admittedly fairly minor, problems.
The major immediate issue that I can see by inspection is that your program will certainly segfault because the storage that you have allocated with malloc() is too small by a factor of sizeof(int), which you have omitted.
A static error-checker like splint would help you detect some further issues; there's no need to slavishly follow all of its recommendations, though: once you understand them all, you can decide which ones to follow.
A few other remarks:
“Magic numbers”, such as 100 in your code, are considered very bad form. As a rule of thumb, the only number that should ever appear in code literally is 0 (zero), and then only sometimes. Your 100 would be much better represented as something named (like a const int or, more traditionally, a #define) to give an indication of its meaning.
It is unconventional to “outdent” variable declarations in the manner done in your code
If a function is advertised to return a value, you should always check it for errors, e.g. make sure that the return value of malloc() is not NULL, check that the return value of scanf() (if you use it) is the expected value, etc.
As a general matter of style, it is usually considered good practice in C to declare variables one per line with a short explanatory comment. There are exceptions, but it's a reasonable rule-of-thumb.
For any kind of input, scanf() is a poor choice because it alters the state of stdin in ways that are difficult to predict unless the input is exactly as expected, which you can never depend on. If you want to read in an integer, it is much better to read what's available on stdin with fgets() into a buffer, and then use strtol(), as you can do much more effective error-checking and reporting that way.
It is not recommended to cast the return of malloc() any more.
Hope this helps.

What happens when you write to memory out of bounds of an array?

On a recent test question I was asked to print the output of the following program. I got the answer correct however this program caused me significant mental anguish as I didn't know what the behavior would be when writing to memory that is out of bounds of an array.
Here is the program under question, the comments are my notes:
#include <stdio.h>
#define MAX 4
void RecordArgs(int x);
int main()
{
RecordArgs(1);
RecordArgs(7);
RecordArgs(-11);
return 0;
}
void RecordArgs(int x)
{
static int i = 0;
int call_count = 0;
int arg_history[MAX] = {0};
if (call_count == MAX)
{
# call_count is not static and is initialized to 0 on each call
# as a result, under no circumstance can call_count == MAX and
# this printf is never executed
printf("Too many calls to RecordArgs\n");
}
else
{
# index out of bounds on second/third call (i + call_count will be 4??)
arg_history[i + call_count] = x;
++call_count;
++i;
for (i = 0; i < MAX; ++i)
printf("%d ", arg_history[i]);
printf("\n");
}
}
And the expected output:
1 0 0 0
0 0 0 0
0 0 0 0
When RecordArgs is called the second and third times where does the 7 and -11 values get written? I tried compiling it under different settings to see if I could get it two write to something it shouldn't but everything I've tried has resulted in that exact output w/o any segfaults.
Expanding on Patashu's comment, segmentation faults occur when you access memory from a page in a way which clashes with the page of memory's permissions. In other words, they occur when you access a page of memory in a way that you're not allowed to. What's possibly occurring in your situation is that you are accessing memory still within the same page on which arg_history is stored, for which you obviously have permission to read and write.
Another possible scenario is that the page of memory right after the one you're working on has the same permissions which allow you to access it the same way.
In any case, this is undefined behavior in C. Although you witness "expected results," that should not indicate to you that the program is correct. In fact, this is a circumstance in which an out-of-bounds error could potentially go unnoticed, if it doesn't cause a segmentation fault.

Bubble-sort 'works' in Windows, but not with GCC on GNU/Linux

I am using Ubuntu Linux for programming purposes. Yesterday I came across a very strange problem that was really really obscure and was weird.
The problem was that I tried to do bubble sort, logic, syntax everything was correct but the output was wrong. I wrote same program in Windows and it worked fine. I am using Eclipse IDE in Linux. What can be the problem? On The other side I used pointers (call by reference) to accomplish bubble sort, but in Ubuntu the output was also wrong, while in Windows the output was okay. I don't know how to figure it out.
My code for bubble sort is as following:
#include<stdio.h>
void main(void)
{
int array[] = {4,2,6,3,1,5,8,4,6,1};
int i=0;
int j=0;
for(i=1;i<=10;i++)
{
for(j=0;j<=10-i;j++)
{
if(array[j]>array[j+1])
{
int temp = array[j];
array[j] = array[j+1];
array[j+1] = temp;
}
}
}
for(i=0;i<=9;i++)
{
printf("%d\t",array[i]);
}
}
Output:
gcc -o bubblesort.c -o output
./output
2 3 4 1 5 6 4 6 1 1
Going beyond the bounds of an array is undefined behaviour (a subset of which is behave "correctly"), which is what is occuring the program. Arrays use a zero-based indexed meaning the last valid index is one less than the number of elements in the array:
/* 10 elements in 'array'. */
int array[] = {4,2,6,3,1,5,8,4,6,1};
for(j=0;j<=10-i ;j++)
{
if(array[j]>array[j+1]) /* When 'j' is 9 the
'array[j + 1]' is
out of bounds. */
Change the inner for loop terminating condition:
for(j=0;j<=9-i ;j++)
Instead of hard-coding 9 and 10 throughout the code you could use sizeof(array)/sizeof(array[0]) to obtain the number of elements in array. This makes it less error prone and simpler to change the number of elements in array later:
const int ARRAY_SIZE = sizeof(array)/sizeof(array[0]);
This:
for(j=0;j<=10-i;j++)
together with this:
if(array[j]>array[j+1])
and other places where you access your array out of bound is a likely cause of your problems.
Accessing an array out of bounds is undefined behavior.
This is pseudo code for a bubble sort:
for (i = 0; i < 9; i++) {
for (j = i + 1; j < 10; j++) {
if (element[i] > element[j]) swap_elements();

c language+Two dimensional array

I am having one c code.
Where i had given an array index as 12.But it is allowing me to initialize the array more to that index instead of giving error for index out of bound.
Can any one please explain me y it is happeining.
int vas1[12][12];
vas1[15][15]=0;
int i,j;
for (i = 0; i < 15; i ++)
{
for (j = 0; j < 15; j ++) {
printf("i=%d j=%d vas=%d",i,j,vas1[i][j]);
}
}
printf("Success");
Thanks
C doesn't do bounds checking on array accesses. It simply marks illegal accesses as "undefined behavior" so each implementation can do as it please. Since using C means you know what you're doing, C allows you to shoot yourself in the foot.
In practice, sometimes you will get an error, sometimes not. Sometimes you won't get an error but the client will. Worst case scenario: you won't get an error but the program will behave really weird (variables changing values for no reason etc).

Resources