Segmentation fault: 11 - C [duplicate] - c

Why doesn't this code defining and using a VLA (variable-length array) work reliably?
#include <stdio.h>
int main(void)
{
int n;
double vla[n];
if (scanf("%d", &n) != 1)
return 1;
for (int i = 0; i < n; i++)
{
if (scanf("%lf", &vla[i]) != 1)
return 1;
}
for (int i = 0; i < n; i++)
printf("[%d] = %.2f\n", i, vla[i]);
return 0;
}

Diagnosis
In the code in the question, the variable n is uninitialized when it is used in the definition of vla. Indeed, with GCC set fussy, the code shown produces a compilation error (it'll give a warning if you are careless enough to omit -Werror from your compilation options — don't do that!):
$ gcc -std=c11 -O3 -g -Wall -Wextra -Werror -Wstrict-prototypes -Wmissing-prototypes -Wshadow -pedantic-errors vla37.c -o vla37
vla37.c: In function ‘main’:
vla37.c:6:5: error: ‘n’ is used uninitialized [-Werror=uninitialized]
6 | double vla[n];
| ^~~~~~
vla37.c:5:9: note: ‘n’ declared here
5 | int n;
| ^
cc1: all warnings being treated as errors
$
(That's from GCC 11.2.0 on a machine running RedHat RHEL 7.4.)
The trouble is that the compiler must know the size of the array when it is declared, but the value in n is undefined (indeterminate) because it is uninitialized. It could be huge; it could be zero; it could be negative.
Prescription
The cure for the problem is simple — make sure the size is known and sane before it is used to declare the VLA:
#include <stdio.h>
int main(void)
{
int n;
if (scanf("%d", &n) != 1)
return 1;
double vla[n];
for (int i = 0; i < n; i++)
{
if (scanf("%lf", &vla[i]) != 1)
return 1;
}
for (int i = 0; i < n; i++)
printf("[%d] = %.2f\n", i, vla[i]);
return 0;
}
Now you can run the result:
$ vla41 <<<'9 2.34 3.45 6.12 8.12 99.60 -12.31 1 2 3'
[0] = 2.34
[1] = 3.45
[2] = 6.12
[3] = 8.12
[4] = 99.60
[5] = -12.31
[6] = 1.00
[7] = 2.00
[8] = 3.00
$
(That assumes your shell is Bash or compatible with Bash and supports 'here strings' (the <<<'…' notation.)
The code shown in the question and in this answer is barely adequate in handling I/O errors; it detects input problems but doesn't provide useful feedback to the user.
The code shown does not validate the value of n for plausibility. You should ensure that the size is larger than zero and less than some upper bound. The maximum size depends on the size of the data being stored in the VLA and the platform you're on.
If you're on a Unix-like machine, you probably have 8 MiB of stack; if you're on a Windows machine, you probably have 1 MiB of stack; if you're on an embedded system, you may have much less stack available to you. You need to leave some stack space for other code too, so you should probably check that the array size is not more than, for sake of discussion, 1024 — that would be 8 KiB of stack for an array of double, which is not huge at all but it provides plenty of space for most homework programs. Tweak the number larger to suit your purposes, but when the number grows, you should use malloc() et al to dynamically allocate the array instead of using an on-stack VLA. For example, on a Windows machine, if you use a VLA of type int, setting the size above 262,144 (256 * 1024) almost guarantees that your program will crash, and it may crash at somewhat smaller sizes than that.
Lessons to learn
Compile with stringent warning options.
Compile with -Werror or its equivalent so warnings are treated as errors.
Make sure the variable defining the size of a VLA is initialized before defining the array.
Not too small (not zero, not negative).
Not too big (not using more than 1 megabyte on Windows, 8 megabytes on Unix).
Leave a decent margin for other code to use as well.
Note that all compilers that support VLAs also support variables defined at arbitrary points within a function. Both features were added in C99. VLAs were made optional in C11 — and a compiler should define __STDC_NO_VLA__ if it does not support VLAs at all but claims conformance with C11 or later.
C++ and variable-length arrays
Standard C++ does not support C-style VLAs. However, GCC (g++) does support them by default as an extension. This can cause confusion.

Related

C, when calling function's arguments not meet the need of called function's parameters

I'm learning C. I have done some little experimentswhen I'm learning some chapters.
The major question is I can't understand why the result of the code's execution is following, because it doesn't meet what I thought code would go.
source code:
#include <stdio.h>
int imax();
int main(void)
{
printf("%zd %zd\n", sizeof(int), sizeof(double));
printf("The maximum of %d and %d is %d.\n", 3, 5, imax(3));
printf("The maximum of %d and %d is %d.\n", 3, 5, imax(3.0, 1000.0));
printf("The maximum of %d and %d is %d.\n", 3, 5, imax(3.0));
printf("The maximum of %d and %d is %d.\n", 3, 5, imax(1000.0));
return 0;
}
int imax(n, m)
int n, m;
{
return (n > m ? n: m);
}
the output:
What I can't understand is why the last three print statements print same words!
I know I am do a test for researching what will happen when the declaration of a function using old style which do not care the formal parameters' type.
In the context, I design four cases which the calling function's actual arguments do not match with the requirement of the called function's formal parameters.
I know this is relevant to the mechanism of the stack in C. And I try my best to search why. In my opinion, the last three print statements should behave different. In fact, I think the statement imax(3.0, 1000.0) may be same with imax(3.0) or imax(1000.0) but it's impossible be same with both!
int imax();
and
int imax(n, m)
int n, m;
{
return (n > m ? n: m);
}
is an ancient style of C code. Don't use it, as one of it's problems is it doesn't do any function argument checking.
The proper standardized code would be
int imax( int n, int m)
and
int imax( int n, int m)
{
return (n > m ? n: m);
}
Edit : To be more precise, it's because you implement your function with the K&R style. Change "int imax(n , m)" to "int imax(int n, int m)" too.
you have to correct the prototype of imax too because function without anything in the parenthesis allow this kind of thing.
When there is nothing in the parameter of the function, the function can take any number of argument. but this is useless and dangerous, since it can lead to security flaw. It's useless because you can't retrieve the correct amount of argument, and it's a security flaw because you can put any number of argument and this can smash your calling stack.
Just take
int imax();
and transform in
int imax(int n, int m);
Now, you function must take two parameter, and compilation will fail if it's not the case.
If you want a function that have no argument, put void in it to ensure that the function can't called with argument
int funcname(void);
Your code is not conforming to C99 or C11 standard. Don't use anything older (like K&R C).
This is wrong or at least undefined behaviour and you should be scared of it.
You really should have a prototype (in C99 or C11 style) for every function, that is you need to have a declaration of every used function.
(In old C89 or K&R C, that was not mandatory; but today you should code in C99 at least)
Actually you should code:
#include <stdio.h>
int imax(int, int);
int main(void)
{
printf("%zu %zu\n", sizeof(int), sizeof(double));
// WRONG CODE BELOW: the compiler should reject it or emit warnings.
printf("The maximum of %d and %d is %d.\n", 3, 5, imax(3));
printf("The maximum of %d and %d is %d.\n", 3, 5, imax(3.0, 1000.0));
printf("The maximum of %d and %d is %d.\n", 3, 5, imax(3.0));
printf("The maximum of %d and %d is %d.\n", 3, 5, imax(1000.0));
return 0;
}
int imax(int n, int m)
{
return (n > m ? n: m);
}
and a recent compiler would reject that code. Be sure to compile with all warnings enabled, e.g. gcc -Wall -Wextra -g)
What really happens (on a mismatch between declared function signature and incorrect call) depends upon the ABI and the calling conventions. Today, on x86-64 with SVR4/Linux ABI, some arguments are passed thru registers. So even if you forced the compiler (e.g. by casting some function address to a function pointer and using that) what would happen is undefined and implementation specific.
If you book shows K&R C code, you need to use a better book showing C99 code (or C11). Bookmark also some C reference site (for C99 at least).
Read the C11 standard, e.g. n1570.
In 2018, you should avoid coding in K&R C (unless forced to).
I know this is relevant to the mechanism of the stack in C.
This is a misconception. Implementations are not required to have a call stack. And most recent implementations don't pass every argument on the stack (most calling conventions may use registers, details are implementation specific).
On Ubuntu, I recommend compiling with gcc -Wall -Wextra -g -all warnings and debug info-. And probably eliciting the language standard, e.g. by adding -std=c99 or -std=gnu11. When you want to benchmark, enable more compiler optimizations (e.g. with -O2)

Inconsistent undefined behavior

For a class, I wanted to demonstrate undefined behavior with goto to the students. I came up with the following program:
#include <stdio.h>
int main()
{
goto x;
for(int i = 0; i < 10; i++)
x: printf("%d\n", i);
return 0;
}
I would expect the compiler (gcc version 4.9.2) to warn me about the access to i being undefined behavior, but there is no warning, not even with:
gcc -std=c99 -Wall -Wextra -pedantic -O0 test.c
When running the program, i is apparently initialized to zero. To understand what is happening, I extended the code with a second variable j:
#include <stdio.h>
int main()
{
goto x;
for(int i = 0, j = 1; i < 10; i++)
x: printf("%d %d\n", i, j);
return 0;
}
Now the compiler warns me that I am accessing j without it being initialized. I understand that, but why is i not uninitialized as well?
Undefined behavior is a run-time phenomenon. Therefore it is quite rare that the compiler will be able to detect it for you. Most cases of undefined behavior are invoked when doing things beyond the scope of the compiler.
To make things even more complicated, the compiler might optimize the code. Suppose it decided to put i in a CPU register but j on the stack or vice versa. And then suppose that during debug build, it sets all stack contents to zero.
If you need to reliably detect undefined behavior, you need a static analysis tool which does checks beyond what is done by a compiler.
Now the compiler warns me that I am accessing j without it being
initialized. I understand that, but why is i not uninitialized as
well?
Thats the point with undefined behavior, it sometimes does work, or not, or partially, or print garbage. The problem is that you can't know what exactly your compiler is doing under the hood to make this, and its not the compiler's fault for producing inconsistent results, since, as you admit yourself, the behavior is undefined.
At that point the only thing thats guaranteed is that nothing is guaranteed as to how this will play out. Different compilers may even give different results, or different optimization levels may.
A compiler is also not required to check for this, and its not required to handle this, so consequently compilers don't. You can't use a compiler to check for undefined behavior reliably, anyways. Thats what unit tests and lots of test cases or statistical analysis is for.
Using "goto" to skip a variable initialization would, per the C Standard, allow a compiler to do anything it wants even on platforms where it would normally yield an Indeterminate Value which may not behave consistently but wouldn't have any other side-effects. The behavior of gcc in this case doesn't seem to have devolved as much as its behavior in case of e.g. integer overflow, but its optimizations may be somewhat interesting though benign. Given:
int test(int x)
{
int y;
if (x) goto SKIP;
y=x+1;
SKIP:
return y*2;
}
int test2(unsigned short y)
{
int q=0;int i;
for (i=0; i<=y; i++)
q+=test(i);
return q;
}
The compiler will observe that in all defined cases, test will return 2, and can thus eliminate the loop by generating code for test2 equivalent to:
int test2(unsigned short y)
{
return (int)y << 1;
}
Such an example, however, may give the impression that compilers treat UB in a benign fashion. Unfortunately, in the case of gcc, that is no longer true in general. It used to be that on machines without hardware traps, compilers would treat uses of Indeterminate Value as simply yielding arbitrary values that may or may not behave in any consistent fashion, but without any other side-effects. I'm not sure of any cases where using goto to skip variable initialization would yet cause side-effects other than having a meaningless value in the variable, but that doesn't mean the authors of gcc won't decide to exploit that freedom in future.

How to make gcc complain about comparison of char with 256

I found the following code on codegolf.stackexchange to print a code table for ASCII characters:
#include <stdio.h>
int main(){
char i;
for(i = 0; i < 256; i++){
printf("%3d 0x%2x: %c\n", i, i, i);
}
return 0;
}
Since chars store single bytes in them, they are always < 256 and the loop never terminates. I would like to detect this upon compilation.
Nicely, clang gives the following warning:
a.c:5:18: warning: comparison of constant 256 with expression of type 'char' is always true [-Wtautological-constant-out-of-range-compare]
for(i = 0; i < 256; i++){
~ ^ ~~~
However, neither gcc nor gcc -Wall give any warning of any sort. Is there any set of command line options I can give to turn on this kind of warnings? Or is it not possible in gcc?
-Wtype-limits (or -Wextra) should trigger this warning
Add -Wextra and -Wconversion. The first includes a warning for your actual probem, but the latter will warn about many other related problems.
But beware: -Wconversion will also warn about many other potential problems if your code is not well-written (signed/unsigned, etc). Best is to compile, see the warnings and carefully verify the listed lines, possibly adding casts (after thinking thrice if the code is correct!).
I compiled the posted code with gcc, on ubuntu 14.04 linux using:
-Wall -Wextra -pedantic -std=c99
and the compiler output this warning:
warning: comparison is always true due to limited range of data type [-Wtype-limits]
Just one more reason to always enable all the warnings when compiling

how to elegantly construct long argument lists that iterate through arrays in C

I have a C function that takes variable arguments, and I need to call it with a very long list of arguments, where the arguments all step through the elements of an array. Example:
myFunction( A[0], B[0], A[1], B[1], A[2], B[2], A[3], B[3], ..... A[N], B[N] );
where N is typically 100-200.
I would prefer not having to construct this call manually every time I make N bigger, and got to thinking, is there an elegant way to do this?
I tried something like:
i=0;
myFunction( A[i], B[i++], A[i], B[i++], A[i], B[i++], A[i], B[i++], ..... A[i], B[++] );
but of course that fails. What is preferred about it, however, is anytime I make N larger, I can simply copy the same line over and over, instead of having to ensure each array index is correct, which is quite tedious.
Changing myFunction() is not an option.
I wish C had a way to construct function calls on the fly, like:
for( i = 0 ; i <= N ; i++ )
{
CONSTRUCT_CALL( myFunction, A[i], B[i] );
}
which would be exactly what I want, but of course that's not an option.
Is there anything that might be easier or more elegant?
Thank you very much.
There is no standard C way of doing that (synthesizing a variadic call at runtime). But...
you can use libffi which is designed to handle such issues (so I recommend it)
you could consider GCC specific Builtins for Constructing Calls
you could have some fixed limit on the arity (e.g. 500) and have some C file generated with some (shell, awk, Python, ...) script doing a switch on the 500 cases, one for each arity.
you might consider generating some C code at runtime into _gen123.c, compile it into a dynamically loadable plugin (e.g. forking some gcc -shared -fPIC -Wall -O _gen123.c -o _gen123.so command on Linux), then loading that plugin (with dlopen(3) on Linux or Posix)
you might consider some just-in-time compilation library (e.g. libjit, llvm, GNU lightning, asmjit, ...)
Of course, avoid several i++ in a single call. Avoid undefined behavior, since bad things could happen.
There is something very bad in your design.
Rewrite your myFunction so it takes two arrays (A and B) and then takes numer of indices to use.
A short example of calling such a function:
int A[100];
int B[100];
int c = myFunction(A, B, 100);
A possible implementation of myFunction:
int myFunction(int* A, int* B, int count)
{
int result = 0;
for(int j = 0; j < i; j++)
result += A[j] + B[j]*2;
return result;
}

Behaviour of printf when printing a %d without supplying variable name

I've just encountered a weird problem, I'm trying to printf an integer variable, but I forgot to specify the variable name, i.e.
printf("%d");
instead of
printf("%d", integerName);
Surprisingly the program compiles, there is output and it is not random. In fact, it happens to be the very integer I wanted to print in the first place, which happens to be m-1.
The errorneous printf statement will consistently output m-1 for as long as the program keeps running... In other words, it's behaving exactly as if the statement reads
printf("%d", m-1);
Anybody knows the reason behind this behaviour? I'm using g++ without any command line options.
#include <iostream>
#define maxN 100
#define ON 1
#define OFF 0
using namespace std;
void clearArray(int* array, int n);
int fillArray(int* array, int m, int n);
int main()
{
int n = -1, i, m;
int array[maxN];
int found;
scanf("%d", &n);
while(n!=0)
{
found=0;
m = 1;
while(found!=1)
{
if(m != 2 && m != 3 && m != 4 && m != 6 && m != 12)
{
clearArray(array, n);
if(fillArray(array, m, n) == 0)
{
found = 1;
}
}
m++;
}
printf("%d\n");
scanf("%d", &n);
}
return 0;
}
void clearArray(int* array, int n)
{
for(int i = 1; i <= n; i++)
array[i] = ON;
}
int fillArray(int* array, int m, int n)
{
int i = 1, j, offCounter = 0, incrementCounter;
while(offCounter != n)
{
if(*(array+i)==ON)
{
*(array+i) = OFF;
offCounter++;
}
else
{
j = 0;
while((*array+i+j)==OFF)
{
j++;
}
*(array+i+j) = OFF;
offCounter++;
}
if(*(array+13) == OFF && offCounter != n) return 1;
if(offCounter ==n) break;
incrementCounter = 0;
while(incrementCounter != m)
{
i++;
if(i > n) i = 1;
if(*(array+i) == ON) incrementCounter++;
}
}
return 0;
}
You say that "surprisingly the program compiles". Actually, it is not surprising at all. C & C++ allow for functions to have variable argument lists. The definition for printf is something like this:
int printf(char*, ...);
The "..." signifies that there are zero or more optional arguments to the function. In fact, one of the main reasons C has optional arguments is to support the printf & scanf family of functions.
C has no special knowledge of the printf function. In your example:
printf("%d");
The compiler doesn't analyse the format string and determine that an integer argument is missing. This is perfectly legal C code. The fact that you are missing an argument is a semantic issue that only appears at runtime. The printf function will assume that you have supplied the argument and go looking for it on the stack. It will pick up whatever happens to be on there. It just happens that in your special case it is printing the right thing, but this is an exception. In general you will get garbage data. This behaviour will vary from compiler to compiler and will also change depending on what compile options you use; if you switch on compiler optimisation you will likely get different results.
As pointed out in one of the comments to my answer, some compilers have "lint" like capabilities that can actually detect erroneous printf/scanf calls. This involves the compiler parsing the format string and determining the number of extra arguments expected. This is very special compiler behaviour and will not detect errors in the general case. i.e. if you write your own "printf_better" function which has the same signature as printf, the compiler will not detect if any arguments are missing.
What happens looks like this.
printf("%d", m);
On most systems the address of the string will get pushed on the stack, and then 'm' as an integer (assuming it's an int/short/char). There is no warning because printf is basically declared as 'int printf(const char *, ...);' - the ... meaning 'anything goes'.
So since 'anything goes' some odd things happen when you put variables there. Any integral type smaller than an int goes as an int - things like that. Sending nothing at all is ok as well.
In the printf implementation (or at least a 'simple' implementation) you will find usage of va_list and va_arg (names sometime differ slightly based on conformance). These are what an implementation uses to walk around the '...' part of the argument list. Problem here is that there is NO type checking. Since there is no type checking, printf will pull random data off the execution stack when it looks at the format string ("%d") and thinks there is supposed to be an 'int' next.
Random shot in the dark would say that the function call you made just before printf possibly passed 'm-1' as it's second parm? That's one of many possibilities - but it would be interesting if this happened to be the case. :)
Good luck.
By the way - most modern compilers (GCC I believe?) have warnings that can be enabled to detect this problem. Lint does as well I believe. Unfortunately I think with VC you need to use the /analyze flag instead of getting for free.
It got an int off the stack.
http://en.wikipedia.org/wiki/X86_calling_conventions
You're peering into the stack. Change the optimizer values, and this may change. Change the order of the declarations your variables (particularly) m. Make m a register variable. Make m a global variable.
You'll see some variations in what happens.
This is similar to the famous buffer overrun hacks that you get when you do simplistic I/O.
While I would highly doubt this would result in a memory violation, the integer you get is undefined garbage.
You found one behavior. It could have been any other behavior, including an invalid memory access.

Resources