Behaviour of printf when printing a %d without supplying variable name - c

I've just encountered a weird problem, I'm trying to printf an integer variable, but I forgot to specify the variable name, i.e.
printf("%d");
instead of
printf("%d", integerName);
Surprisingly the program compiles, there is output and it is not random. In fact, it happens to be the very integer I wanted to print in the first place, which happens to be m-1.
The errorneous printf statement will consistently output m-1 for as long as the program keeps running... In other words, it's behaving exactly as if the statement reads
printf("%d", m-1);
Anybody knows the reason behind this behaviour? I'm using g++ without any command line options.
#include <iostream>
#define maxN 100
#define ON 1
#define OFF 0
using namespace std;
void clearArray(int* array, int n);
int fillArray(int* array, int m, int n);
int main()
{
int n = -1, i, m;
int array[maxN];
int found;
scanf("%d", &n);
while(n!=0)
{
found=0;
m = 1;
while(found!=1)
{
if(m != 2 && m != 3 && m != 4 && m != 6 && m != 12)
{
clearArray(array, n);
if(fillArray(array, m, n) == 0)
{
found = 1;
}
}
m++;
}
printf("%d\n");
scanf("%d", &n);
}
return 0;
}
void clearArray(int* array, int n)
{
for(int i = 1; i <= n; i++)
array[i] = ON;
}
int fillArray(int* array, int m, int n)
{
int i = 1, j, offCounter = 0, incrementCounter;
while(offCounter != n)
{
if(*(array+i)==ON)
{
*(array+i) = OFF;
offCounter++;
}
else
{
j = 0;
while((*array+i+j)==OFF)
{
j++;
}
*(array+i+j) = OFF;
offCounter++;
}
if(*(array+13) == OFF && offCounter != n) return 1;
if(offCounter ==n) break;
incrementCounter = 0;
while(incrementCounter != m)
{
i++;
if(i > n) i = 1;
if(*(array+i) == ON) incrementCounter++;
}
}
return 0;
}

You say that "surprisingly the program compiles". Actually, it is not surprising at all. C & C++ allow for functions to have variable argument lists. The definition for printf is something like this:
int printf(char*, ...);
The "..." signifies that there are zero or more optional arguments to the function. In fact, one of the main reasons C has optional arguments is to support the printf & scanf family of functions.
C has no special knowledge of the printf function. In your example:
printf("%d");
The compiler doesn't analyse the format string and determine that an integer argument is missing. This is perfectly legal C code. The fact that you are missing an argument is a semantic issue that only appears at runtime. The printf function will assume that you have supplied the argument and go looking for it on the stack. It will pick up whatever happens to be on there. It just happens that in your special case it is printing the right thing, but this is an exception. In general you will get garbage data. This behaviour will vary from compiler to compiler and will also change depending on what compile options you use; if you switch on compiler optimisation you will likely get different results.
As pointed out in one of the comments to my answer, some compilers have "lint" like capabilities that can actually detect erroneous printf/scanf calls. This involves the compiler parsing the format string and determining the number of extra arguments expected. This is very special compiler behaviour and will not detect errors in the general case. i.e. if you write your own "printf_better" function which has the same signature as printf, the compiler will not detect if any arguments are missing.

What happens looks like this.
printf("%d", m);
On most systems the address of the string will get pushed on the stack, and then 'm' as an integer (assuming it's an int/short/char). There is no warning because printf is basically declared as 'int printf(const char *, ...);' - the ... meaning 'anything goes'.
So since 'anything goes' some odd things happen when you put variables there. Any integral type smaller than an int goes as an int - things like that. Sending nothing at all is ok as well.
In the printf implementation (or at least a 'simple' implementation) you will find usage of va_list and va_arg (names sometime differ slightly based on conformance). These are what an implementation uses to walk around the '...' part of the argument list. Problem here is that there is NO type checking. Since there is no type checking, printf will pull random data off the execution stack when it looks at the format string ("%d") and thinks there is supposed to be an 'int' next.
Random shot in the dark would say that the function call you made just before printf possibly passed 'm-1' as it's second parm? That's one of many possibilities - but it would be interesting if this happened to be the case. :)
Good luck.
By the way - most modern compilers (GCC I believe?) have warnings that can be enabled to detect this problem. Lint does as well I believe. Unfortunately I think with VC you need to use the /analyze flag instead of getting for free.

It got an int off the stack.
http://en.wikipedia.org/wiki/X86_calling_conventions

You're peering into the stack. Change the optimizer values, and this may change. Change the order of the declarations your variables (particularly) m. Make m a register variable. Make m a global variable.
You'll see some variations in what happens.
This is similar to the famous buffer overrun hacks that you get when you do simplistic I/O.

While I would highly doubt this would result in a memory violation, the integer you get is undefined garbage.

You found one behavior. It could have been any other behavior, including an invalid memory access.

Related

scanf inside function to return value (or other function)

so i was going to run a function in an infinite loop which takes a number input, but then I remembered I codn't do
while (true) {
myfunc(scanf("%d));
}
because I need to put the scanf input into a variable. I can't do scanf(%*d) because that doesn't return value at all. I don't want to have to do
int temp;
while (true) {
scanf("%d", &temp);
myfunc(temp);
or include more libraries. Is there any standard single function like gets (I cod do myfunc((int) strtol(gets(), (char**) NULL, 10)); but its kinda messy sooo yea)
srry if im asking too much or being pedantic and i shod do ^
btw unrelated question is there any way to declare a string as an int--- or even better, a single function for converting int to string? I usually use
//num is some number
char* str = (char*) malloc(12);
sprintf(str, "%d", num);
func(str);
but wodnt func(str(num)); be easier?
For starters, the return value of scanf (and similar functions) is the number of conversions that took place. That return value is also used to signify if an error occurred.
In C you must manually manage these errors.
if ((retv = scanf("%d", &n)) != 1) {
/* Something went wrong. */
}
What you seem to be looking for are conveniences found in higher-level languages. Languages & runtimes that can hide the details from you with garbage collection strategies, exception nets (try .. catch), etc. C is not that kind of language, as by today's standards it is quite a low-level language. If you want "non-messy" functions, you will have to build them up from scratch, but you will have to decide what kinds of tradeoffs you can live with.
For example, perhaps you want a simple function that just gets an int from the user. A tradeoff you could make is that it simply returns 0 on any error whatsoever, in exchange for never knowing if this was an error, or the user actually input 0.
int getint(void) {
int n;
if (scanf("%d", &n) != 1)
return 0;
return n;
}
This means that if a user makes a mistake on input, you have no way of retrying, and the program must simply roll on ahead.
This naive approach scales poorly with the fact that you must manually manage memory in C. It is up to you to free any memory you dynamically allocate.
You could certainly write a simple function like
char *itostr(int n) {
char *r = malloc(12);
if (r && sprintf(r, "%d", n) < 1) {
r[0] = '0';
r[1] = '\0';
}
return r;
}
which does the most minimal of error checking (Again, we don't know if "0" is an error, or a valid input).
The problem comes when you write something like func(itostr(51));, unless func is to be expected to free its argument (which would rule out passing non-dynamically allocated strings), you will constantly be leaking memory with this pattern.
So no there is no real "easy" way to do these things. You will have to get "messy" (handle errors, manage memory, etc.) if you want to build anything with complexity.

How to include last element in loops in C?

This is a general issue I've run into, and I've yet to find a solution to it that doesn't feel very "hack-y". Suppose I have some array of elements xs = {a_1, a_2, ..., a_n}, where I know some x is in the array. I wish to loop through the array, and do something with each element, up to and including the element x. Here's a version that does almost that, except it leaves out the very last element. Note that in this example the array happens to be a sorted list of integers, but in the general case this might not necessarily be true.
int xs[] = {1,2,3,4,5};
for (int i = 0; xs[i] != 4; i++) {
foo(xs[i]);
}
The only solutions I've seen so far are:
Just add a final foo(xs[i]); statement after the for-loop. This is first of all ugly and repetitious, especially in the case where foo is not just a function call but a list of statements. Second, it requires i to be defined outside the scope of the for-loop.
Manually break the loop, with an if-statement inside an infinite loop. This again seems ugly to me, since we're not really using the for and while constructs to their full extent. The problem is almost archetypal of what you'd use a for-loop for, the only difference is that we just want it to go through the loop one more time.
Does anyone know of a good solution to this problem?
In C, the for loop is a "check before body" operation, you want the "check after body" variant, a do while loop, something like:
int xs[] = {1,2,3,4,5};
{
int i = 0;
do {
foo(xs[i]);
} while (xs[i++] != 4);
}
You'll notice I've enclosed the entire chunk in its own scope (the outermost {} braces). This is just to limit the existence of i to make it conform more with the for loop behaviour.
In terms of a complete program showing this, the following code:
#include <stdio.h>
void foo(int x) {
printf("%d\n", x);
}
int main(void) {
int xs[] = {1,2,3,4,5};
{
int i = 0;
do {
foo(xs[i]);
} while (xs[i++] != 4);
}
return 0;
}
outputs:
1
2
3
4
As an aside, like you, I'm also not that keen of the two other solutions you've seen.
For the first solution, that won't actually work in this case since the lifetime of i is limited to the for loop itself (the int in the for statement initialisation section makes this so).
That means i will not have the value you expect after the loop. Either there will be no i (a compile-time error) or there will be an i which was hidden within the for loop and therefore unlikely to have the value you expect, leading to insidious bugs.
For the second, I will sometimes break loops within the body but generally only at the start of the body so that the control logic is still visible in a single area. I tend to do that if the for condition would be otherwise very lengthy but there are other ways to do this.
Try processing the loop as long as the previous element (if available) is not 4:
int xs[] = {1,2,3,4,5};
for (int i = 0; i == 0 || xs[i - 1] != 4; i++) {
foo(xs[i]);
}
This may not be a direct answer to the original question, but I would strongly suggest against making a habit of parsing arrays like that (it's like a ticking bomb waiting to explode at a random point in time).
I know you said you already know x is a member of xs, but when it is not (and this can accidentally happen for a variety of reasons) then your loop will crash your program if you are lucky, or it will corrupt irrelevant data if you are not lucky.
In my opinion, it is neither ugly nor "hacky" to be defensive with an extra check.
If the hurdle is the seemingly unknown length of xs, it is not. Static arrays have a known length, either by declaration or by initialization (like your example). In the latter case, the length can be calc'ed on demand within the scope of the declared array, by sizeof(arr) / sizeof(*arr) - you can even make it a reusable macro.
#define ARRLEN(a) (sizeof(a)/sizeof(*(a)))
...
int xs[] = {1,2,3,4,5};
/* way later... */
size_t xslen = ARRLEN(xs);
for (size_t i=0; i < xslen; i++) {
if (xs[i] == 4) {
foo( xs[i] );
break;
};
}
This will not overrun xs, even when 4 is not present in the array.
EDIT:
"within the scope of the declared array" means that the macro (or its direct code) will not work on an array passed as a function parameter.
// This does NOT work, because sizeof(arr) returns the size of an int-pointer
size_t foo( int arr[] ) {
return( sizeof(arr)/sizeof(*arr) );
}
If you need the length of an array inside a function, you can pass it too as a parameter along with the array (which actually is just a pointer to the 1st element).
Or if performance is not an issue, you may use the sentinel approach, explained below.
[end of EDIT]
An alternative could be to manually mark the end of your array with a sentinel value (a value you intentionally consider invalid). For example, for integers it could be INT_MAX:
#include <limits.h>
...
int xs[] = {1,2,3,4,5, INT_MAX};
for (size_t i=0; xs[i] != INT_MAX; i++) {
if (xs[i] == 4) {
foo( xs[i] );
break;
};
}
Sentinels are quite common for parsing unknown-length dynamically-allocated arrays of pointers, with the sentinel being NULL.
Anyway, my main point is that preventing accidental buffer overruns probably has a higher priority compared to code prettiness :)

C function returns correct result but no return statement is given, why?

So I've written a function that should return the n'th Fibonnacci number, but I forgot to actually return my result. I did get the " control reaches end of non-void function" warning, but the code executed fine and returned the correct result. Why is that? How does C know that it should return "result"?
int fib (int n);
int main(int argc, char** argv) {
printf("%d", fib(10));
}
int fib (int n){
if (n == 1){
return 1;
}
if (n == 0){
return 0
}
fib(n-2) + fib(n-1)
}
It returned this
55
I've tried to add
int j = n+1
to the last line of the function, and then it actually returned 2, not 256. Is this a bug, or how does c read something like this?
Reaching the end of a non void function without a return statement invokes undefined behavior. Getting the expected result is a form of undefined behavior commonly called luck. By the way, 256 may be what you expected, but it is not correct.
A possible explanation is: the last value computed by the function and stored into the register that would normally contain the return value is the expected result.
Of course you should never rely on this, nor expect it.
This is a good example of the use of compiler warnings: do not ignore them. Always turn on more compiler warnings and fix the code. gcc -Wall -W or clang -Weverything can spot many silly mistakes and save hours of debugging.
Here are some other problems:
you do not include <stdio.h>
you compute an unsigned long long but only return a probably smaller type int
your algorithm computes powers of 2, not Fibonacci numbers.

Code only works if all variables are set to 0 first. UB?

This code fails randomly by correctly identifying some numeric palindromes and failing on others.
#include <stdio.h>
int main(int argc, char *argv[])
{
int n, reverse = 0, temp;
printf("Enter a number to check if it is a palindrome or not\n");
scanf("%d",&n);
temp = n;
while( temp != 0 )
{
reverse = reverse * 10;
reverse = reverse + temp%10;
temp = temp/10;
}
if ( n == reverse )
printf("%d is a palindrome number.\n", n);
else
printf("%d is not a palindrome number.\n", n);
return 0;
}
For example, the above code incorrectly says "87678" isn't a numeric palindrome.
Checking the return of scanf() shows it's succeeding and printing the value of n is correct for input of 87678.
However the code correctly says "4554" is a palindrome.
However, by adding:
n = reverse = temp = 0;
before the first printf() the program appears to work correctly all the time. So what is happening in the first version? Is this some sort of undefined behavior when the variables aren't initialized before use?
EDIT: Will later provide the assembly of the compiled version that is failing to see what the compiler is doing.
Unless sizeof(int) is less than 4, you've either hit a compiler bug, your hardware is malfunctioning, or you have some form of data corruption going on in your system.
To answer the question: no, there's no undefined behavior anywhere in your program (assuming the scanf() really doesn't fail).
Try running memtest on your system to rule out RAM issues: http://www.memtest.org
It sounds very much like you have a compiler error since this works with later versions of gcc. I'd be very interested to see the output of gcc -S (pastebin please?) and also to know the compile command you are using. (optimization level especially).
Unlike Java, C does not have a default value for int. You can refer to this post as it discuss this similar problem.

Why does the dependence graph of this scanf()-using program by Frama-C look like this?

I use the Frama-C tool to generate the dependence graph of this program(main.c).
#include<stdio.h>
int main()
{
int n,i,m,j;
while(scanf("%d",&n)!=EOF)
{
m=n;
for(i=n-1;i>=1;i--)
{
m=m*i;
while(m%10==0)
{
m=m/10;
}
m=m%10000;
}
m=m%10;
printf("%5d -> %d\n",n,m);
}
return 0;
}
The command is:
frama-c -pdg -dot-pdg main main.c
dot -Tpdf main.main.dot -o main.pdf
The result is
My question is why the statments "m=m*i;","m=m%10000" do not map to nodes. The result does not seem right,because there are three loops in the code.
A slicer for C programs only works in practice if its defined goal is
to preserve defined executions, and the slicer is allowed to change undefined executions.
Otherwise, the slicer would be unable to remove a statement such as x = *p; as soon as it is unable to determine that p is a valid pointer at that point, even if it knows that it does not need x, just because if the statement is removed, executions where p is NULL at that point are changed.
Frama-C does not handle complex library functions such as scanf(). Because of this, it thinks that local variable n is used without being initialized.
Type frama-c -val main.c
You should get a warning like:
main.c:10:[kernel] warning: accessing uninitialized left-value:
assert \initialized(&n);
...
[value] Values for function main:
NON TERMINATING FUNCTION
The word assert means that Frama-C's option -val is unable to determine that all executions are defined, and "NON TERMINATING FUNCTION" means that it is unable to find a single defined execution of the program to continue from.
The undefined use of an uninitialized variable is the reason the PDG removes most statements. The PDG algorithm thinks it can remove them because they come after what it thinks is an undefined behavior, the first access to variable n.
I modified your program slightly to replace the scanf() call with a simpler statement:
#define EOF (-1)
int unknown_int();
int scan_unknown_int(int *p)
{
*p = unknown_int();
return unknown_int();
}
int main()
{
int n,i,m,j;
while(scan_unknown_int(&n) != EOF)
{
m=n;
for(i=n-1;i>=1;i--)
{
m=m*i;
while(m%10==0)
{
m=m/10;
}
m=m%10000;
}
m=m%10;
printf("%5d -> %d\n",n,m);
}
return 0;
}
and I got the PDG below. It looks complete as far as I can tell. If you know better layout programs than dot but that accept the dot format, this is a good chance to use them.
Note that the condition of the outmost while became tmp != -1. The nodes of the graph are the statements of an internal normalized representation of the program. The condition tmp != -1 has a data dependency to the node for the statement tmp = unknown_int();. You can display the internal representation with frama-c -print main.c, and it will show that the outmost loop condition has been broken into:
while (1) {
int tmp;
tmp = scan_unknown_int(& n);
if (! (tmp != -1)) { break; }
This helps, among other things, the slicing to remove only the parts of a complex statement that can be removed instead of having to keep the entire complex statement.

Resources