Why does this do whatever it does? - c

#include <stdio.h>
void littledot(){}//must use C, not C++
int main() {
littledot(568,76,105,84,116,76,101,68,111,84);
printf("%c%c%c%c%c%c%c%c%c\n");
getchar();
return 0;
}
The above code yields the result "LiTtLeDoT". Why does it do that? Why is 568 crucial?

This differs per platform and is UB (the implementation can do anything it wants*), but probably the arguments to littledot() are still on the stack after littledot() returns and printf prints those arguments from the stack.
NEVER RELY ON THIS!
*really anything. Afaik an ancient version of GCC started a videogame when it encountered something that would behave in an undefined way.

You were lucky. This is undefined behaviour, specifically the call to printf. The program could do anything. Your implementation happens to write "LiTtLeDoT".
The really is the nature of undefined behaviour. The compiler can do anything it wants. If you really want to know why it does what it does then you will need to look at the emitted object code. Looking at the C code will yield nothing because of the aforementioned undefined behaviour.

This is what's working reliably for me with Open Watcom 1.9 on Windows:
//must use C, not C++
#include <stdio.h>
#include <stdlib.h>
void
__stdcall // callee cleans up
littledot()
{
}
int main(void)
{
littledot(/*568,*/'L','i','T','t','L','e','D','o','T');
printf("%c%c%c%c%c%c%c%c%c\n");
getchar();
exit(0);
return 0;
}
littledot() is called with a number of parameters that are passed on the stack.
If the calling convention for littledot() is __stdcall (or __fastcall), it will have to remove its parameters from the stack.
If it's __cdecl, then main() will have to remove them, but this won't work for us.
However, littledot() doesn't and can't do anything about the parameters because they're not specified, which is something you can do in C, but not C++.
So, what happens is that not only littledot()'s parameters remain on the stack after the call to it, but also the stack pointer is not restored (because neither littledot() nor main() remove the parameters, which is typically done by adjusting the stack pointer) and the stack pointer points to 'L'. Then there's the call to printf() that first places on the stack the address of the format string "%c%c%c%c%c%c%c%c%c\n" thus forming all expected parameters for the function on the stack. printf() happily prints the text and returns.
After that the stack isn't correctly balanced and doing return in main() is risky as the app may crash. Returning by means of exit(0) fixes that.
As all others have said, none of this is guaranteed to work. It may only work with specific compilers for specific OSes and then only sometimes.

http://codepad.org/tfRLaCB5
I'm sorry, what you claim the program to print is not what happens on my box and not what happens on codepad's box.
And the reason is, the program has undefined behavior. printf expects one additional argument (an int) for every %c you have in the format string. You don't give it those arguments, hence anything can happen.
You are in a situation where with certain implementations of printf, compiler options, certain compilers and certain ABIs, you end up with that output. But you should not think that this output is required by any specification.

Related

The "%p" printf parameter

I have this code:
#include <stdio.h>
#include <string.h>
void main(){
printf("%p");
}
This is the output:
0x7ffdd9b973d8
I know %p stands for pointer and when using it as for example
#include <stdio.h>
#include <string.h>
void main(){
int i = 0;
printf("%p", i);
}
it returns the pointer address of i. But my question is what does it return when not adding any other argument in the printf function just printf("%p")
Trash. printf uses a variable-length argument list. It uses the format string to determine how many arguments you actually passed. If you did not actually pass anything in, it will still read from basically arbitrary portions of memory as though you did. The result is undefined/trash.
Some compilers will be able to catch this situation with a warning because the printf family of functions is so popular. Some cases may crash your system if the function tries to read from memory you do not have access to. There is no way to tell how it will behave next time even if you have obtained a certain result.
But my question is what does it return when not adding any other argument in the printf function just printf("%p");
Anything. Nothing. Random junk. Maybe it crashes.
There is no way to know without investigating a specific combination of compiler, CPU, platform, libraries, execution environment, and so on. There is no rule that requires it to operate any particular way.
The behavior of
printf("%p");
is undefined. When you specify a %p format in the format string, the corresponding argument of void * (or char *) type shall be present in the argument list.

What main(++i) will return in C

I have a program like this.
‪#include<stdio.h>
#include<stdlib.h>
int main(int i) { /* i will start from value 1 */
if(i<10)
printf("\n%d",main(++i)); /* printing the values until i becomes 9 */
}
output :
5
2
2
2
Can anyone explain how the output is coming ?? what main(++i) is returning for each iteration.
Also it is producing output 5111 if i remove the \n in the printf function.
Thanks in advance.
First of all, the declaration of main() is supposed to be int main(int argc, char **argv). You cannot modify that. Even if your code compiles, the system will call main() the way it is supposed to be called, with the first parameter being the number of parameters of your program (1 if no parameter is given). There is no guarantee it will always be 1. If you run your program with additional parameters, this number will increase.
Second, your printf() is attempting to print the return value of main(++i), howover, your main() simply don't return anything at all. You have to give your function a return value if you expect to see any coherence here.
And finally, you are not supposed to call your own program's entrypoint, much less play with recursion with it. Create a separate function for this stuff.
Here's what the C Draft Standard (N1570) says about main:
5.1.2.2.1 Program startup
1 The function called at program startup is named main. The implementation declares no
prototype for this function. It shall be defined with a return type of int and with no
parameters:
int main(void) { /* ... */ }
or with two parameters (referred to here as argc and argv, though any names may be
used, as they are local to the function in which they are declared):
int main(int argc, char *argv[]) { /* ... */ }
or equivalent or in some other implementation-defined manner.
Clearly, the main function in your program is neither of the above forms. Unless your platform supports the form you are using, your program is exhibiting undefined behavior.
This program has undefined behavior (UB) all over the place, and if you have a single instance undefined behavior in your program, you can't safely assume any output or behavior of your program - It legally can happen anything (although in real world the effects often are somewhat localized near the place of UB in the code.
The old C90 standard listed are more than 100 (if i recall right) situations of UB and there is a not known number of UBs on top, which is behavior for situations, the standard do not describe. A set of situations, that are UB exists, for every C and C++ Standard.
In your case (without consulting standards) instances of UB are at least:
not returning an value of a function that is declared with a return value. (exception: calling main the FIRST time - thanks, Jim for the comments)
defining (and calling) main other than with the predefined forms of the standard, or as specified (as implementation defined behavior) by your compiler.
Since you have at least one instance of UB in your program, speculations about the results, are somewhat... speculative and must make assumptions about your compiler, your operating system, hardware, and even software running on parallel, that are normally not documented or can be known.
You are not initializing i, so by default value will be taken from the address where it is stored in RAM.
This code will produce garbage output if you run the code multiple times after restarting your computer.
The output will also depend on compiler.
I'm surprised that even compiles.
When the operating system actually runs the program and main() gets called, two 32 (or 64) bit values are passed to it. You can either ignore them by declaring main(void), or use them by declaring main(int argc, char** args).
As the above prototype suggests, the first value passed is a count of the number of command-line arguments that are being passed to the process, and the second is a pointer to where a list of these arguments is stored in memory, likely on the program's local stack.
The value of argc is always at least 1, because the first item string in args is always the name of the program itself, generated by the OS.
Regarding your unexpected output, I'd say something is not getting pulled off or pushed onto the stack, so variables are getting mixed up. This is either due to the incomplete argument list for main() or the fact that you've declared main to return an int, but haven't returned anything.
I think, the main method is calling itself inside the main method.
to increment the value of a variable, i++ is printing the value of i before it increment while ++i it increment the value of i first before it print the value of i.
you could use this..
int x=0;
main()
{
do
{
printf(x++);
}while (i<10);
}

How undefined is undefined behavior?

I'm not sure I quite understand the extent to which undefined behavior can jeopardize a program.
Let's say I have this code:
#include <stdio.h>
int main()
{
int v = 0;
scanf("%d", &v);
if (v != 0)
{
int *p;
*p = v; // Oops
}
return v;
}
Is the behavior of this program undefined for only those cases in which v is nonzero, or is it undefined even if v is zero?
I'd say that the behavior is undefined only if the users inserts any number different from 0. After all, if the offending code section is not actually run the conditions for UB aren't met (i.e. the non-initialized pointer is not created neither dereferenced).
A hint of this can be found into the standard, at 3.4.3:
behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements
This seems to imply that, if such "erroneous data" was instead correct, the behavior would be perfectly defined - which seems pretty much applicable to our case.
Additional example: integer overflow. Any program that does an addition with user-provided data without doing extensive check on it is subject to this kind of undefined behavior - but an addition is UB only when the user provides such particular data.
Since this has the language-lawyer tag, I have an extremely nitpicking argument that the program's behavior is undefined regardless of user input, but not for the reasons you might expect -- though it can be well-defined (when v==0) depending on the implementation.
The program defines main as
int main()
{
/* ... */
}
C99 5.1.2.2.1 says that the main function shall be defined either as
int main(void) { /* ... */ }
or as
int main(int argc, char *argv[]) { /* ... */ }
or equivalent; or in some other implementation-defined manner.
int main() is not equivalent to int main(void). The former, as a declaration, says that main takes a fixed but unspecified number and type of arguments; the latter says it takes no arguments. The difference is that a recursive call to main such as
main(42);
is a constraint violation if you use int main(void), but not if you use int main().
For example, these two programs:
int main() {
if (0) main(42); /* not a constraint violation */
}
int main(void) {
if (0) main(42); /* constraint violation, requires a diagnostic */
}
are not equivalent.
If the implementation documents that it accepts int main() as an extension, then this doesn't apply for that implementation.
This is an extremely nitpicking point (about which not everyone agrees), and is easily avoided by declaring int main(void) (which you should do anyway; all functions should have prototypes, not old-style declarations/definitions).
In practice, every compiler I've seen accepts int main() without complaint.
To answer the question that was intended:
Once that change is made, the program's behavior is well defined if v==0, and is undefined if v!=0. Yes, the definedness of the program's behavior depends on user input. There's nothing particularly unusual about that.
Let me give an argument for why I think this is still undefined.
First, the responders saying this is "mostly defined" or somesuch, based on their experience with some compilers, are just wrong. A small modification of your example will serve to illustrate:
#include <stdio.h>
int
main()
{
int v;
scanf("%d", &v);
if (v != 0)
{
printf("Hello\n");
int *p;
*p = v; // Oops
}
return v;
}
What does this program do if you provide "1" as input? If you answer is "It prints Hello and then crashes", you are wrong. "Undefined behavior" does not mean the behavior of some specific statement is undefined; it means the behavior of the entire program is undefined. The compiler is allowed to assume that you do not engage in undefined behavior, so in this case, it may assume that v is non-zero and simply not emit any of the bracketed code at all, including the printf.
If you think this is unlikely, think again. GCC may not perform this analysis exactly, but it does perform very similar ones. My favorite example that actually illustrates the point for real:
int test(int x) { return x+1 > x; }
Try writing a little test program to print out INT_MAX, INT_MAX+1, and test(INT_MAX). (Be sure to enable optimization.) A typical implementation might show INT_MAX to be 2147483647, INT_MAX+1 to be -2147483648, and test(INT_MAX) to be 1.
In fact, GCC compiles this function to return a constant 1. Why? Because integer overflow is undefined behavior, therefore the compiler may assume you are not doing that, therefore x cannot equal INT_MAX, therefore x+1 is greater than x, therefore this function can return 1 unconditionally.
Undefined behavior can and does result in variables that are not equal to themselves, negative numbers that compare greater than positive numbers (see above example), and other bizarre behavior. The smarter the compiler, the more bizarre the behavior.
OK, I admit I cannot quote chapter and verse of the standard to answer the exact question you asked. But people who say "Yeah yeah, but in real life dereferencing NULL just gives a seg fault" are more wrong than they can possibly imagine, and they get more wrong with every compiler generation.
And in real life, if the code is dead you should remove it; if it is not dead, you must not invoke undefined behavior. So that is my answer to your question.
If v is 0, your random pointer assignment never gets executed, and the function will return zero, so it is not undefined behaviour
When you declare variables (especially explicit pointers), a piece of memory is allocated (usually an int). This peace of memory is being marked as free to the system but the old value stored there is not cleared (this depends on the memory allocation being implemented by the compiler, it might fill the place with zeroes) so your int *p will have a random value (junk) which it has to interpret as integer. The result is the place in memory where p points to (p's pointee). When you try to dereference (aka. access this piece of the memory), it will be (almost every time) occupied by another process/program, so trying to alter/modify some others memory will result in access violation issues by the memory manager.
So in this example, any other value then 0 will result in undefined behavior, because no one knows what *p will point to at this moment.
I hope this explanation is of any help.
Edit: Ah, sorry, again few answers ahead of me :)
It is simple. If a piece of code doesn't execute, it doesn't have a behavior!!!, whether defined or not.
If input is 0, then the code inside if doesn't run, so it depends on the rest of the program to determine whether the behavior is defined (in this case it is defined).
If input is not 0, you execute code that we all know is a case of undefined behavior.
I would say it makes the whole program undefined.
The key to undefined behavior is that it is undefined. The compiler can do whatever it wants to when it sees that statement. Now, every compiler will handle it as expected, but they still have every right to do whatever they want to - including changing parts unrelated to it.
For example, a compiler may choose to add a message "this program may be dangerous" to the program if it detects undefined behavior. This would change the output whether or not v is 0.
Your program is pretty-well defined. If v == 0 then it returns zero. If v != 0 then it splatters over some random point in memory.
p is a pointer, its initial value could be anything, since you don't initialise it. The actual value depends on the operating system (some zero memory before giving it to your process, some don't), your compiler, your hardware and what was in memory before you ran your program.
The pointer assignment is just writing into a random memory location. It might succeed, it might corrupt other data or it might segfault - it depends on all of the above factors.
As far as C goes, it's pretty well defined that unintialised variables do not have a known value, and your program (though it might compile) will not be correct.

C programming language, array, pointer

int main()
{
int j=97;
char arr[4]="Abc";
printf(arr,j);
getch();
return 0;
}
this code gives me a stack overflow error why?
But if instead of printf(arr,j) we use printf(arr) then it prints Abc.
please tell me how printf works , means 1st argument is const char* type so how arr is
treated by compiler.
sorry! above code is right it doesn't give any error,I write this by mistake. but below code give stack overflow error.
#include <stdio.h>
int main()
{
int i, a[i];
getch();
return 0;
}
since variable i take any garbage value so that will be the size of the array
so why this code give this error when i use DEV C++ and if I use TURBO C++ 3.0 then
error:constant expression required displayed. if size of array can't be variable then when
we take size of array through user input.no error is displayed. but why in this case.
please tell me how printf works
First of all, pass only non-user supplied or validated strings to the first argument of printf()!
printf() accepts a variable number of arguments after the required const char* argument (because printf() is what's called a variadic function). The first const char* argument is where you pass a format string so that printf() knows how to display the rest of your arguments.
If the arr character array contains user-inputted values, then it may cause a segfault if the string happens to contain those formatting placeholders, so the format string should always be a hard-coded constant (or validated) string. Your code sample is simple enough to see that it's really a constant, but it's still good practice to get used to printf("%s", arr) to display strings instead of passing them directly to the first argument (unless you absolutely have to of course).
That being said, you use the formatting placeholders (those that start with %) to format the output. If you want to display:
Abc 97
Then your call to printf() should be:
printf("%s %d", arr, j);
The %s tells printf() that the second argument should be interpreted as a pointer to a null-terminated string. The %d tells printf() that the third argument should be interpreted as a signed decimal.
this code gives me a stack overflow error why?
See AndreyT's answer.
I see that now the OP changed the description of the behavior to something totally different, so my explanation no longer applies to his code. Nevertheless, the points I made about variadic functions still stand.
This code results in stack invalidation (or something similar) because you failed to declare function printf. printf is a so called variadic function, it takes variable number of arguments. In C language it has [almost] always been mandatory to declare variadic functions before calling them. The practical reason for this requirement is that variadic functions might (and often will) require some special approach for argument passing. It is often called a calling convention. If you forget to declare a variadic function before calling it, a pre-C99 compiler will assume that it is an ordinary non-variadic function and call it as an ordinary function. I.e. it will use a wrong calling convention, which in turn will lead to stack invalidation. This all depends on the implementation: some might even appear to "work" fine, some will crash. But in any case you absolutely have to declare variadic functions before calling them.
In this case you should include <stdio.h> before calling printf. Header file <stdio.h> is a standard header that contains the declaration of printf. You forgot to do it; hence the error (most likely). There's no way to be 100% sure, since it depends on the implementation.
Otherwise, your code is valid. The code is weird, since you are passing j to printf without supplying a format specifier for it, but it is not an error - printf simply ignores extra variadic arguments. Your code should print Abc in any case. Add #include <stdio.h> at the beginning of your code, and it should work fine, assuming it does what you wanted it to do.
Again, this code
#include <stdio.h>
int main()
{
int j=97;
char arr[4]="Abc";
printf(arr,j);
return 0;
}
is a strange, but perfectly valid C program with a perfectly defined output (adding \n at the end of the output would be a good idea though).
In your line int i, a[i]; in the corrected sample of broken code, a is a variable-length array of i elements, but i is uninitialized. Thus your program has undefined behavior.
You see strings in C language are treated as char* and printf function can print a string directly. For printing strings using this function you should use such code:
printf("%s", arr);
%s tells the function that the first variable will be char*.
If you want to print both arr and j you should define the format first:
printf("%s%d", arr, j);
%d tells the function that the second variable will be int
I suspect the printf() issue is a red herring, since with a null-terminated "Abc" will ignore other arguments.
Have you debugged your program? If not can you be sure the fault isn't in getch()?
I cannot duplicate your issue but then I commented out the getch() for simplicity.
BTW, why did you not use fgetc() or getchar()? Are you intending to use curses in a larger program?
===== Added after your edit =====
Okay, not a red herring, just a mistake by the OP.
C++ does allow allocating an array with the size specified by a variable; you've essentially done this with random (garbage) size and overflowed the stack, as you deduced. When you compile with C++ you are typically no longer compiling C and the rules change (depending on the particular compiler).
That said, I don't understand your question - you need to be a lot more clear with "when we take size of array through user input" ...

Why would you precede the main() function in C with a data type? [duplicate]

This question already has answers here:
What should main() return in C and C++?
(19 answers)
Closed 8 years ago.
Many are familiar with the hello world program in C:
#include <stdio.h>
main ()
{
printf ("hello world");
return 0;
}
Why do some precede the main() function with int as in:
int main()
Also, I've seen the word void entered inside the () as in:
int main(void)
It seems like extra typing for nothing, but maybe it's a best practice that pays dividends in other situations?
Also, why precede main() with an int if you're returning a character string? If anything, one would expect:
char main(void)
I'm also foggy about why we return 0 at the end of the function.
The main function returns an integer status code that tells the operating system whether the program exited successfully.
return 0 indicates success; returning any other value indicates failure.
Since this is an integer and not a string, it returns int, not char or char*. (Calling printf does not have anything to do with returning from the function)
Older versions of C allow a default return type of int.
However, it's better to explicitly specify the return type.
In C (unlike C++), a function that doesn't take any parameters is declared as int myFunc(void)
The following has been valid in C89
main() {
return 0;
}
But in modern C (C99), this isn't allowed anymore because you need to explicitly tell the type of variables and return type of functions, so it becomes
int main() {
return 0;
}
Also, it's legal to omit the return 0 in modern C, so it is legal to write
int main() {
}
And the behavior is as if it returned 0.
People put void between the parentheses because it ensures proper typechecking for function calls. An empty set of parentheses in C mean that no information about the amount and type of the parameters are exposed outside of the function, and the caller has to exactly know these.
void f();
/* defined as void f(int a) { } later in a different unit */
int main() {
f("foo");
}
The call to f causes undefined behavior, because the compiler can't verify the type of the argument against what f expects in the other modules. If you were to write it with void or with int, the compiler would know
void f(int); /* only int arguments accepted! */
int main(void) {
f("foo"); /* 'char*' isn't 'int'! */
}
So for main it's just a good habit to put void there since it's good to do it elsewhere. In C you are allowed to recursively call main in which case such differences may even matter.
Sadly, only a few compilers support modern C, so on many C compilers you may still get warnings for using modern C features.
Also, you may see programs to declare main to return different things than int. Such programs can do that if they use a freestanding C implementation. Such C implementations do not impose any restrictions on main since they don't even know or require such a function in the first place. But to have a common and portable interface to the program's entry point, the C specification requires strictly conforming programs to declare main with return type int, and require hosted C implementations to accept such programs.
It's called an exit status. When the program finishes, you want to let the callee know how your program exited. If it exited normally, you'll return 0, etc.
Here's where you can learn more about exit statuses.
Why do some precede the main () function with int as in:
In C (C 89 anyway, that's the one you seem to be referring to), functions return int by default when they aren't preceded by any data type. Preceding it with int is ultimately just a matter of preference.
Also, I've seen the word 'void' entered inside the () as in:
This is also a matter of preference mostly, however these two are quite different:
f(void) means the function f accepts zero arguments, while f() means the function f accepts an unspecified number of arguments.
f(void) is the correct way to declare a function that takes no arguments in C.
It seems like extra typing for nothing, but maybe it's a best practice that pays dividends in other situations?
The benefit is clarity. If you're going to argue that less typing is good, then what do you think about naming all of your functions and variables and files and everything with as few characters as possible?
Also, why precede main() with an int if you're returning a character string? If anything, one would expect:
char main(void)
What? why in the world would you return a character string from main? There is absolutely no reason to do this...
Also, your declaration of main return a character, not a character string.
I'm also foggy about why we return 0 at the end of the function.
A program returns a value to the operating system when it ends. That value can be used to inform the OS about certain things. For example, your virus scanner could return 1 if a virus was found, 0 if a virus wasn't found and 2 if some error occurred. Usually, 0 means there was no error and everything went well.
The reasoning behind this was to provide some form of error reporting mechanism. So whenever a program returns a zero value, that typically means that it believes it completed its task successfully. For non-zero values, the program failed for some reason. Every now and then, that number might correspond to a certain error message.
Several system API calls in C return return codes to indicate whether the processing was successful. Since there is no concept of an exception in C, it is the best way to tell if your calls were successful or not.
The "int" before the main function declaration is the return type of the function. That is, the function returns an integer.
From the function, you can return a piece of data. It is typical to return "0" if the function completed successfully. Any other number, up to 255 is returned if the program did not execute successfully, to show the error that happened.
That is, you return 0, with an int data type, to show that the program executed correctly.
It is traditional to use int main(int argc, char * argv[]) because the integer return can indicate how the program executed. Returning a 0 indicates that the program executed successfully, whereas a -1 would indicate that it had a failure of some sort. Usually, the exit codes would be very specific and would be used as a method of debugging since exceptions (in the C++ sense) don't exist in C.
Most operating systems uses exit codes for applications. Typically, 0 is the exit code for success (as in, the application successfully completed). The return value of main() is what gets returned to the OS, which is why main() usually returns an integer (and since 0 is success, most C programs will have a return 0; at the end).

Resources