How does this code print 404? - c

I copied this below code from Stack Overflow's 404 Not Found Error Page.
# define v putchar
# define print(x)
main(){v(4+v(v(52)-4));return 0;}/*
#>+++++++4+[>++++++<-]>
++++.----.++++.*/
print(202*2);exit();
#define/*>.#*/exit()
The above code compiles fine and prints 404 on the console. I thought the statement print(202*2); is responsible for printing 404, but I am not right because changing the numbers in this statement also prints 404.
Could somebody help me to understand this code and how it prints 404?
Am posting the compilation output for your reference as there are comments saying this code doesn't compile. The file containing above code is Test.c.
gcc Test.c -o Test
Test.c:3:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
main(){v(4+v(v(52)-4));return 0;}/* ^ Test.c: In function ‘main’:
Test.c:1:12: warning: implicit declaration of function ‘putchar’
[-Wimplicit-function-declaration] # define v putchar
^ Test.c:3:8: note: in expansion of macro ‘v’ main(){v(4+v(v(52)-4));return 0;}/*
^ Test.c: At top level: Test.c:6:14: warning: data definition has no type or storage class print(202*2);exit();
^ Test.c:6:14: warning: type defaults to ‘int’ in declaration of ‘exit’ [-Wimplicit-int] Test.c:6:14: warning:
conflicting types for built-in function ‘exit’
./Test
404

Cannot use a meta-question as dupe, so blatantly copying from the MSO answer.
Since this is tagged c and mentioned "compiled", so just extracting the C part of it.
Credits: Mark Rushakoff is the original author of the polyglot.
The C code is fairly easy to read, but even easier if you run it
through a preprocessor:
main(){putchar(4+putchar(putchar(52)-4));return 0;};exit();
Your standard main function is declared there, and exit is also
declared as a function with an implicit return type of int (exit
is effectively ignored).
putchar was used because you don't need any #include to use it;
you give it an integer argument and it puts the corresponding ASCII
character to stdout and returns the same value you gave it. So, we
put 52 (which is 4); then we subtract 4 and output 0; then we add
4 to output 4 again.
Also, a little more elboration, from [Cole Johnson's]
(https://meta.stackoverflow.com/users/1350209/cole-johnson) answer
Disregarding all of that, when we reformat the code a bit, and replace
52 with its ASCII equivalent ('4'), we get:
int main() {
putchar(4 + putchar(putchar('4') - 4));
return 0;
}
As for the putchar declaration, it is defined by the standard to
return it's input, like realloc. First, this program prints a 4,
then takes the ASCII value (52), subtracts 4 (48), prints that
(ASCII 0), adds 4 (52), prints that (4), then finally
terminates. This results in the following output:
404
As for this polyglot being valid C++, unfortunately, it is not as
C++ requires an explicit return type for functions. This program takes advantage of the fact that C requires functions without an
explicit return type to be int.

# define v putchar
this defines v as the putchar() function. It prints a character and returns it.
# define print(x)
this defines print(x) as nothing (so print(202*2) means nothing)
main(){v(4+v(v(52)-4));return 0;}/*
this could be rewritten as:
main()
{
putchar(4 + putchar(putchar(52) - 4));
return 0;
}
it is using ASCII codes to print '4' (code 52), '0' (code 52 - 4 = 38) and again '4', so "404".
That line ends with a /* starting a comment the continues through the next two lines:
#>+++++++4+[>++++++<-]>
++++.----.++++.*/
The line below turns out empty, but it is a bit tricky because exit() is defined as empty AFTER the line itself. That works because the C preprocessor runs BEFORE the compilation.
print(202*2);exit();
The line below defines exit() as empty, used on the line above.
#define/*>.#*/exit()

The code cannot compile on a standard C compiler, such as gcc -std=c11 -pedantic-errors.
1) main must return int on hosted systems.
2) putchar() must have #include <stdio.h>.
3) You can't write semicolons outside functions.
After fixing these beginner-level bugs and removing all superfluous fluff that doesn't do anything but creating compiler errors, we are left with this:
#include <stdio.h>
#define v putchar
int main(){v(4+v(v(52)-4));return 0;}
This revolves around putchar returning the character written:
putchar(4+putchar(putchar(52)-4));
52 is ASCII for '4'. Print 4.
52 - 4 = 48, ASCII for 0. Print 0.
4 + 48 = 52. Again print 4.
And that's it. Very bleak as far as obfuscation attempts go.
Proper, standard-compliant obfuscation would rather look something like this:
#include <stdio.h>
#include <iso646.h>
??=define not_found_404(a,b,c,d,e,f,g,h,i,j)a%:%:b%:%:c%:%:d%:%:e%:%:f(\
(g%:%:h%:%:i%:%:j<::>)<%'$'+d##o%:%:e not "good",g??=??=ompl ??-- -0163l,\
((void)(0xBAD bito##b not "bad"),not "ugly")??>,(g%:%:h%:%:i%:%:j??(??)){\
((c%:%:d%:%:e)- -not "lost") <:??=a??) -??-??- '<',\
((c%:%:d%:%:e)- -not "found") <:??=b??) -??-??- 'B',\
((c%:%:d%:%:e)- -not 0xDEADC0DE) <:??=c??) -??-??- '5',\
((c%:%:d%:%:e)- -6##6##6 xo##b- -6##6##6)%>)
int main()
{
not_found_404(p,r,i,n,t,f,c,h,a,r);
}

you have defined V as putchar() which take ascii code of char to be printed and return ascii value of printed char. execution of your program will start from the main as bellow
first v(52) will print 4 and return 52
second v(52-4) will print 0 (48 is ascii value of 0) and return 48
finally it will call to v(48+4) will print 4 as 52 is ascii value of '4'.

Related

Can I use a string as a macro to pass the value of that macro in C?

I am trying to achieve that C interprets my string as macro.
Hey, let's suppose there is a defined macro as,
#define ABC 900
If i define;
char* s[] = "ABC" ;
then,
printf("%d",s) ;
Is there any way the compiler understands that "ABC" as macro ABC and passes 900 integer value to printf ?
#include<stdio.h>
#define abc 15
int main(void) {
char a[] = "abc" ;
printf("%d",a);
return 0;
}
When i try the above code, instead of my desired output 15 , i get 6487568 which i guess the integer equivalent of that string.
Edit : those were random values , or address of strings. ( as stated below by others )
No, what you're trying to do is double impossible. You can't access variables by name at runtime (string -> variable) because the compiled machine code knows nothing about the names in your C code, and you can't access macros from the compiler because the compiler knows nothing about macros (they're expanded by the preprocessor before the compiler even sees the code).
In other words, compilation / execution happens in multiple stages:
C source code is preprocessed (which gets rid of directives like #include or #define and expands macros).
The preprocessed token stream is passed to the compiler, which converts it to machine code (a runnable program).
Finally the program runs.
Simplified example:
// original C code
#define FOO 42
...
int x = y + FOO;
After preprocessing:
...
int x = y + 42;
After compilation:
movl %ecx, %eax
addl $42, %eax
There is no trace of FOO in step 2, and the final code knows nothing about x or y.
Variable values such as strings only exist at runtime, in step 3. You can't get back to step 1 from there. If you wanted to access information about macros at runtime, you'd have to keep it explicitly in some sort of data structure, but none of this is automatic.
Macros are simple copy paste and they are pretty limited. A macro will not expand if it's quoted or commented.
One solution would be:
#define ABC "900"
char s[] = ABC;
But no, macros cannot be used for what you're trying to do.
When i try the above code, instead of my desired output 15 , i get 6487568 which i guess the integer equivalent of that string.
It's undefined behavior. Most likely it's the address of the string. If you compile with -Wall you will get a warning for this.

Argument counting in macro

I'm trying to understand the argument counting in C preprocessing macro and the idea in this answer. We have the following macro (I changed the number of arguments for simplicity):
#define HAS_ARGS(...) HAS_ARGS_(__VA_ARGS__, 1, 1, 0,)
#define HAS_ARGS_(a, b, c, N, ...) N
As far as I understand the purpose of this macro is to check if the given varargs empty. So on empty varargs the macro invokation is replaced with 0 which seems fine. But with a single argument it also turns into 0 which I seems strange.
HAS_ARGS(); //0
HAS_ARGS(123); //also 0
HAS_ARGS(1, 2); //1
LIVE DEMO
I think I understand the reason. In case of empty varargs a is replaced with empty preprocessing token, in case of a single argument vararg a is replaced with the argument yielding the same result.
Is there a way to get 0 returned in case varargs are empty, 1 in case argument number is from 1 to the defined in HAS_ARGS_ macro invokation without using comma-swallowing or other non-conforming tricks. I mean
SOME_MACRO_F() //0
SOME_MACRO_F(234) //1
SOME_MACRO_F(123, 132) //1
//etc
You cannot pass zero arguments to HAS_ARGS(...). ISO C (and C++, at least for the next two years) requires that an ellipsis corresponds to at least one additional argument after the last named one.
If there are no named ones, then the macro needs to be passed at least one argument. In the case of HAS_ARGS() the extra argument is simply an empty token sequence. Zero arguments is simply not possible.
This is exactly the use case in the answer. The target macro expects at least one argument. So we can use a wrapper accepting only an ellipsis for "overload resolution". A better name probably would have been HAS_MORE_THAN_1_ARGS. Because that's what the predicate is meant to tell you. Alas, I favored brevity on that answer.
It seems difficult to compute that at compile-time, but you can do it at run-time by stringifying the arguments and testing if the string is empty.
Tested with gcc:
#include <stdio.h>
#define HAS_ARGS(...) (#__VA_ARGS__[0] != '\0')
int main()
{
printf("%d %d %d %d\n",HAS_ARGS(),HAS_ARGS(10),HAS_ARGS(20,"foo"),HAS_ARGS(10,20));
return 0;
}
this prints:
0 1 1 1
behind the scenes, here's what the pre-processor outputs:
int main()
{
printf("%d %d %d %d\n",(("")[0] != '\0'),(("10")[0] != '\0'),(("20,\"foo\"")[
0] != '\0'),(("10,20")[0] != '\0'));
return 0;
}

How to understand the behaviour of printf statement in C?

I have come across two C aptitude questions.
main()
{
int x=4,y,z;
y=--x;
z=x--;
printf("\n%d %d %d",x,y,z);
}
Output: 2 3 3 (it is printed left to right)
main()
{
int k=35;
printf("\n%d %d %d",k==35,k=50,k>40);
}
Output: 0 50 0 (it is printed right to left)
Why is it so? I have seen so many similar answers on Stack Overflow similar to this. People answer this is undefined behaviour, but if this is asked in interviews, how should one answer them?
The order of the evaluation of the arguments to any function in C is not in any particular order. It looks like the platform / compiler you are being asked about is probably evaluating the functions arguments right-to-left, which would print out the result you obtained, but the C standard says you do not know the order, so what is shown here is undefined behavior and would almost certainly obtain different results on a different compiler or platform.
Note, in your function, all the variable values are assigned before calling printf() - while in your main(), the values are being assigned to the variable in printf()'s argument list.
Yes, you need to read the documentation of printf. Read it carefully and several times.
You should compile with all warnings and debug info, i.e. using gcc -Wall -Wextra -g with GCC. Improve your code to get no warnings. Then use the gdb debugger to understand the behavior of your program.
On the second example (where I added the missing but mandatory #include <stdio.h>) GCC 8.1 gives on Linux/x86-64/Debian:
% gcc -Wall -Wextra -g m.c -o myprog
m.c:3:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
main()
^~~~
m.c: In function ‘main’:
m.c:6:29: warning: operation on ‘k’ may be undefined [-Wsequence-point]
printf("\n%d %d %d",k==35,k=50,k>40);
~^~~
m.c:6:29: warning: operation on ‘k’ may be undefined [-Wsequence-point]
Also, as explained by John's H answer the order of evaluation of arguments is undefined (and the compiler gives some clue). A good way to think of it is to believe it is random and could dynamically change (but few implementations behave like that), and to write your source code in such way that won't change the intended behavior of your program.
In
printf("\n%d %d %d",k==35,k=50,k>40);
// ^
you have an assignment operator. So k is changing. But you don't know exactly when (it could happen after or before the k==35 and k>40 comparisons). So you have undefined behavior, be very scared!
At last, stdout is often buffered (see setvbuf(3) & stdio(3) for more) and usually line buffered. So the buffer could be flushed by the \n which you'll better place at the end of the format control string. Otherwise, ensure flushing by calling fflush(3).
From the C standard C99, section 6.5
The grouping of operators and operands is indicated by the syntax. 74)
Except as specified later (for the function-call () , && , || , ?:
,and comma operators), the order of e valuation of sube xpressions and
the order in which side ef fects tak ep lace are both unspecified.
So C standard doesn't say anything like that the function argument are solved from right to left & printed from left to right.
case 1 :- The statement y=--x; results in y=3 and x=3. when the expression z=x--; is performed. After this x=2 and z=3. Finally when the printf statement executed
printf("\n%d %d %d",x,y,z);
it prints 2 3 3.
Case 2 :- Here x=35 and when printf statement executes
printf("\n%d %d %d",k==35, k=50, k>40);
| | | <---- R to L (in your machine, seems argument passed from R to L, but can't grantee same for other platform)
50==35 50 35>40
| | | ----> L to R
0 50 0
In between, the main() prototype you used is incorrect. It should be int main(void) { /*... */ } as specified in C standard here
The function called at program startup is named main. The
implementation declares no prototype for this function. It shall be
defined with a return type of int and with no parameters:
int main(void) { /* ... */ }
or with two parameters (referred to here as argc and argv, though any
names may be used, as they are local to the function in which they are
declared):
int main(int argc, char *argv[]) { /* ... */ }
Let's trace through the code.
Example 1:
main()
This is no longer correct (since 1999). main needs a return type: int main().
{
int x=4,y,z;
Here x = 4 whereas y and z are indeterminate.
y=--x;
--x decrements x (from 4 to 3) and returns the new value of x (3), which is then assigned to y. At the end we have x = 3, y = 3, and z still indeterminate.
z=x--;
x-- decrements x (from 3 to 2) and returns the old value of x (3), which is then assigned to z. At the end we have x = 2, y = 3, z = 3.
printf("\n%d %d %d",x,y,z);
Here we're calling printf, but the function is not declared. The code is missing #include <stdio.h>; without it, the behavior is undefined (because it's calling an undeclared varargs function). But let's assume <stdio.h> was included. Then:
This outputs x, y, z as 2 3 3. Note that the format string should be "%d %d %d\n"; in the C model, lines are terminated by '\n', so you should always have a '\n' at the end.
}
Example 2:
main()
Same issue, should be int main() and #include <stdio.h> is missing.
{
int k=35;
Now k = 35.
printf("\n%d %d %d",k==35,k=50,k>40);
This is just broken. Function arguments can be evaluated in any order. In particular, the assignment to k (k = 50) and the comparisons (k == 35, k > 40) are not sequenced relative to each other, which means this piece of code has undefined behavior. You're not allowed to modify a variable while at the same time reading from it.
}
People answer this is undefined behaviour, but if this is asked in interviews, how should one answer them?
Tell them "this is undefined behavior". That's the correct answer. The example above is not required to produce any output. It could print 1 2 3, but it also could print hello!, or go into an infinite loop, or crash, or delete all of your files.
As far as the C standard is concerned, the code is simply meaningless.
(What happens on any particular platform is highly dependent on your compiler, the exact version of the compiler, any optimization options used, etc.)
I have seen so many similar answers on Stack Overflow similar to this.
People answer this is undefined behaviour,
And that's quite often the only correct answer.
but if this is asked in interviews, how should one answer them?
If your interviewer truly knows C, but chooses to ask this sort of question, it can be thought of as a trick question. They might seem to be expecting an answer like "1 50 1", but really they do expect the correct answer, which is, "It's undefined."
So if I were asked this question, I would give the interviewer a look suggesting "I can't believe you're asking me this", but then say, confidently, "It's undefined."
If the interviewer doesn't realize it's undefined, you have somewhat of a problem, but that's a human psychology and interview strategy question, not a C programming question, so I think I'll avoid delving into it further.

'system' was not declared in this scope error

So i dont get this error in other programs but i did get it in this.
This program is an example where i dont get the error.
#include<stdio.h>
int main() {
system("pause");
} // end main
but in this program below i get the error
#include <stdio.h>
//#include <stdlib.h>
// Takes the number from function1, calculates the result and returns recursively.
int topla (int n) {
if(n == 1)
return 3;
else
return topla(n-1) + topla(n-1) + topla(n-1);
}
// Takes a number from main and calls function topla to find out what is 3 to the
// power of n
int function1(int n) {
return topla(n);
}
int main() {
int n; // We us this to calculate 3 to the power of n
printf("Enter a number n to find what 3 to the power n is: ");
scanf("%d", &n);
function1(n);
system("pause");
} // end main
Just include stdlib.h, but don't use system("pause") as it's not standard and will not work on every system, just a simple getchar() (or a loop involving getchar() since you've used scanf()) should do the trick.
And normally system("pause") is found in windows command line programs because windows command prompt closes when the program exits, so maybe running the program from the command prompt directly would help, or using an IDE that fixes this like geany.
Finally always check the return value if scanf() instead of assuming that it worked.
Note: This code
return topla(n - 1) + topla(n - 1) + topla(n - 1)
you can write as
return 3 * topla(n - 1);
instead of calling topla() recursively 3 times.
And you don't really need the else because the function returns unless the n != 1 so even without the else the recursion will stop when n == 1.
The system function is declared in the standard header <stdlib.h>. If your program calls system(), you must have
#include <stdlib.h>
at or near the top of your source file.
But part of your question is: why didn't the compiler complain when you omitted the #include directive?
The 1990 C standard (sometimes called "ANSI C") permits calls to functions that have not been explicitly declared. If you write, for example:
system("pause");
with no visible declaration for the system function, it would be assumed that system is declared with a return type of int and parameters matching the arguments in the call -- in this case, a single argument of type char*. That happens to be consistent with the actual declaration of system, so with a C90 compiler, you can get away with omitting the #include directive. And some C compilers that support the more current 1999 and 2011 standards (which don't permit implicit declarations) still permit the old form, perhaps with a warning, for the sake of backward compatibility.
Even given a C90 compiler, there is no advantage to depending on the now obsolete "implicit int" rule. Just add the #include <stdlib.h>. More generally, for any library function you call, read its documentation and #include the header that declares it.
As for why you got an error with one of your programs and not another, I don't have an explanation for that. Perhaps you invoked your compiler with different settings. In any case, it doesn't really matter -- though you might look into how to configure your compiler so it always warns about things like this, so you can avoid this kind of error.
Here you need to know about two things.
Firstly, your code works absolutely fine and the program really finds the value of 3^n. So do not worry about that.
Coming to the system() part,
In order to use the system(); function, you need to include the stdlib.h header file, as the function is declared in that header.
So it is a good practice to include the header (rather than commenting it).
Now, the pause keyword is used in windows, to stop the console from closing after the completion of the program and it is only for windows.
Note that, system("pause"); is also not a standard, and it does not work on other machines, namely linux as, with the system command, you are directly interacting with the command line. In this regard, the commands for each operating system are specific, and they cannot be used for other OS.
so it is better that you use getchar(); , a C standard library function, to hold the console window.

C function defined as int but having no return statement in the body still compiles

Say you have a C code like this:
#include <stdio.h>
int main(){
printf("Hello, world!\n");
printf("%d\n", f());
}
int f(){
}
It compiles fine with gcc, and the output (on my system) is:
Hello, world!
14
But.. but.. how is that possible? I thought that C won't let you compile something like that because f() doesn't have a return statement returning an integer. Why is that allowed? Is it a C feature or compiler omission, and where did 14 come from?
The return value in this case, depending on the exact platform, will likely be whatever random value happened to be left in the return register (e.g. EAX on x86) at the assembly level. Not explicitly returning a value is allowed, but gives an undefined value.
In this case, the 14 is the return value from printf.
compile with -Wall to enable more sanity checking in the compiler.
gcc -Wall /tmp/a.c
/tmp/a.c: In function ‘main’:
/tmp/a.c:5: warning: implicit declaration of function ‘f’
/tmp/a.c:6: warning: control reaches end of non-void function
/tmp/a.c: In function ‘f’:
/tmp/a.c:10: warning: control reaches end of non-void function
Note how it flags up the missing return statements - as "control reaches end of non-void function"?
Always compile using -Wall or similar - you will save yourself heartache later on.
I can recall several occasions when this exact issue has caused hours or days of debugging to fix - it sorta works until it doesn't one day.
14 is exactly the return value of the first printf, also. Probably, the compiler optimized out the call to f() and then we're left with 14 on EAX.
Try compiling your code with different optimization levels (-O0, -O1, -O2, -O3, -Os) and see if the output changes.
18 is the return value of the first print statement. (the number of characters printed)
This value is stored in stack memory.
In second printf function the stack value is 'returned' by function f. But f just left the value that printf created in that slot on the stack.
for example, in this code:
#include <stdio.h>
int main(){
printf("Hello, world!1234\n");
printf("%d\n", f());
}
int f(){
}
Which compiles fine with gcc, and the output (on my system) is:
Hello, world!
18
for details of printf() returning value mail me.
You probably have the warning level of your compiler set very low. So it is allowed, even if the result is undefined.
The default return value from a
function is int. In other words,
unless explicitly specified the
default return value by compiler would
be integer value from function.
So, the ommiting of return statement is allowed, but undefined value will be returned, if you try to use it.
I compile with -Werror=return-type to prevent this. GCC will give an error if you don't return from a function (unless it's void return).

Resources