C code after preprocessor - c

This is an exercise taken by a book. The question is what is the output of this code.
This code prints always "N is undefined", but I don't know why. The command "#undef N" is after the function f. Then, why the output is always "N is undefined"?
#define N 100
void f(void);
int main(void)
{
f();
#ifdef N
#undef N
#endif
return 0;
}
void f(void)
{
#if defined(N)
printf("N is %d\n", N);
#else
printf("N is undefined\n");
#endif
}

The point of this exercise is to demonstrate that preprocessor's control flow is completely separate from the control flow of your program.
#if/#undef directives are processed in the order that they appear in the text of your program. They are processed only once at compile time; the decision to define or undefine a preprocessor variable cannot be reconsidered at runtime.
That's why the fact that f executes before #if/#undef line of the main is irrelevant. You can change the output of this program only by moving f to a position in file before main.

If you run the compiler with the -E flag (for gcc at least) it'll show you what the code you're actually compiling is.
You'll see that the preprocessor doesn't follow the code execution - it performs its actions in the order that they appear in the file.
Then the compiler takes the resulting code and f just has the one call to printf in it that says N isn't defined.

The C preprocessor goes through your code line by line. As such, it is wrong to assume the #undef happens after the function f() because of the function call. Instead, it happens before your definition of function f().
To understand this, you have to distinguish between the preprocessor (line by line) and the control flow (follows function calls).

Because the preprocessors instructions run in the "physycal" order, line after line.
Think about it something is executed before actual compilation, in a such way your code be clear, only with plain C code for the compiler.

Related

What is the problem with this preprocessor in the code below?

//C code starts
#define mod(a) (a>=0?a:-a)
#include<stdio.h>
int main(){
int x,y,z;
scanf("%d%d%d",&x,&y,&z);
printf("%d %d %d %d %d\n",x,y,z,y-z,x-z);
if(mod(y-x)<mod(x-z)) printf("%d %d Cat A",mod(y-z),mod(x-z));
else if(mod(y-z)>mod(x-z)) printf("%d %d Cat B",mod(y-z),mod(x-z));
else printf("Mouse C");
printf("\n");
}
/*code ends here*/
For the input of "1 3 2" I would expect the output to be "Mouse C" but it is not the case.
Also if we add all the variables in mod in one more bracket (e.g. if the mod(y-z) is then written as mod((y-z)) ) then the output comes as expected.
So why it is going on?
Macros perform direct text (or more accurately, token) substitution. So this:
mod(y-x)
Is exactly the same as this:
(y-x>=0?y-x:-y-x)
Notice that the last part is -y-x, i.e. the negation of y minus x, while what you wanted was -(y-x). This is a prime example of why macro arguments should always be placed in parenthesis as follows:
#define mod(a) ((a)>=0?(a):-(a))
When you want to see what your macro actually does, look at your code after the preprocessor has actually done its job:
gcc -E file.c
clang -E file.c or clang --preprocess file.c
cl.exe /E file.c, cl.exe /P file.c, or cl.exe /EP file.c for those on windows
It's in general a bad idea to use macros for these things. Use a function instead.
First thing is that you the preprocessor just replaces text. mod(y-x) will be expanded to (y-x>=0?y-x:-y-x) which is obviously wrong. You can remedy this with
#define mod(a) ((a)>=0?(a):-(a))
But here's another catch. Suppose you do this:
mod(f())
where f is a function with side effects. For instance the rand() function. The macro would then make three function calls.
You can solve this problem in gcc (thus making it non-portable) with this construct:
#define mod(a) ({ int _a=a; _a>=0?_a:-_a; })
But isn't it easier to just do this?
long long mod(a) {
return a>=0?a:-a;
}

Why GCC won't give me stackerror on long arguments?

My point is to show someone that every argument you sent to a C function, are pushed on the stack. It also happens in Ruby, Python, Lua, JavaScript, etc.
So I have written a Ruby code that generates a C code:
#!/usr/bin/env ruby
str = Array.new(10, &:itself)
a = <<~EOF
#include <stdio.h>
void x(#{str.map { |x| "int n#{x}" }.join(?,)}) {
printf("%d\\n", n#{str.length - 1}) ;
}
int main() { x(#{str.join(?,)}) ; }
EOF
IO.write('p.c', a)
After running this code with the Ruby interpreter, I get a file called p.c, which has this content:
#include <stdio.h>
void x(int n0,int n1,int n2,int n3,int n4,int n5,int n6,int n7,int n8,int n9) {
printf("%d\n", n9) ;
}
int main() { x(0,1,2,3,4,5,6,7,8,9) ; }
Which is good, and does compile and execute just fine.
But if I give the ruby program an array size of 100,000, it should generate a C file that takes n0 to n999999 arguments. That means 100,000 arguments.
A quick google search shows me that C's arguments are stored on the stack.
Passing these arguments should give me a stackerror, but it doesn't. GCC compiles it just fine, I also get output of 99999.
But with Clang, I get:
p.c:4:17: error: use of undeclared identifier 'n99999'
printf("%d\n", n99999) ;
^
p.c:8:195690: error: too many arguments to function call, expected 34464, have 100000
p.c:3:6: note: 'x' declared here
2 errors generated.
How does GCC deal with that many arguments? In most cases, I get stackerror on other programming languages when the stacksize in 10900.
The best way to prove this to your friend is to write an infinite recursive function:
#include <stdio.h>
void recurse(int x) {
static int iterations=0;
printf("Iteration: %d\n", ++iterations);
recurse(x);
}
int main() {
recurse(1);
}
This will always overflow the stack assuming there is a stack (not all architectures use stacks). It will tell you how many stack frames you get to before the stack overflow happens; this will give you an idea of the depth of the stack.
As for why gcc compiles, gcc does not know the target stack size so it cannot check for a stack overflow. It's theoretically possible to have a stack large enough to accommodate 100,000 arguments. That's less than half a megabyte. Not sure why clang behaves differently; it would depend on seeing the generated C code.
If you can share what computer system/architecture you are using, it would be helpful. You cited information that applies to 64-bit Intel systems (e.g. PC/Windows).

Self-replicating code, how to implement different behavior in first iteration vs following ones?

So I'm having a tough time with a school project. The goal is to make a self-replicating code, name Sully.c. That program must output it's own source code (it's a quine) into a program named Sully_x.c, where x is an integer in the source code, then compile said program and execute it iff x > 0. x must decrement from one copy to the next, but not from the original Sully.c to Sully_5.c.
Here is my code so far:
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int k = 5;
#define F1 int main(void){int fd = open("Sully_5.c", 0);if(fd != -1){close(fd);k-=1;}char buff[62];(sprintf)(buff, "Sully_%d.c", k);FILE *f = fopen(buff, "w");fprintf(f, "#include <fcntl.h>\n#include <stdio.h>\n#include <stdlib.h>\n#include <unistd.h>\nint k = %d;\n#define F1 %s\n#define F2(x) #x\n#define F3(x) F2(x)\nconst char *s = F3(F1);\nF1\n", k, s);fclose(f);(sprintf)(buff, "gcc -Wall -Wextra -Werror Sully_%d.c -o Sully_%d", k, k);system(buff);if (k != 0){(sprintf)(buff, "./Sully_%d", k);system(buff);}return 0;}
#define F2(x) #x
#define F3(x) F2(x)
const char *s = F3(F1);
F1
That code works, and checks all the requirements for the program. However, I'm using a method that checks something other than the code itself -> I'm checking if sully_5.c already exists or not. If it doesn't, x doesn't move, if it does, then it is decremented.
Another method would have been to use argv[0] or the macro __FILE__, but both these options are explicitly forbidden for the assignment and considered cheating.
But, apparently there are other methods that doesn't require any of the above technique. I can't think of any, because if Sully.c and Sully_5.c need different behaviors but the same source code, than there must be an external variable that needs to influence the code behavior, or so is my hypothesis.
Am I right? Wrong? How else could this be done?
... there must be an external variable that needs to influence the code behavior
How else could this be done?
You can define or not some preprocessing variables (e.g. -Daze or -Daze=12 etc) to generate a different code using conditional compilation without changing the source
The execution can also use the argument(s) given to the program when it is run to change its behavior

Will the compiler allocate any memory for code disabled by macro in C language?

For example:
int main()
{
fun();//calling a fun
}
void fun(void)
{
#if 0
int a = 4;
int b = 5;
#endif
}
What is the size of the fun() function? And what is the total memory will be created for main() function?
Compilation of a C source file is done in multiple phases. The phase where the preprocessor runs is done before the phase where the code is compiled.
The "compiler" will not even see code that the preprocessor has removed; from its point of view, the function is simply
void fun(void)
{
}
Now if the function will "create memory" depends on the compiler and its optimization. For a debug build the function will probably still exist and be called. For an optimized release build the compiler might not call or even keep (generate boilerplate code for) the function.
Compilation is split into 4 stages.
Preprocessing.
Compilation.
Assembler.
Linker
Compiler performs preprocessor directives before starting the actual compilation, and in this stage conditional inclusions are performed along with others.
The #if is a conditional inclusion directive.
From C11 draft 6.10.1-3:
Preprocessing directives of the forms
#if constant-expression new-line groupopt
#elif constant-expression new-line groupopt
check whether the controlling constant expression evaluates to nonzero.
As in your code #if 0 tries to evaluate to nonzero but remains false, thereby the code within the conditional block is excluded.
The preprocessing stage can be output to stdout with -E option:
gcc -E filename.c
from the command above the output will give,
# 943 "/usr/include/stdio.h" 3 4
# 2 "filename.c" 2
void fun(void)
{
}
int main()
{
fun();
return 0;
}
As we can see the statements with the #if condition are removed during the preprocessing stage.
This directive can be used to avoid compilation of certain code block.
Now to see if there is any memory allocated by the compiler for an empty function,
filename.c:
void fun(void)
{
}
int main()
{
fun();
return 0;
}
The size command gives,
$ size a.out
text data bss dec hex filename
1171 552 8 1731 6c3 a.out
and for the code,
filename.c:
void fun(void)
{
#if 0
int a = 4;
int b = 5;
#endif
}
int main()
{
fun();
return 0;
}
The output of size command for the above code is,
$ size a.out
text data bss dec hex filename
1171 552 8 1731 6c3 a.out
As seen in both cases memory allocated is same by which can conclude that the compiler does not allocate memory for the block of code disabled by macro.
According to Gcc reference:
The simplest sort of conditional is
#ifdef MACRO
controlled text
#endif /* MACRO */
This block is called a conditional group. controlled text will be
included in the output of the preprocessor if and only if MACRO is
defined. We say that the conditional succeeds if MACRO is defined,
fails if it is not.
The controlled text inside of a conditional can include preprocessing
directives. They are executed only if the conditional succeeds. You
can nest conditional groups inside other conditional groups, but they
must be completely nested. In other words, ‘#endif’ always matches the
nearest ‘#ifdef’ (or ‘#ifndef’, or ‘#if’). Also, you cannot start a
conditional group in one file and end it in another.
Even if a conditional fails, the controlled text inside it is still
run through initial transformations and tokenization. Therefore, it
must all be lexically valid C. Normally the only way this matters is
that all comments and string literals inside a failing conditional
group must still be properly ended.
The comment following the ‘#endif’ is not required, but it is a good
practice if there is a lot of controlled text, because it helps people
match the ‘#endif’ to the corresponding ‘#ifdef’. Older programs
sometimes put MACRO directly after the ‘#endif’ without enclosing it
in a comment. This is invalid code according to the C standard. CPP
accepts it with a warning. It never affects which ‘#ifndef’ the
‘#endif’ matches.
Sometimes you wish to use some code if a macro is not defined. You can
do this by writing ‘#ifndef’ instead of ‘#ifdef’. One common use of
‘#ifndef’ is to include code only the first time a header file is
included.

"if " and " #if "; which one is better to use [duplicate]

This question already has answers here:
Difference between preprocessor directive #if and normal if
(3 answers)
Closed 9 years ago.
I learned that if or #if can both be used for condition checks.
As we can check conditions using if, why would we use preprocessor #if?
What difference will it make to my code if I use #if instead of if?
Which one is better to use and why?
if and #if are different things with different purposes.
If you use the if statement, the condition is evaluated at runtime, and the code for both branches exists within the compiled program. The condition can be based on runtime information, such as the state of a variable. if is for standard flow control in a program.
If you use the preprocessor's #if, the condition is evaluated at compile-time (originally this was before compile-time, but these days the preprocessor is usually part of the compiler), and the code for the false branch is not included in the compiled program. The condition can only be based on compile-time information (such as #define constants and the like). #if is for having different code for different compile-time environments (for instance, different code for compiling on Windows vs. *nix, that sort of thing).
we could not say which better to use, because one is used in the compilation phase (#if) and the other one is used in the runtime phase(if)
#if 1
printf("this code will be built\n");
#else
printf("this code will not\n");
#endif
try to build the above code with gcc -E and you will see that your compiler will generate another code containing only :
printf("this code will be build\n");
the other printf will not be present in the new code (pre processor code) and then no present in the program binary.
Conclusion: the #if is treated in the compilation phase but the normal if is treated when your program run
You can use the #if 0 in a part of your code inorder to avoid the compiler to compile it. it's like you have commented this part
example
int main(void) {
printf("this code will be build\n");
#if 0
printf("this code will not\n");
#endif
}
it's equivalent to
int main(void) {
printf("this code will be built\n");
/*
printf("this code will not\n");
*/
}
Hey both are different
#if Tests if the condition is true at the compile time.
if is evaluated at runtime.
You should use #if when the outcome of the condition is known at compile time and regular if when outcome is not known until runtime.
#if DEBUG
I know at compile time I am making a debug build
if (date == DateTime.Today)
Depends on what day it is
Some uses of #if are:
You want to put extra prints, or checks when you build a debug version of your code
you want to ensure the compiler doesn't include a .h file twice
you want to write code that will use different system calls, and depending on the system it gets compiled on use the appropriate ones.
Because all of the above are checked at compile time this means that:
The condition must be able to be evaluated at compiletime
The produced code will not contain the branches that evaluate to false, leading to smaller code, and faster, as the condition is not checked every time the program is run.
Examples:
Adding extra checks only for debug mode:
#define DEBUGLEVEL 2
#if DEBUGLEVEL > 1
printf("The value of x is: %d", x);
#end if
#if DEBUGLEVEL > 2
printf("The address of x is: %x", &x);
ASSERT(x > 100);
#end if
Ensuring header only gets included once:
#ifndef PERSON_H
#define PERSON_H
class Person{
....
};
#end if
Having different code depending on platform:
#ifdef WINDOWS
time = QueryPerformanceCounter(..);
#else
time = gettimeofday(..);
#endif

Resources