I'm curious as to why I see nearly all C macros formatted like this:
#ifndef FOO
# define FOO
#endif
Or this:
#ifndef FOO
#define FOO
#endif
But never this:
#ifndef FOO
#define FOO
#endif
(moreover, vim's = operator only seems to count the first two as correct.)
Is this due to portability issues among compilers, or is it just a standard practice?
I've seen it done all three ways, it seems to be a matter of style, not of syntax
While usually the second example is the most common, i've seen cases where the first (or third) is used to help distinguish multiple levels of #ifdefs. Sometimes the logic can become deeply nested and the only way to understand it at a glance is to use indentation much like it is common practice to indent blocks of code between { and }.
IIRC, older C preprocessors required the # to be the first character on the line (though I've never actually encountered one that had this requirement).
I never seen your code like your first example. I usually wrote preprocessor directives as in your second example. I found that it visually interfered with the indentation of the actual code less (not that I write in C anymore).
The GNU C Preprocessor manual says:
Preprocessing directives are lines in
your program that start with '#'.
Whitespace is allowed before and after
the '#'.
For preference I use the third style, with the exception of include guards, for which I use the second style.
I don't like the first style at all - I think of #define as being a preprocessor instruction, even though really of course it isn't, it's a # followed by the preprocessor instruction define. But since I do think of it that way, it seems wrong to separate them. I expect text editors written by people who advocate that style will have a block indent/un-indent that works on code written in that style. But I would hate to encounter it using a text editor that didn't.
There's no point pandering to ancient preprocessors where the # has to be the first character of the line, unless you can also list off the top of your head all the other differences between those implementations and standard C, in order to avoid the other things you could possibly do that they would not support. Of course if you genuinely are working with a pre-standard compiler, fair enough.
Preprocessor directives are lines included in our programs that are not actually program statements but directives for the preprocessor. These lines are always preceded by a hash sign (#).Whitespace is allowed before and after the '#'. As soon as a newline character is found, the preprocessor directive is considered to end.
There is no other rule as far the standard of C/C++ concerned,So it remains as the matter of style and readability issue,I have seen/wrote programs only in the second way that you posted,although the third one seems more readable.
Related
Is it mandatory to write #include at the top of the program and outside the main function?
I tried using #define preprocessor inside the main function and it worked fine with only one exception..that being the constant which i defined using the define directive can be used only after the line #define
For instance say printf("%d",PI); #define PI 3.14will give error "Undefined symbol PI". But in the following code i did not encounter any error
#define PI 3.14
printf("%d",PI);
Is this because C is a procedural language and procedural languages implements top down approach?
Also i would like to know that can we use only #define inside the main function or other preprocessor directives too? If we can use then which ones?
Or is it the other way around, instead of #include we can use all the preprocessor directives in the main function?
The only place you can't put a preprocessor directive is in a macro expansion. The sole exception is #pragma, which can also be written _Pragma().
This has nothing to do with "procedural", but due to the fact that C is defined in terms of 8 translation phases, each of which is "as-if" fully-completed before the next phase. For more details, see the C11 standard, section 5.1.1.2.
One example of when it is useful to use preprocessor directives after the start of a file is for the "X Macro" technique (which many people only know as "those .def files").
Preprocessor directives work pretty much anywhere. Of course, you can make your code confusing pretty easily if you abuse this.
The pre-processor does its work before the compiler performs the source code translation into object code. Pre-processing is mostly a string replacement task, so it can be placed just about anywhere in your code. Of course, if the resulting expansion is syntactically incorrect, the expanded source code will fail to compile.
A commonly tolerated practice is to embed conditional compilation directives inside a function to allow the function to use platform specific APIs.
void some_wrapper_function () {
#if defined(UNIX)
some_unix_specific_function();
#elif defined(WIN32)
some_win32_specific_function();
#else
#error "Compiled on an unsupported platform"
#endif
}
By their nature, the directives themselves normally have to be defined at the beginning of the line, and not somewhere in the middle of source line. But, defined macros can of course appear anywhere in the source, and will be replaced according to the substitution rules defined by your directives.
The trick here is to realize that # directives have traditionally been interpreted by a pre-processor, that runs before any compilation. The pre-processor would produce a new source file, which was then compiled. I don't think any modern compiler works that way by default, but the same principles apply.
So when you say
#include "foo.h"
you're saying "insert the entire contents of foo.h into my source code starting at this line."
You can use this directive pretty much anywhere in a source file, but it's rarely useful (and not often readable) to use it anywhere other than at the start of the source.
I'm currently using the __COUNTER__ macro in my C library code to generate unique integer identifiers. It works nicely, but I see two issues:
It's not part of any C or C++ standard.
Independent code that also uses __COUNTER__ might get confused.
I thus wish to implement an equivalent to __COUNTER__ myself.
Alternatives that I'm aware of, but do not want to use:
__LINE__ (because multiple macros per line wouldn't get unique ids)
BOOST_PP_COUNTER (because I don't want a boost dependency)
BOOST_PP_COUNTER proves that this can be done, even though other answers claim it is impossible.
In essence, I'm looking for a header file "mycounter.h", such that
#include "mycounter.h"
__MYCOUNTER__
__MYCOUNTER__ __MYCOUNTER__
__MYCOUNTER__
will be preprocessed by gcc -E to
(...)
0
1 2
3
without using the built-in __COUNTER__.
Note: Earlier, this question was marked as a duplicate of this, which deals with using __COUNTER__ rather than avoiding it.
You can't implement __COUNTER__ directly. The preprocessor is purely functional - no state changes. A hidden counter is inherently impossible in such a system. (BOOST_PP_COUNTER does not prove what you want can be done - it relies on #include and is therefore one-per-line only - may as well use __LINE__. That said, the implementation is brilliant, you should read it anyway.)
What you can do is refactor your metaprogram so that the counter could be applied to the input data by a pure function. e.g. using good ol' Order:
#include <order/interpreter.h>
#define ORDER_PP_DEF_8map_count \
ORDER_PP_FN(8fn(8L, 8rec_mc(8L, 8nil, 0)))
#define ORDER_PP_DEF_8rec_mc \
ORDER_PP_FN(8fn(8L, 8R, 8C, \
8if(8is_nil(8L), \
8R, \
8let((8H, 8seq_head(8L)) \
(8T, 8seq_tail(8L)) \
(8D, 8plus(8C, 1)), \
8if(8is_seq(8H), \
8rec_mc(8T, 8seq_append(8R, 8seq_take(1, 8L)), 8C), \
8rec_mc(8T, 8seq_append(8R, 8seq(8C)), 8D) )))))
ORDER_PP (
8map_count(8seq( 8seq(8(A)), 8true, 8seq(8(C)), 8true, 8true )) //((A))(0)((C))(1)(2)
)
(recurses down the list, leaving sublist elements where they are and replacing non-list elements - represented by 8false - with an incrementing counter variable)
I assume you don't actually want to simply drop __COUNTER__ values at the program toplevel, so if you can place the code into which you need to weave __COUNTER__ values inside a wrapper macro that splits it into some kind of sequence or list, you can then feed the list to a pure function similar to the example.
Of course a metaprogramming library capable of expressing such code is going to be significantly less portable and maintainable than __COUNTER__ anyway. __COUNTER__ is supported by Intel, GCC, Clang and MSVC. (not everyone, e.g. pcc doesn't have it, but does anyone even use that?) Arguably if you demonstrate the feature in use in real code, it makes a stronger case to the standardisation committee that __COUNTER__ should become part of the next C standard.
You are confusing two different things:
1 - the preprocessor which handles#define and #include like stuff. It does only works as the text (meaning character sequences) level and has very few computing capabilities. It is so limited that it cannot implement __COUNTER__. The preprocessor work consist only in macro expansion and file replacement. The crucial point it that it occur before the compilation even start.
2 - the C++ language and in particular the template (meta)programming language which can be used to compute stuff during the compilation phase. It is indeed turing complete but as I already said compilation start after preprocessing.
So what you are asking is not doable in standard C or C++. To solve this problem boost implement its own preprocessor which is not standard compliant and has much more computing capabilities. In particular it is possible to use build an analogue to __counter__ with it.
This small header of mine contains an own implementation of a C preprocessor counter (it uses a slightly different syntax).
Let's say I have a macro in an inclusion file:
// a.h
#define VALUE SUBSTITUTE
And another file that includes it:
// b.h
#define SUBSTITUTE 3
#include "a.h"
Is it the case that VALUE is now defined to SUBSTITUTE and will be macro expanded in two passes to 3, or is it the case that VALUE has been set to the macro expanded value of SUBSTITUTE (i.e. 3)?
I ask this question in the interest of trying to understand the Boost preprocessor library and how its BOOST_PP_SLOT defines work (edit: and I mean the underlying workings). Therefore, while I am asking the above question, I'd also be interested if anyone could explain that.
(and I guess I'd also like to know where the heck to find the 'painted blue' rules are written...)
VALUE is defined as SUBSTITUTE. The definition of VALUE is not aware at any point that SUBSTITUTE has also been defined. After VALUE is replaced, whatever it was replaced by will be scanned again, and potentially more replacements applied then. All defines exist in their own conceptual space, completely unaware of each other; they only interact with one another at the site of expansion in the main program text (defines are directives, and thus not part of the program proper).
The rules for the preprocessor are specified alongside the rules for C proper in the language standard. The standard documents themselves cost money, but you can usually download the "final draft" for free; the latest (C11) can be found here: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
For at-home use the draft is pretty much equivalent to the real thing. Most people who quote the standard are actually looking at copies of the draft. (Certainly it's closer to the actual standard than any real-world C compiler is...)
There's a more accessible description of the macro rules in the GCC manual: http://gcc.gnu.org/onlinedocs/cpp/Self_002dReferential-Macros.html
Additionally... I couldn't tell you much about the Boost preprocessor library, not having used it, but there's a beautiful pair of libraries by the same authors called Order and Chaos that are very "clean" (as macro code goes) and easy to understand. They're more academic in tone and intended to be pure rather than portable; which might make them easier reading.
(Since I don't know Boost PP I don't know how relevant this is to your question but) there's also a good introductory example of the kids of techniques these libraries use for advanced metaprogramming constructs in this answer: Is the C99 preprocessor Turing complete?
I want to run simple analysis on C files (such as if you call foo macro with INT_TYPE as argument, then cast the response to int*), I do not want to prerprocess the file, I just want to parse it (so that, for instance, I'll have correct line numbers).
Ie, I want to get from
#include <a.h>
#define FOO(f)
int f() {FOO(1);}
an list of tokens like
<include_directive value="a.h"/>
<macro name="FOO"><param name="f"/><result/></macro>
<function name="f">
<return>int</return>
<body>
<macro_call name="FOO"><param>1</param></macro_call>
</body>
</function>
with no need to set include path, etc.
Is there any preexisting parser that does it? All parsers I know assume C is preprocessed. I want to have access to the macros and actual include instructions.
Our C Front End can parse code containing preprocesser elements can do this to fair extent and still build a usable AST. (Yes, the parse tree has precise file/line/column number information).
There are a number of restrictions, which allows it to handle most code. In those few cases it cannot handle, often a small, easy change to the source file giving equivalent code solves the problem.
Here's a rough set of rules and restrictions:
#includes and #defines can occur wherever a declaration or statement can occur, but not in the middle of a statement. These rarely cause a problem.
macro calls can occur where function calls occur in expressions, or can appear without semicolon in place of statements. Macro calls that span non-well-formed chunks are not handled well (anybody surprised?). The latter occur occasionally but not rarely and need manual revision. OP's example of "j(v,oid)*" is problematic, but this is really rare in code.
#if ... #endif must be wrapped around major language concepts (nonterminals) (constant, expression, statement, declaration, function) or sequences of such entities, or around certain non-well-formed but commonly occurring idioms, such as if (exp) {. Each arm of the conditional must contain the same kind of syntactic construct as the other arms. #if wrapped around random text used as bad kind of comment is problematic, but easily fixed in the source by making a real comment. Where these conditions are not met, you need to modify the original source code, often by moving the #if #elsif #else #end a few tokens.
In our experience, one can revise a code base of 50,000 lines in a few hours to get around these issues. While that seems annoying (and it is), the alternative is to not be able to parse the source code at all, which is far worse than annoying.
You also want more than just a parser. See Life After Parsing, to know what happens after you succeed in getting a parse tree. We've done some additional work in building symbol tables in which the declarations are recorded with the preprocessor context in which they are embedded, enabling type checking to include the preprocessor conditions.
You can have a look at this ANTLR grammar. You will have to add rules for preprocessor tokens, though.
Your specific example can be handled by writing your own parsing and ignore macro expansion.
Because FOO(1) itself can be interpreted as a function call.
When more cases are considered however, the parser is much more difficult. You can refer PDF Link to find more information.
If I write
#include <stdio.h>;
there no error but a warning comes out during compilation
pari.c:1:18: warning: extra tokens at end of #include directive
What is the reason ?
The reason is that preprocessor directives don't use semicolons. This is because they use a line break to delimit statements. This means that you cannot have multiple directives per line:
#define ABC #define DEF // illegal
But you can have one on multiple lines by ending each line (except the last) with a \ (or /, I forget).
Because Preprocessor directives are lines included in the code of our programs that are not program statements but directives for the preprocessor.
These preprocessor directives extend only across a single line of code. As soon as a newline character is found, the preprocessor directive is considered to end. That's why no semicolon (;) is expected at the end of a preprocessor directive.
Preprocessor directives are a different language than C, and have a much simpler grammar, because originally they were "parsed", if you can call it that, by a different program called cpp before the C compiler saw the file. People could use that to pre-process even non-C files to include conditional parts of config files and the like.
There is a Linux program called "unifdef" that you can still use to remove some of the conditional parts of a program if you know they'll never be true. For instance, if you have some code to support non-ANSI standard compilers surrounded by #ifdef ANSI/#else/#end or just #ifndef ANSI/#end, and you know you'll never have to support non-ANSI any more, you can eliminate the dead code by running it through unifdef -DANSI.
Because they're unnecessary. Preprocessor directives only exist on one line, unless you explicitly use a line-continuation character (for e.g. a big macro).
During compilation, your code is processed by two separate programs, the pre-processor and the compiler. The pre-processor runs first.
Your code is actually comprised of two languages, one overlaid on top of another. The pre-processor deals with one language, which is all directives starting with "#" (and the implications of these directives). It processes the "#include", "#define" and other directives, and leaves the rest of the code untouched (well, except as side effect of the pre-processor directives, like macro substitutions etc.).
Then the compiler comes along and processes the output generated by the pre-processor. It deals with "C" language, and pretty much ignores the pre-processor directives.
The answer to your question is that "#include" is a part of the language processed by the pre-processor, and in this language ";" are not required, and are, in fact, "extra tokens".
and if you use #define MACRO(para) fun(para); it could be WRONG to put an semikolon behind it.
if (cond)
MACRO (par1);
else
MACRO (par2);
leads to an syntactical error