CPP : using different sign instead of hash '#' for directives? - c-preprocessor

Is there a way to use different sign in place of hash-sign # for specifying directives ?
may be specified when running it via cmd line param ..

The C preprocessor unambiguously uses # to indicate directives. No standard implementation allows this to be changed.
Even a non-standard preprocessor must be at least somewhat aware of C lexical syntax, on order to avoid expanding macros inside comments and strings. It also must correctly handle the # and ## preprocessor operators. So modifying the sigil character is likely to be an intrusive change. There are open source preprocessor libraries available if you want to give it a try.

Related

How to process macros in LEX?

How do I implement #define in yacc/bison?
For Example:
#define f(x) x*x
If anywhere f(x) appears in any function then it is replaced by the right side of the
macro substituting for the argument ‘x’.
For example, f(3) would be replaced with 3*3. The macro can call another macro too.
It's not usually possible to do macro expansion inside a parser, at least not C-style macros, because C-style macro expansion doesn't respect syntax. For example
#define IF if(
#define THEN )
is legal (although very bad style IMHO). But for that to be handled inside the grammar, it would be necessary to allow a macro identifier to appear anywhere in the input, not just where an identifier might be expected. The necessary modifications to the grammar are going to make it much less readable and are very likely to introduce parser action conflicts. [Note 1]
Alternatively, you could do the macro expansion in the lexical analyzer. The lexical analyzer is not a parser, but parsing a C-style macro invocation doesn't require much sophistication, and if macro parameters were not allowed, it would be even simpler. This is how Flex handles macro replacement in its regular expressions. ({identifier}, for example. [Note 2] Since Flex macros are just raw character sequences, not token lists as with C-style macros, they can be handled by pushing the replacement text back into the input stream. (F)lex provides the unput special action for this purpose. unput pushes one character back into the input stream, so if you want to push an entire macro replacement, you have to unput it one character at a time, back to front so that the last character unput is the first one to be read afterwards.
That's workable but ugly. And it's not really scalable to even the small feature list provided by the C preprocessor. And it violates the fundamental principle of software design, which is that each component does just one thing (so that it can do it well).
So that leaves the most common approach, which is to add a separate macro processor component, so that instead of dividing the parse into lexical scan/syntax analysis, the parse becomes lexical scan/macro expansion/syntax analysis. [Note 3]
A C-style macro processor which works between the lexical analyser and the syntactic analyser could itself be written in Bison. As I mentioned above, the parsing requirements are generally minimal, but there is still parsing to be done and Bison is presumably already part of the project. Although I don't know of any macro processor (other than proof-of-concept programs I've written myself) which do this, I think it's a very flexible solution. In particular, the Bison syntactic analysis phase could be implemented with a push-parser, which avoids the need to produce the entire macro-expanded token stream in order to make it available to a traditional pull-parser.
That's not the only way to design macros, though. Indeed, it has a lot of shortcomings, because the macro expansions are not hygienic, respecting neither syntax nor scope. Probably anyone who has used C macros has at one time or other been bitten by these problems; the simplest manifestation is defining a macro like:
#define NEXT(a) a + 1
and then writing
int x = NEXT(a) * 3;
which is not going to produce the expected result (unless what is expected is a violation of the syntactic form of the last statement). Also, any macro expansion which needs to use a local variable will sooner or later produce an incorrect expansion because of unexpected name collision. Hygienic macro expansion seeks to solve these issues by viewing macro expansion as an operation on syntax trees, not token streams, making the parsing paradigm lexical scan/syntax analysis/macro expansion (of the parse tree). For that operation, the appropriate tool might well be some kind of tree parser.
Notes
Also, you'd want to remove the token from the parse tree Yacc/bison does have a poorly-documented feature, YYBACKUP, which might possibly help be able to accomplish this. I don't know if that's one of its intended use cases; indeed, it is not clear to me what its intended use cases are.
The (f)lex documentation calls these definitions, but they really are macros, and they suffer from all the usual problems macros bring with them, such as mysterious interactions with surrounding syntax.
Another possibility is macro expansion/lexical scan/syntax analysis, which could be implemented using a macro processor like M4. But that completely divorces the macros from the rest of the language.
yacc and lex generate c source at the end. So you can use macros inside the parser and lexer actions.
The actual #define preprocessor directives can go in the first section of the lexer and parser file
%{
// Somewhere here
#define f(x) x*x
%}
These sections will be copied verbatim to the generated c source.

Computed Includes in C

I was reading the C Preprocessor guide page on gnu.org on computed includes which has the following explanation:
2.6 Computed Includes
Sometimes it is necessary to select one of several different header
files to be included into your program. They might specify
configuration parameters to be used on different sorts of operating
systems, for instance. You could do this with a series of
conditionals,
#if SYSTEM_1
# include "system_1.h"
#elif SYSTEM_2
# include "system_2.h"
#elif SYSTEM_3 …
#endif
That rapidly becomes tedious. Instead, the preprocessor offers the
ability to use a macro for the header name. This is called a computed
include. Instead of writing a header name as the direct argument of
‘#include’, you simply put a macro name there instead:
#define SYSTEM_H "system_1.h"
…
#include SYSTEM_H
This doesn't make sense to me. The first code snippet allows for optionality based on which system type you encounter by using branching if elifs. The second seems to have no optionality as a macro is used to define a particular system type and then the macro is placed into the include statement without any code that would imply its definition can be changed. Yet, the text implies these are equivalent and that the second is a shorthand for the first. Can anyone explain how the optionality of the first code snippet exists in the second? I also don't know what code is implied to be contained in the "..." in the second code snippet.
There's some other places in the code or build system that define or don't define the macros that are being tested in the conditionals. What's suggested is that instead of those places defining lots of different SYSTEM_1, SYSTEM_2, etc. macros, they'll just define SYSTEM_H to the value that's desired.
Most likely this won't actually be in an explicit #define, instead of will be in a compiler option, e.g.
gcc -DSYSTEM_H='"system_1.h"' ...
And this will most likely actually come from a setting in a makefile or other configuration file.

How can I get the function name as text not string in a macro?

I am trying to use a function-like macro to generate an object-like macro name (generically, a symbol). The following will not work because __func__ (C99 6.4.2.2-1) puts quotes around the function name.
#define MAKE_AN_IDENTIFIER(x) __func__##__##x
The desired result of calling MAKE_AN_IDENTIFIER(NULL_POINTER_PASSED) would be MyFunctionName__NULL_POINTER_PASSED. There may be other reasons this would not work (such as __func__ being taken literally and not interpreted, but I could fix that) but my question is what will provide a predefined macro like __func__ except without the quotes? I believe this is not possible within the C99 standard so valid answers could be references to other preprocessors.
Presently I have simply created my own object-like macro and redefined it manually before each function to be the function name. Obviously this is a poor and probably unacceptable practice. I am aware that I could take an existing cpp program or library and modify it to provide this functionality. I am hoping there is either a commonly used cpp replacement which provides this or a preprocessor library (prefer Python) which is designed for extensibility so as to allow me to 'configure' it to create the macro I need.
I wrote the above to try to provide a concise and well defined question but it is certainly the Y referred to by #Ruud. The X is...
I am trying to manage unique values for reporting errors in an embedded system. The values will be passed as a parameter to a(some) particular function(s). I have already written a Python program using pycparser to parse my code and identify all symbols being passed to the function(s) of interest. It generates a .h file of #defines maintaining the values of previously existing entries, commenting out removed entries (to avoid reusing the value and also allow for reintroduction with the same value), assigning new unique numbers for new identifiers, reporting malformed identifiers, and also reporting multiple use of any given identifier. This means that I can simply write:
void MyFunc(int * p)
{
if (p == NULL)
{
myErrorFunc(MYFUNC_NULL_POINTER_PASSED);
return;
}
// do something actually interesting here
}
and the Python program will create the #define MYFUNC_NULL_POINTER_PASSED 7 (or whatever next available number) for me with all the listed considerations. I have also written a set of macros that further simplify the above to:
#define FUNC MYFUNC
void MyFunc(int * p)
{
RETURN_ASSERT_NOT_NULL(p);
// do something actually interesting here
}
assuming I provide the #define FUNC. I want to use the function name since that will be constant throughout many changes (as opposed to LINE) and will be much easier for someone to transfer the value from the old generated #define to the new generated #define when the function itself is renamed. Honestly, I think the only reason I am trying to 'solve' this 'issue' is because I have to work in C rather than C++. At work we are writing fairly object oriented C and so there is a lot of NULL pointer checking and IsInitialized checking. I have two line functions that turn into 30 because of all these basic checks (these macros reduce those lines by a factor of five). While I do enjoy the challenge of crazy macro development, I much prefer to avoid them. That said, I dislike repeating myself and hiding the functional code in a pile of error checking even more than I dislike crazy macros.
If you prefer to take a stab at this issue, have at.
__FUNCTION__ used to compile to a string literal (I think in gcc 2.96), but it hasn't for many years. Now instead we have __func__, which compiles to a string array, and __FUNCTION__ is a deprecated alias for it. (The change was a bit painful.)
But in neither case was it possible to use this predefined macro to generate a valid C identifier (i.e. "remove the quotes").
But could you instead use the line number rather than function name as part of your identifier?
If so, the following would work. As an example, compiling the following 5-line source file:
#define CONCAT_TOKENS4(a,b,c,d) a##b##c##d
#define EXPAND_THEN_CONCAT4(a,b,c,d) CONCAT_TOKENS4(a,b,c,d)
#define MAKE_AN_IDENTIFIER(x) EXPAND_THEN_CONCAT4(line_,__LINE__,__,x)
static int MAKE_AN_IDENTIFIER(NULL_POINTER_PASSED);
will generate the warning:
foo.c:5: warning: 'line_5__NULL_POINTER_PASSED' defined but not used
As pointed out by others, there is no macro that returns the (unquoted) function name (mainly because the C preprocessor has insufficient syntactic knowledge to recognize functions). You would have to explicitly define such a macro yourself, as you already did yourself:
#define FUNC MYFUNC
To avoid having to do this manually, you could write your own preprocessor to add the macro definition automatically. A similar question is this: How to automatically insert pragmas in your program
If your source code has a consistent coding style (particularly indentation), then a simple line-based filter (sed, awk, perl) might do. In its most naive form: every function starts with a line that does not start with a hash or whitespace, and ends with a closing parenthesis or a comma. With awk:
{
print $0;
}
/^[^# \t].*[,\)][ \t]*$/ {
sub(/\(.*$/, "");
sub(/^.*[ \t]/, "");
print "#define FUNC " toupper($0);
}
For a more robust solution, you need a compiler framework like ROSE.
Gnu-C has a __FUNCTION__ macro, but sadly even that cannot be used in the way you are asking.

Where macros variable created? and size of the variable?

I have doubts about macros, When we create like the following
#define DATA 40
where DATA can be create? and i need to know size also?and type of DATA?
In java we create macro along with data type,
and what about macro function they are all inline function?
Macros are essentially text substitutions.
DATA does not exist beyond the pre-processing stage. The compiler never sees it. Since no variable is created, we can't talk about its data type, size or address.
Macros are literally pasted into the code. They are not "parsed", but expanded. The compiler does not see DATA, but 40. This is why you must be careful because macros are not like normal functions or variables. See gcc's documentation.
A macro is a fragment of code which has been given a name. Whenever
the name is used, it is replaced by the contents of the macro. There
are two kinds of macros. They differ mostly in what they look like
when they are used. Object-like macros resemble data objects when
used, function-like macros resemble function calls.
You may define any valid identifier as a macro, even if it is a C
keyword. The preprocessor does not know anything about keywords. This
can be useful if you wish to hide a keyword such as const from an
older compiler that does not understand it. However, the preprocessor
operator defined (see Defined) can never be defined as a macro, and
C++'s named operators (see C++ Named Operators) cannot be macros when
you are compiling C++.
macro's are not present in your final executable. They present in your source code only.macro's are processed during pre-processing stage of compilation.You can find more info about macro's here
Preprocessor directives like #define are replaced with the corresponding text during the preprocessing phase of compilation, and are (almost) never represented in the final executable.

Why there is no semicolons after preprocessor directives?

If I write
#include <stdio.h>;
there no error but a warning comes out during compilation
pari.c:1:18: warning: extra tokens at end of #include directive
What is the reason ?
The reason is that preprocessor directives don't use semicolons. This is because they use a line break to delimit statements. This means that you cannot have multiple directives per line:
#define ABC #define DEF // illegal
But you can have one on multiple lines by ending each line (except the last) with a \ (or /, I forget).
Because Preprocessor directives are lines included in the code of our programs that are not program statements but directives for the preprocessor.
These preprocessor directives extend only across a single line of code. As soon as a newline character is found, the preprocessor directive is considered to end. That's why no semicolon (;) is expected at the end of a preprocessor directive.
Preprocessor directives are a different language than C, and have a much simpler grammar, because originally they were "parsed", if you can call it that, by a different program called cpp before the C compiler saw the file. People could use that to pre-process even non-C files to include conditional parts of config files and the like.
There is a Linux program called "unifdef" that you can still use to remove some of the conditional parts of a program if you know they'll never be true. For instance, if you have some code to support non-ANSI standard compilers surrounded by #ifdef ANSI/#else/#end or just #ifndef ANSI/#end, and you know you'll never have to support non-ANSI any more, you can eliminate the dead code by running it through unifdef -DANSI.
Because they're unnecessary. Preprocessor directives only exist on one line, unless you explicitly use a line-continuation character (for e.g. a big macro).
During compilation, your code is processed by two separate programs, the pre-processor and the compiler. The pre-processor runs first.
Your code is actually comprised of two languages, one overlaid on top of another. The pre-processor deals with one language, which is all directives starting with "#" (and the implications of these directives). It processes the "#include", "#define" and other directives, and leaves the rest of the code untouched (well, except as side effect of the pre-processor directives, like macro substitutions etc.).
Then the compiler comes along and processes the output generated by the pre-processor. It deals with "C" language, and pretty much ignores the pre-processor directives.
The answer to your question is that "#include" is a part of the language processed by the pre-processor, and in this language ";" are not required, and are, in fact, "extra tokens".
and if you use #define MACRO(para) fun(para); it could be WRONG to put an semikolon behind it.
if (cond)
MACRO (par1);
else
MACRO (par2);
leads to an syntactical error

Resources