What does #define without assignment assign to in C? - c

I have often seen code like
#ifndef HEADERFILE_H
#define HEADERFILE_H
// some declarations in
// the header file.
#endif
I want to know what #define HEADERFILE_H define HEADERFILE_H to?
I tried doing
cout<<HEADERFILE_H<<endl;
but I am getting
error: expected expression

A define preprocessing directive has the form # define identifier preprocessing-tokens, ending with a new-line character. The preprocessing-tokens is a list of zero or more preprocessing tokens. It may be empty, that is, it may have zero tokens. This means, that when the identifier is encountered in a place where macro replacement occurs, it will be replaced with nothing.1
Tests of the form #ifdef identifier, #ifndef identifier, or define identifier in a #if or #elif directive test whether identifier is defined or not. If it was not defined (or its definition was removed with the #undef directive), then the test indicates it is not defined. If it was defined, then the test indicates it was defined, even if the definition is for zero tokens.
A definition with zero tokens is different from no definition at all, and defined identifier will evaluate as true for the former and false for the latter.
Footnote
1 If the list does have tokens, then identifier will be replaced with those tokens and # and ## operators among them will be applied. A preprocessing token is largely an identifier (like foo34), a constant (like 3, 4u, or 1.8e4), one of the C operators or special characters (like * or +=), or certain other components of the C language.

It actually defines "nothing else than itself". That is: you may define a macro without assigning it a specific value. Since you can check if a given macro is defined or not, you therefore can ask for the simple "existence" of a given macro.
This is useful to indicate a context (for example, if you're compiling for a given OS) and/or the availability of some resources.
In this particular example, this is a called a "guard": it will define itself if this hasn't been done first before, as well as including the rest of the file, which is totally embedded in the #ifdef … #endif clause.
This is used to implement a kind of require_once, that is something that will be included if needed, but not multiple times. This is required when you are defining functions or declaring variables at a global scope, for instance.

This is a language idiom (I will comment it):
#ifndef HEADERFILE_H
Between this, and the last #endif everything is included in compilation, but only if HEADERFILE_H has not been defined before.
#define HEADERFILE_H
The first thing we do in the block is to #define the identifier, so next time we find this fragment again later, the contents between #ifndef and #endif will not be #included again (because of the identifier declaration).
// some declarations in
// the header file.
this block will be included only once, even if you #include this file several times.
#endif
and this marks the end of the protected block.
It is common to include some file that, indeed, #includes another, and that file includes another, leading to a point in which you don't know which files have been included and which don't. This phrasing allows you to be protected, and to be able to #include the same file several times (normally you cannot, as some definitions cannot be repeated in the same compilation unit, e.g. declarations) the lines above will include the contents and define the identifier, making next inclussions (that are effectively done) not to include the contents, as the identifier appears as #definen in second and ulterior times.

Related

What does the name _headerfile_h mean

I have been reading Zed Shaw's "Learn C The Hard Way". In the 20th chapter, the author creates a header file e.g. headerfile.h and includes the line
#ifndef _headerfile_h. I understand the #ifndef directive but not _headerfile_h. Please explain this _headerfile_h or mention any resource to look for it.
#ifndef _headerfile_h
#define _headerfile_h
…other material…
#endif /* _headerfile_h */
It's simply a unique name that will only used by that header, to prevent problems if the header is included twice.
Note that you should not, in general, create function, variable, tag or macro names that start with an underscore. Part of C11 §7.1.3 Reserved identifiers says:
All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use.
All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces.
See also:
What does double underscore (__const) mean in C?
How do I use extern to share variables between source files?
Should I use #include in headers?
Why doesn't the compiler generate a header guard automatically?
What is a good reference documenting patterns of use of ".h" files in C?
When to use include guards in C
and probably some others too. A number of those questions have further links to other resources — SO questions and external links.
The directive #ifndef checks if the "argument" is defined as a macro or not. If it's not defined (the n in ifndef stands for "not") then the next block up to the matching #endif is passed on by the preprocessor.
If the macro is defined, then the block is skipped and not passed on to the compiler by the preprocessor.
So what #ifndef _headerfile_h does is check if the symbol _headerfile_h is defined as a macro or not.
From the macro name it seems like this is part of an header include guard.

Check if a constructed constant is #define'd

I am trying to build a test that checks if a certain file defines a header guard with a certain namespace. Because the test is generic, this namespace is only known at compile-time and passed in as -DTHENAMESPACE=BLA. We then use some magic from https://stackoverflow.com/a/1489985/1711232 to paste that together.
This means I want to do something like:
#define PASTER(x, y) x##_##y
#define EVALUATOR(x, y) PASTER(x, y)
#define NAMESPACE(fun) EVALUATOR(THENAMESPACE, fun)
#ifndef NAMESPACE(API_H) // evaluates to BLA_API_H
# error "namespace not properly defined"
#endif
But this does not work properly, with cpp complaining about the ifndef not expecting the parentheses.
How can I do this properly, if it is possible at all?
I have also tried adding more layers of indirection, but not with a lot of success.
So directly, properly executing the #ifdef this at least appears to not be possible:
Considering the defined operator:
If the defined operator appears as a result of a macro expansion, the C standard says the behavior is undefined. GNU cpp treats it as a genuine defined operator and evaluates it normally. It will warn wherever your code uses this feature if you use the command-line option -Wpedantic, since other compilers may handle it differently. The warning is also enabled by -Wextra, and can also be enabled individually with -Wexpansion-to-defined.
https://gcc.gnu.org/onlinedocs/cpp/Defined.html#Defined
and ifdef expects a MACRO, and does not do further expansion.
https://gcc.gnu.org/onlinedocs/cpp/Ifdef.html#Ifdef
But maybe it is possible to trigger an 'undefined constant' warning (-Wundef), which would also allow my test pipeline to catch this problem.
If we assume that include guards always looks like
#define NAME /* no more tokens here */
and if, as you said, any compile time error (rather than #error exclusively) is acceptable, then you can do following:
#define THENAMESPACE BLA
#define BLA_API_H // Comment out to get a error.
#define CAT(x,y) CAT_(x,y)
#define CAT_(x,y) x##y
#define NAMESPACE(x) static int CAT(UNUSED_,__LINE__) = CAT(CAT(THENAMESPACE,CAT(_,x)),+1);
NAMESPACE(API_H)
Here, NAMESPACE(API_H) tries to concatenate BLA_API_H and + using ##.
This results in error: pasting "BLA_API_H" and "+" does not give a valid preprocessing token except if BLA_API_H is #defined to 'no tokens'.
In presence of #define BLA_API_H, NAMESPACE(API_H) simply becomes
static int UNUSED_/*line number here*/ = +1;
If you settle for a less robust solution, you can even get nice error messages:
#define THENAMESPACE BLA
#define BLA_API_H // Comment out to get a error.
#define TRUTHY_VALUE_X 1
#define CAT(x,y) CAT_(x,y)
#define CAT_(x,y) x##y
#define NAMESPACE(x) CAT(CAT(TRUTHY_VALUE_,CAT(THENAMESPACE,CAT(_,x))),X)
#if !NAMESPACE(API_H)
#error "namespace not properly defined"
#endif
Here, if BLA_API_H is defined, then #if !NAMESPACE(API_H) expands to #if 1.
If BLA_API_H is not defined, then it expands to #if TRUTHY_VALUE_BLA_API_HX, and TRUTHY_VALUE_BLA_API_HX evaluates to false due to being undefined.
The problem here is that if TRUTHY_VALUE_BLA_API_HX accidentally turns out to be defined to something truthy, you'll get a false negatie.
I don't think that macro expansion inside #ifndef directive is possible in the realm of standard C.
#ifndef symbol is equivalent of #if !defined symbol. Standard says this about it (§6.10.1 Conditional inclusion):
... it may contain unary operator expressions of the form
defined identifier
or
defined ( identifier )
and
Prior to evaluation, macro invocations in the list of preprocessing tokens that will become the controlling constant expression are replaced (except for those macro names modified by the defined unary
operator), just as in normal text. If the token defined is generated as a result of this replacement
process or use of the defined unary operator does not match one of the two specified forms prior to
macro replacement, the behavior is undefined. ...
So basically identifiers in defined expression are not expanded, and your current NAMESPACE(API_H) is not valid form of identifier.
Possible workaround could be to simply use:
#if NAMESPACE(API_H) == 0
# error "namespace not properly defined"
#endif
This works because non-existing identifiers are replaced with 0. Problem with this approach is that there will be false error if BLA_API_H is defined as 0, but depending on your situation that may be acceptable.
You can also do this using preprocessor pattern matching.
#define PASTE3(A,B,C) PASTE3_I(A,B,C)
#define PASTE3_I(A,B,C) A##B##C
#define PASTE(A,B) PASTE_I(A,B)
#define PASTE_I(A,B) A##B
#define THIRD(...) THIRD_I(__VA_ARGS__,,,)
#define THIRD_I(A,B,C,...) C
#define EMPTINESS_DETECTOR ,
#if THIRD(PASTE3(EMPTINESS_,PASTE(THENAMESPACE,_API_H),DETECTOR),0,1)
# error "namespace not properly defined"
#endif
This is the same idea as #HolyBlackCat's answer except that it's done in the preprocessor; the inner paste in the expression in the #if directive generates a token based on THENAMESPACE, pasting it to your required _API_H. If that token itself is defined in a macro, it will expand to a placemarker during the PASTE3 operation; this effectively pastes EMPTINESS_ [placemarker] DETECTOR, which is a macro expanding to a comma. That comma will shift the arguments of the indirect THIRD, placing 0 there, making the conditional equivalent to #if 0. Anything else won't shift the arguments, which results in THIRD selecting 1, which triggers the #error.
This also makes the same assumption HolyBlackCat's answer makes... that inclusion guards always look like #define BLA_API_H, but you can accommodate specific alternate styles using expanded pattern matching... for example, if you want to accept inclusion guards like #define BLAH_API_H 1 (who does that?), you could add #define EMPTINESS_1DETECTOR ,.
The language defines no way other than use of the defined operator or an equivalent to test whether identifiers are defined as macro names. In particular, a preprocessor directive of the form
#ifndef identifier
is equivalent to
#if ! defined identifier
(C11, 6.10.1/5). Similar applies to #ifdef.
The defined operator takes a single identifier as its operand, however, not an expression (C11, 6.10.1/1). Moreover, although the expression associated with an #if directive is macro expanded prior to evaluation, the behavior is undefined if in that context macro expansion produces the token "defined", and macro names modified by the defined unary operator are explicitly excluded from expansion (C11, 6.10.1/4).
Thus, although it is possible in many contexts to construct macro names via token pasting, and in such contexts the results are thereafter be subject to macro expansion, the operand of a defined operator is not such a context. The language therefore defines no way to test whether a constructed or indirectly-specified identifier is defined as a macro name.
HOWEVER, you can avoid relying on defined if you are in control of all the header guards, and you are willing to deviate slightly from traditional style. Instead of merely #defineing your header guards, define them to some nonzero integer value, say 1:
#if ! MYPREFIX_SOMEHEADER_H
#define MYPREFIX_SOMEHEADER_H 1
// body of someheader.h ...
#endif
You can then drop the defined operator from your test expression:
#if ! NAMESPACE(API_H)
# error "namespace not properly defined"
#endif
Do note, however, that the #define directive has a similar issue: it defines a single identifier, which is not subject to prior macro expansion. Thus, you cannot dynamically construct header guard names. I'm not sure whether you had that in mind, but if you did, then all the foregoping is probably moot.

How does the preprocessor know to translate HEADER_H to header.h?

Per this question, it seems there is some flexibility to how you can write that--
#ifndef _HEADER_H
or:
#ifndef __HEADER___H__
etc. It's not set in stone.
But I don't understand why we're using underscores at all in the first place. Why can't I just write:
#ifndef header.h
What's wrong with that? Why are we placing underscores everywhere and capitalizing everything? What does the preprocessor do with underscores?
header.h is not a valid identifier. You cannot have a period in a macro name.
That said, the name you pick for your include guard macros is completely arbitrary. After all, it's just another variable. It is purely convention (and reasonable in order to avoid clashes) to name them after the file.
I encourage you to phrase the header structure out aloud to see what the preprocessor does.
#ifndef MY_HEADER_H /* If the macro MY_HEADER_H is not defined (yet)... */
#define MY_HEADER_H /* ... then define it now ... */
... /* ... and deal with all this stuff ... */
#endif /* ... otherwise, skip all over it and go here. */
You see that this mechanism works equally well if you substitute MY_HEADER_H with I_REALLY_LIKE_BANANAS or whatever. The only requirement is that it be a valid macro identifier and not clash with the name of any other include guard.
In the above example, the macro is defined empty. That's fine, but it is not the only option. The second line could equally well read
#define MY_HEADER_H 1
which would then define the macro to 1. Some people do this but it doesn't really add anything and the value 1 is rather arbitrary. I generally don't do this. The only advantage is that if you define it to 1, you can also use #if in addition to #ifdef.
A final word of caution: Identifiers that start with an underscore or contain two or more consecutive underscore characters are reserved for the implementation and should not be used in user-code. Hence, _MY_HEADER_H and __MY_HEADER__H__ are both unfortunate choices.
The logic by which the preprocessor finds the correct header file if you say
#include <myheader.h>
is completely unrelated. Here, myheader.h names a file and the preprocessor will search for it in a number of directories (that usually can e configured via the -I command line option). Only after it has found and opened the file it will go ahead parsing it and thereby, it will eventually find the include guards that will cause it to essentially skip over the file if it has already parsed it before (and the include guard macro is therefore already defined so the first check evaluates to false).
Because #ifdef or #ifndef requires a preprocessor symbol after it, and these symbols cannot contain dots.
In the C11 (latest draft) spec n1570 (§6.10.1):
Preprocessing directives of the forms
# ifdef identifier new-line group opt
# ifndef identifier new-line group opt
check whether the identifier is or is not currently defined as a macro name.
and identifiers cannot contain dots (§6.4.2.1)
BTW, include guards are not required to have #ifdef symbols related to the file name. You can have a header file foo.h guarded with a #ifndef JESUISCHARLIEHEBDO or by #ifndef I_LOVE_PINK_ROSES_BUT_NOT_YELLOW_ONES preprocessor directive if you want so. But by human convention, the names are often related.
Notice that identifiers starting with an underscore are implementation defined, so you should rather avoid #ifndef _FOO_INCLUDED but prefer #ifndef FOO_INCLUDED

Using #undef before #define

In many places I see the usage of undefine macro before defining the same macro. For example:
#undef FORMULA
#ifdef SOMETHING
#define FORMULA 1
#else
#define FORMULA 2
#endif
What for the #undefine FORMULA used?
I may guess that it deals with the case when the macro was already defined before. But isn't the new definition overrides the old one? Thanks!
A macro name currently defined cannot be redefined with a different definition (see below), so #undef allows that macro name to be redefined with a different definition.
Here's the relevant legalese:
Both C and C++ Standards (same wording):
A macro definition lasts (independent of block structure) until a corresponding #undef directive is encountered or (if none is encountered) until the end of the preprocessing translation unit.
Slight differences in wording, same meaning:
C Standard (N1256), §6.10.3/2:
An identifier currently defined as an object-like macro shall not be redefined by another #define preprocessing directive unless the second definition is an object-like macro definition and the two replacement lists are identical. Likewise, an identifier currently defined as a function-like macro shall not be redefined by another #define preprocessing directive unless the second definition is a function-like macro definition that has the same number and spelling of parameters, and the two replacement lists are identical.
C++ Standard (N3337) §16.3/2
An identifier currently defined as an object-like macro may be redefined by another #define preprocessing directive provided that the second definition is an object-like macro definition and the two replacement lists are identical, otherwise the program is ill-formed. Likewise, an identifier currently defined as a function-like macro may be redefined by another #define preprocessing directive provided that the second definition is a function-like macro definition that has the same number and spelling of parameters, and the two replacement lists are identical, otherwise the program is ill-formed.
Same wording in both Standards:
Two replacement lists are identical if and only if the preprocessing tokens in both have the same number, ordering, spelling, and white-space separation, where all white-space separations are considered identical.
So:
#define X(y) (y+1)
#define X(z) (z+1) // ill-formed, not identical
IMHO, using #undef is generally dangerous due to the scoping rules for preprocessor macros. I'd prefer to get a warning or error from the preprocessor and come up with a different preprocessor macro rather than have some translation unit silently accept a wrong macro definition that introduces a bug into the program. Consider:
// header1.h
#undef PORT_TO_WRITE_TO
#define PORT_TO_WRITE_TO 0x400
// header2.h
#undef PORT_TO_WRITE_TO
#define PORT_TO_WRITE_TO 0x410
and have a translation unit #include both headers. No warning, probably not the intended result.
Yes, the new definition overrides all previous. But there is corresponding warning message. With #undef you have no warnings.
But isn't the new definition overrides the old one?
Yes, it does (when your compiler allows it). However, redefining a macro results in a compiler warning, which using #undefine lets you avoid.
This may be important in programming shops with strict rules on compiler warnings - for example, by requiring all production code to be compiled with -Werror flag, which treats all warnings as errors.
#undef removes the macro, so the name is free to be defined again.
If the macro was never defined in the first place, #undef has no effect, so there's no downside. #undef … #define should be read as replacing any potential previous definition, but not to imply that it must already be defined.
Popular compilers do allow you to skip the #undef, but this is not allowed by the official standard ISO C and C++ language specifications. It is not portable to do so.

Why does my #define macro appear to be a global?

I was investigating a compile and link issue within my program when I came across the following macro that was defined in a header and source file:
/* file_A.c */
#ifndef _NVSize
#define _NVSize 1
#endif
/* file_B.c */
#include "My_Header.h"
#ifndef _NVSize
#define _NVSize 1
#endif
/* My_Header.h */
#define _NVSize 1024
Nothing out of the ordinary yet, until I saw the following information in the GCC output map file:
/* My Map File */
...
.rodata 0x08015694 _NVSize
...
My understanding of the map file is that if you see a symbol in the .rodata section of the map file, this symbol is being treated as a global variable by the compiler. But, this shouldn't be the case because macros should be handled by the preprocessor before the compiler even parses the file. This macro should be replaced with it's defined value before compiling.
Is this the standard way that GCC handles macros or is there some implementation specific reason that GCC would treat this as a global (debug setting maybe)? Also, what does this mean if my macro gets redefined in a different source file? Did I just redefine it for a single source file or did I modify a global variable, thereby changing _NVSize everywhere it's used within my program?
I think the compiler is free to assign your macro to a global variable as long as it ensures that this produces the exact same result as if it did a textual replacement.
During the compilation the compiler can mark this global specially to denote that it is a macro constant value, so no re-assignment is possible, no address can be taken, etc.
If you redefine the macro in your sorce, the compiler might not perform this transformation (and treat it as you'd expect: a pre-compier textual replacement), perform it on one of the different values (or on all of them say, using different names for each occurrance), or do domething else :)
Macros are substituted in the preprocessor step, the compiler only sees the substituted result. Thus if it sees the macro name, then my bet is that the macro wasn't defined at the point of usage. It is defined between the specific #define _NVSize and an #undef _NVSize. Redefining an existing macro without using an #undef first should result in a preprocessor error, AFAIR.
BTW, you shouldn't start your macro names with an underscore. These are reserved for the implementation.

Resources