Two quick questions about flex / C - c

I would like to use this idiom :
yy_scan_string(line);
int i;
while ((i = yylex()))
....
where these two functions are define in the flex generated lex.yy.c in my main C file. So far, I am
#including "lex.yy.c"
but it seems fishy. How do I do that the correct C way?
Secondly, I would like the last line of my .l file,
. { return WORD; }
to no longer return a "WORD" token, but rather to return its input. For exemple (it is a smallish linux shell)
ls > ls.txt
Currently returns 2 WORD tokens, a GREATER token, and 6 WORD tokens, when I would like a return of "ls" GREATER "ls.txt". Of course yylex() can only return one type, so what is the accepted way to obtain the desired result?
Thanks.

You can tell flex to generate a header file as well as the C source file, using the --header-file=<filename> command line option, or by including %option header-file="<filename>" in the flex source. I typically invoke flex with:
flex --header-file=file.h -o file.c file.l
(Actually, I use make rules to generate a command like that, but that's the idea.) Then you can #include "file.h" in any source file which needs to invoke a flex function.
Normally, yylex returns the token type (an integer). The global variable yytext contains a pointer to the token string itself, which is probably sufficient for your purposes. However, please read "A Note About yytext And Memory" in the flex manual. (Summary: if you need to save the value of yytext, you must make a copy of it; strdup is recommended. Don't forget to free the copy when you don't need it anymore.)
Sometimes, the token string itself is not exactly what you want as a semantic value. By convention, flex actions place the semantic value of the token in the global yylval, which is where bison-generated parsers will look for it. However, yylval is not declared anywhere by flex-generated code, so you need to include a declaration yourself, both in the flex-generated code and in any source file which includes it. (If you use bison to generate your parser, bison will generate this declaration and put it in the header file it generates.)

Related

Linker error while using lex to compile simple print file. have newest version of macOS 13.1 and xcode installed

I'm trying to run a simple progam in flex that reads the string "hello world" and prints "Goodbye"
here is the file:
%%
"hello world" printf("Goodbye\n");
. ;
%%
the commands are
% flex ex1.l
% gcc lex.yy.c -ll
Error message:
ld: warning: object file (/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib/libl.a(libmain.o)) was built for newer macOS version (13.1) than being linked (13.0)
I have redownloaded the newest version of Xcode and have the updated macOS. I'm not sure what I can try to get this error code to go away.
I have also ran
% CMAKE_OSX_DEPLOYMENT_TARGET 13.0
to try and force the link to V13.0 without success.
The lex library, which the linker will use if given the -ll command-line flag, is supposed to be a convenience to make it easier to write quick and dirty (f)lex programs. If your compiler toolchain installation is OK, then it is a small convenience, but it seems like it's pretty easy to get the installation wrong, and since no-one seems to bother documenting the details of toolchain installation, it's hard to debug configuration issues unless you have lots of experience. That makes it a lot less convenient.
You could use this is as a learning experience in configuring your compilation environment, a skill which you will certainly find useful. But if you just want to get on with flex, it's really easy to avoid the need to use of the lex library.
The lex library consists of exactly two things, neither of which are necessary.
First, it contains a definition of yywrap(). The generated lexer calls yywrap() when it encounters the end of an input file. If yywrap returns 0, the lexer assumes that you have done something which will allow the lexer to continue reading input. That's actually not a very common use case, because most parsers just read one input file from beginning to end, and then finish. So a good default is to write a version of yywrap() which always returns 1, in which case the lexer will return an end-of-file token (0), and that should cause the lexer's caller to stop calling the lexer. The lex library includes the definition of precisely that simple default implementation:
int yywrap() { return 1; }
So you could include that code in your flex file, but a better solution is to tell flex to omit the yywrap() call and just assume that end-of-file means that there is no more input. Which you do by inserting the following declaration in your Flex prologue (not in the %{...%} code section):
%option noyywrap
I always recommend using a few more options:
%option noinput nounput noyywrap nodefault
The first two of those suppress the generation of the input() and unput() functions, so that you don't get "Unused function" warnings when you compile. Or better said, so that you wouldn't get warnings if you requested warnings when you compiled. But you should always request warnings. (And you should read them and act on them.) See the sample build instructions at the end of this answer.
The last option, nodefault, causes flex to produce a warning if there is any input which could trigger the automatic default action. The automatic default is to write the unmatched input to stdout and do nothing else, which is practically never what you want to do with unmatched input. Even if you wanted to ignore all unmatched input, which is a good way of ignoring programming errors, you'd very rarely want to print unmatched input to the output.
As it happens, your code will trigger this warning if you add the nodefault option, because you used . as your fallback pattern. . in (f)lex matches anything other than a newline character, and nothing in your program matches a newline, so the newline will go unmatched. Which means it will be echoed to standard output. Instead, you should use .|\n so that your default pattern also matches a newline. (Alternatively, with flex, you can use the s pattern flag, which causes . to really match anything: (?s:.).)
The other thing that the lex library contains is a default definition of main(). It's somewhat interesting that this is even possible. It works because the linker only includes functions from a library if the functions are not defined in the object files being linked. And, in C, main() really acts like an ordinary function. So if you link with a library which defines main(), that main program will be used whenever you don't define main() in your source code. That can be handy for debugging, but it's not really recommended in production code.
The particular main() function which is included in -ll just calls yylex() repeatedly until it returns an end-of-file token (0):
int main() {
while (yylex() != 0) { }
return 0;
}
So instead of relying on the lex library, just add those four lines at the end of your lexer definition, after the second %%.
If you apply both of these suggestions, you will end up with:
%option noinput nounput noyywrap nodefault
%{
/* So that `printf` is declared */
#include <stdio.h>
%}
%%
"hello world" printf("Goodbye\n");
.|\n ;
%%
int main() {
while (yylex() != 0) { }
return 0;
}
You can then build it without the -ll flag. But you should always build with compiler warnings enabled, and personally I think that using the default executable name a.out is bad style:
% flex ex1.l
% gcc -Wall -o ex1 lex.yy.c

Invalid argument in C using vscode [duplicate]

What is the difference between using angle brackets and quotes in an include directive?
#include <filename>
#include "filename"
What differs is the locations in which the preprocessor searches for the file to be included.
#include <filename>   The preprocessor searches in an implementation-defined manner, normally in directories pre-designated by the compiler/IDE. This method is normally used to include header files for the C standard library and other header files associated with the target platform.
#include "filename"   The preprocessor also searches in an implementation-defined manner, but one that is normally used to include programmer-defined header files and typically includes same directory as the file containing the directive (unless an absolute path is given).
For GCC, a more complete description is available in the GCC documentation on search paths.
The only way to know is to read your implementation's documentation.
In the C standard, section 6.10.2, paragraphs 2 to 4 state:
A preprocessing directive of the form
#include <h-char-sequence> new-line
searches a sequence of implementation-defined places for a header identified uniquely by the specified sequence between the < and > delimiters, and causes the replacement of that directive by the entire contents of the header. How the places are specified or the header identified is implementation-defined.
A preprocessing directive of the form
#include "q-char-sequence" new-line
causes the replacement of that directive by the entire contents of the source file identified by the specified sequence between the " delimiters. The named source file is searched for in an implementation-defined manner. If this search is not supported, or if the search fails, the directive is reprocessed as if it read
#include <h-char-sequence> new-line
with the identical contained sequence (including > characters, if any) from the original
directive.
A preprocessing directive of the form
#include pp-tokens new-line
(that does not match one of the two previous forms) is permitted. The preprocessing tokens after include in the directive are processed just as in normal text. (Each identifier currently defined as a macro name is replaced by its replacement list of preprocessing tokens.) The directive resulting after all replacements shall match one of the two previous forms. The method by which a sequence of preprocessing tokens between a < and a > preprocessing token pair or a pair of " characters is combined into a single header name preprocessing token is implementation-defined.
Definitions:
h-char: any member of the source character set except the new-line character and >
q-char: any member of the source character set except the new-line character and "
The sequence of characters between < and > uniquely refer to a header, which isn't necessarily a file. Implementations are pretty much free to use the character sequence as they wish. (Mostly, however, just treat it as a file name and do a search in the include path, as the other posts state.)
If the #include "file" form is used, the implementation first looks for a file of the given name, if supported. If not (supported), or if the search fails, the implementation behaves as though the other (#include <file>) form was used.
Also, a third form exists and is used when the #include directive doesn't match either of the forms above. In this form, some basic preprocessing (such as macro expansion) is done on the "operands" of the #include directive, and the result is expected to match one of the two other forms.
Some good answers here make references to the C standard but forgot the POSIX standard, especially the specific behavior of the c99 (e.g. C compiler) command.
According to The Open Group Base Specifications Issue 7,
-I directory
Change the algorithm for searching for headers whose names are not absolute pathnames to look in the directory named by the directory pathname before looking in the usual places. Thus, headers whose names are enclosed in double-quotes ( "" ) shall be searched for first in the directory of the file with the #include line, then in directories named in -I options, and last in the usual places. For headers whose names are enclosed in angle brackets ( "<>" ), the header shall be searched for only in directories named in -I options and then in the usual places. Directories named in -I options shall be searched in the order specified. Implementations shall support at least ten instances of this option in a single c99 command invocation.
So, in a POSIX compliant environment, with a POSIX compliant C compiler, #include "file.h" is likely going to search for ./file.h first, where . is the directory where is the file with the #include statement, while #include <file.h>, is likely going to search for /usr/include/file.h first, where /usr/include is your system defined usual places for headers (it's seems not defined by POSIX).
The exact behavior of the preprocessor varies between compilers. The following answer applies for GCC and several other compilers.
#include <file.h> tells the compiler to search for the header in its "includes" directory, e.g. for MinGW the compiler would search for file.h in C:\MinGW\include\ or wherever your compiler is installed.
#include "file" tells the compiler to search the current directory (i.e. the directory in which the source file resides) for file.
You can use the -I flag for GCC to tell it that, when it encounters an include with angled brackets, it should also search for headers in the directory after -I. GCC will treat the directory after the flag as if it were the includes directory.
For instance, if you have a file called myheader.h in your own directory, you could say #include <myheader.h> if you called GCC with the flag -I . (indicating that it should search for includes in the current directory.)
Without the -I flag, you will have to use #include "myheader.h" to include the file, or move myheader.h to the include directory of your compiler.
GCC documentation says the following about the difference between the two:
Both user and system header files are included using the preprocessing directive ‘#include’. It has two variants:
#include <file>
This variant is used for system header files. It searches for a file named file in a standard list of system directories. You can prepend directories to this list with the -I option (see Invocation).
#include "file"
This variant is used for header files of your own program. It searches for a file named file first in the directory containing the current file, then in the quote directories and then the same directories used for <file>. You can prepend directories to the list of quote directories with the -iquote option.
The argument of ‘#include’, whether delimited with quote marks or angle brackets, behaves like a string constant in that comments are not recognized, and macro names are not expanded. Thus, #include <x/*y> specifies inclusion of a system header file named x/*y.
However, if backslashes occur within file, they are considered ordinary text characters, not escape characters. None of the character escape sequences appropriate to string constants in C are processed. Thus,#include "x\n\\y"specifies a filename containing three backslashes. (Some systems interpret ‘\’ as a pathname separator. All of these also interpret ‘/’ the same way. It is most portable to use only ‘/’.)
It is an error if there is anything (other than comments) on the line after the file name.
It does:
"mypath/myfile" is short for ./mypath/myfile
with . being either the directory of the file where the #include is contained in, and/or the current working directory of the compiler, and/or the default_include_paths
and
<mypath/myfile> is short for <defaultincludepaths>/mypath/myfile
If ./ is in <default_include_paths>, then it doesn't make a difference.
If mypath/myfile is in another include directory, the behavior is undefined.
The <file> include tells the preprocessor to search in -I directories and in predefined directories first, then in the .c file's directory. The "file" include tells the preprocessor to search the source file's directory first, and then revert to -I and predefined. All destinations are searched anyway, only the order of search is different.
The 2011 standard mostly discusses the include files in "16.2 Source file inclusion".
2 A preprocessing directive of the form
# include <h-char-sequence> new-line
searches a sequence of implementation-defined places for a header identified uniquely by the
specified sequence between the < and > delimiters, and causes the
replacement of that directive by the entire contents of the header.
How the places are specified or the header identified is
implementation-defined.
3 A preprocessing directive of the form
# include "q-char-sequence" new-line
causes the replacement of that directive by the entire contents of the source file identified by the
specified sequence between the " delimiters. The named source file is
searched for in an implementation-defined manner. If this search is
not supported, or if the search fails, the directive is reprocessed as
if it read
# include <h-char-sequence> new-line
with the identical contained sequence (including > characters, if any) from the original directive.
Note that "xxx" form degrades to <xxx> form if the file is not found. The rest is implementation-defined.
By the standard - yes, they are different:
A preprocessing directive of the form
#include <h-char-sequence> new-line
searches a sequence of implementation-defined places for a header identified uniquely by the specified sequence between the < and > delimiters, and causes the replacement of that directive by the entire contents of the header. How the places are specified or the header identified is implementation-defined.
A preprocessing directive of the form
#include "q-char-sequence" new-line
causes the replacement of that directive by the entire contents of the source file identified by the specified sequence between the " delimiters. The named source file is searched for in an implementation-defined manner. If this search is not supported, or if the search fails, the directive is reprocessed as if it read
#include <h-char-sequence> new-line
with the identical contained sequence (including > characters, if any) from the original
directive.
A preprocessing directive of the form
#include pp-tokens new-line
(that does not match one of the two previous forms) is permitted. The preprocessing tokens after include in the directive are processed just as in normal text. (Each identifier currently defined as a macro name is replaced by its replacement list of preprocessing tokens.) The directive resulting after all replacements shall match one of the two previous forms. The method by which a sequence of preprocessing tokens between a < and a > preprocessing token pair or a pair of " characters is combined into a single header name preprocessing token is implementation-defined.
Definitions:
h-char: any member of the source character set except the new-line character and >
q-char: any member of the source character set except the new-line character and "
Note that the standard does not tell any relation between the implementation-defined manners. The first form searches in one implementation-defined way, and the other in a (possibly other) implementation-defined way. The standard also specifies that certain include files shall be present (for example, <stdio.h>).
Formally you'd have to read the manual for your compiler, however normally (by tradition) the #include "..." form searches the directory of the file in which the #include was found first, and then the directories that the #include <...> form searches (the include path, eg system headers).
At least for GCC version <= 3.0, the angle-bracket form does not generate a dependency between the included file and the including one.
So if you want to generate dependency rules (using the GCC -M option for exemple), you must use the quoted form for the files that should be included in the dependency tree.
(See http://gcc.gnu.org/onlinedocs/cpp/Invocation.html )
For #include "" a compiler normally searches the folder of the file which contains that include and then the other folders. For #include <> the compiler does not search the current file's folder.
Thanks for the great answers, esp. Adam Stelmaszczyk and piCookie, and aib.
Like many programmers, I have used the informal convention of using the "myApp.hpp" form for application specific files, and the <libHeader.hpp> form for library and compiler system files, i.e. files specified in /I and the INCLUDE environment variable, for years thinking that was the standard.
However, the C standard states that the search order is implementation specific, which can make portability complicated. To make matters worse, we use jam, which automagically figures out where the include files are. You can use relative or absolute paths for your include files. i.e.
#include "../../MyProgDir/SourceDir1/someFile.hpp"
Older versions of MSVS required double backslashes (\\), but now that's not required. I don't know when it changed. Just use forward slashes for compatibility with 'nix (Windows will accept that).
If you are really worried about it, use "./myHeader.h" for an include file in the same directory as the source code (my current, very large project has some duplicate include file names scattered about--really a configuration management problem).
Here's the MSDN explanation copied here for your convenience).
Quoted form
The preprocessor searches for include files in this order:
In the same directory as the file that contains the #include statement.
In the directories of the currently opened include files, in the reverse order in which
they were opened. The search begins in the directory of the parent include file and
continues upward through the directories of any grandparent include files.
Along the path that's specified by each /I compiler option.
Along the paths that are specified by the INCLUDE environment variable.
Angle-bracket form
The preprocessor searches for include files in this order:
Along the path that's specified by each /I compiler option.
When compiling occurs on the command line, along the paths that are specified by the INCLUDE environment variable.
An #include with angle brackets will search an "implementation-dependent list of places" (which is a very complicated way of saying "system headers") for the file to be included.
An #include with quotes will just search for a file (and, "in an implementation-dependent manner", bleh). Which means, in normal English, it will try to apply the path/filename that you toss at it and will not prepend a system path or tamper with it otherwise.
Also, if #include "" fails, it is re-read as #include <> by the standard.
The gcc documentation has a (compiler specific) description which although being specific to gcc and not the standard, is a lot easier to understand than the attorney-style talk of the ISO standards.
Many of the answers here focus on the paths the compiler will search in order to find the file. While this is what most compilers do, a conforming compiler is allowed to be preprogrammed with the effects of the standard headers, and to treat, say, #include <list> as a switch, and it need not exist as a file at all.
This is not purely hypothetical. There is at least one compiler that work that way. Using #include <xxx> only with standard headers is recommended.
#include <abc.h>
is used to include standard library files. So the compiler will check in the locations where standard library headers are residing.
#include "xyz.h"
will tell the compiler to include user-defined header files. So the compiler will check for these header files in the current folder or -I defined folders.
#include <> is for predefined header files
If the header file is predefined then you would simply write the header file name in angular brackets, and it would look like this (assuming we have a predefined header file name iostream):
#include <iostream>
#include " " is for header files the programmer defines
If you (the programmer) wrote your own header file then you would write the header file name in quotes. So, suppose you wrote a header file called myfile.h, then this is an example of how you would use the include directive to include that file:
#include "myfile.h"
#include "filename" // User defined header
#include <filename> // Standard library header.
Example:
The filename here is Seller.h:
#ifndef SELLER_H // Header guard
#define SELLER_H // Header guard
#include <string>
#include <iostream>
#include <iomanip>
class Seller
{
private:
char name[31];
double sales_total;
public:
Seller();
Seller(char[], double);
char*getName();
#endif
In the class implementation (for example, Seller.cpp, and in other files that will use the file Seller.h), the header defined by the user should now be included, as follows:
#include "Seller.h"
In C++, include a file in two ways:
The first one is #include which tells the preprocessor to look for the file in the predefined default location.
This location is often an INCLUDE environment variable that denotes the path to include files.
And the second type is #include "filename" which tells the preprocessor to look for the file in the current directory first, then look for it in the predefined locations user have set up.
The #include <filename> is used when a system file is being referred to. That is a header file that can be found at system default locations like /usr/include or /usr/local/include. For your own files that needs to be included in another program you have to use the #include "filename" syntax.
The simple general rule is to use angled brackets to include header files that come with the compiler. Use double quotes to include any other header files. Most compilers do it this way.
1.9 — Header files explains in more detail about pre-processor directives. If you are a novice programmer, that page should help you understand all that. I learned it from here, and I have been following it at work.
#include <filename>
is used when you want to use the header file of the C/C++ system or compiler libraries. These libraries can be stdio.h, string.h, math.h, etc.
#include "path-to-file/filename"
is used when you want to use your own custom header file which is in your project folder or somewhere else.
For more information about preprocessors and header. Read C - Preprocessors.
#include <filename>
The preprocessor searches in an implementation-dependent manner. It tells the compiler to search directory where system header files are held.
This method usually use to find standard header files.
#include "filename"
This tell compiler to search header files where program is running. If it was failed it behave like #include <filename> and search that header file at where system header files stored.
This method usually used for identify user defined header files(header files which are created by user). There for don't use this if you want to call standard library because it takes more compiling time than #include <filename>.
Form 1 - #include < xxx >
First, looks for the presence of header file in the current directory from where directive is invoked. If not found, then it searches in the preconfigured list of standard system directories.
Form 2 - #include "xxx"
This looks for the presence of header file in the current directory from where directive is invoked.
The exact search directory list depends on the target system, how GCC is configured, and where it is installed.
You can find the search directory list of your GCC compiler by running it with -v option.
You can add additional directories to the search path by using - Idir, which causes dir to be searched after the current directory (for the quote form of the directive) and ahead of the standard system directories.
Basically, the form "xxx" is nothing but search in current directory; if not found falling back the form
#include <file>
Includes a file where the default include directory is.
#include "file"
Includes a file in the current directory in which it was compiled. Double quotes can specify a full file path to a different location as well.
the " < filename > " searches in standard C library locations
whereas "filename" searches in the current directory as well.
Ideally, you would use <...> for standard C libraries and "..." for libraries that you write and are present in the current directory.
In general the difference is where the preprocessor searches for the header file:
#include is a preprocessor directive to include header file. Both #include are used to add or include header file in the program, but first is to include system header files and later one for user defined header files.
#include <filename> is used to include the system library header file in the program, means the C/C++ preprocessor will search for the filename where the C library files are stored or predefined system header files are stored.
#include "filename" is used to include user defined header file in the program, means the C/C++ preprocessor will search for the filename in the current directory the program is in and then follows the search path used for the #include <filename>
Check the gcc docs gcc include files
"" will search ./ first. Then search the default include path.
You can use command like this to print the default include path:
gcc -v -o a a.c
Here are some examples to make thing more clear:
the code a.c works
// a.c
#include "stdio.h"
int main() {
int a = 3;
printf("a = %d\n", a);
return 0;
}
the code of b.c works too
// b.c
#include <stdio.h>
int main() {
int a = 3;
printf("a = %d\n", a);
return 0;
}
but when I create a new file named stdio.h in current directory
// stdio.h
inline int foo()
{
return 10;
}
a.c will generate compile error, but b.c still works
and "", <> can be used together with the same file name. since the search path priority is different.
so d.c also works
// d.c
#include <stdio.h>
#include "stdio.h"
int main()
{
int a = 0;
a = foo();
printf("a=%d\n", a);
return 0;
}
To see the search order on your system using gcc, based on current configuration , you can execute the following command. You can find more detail on this command here
cpp -v /dev/null -o /dev/null
Apple LLVM version 10.0.0 (clang-1000.10.44.2)
Target: x86_64-apple-darwin18.0.0
Thread model: posix InstalledDir: Library/Developer/CommandLineTools/usr/bin
"/Library/Developer/CommandLineTools/usr/bin/clang" -cc1 -triple
x86_64-apple-macosx10.14.0 -Wdeprecated-objc-isa-usage
-Werror=deprecated-objc-isa-usage -E -disable-free -disable-llvm-verifier -discard-value-names -main-file-name null -mrelocation-model pic -pic-level 2 -mthread-model posix -mdisable-fp-elim -fno-strict-return -masm-verbose -munwind-tables -target-cpu penryn -dwarf-column-info -debugger-tuning=lldb -target-linker-version 409.12 -v -resource-dir /Library/Developer/CommandLineTools/usr/lib/clang/10.0.0 -isysroot
/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk
-I/usr/local/include -fdebug-compilation-dir /Users/hogstrom -ferror-limit 19 -fmessage-length 80 -stack-protector 1 -fblocks -fencode-extended-block-signature -fobjc-runtime=macosx-10.14.0 -fmax-type-align=16 -fdiagnostics-show-option -fcolor-diagnostics -traditional-cpp -o - -x c /dev/null
clang -cc1 version 10.0.0 (clang-1000.10.44.2) default target x86_64-apple-darwin18.0.0 ignoring
nonexistent directory "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/local/include"
ignoring nonexistent directory "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/Library/Frameworks"
#include "..." search starts here:
#include <...> search starts here:
/usr/local/include
/Library/Developer/CommandLineTools/usr/lib/clang/10.0.0/include
/Library/Developer/CommandLineTools/usr/include
/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include
/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks (framework directory)
End of search list.
The implementation-defined warnings generated by the compiler can (and will) treat system libraries differently than program libraries.
So
#include <myFilename>
-- which in effect declares that myFilename is in the system library location -- may well (and probably will) hide dead code and unused variable warnings etc, that would show up when you use:
#include "myFilename"
There exists two ways to write #include statement.These are:
#include"filename"
#include<filename>
The meaning of each form is
#include"mylib.h"
This command would look for the file mylib.h in the current directory as well as the specified list of directories as mentioned n the include search path that might have been set up.
#include<mylib.h>
This command would look for the file mylib.h in the specified list of directories only.
The include search path is nothing but a list of directories that would be searched for the file being included.Different C compilers let you set the search path in different manners.

C - Should I use quotes or brackets to include headers in a separate directory

I have a project with a src and an include directory. When compiling, I pass the include directory via the -I option (gcc -Iinclude ...).
Should I use double quotes (") or angle brackets (<) to include my own header files?
I tried to look for an answer and found these two conflicting statements:
include header files relative to the c file via double quotes. Everything else (header files in include paths) with angle brackets.
-> thus use angle brackets
include standard headers with angle brackets. Everything else with double quotes. -> thus use double quotes
In my opinion statement 2 is clearer. When including a file with double quotes it is most obvious that it is my own header.
Should I use quotes or brackets to include my own header files? The C standard allows both possibilities. So what is the best practice?
The common convention is:
Use < … > for headers that are part of the C implementation or the platform—headers outside your project such as the C standard library, Unix or Windows headers, and headers of libraries generally installed for your development environment.
Use " … " for headers that are part of your project.
This is not fully determined by the C standard; it is a matter of general practice. For each delimiter choice, a compiler has a list of places (a search path) where it looks for headers. Those search paths are commonly designed to facilitate the use described above, but they are customizable (depending on the compiler you use) by command-line switches, by environment variables, by system settings, and/or by settings made when building the compiler.
Here is what the C standard says about them in C 2018 6.10.2. Paragraph 2 says:
A preprocessing directive of the form
# include < h-char-sequence > new-line
searches a sequence of implementation-defined places for a header identified uniquely by the specified sequence between the < and > delimiters, and causes the replacement of that directive by the entire contents of the header. How the places are specified or the header identified is implementation-defined.
Paragraph 3 says:
A preprocessing directive of the form
# include " q-char-sequence " new-line
causes the replacement of that directive by the entire contents of the source file identified by the specified sequence between the " delimiters. The named source file is searched for in an implementation-defined manner. If this search is not supported, or if the search fails, the directive is reprocessed as if it read
# include < h-char-sequence > new-line
with the identical contained sequence (including > characters, if any) from the original directive.
Note some of the differences between the two:
The text for the bracket form says it searches for a header identified uniquely. The text for the quote form does not include the word “uniquely”. This suggests all the headers referred to by the bracketed form are supposed to be different from each other, which you might expect if they were part of a designed system seeking to avoid ambiguity.
Note that it says the first form “searches a sequence of implementation-defined places.” This accords with the compiler having a list of places to search for standard headers. For the second form, it uses “the source file identified by the specified sequence.” This accords with using the text between quotes as a path in the file system.
This text in the standard is quite lax, both allowing implementation-defined methods of identifying the files, so either can be stretched to be the same as the other (although it would be interesting to see a compiler complain that a header named in brackets is not unique), and compiler configuration options are sufficiently broad that you could use each in either way for your project. However, it is generally better to stick to convention.
Consider this example:
+- include/
| |
| \- header.h
|
+- src/
|
\- main.c
The statements are saying to either use:
#include "../include/header.h"
gcc src/main.c
or:
#include <header.h>
gcc -Iinclude src/main.c
You decide which style to use. I personally prefer the second one. But it more important to use a consistent style throughout the project.

What is the difference between '#include' and '##include'?

For example:
#include "pathtoheader1/header1.hh"
##include "pathtoheader2/header2.hh"
What is the difference between these two preprocessor directives?
Edit
From what I can tell, the ##include directive, in the context of the program I am working with, will prepend -I flags to the specified include path.
TRICK_CFLAGS += -Imodels
TRICK_CXXFLAGS += -Imodels
The compiler will now look for:
/models/pathtoheader1/header1.hh
instead of
/pathtoheader1/header1.hh
These flags are stored in a .mk file.
Additional Information
I am using NASA's Trick Simulation environment to build a simple 2-body simulation of the earth orbiting the sun. The specific tool I am using is called 'trick-CP', Trick's compilation tool.
https://github.com/nasa/trick
## is the token pasting operator in both the C and C++ preprocessors. It's used to concatenate two arguments.
Since it requires an argument either side, a line starting with it is not syntactically valid, unless it's a continuation of a previous line where that previous line has used the line continuation symbol \ or equivalent trigraph sequence.
Question is about NASA Trick. Trick extends C and C++ language with its own syntax.
From Trick documentation:
Headers files, that supply data-types for user-defined models should be included using ##include . Note the double hash (#).
The second one is a syntax error in C++, and I am pretty sure it is a syntax error in C too. The ## preprocessor operator is only valid inside a preprocessor macro (where it forces token pasting).
Here is what the Trick Documentation says about include:
Include files
There are two types of includes in the S_define file.
Single pound "#" includes.
Include files with a single pound "#" are parsed as they are part of the S_define file. They are treated just as #include files in C or C++ files. These files usually include other sim objects or instantiations as part of the S_define file.
Double pound "#" includes.
Include files with a double pound "##" are not parsed as part of the S_define file. These files are the model header files. They include the model class and structure definitions as well as C prototypes for functions used in the S_define file. Double pound files are copied, minus one pound, to S_source.hh.
Also here is a link to where it talks about it in the Trick documentation: https://nasa.github.io/trick/documentation/building_a_simulation/Simulation-Definition-File

gcc check if file is main (#if __BASE_FILE__ == __FILE__)

In ruby there's very common idiom to check if current file is "main" file:
if __FILE__ == $0
# do something here (usually run unit tests)
end
I'd like to do something similar in C after reading gcc documentation I've figured that it should work like this:
#if __FILE__ == __BASE_FILE__
// Do stuff
#endif
the only problem is after I try this:
$ gcc src/bitmap_index.c -std=c99 -lm && ./a.out
src/bitmap_index.c:173:1: error: token ""src/bitmap_index.c"" is not valid in preprocessor expressions
Am I using #if wrong?
As summary for future guests:
You cannot compare string using #if
BASE_FILE is the name of file that is being compiled (that Is actually what I wanted).
Best way to do this is to set flag during compilation with -D
in gcc you can use:
#if __INCLUDE_LEVEL__ == 0
or:
if(!__INCLUDE_LEVEL__)
to check if your inside the __BASE_FILE__
Yes, you are misusing #if. It only works on integer constant expressions. But even if you were using if, comparing pointers for equality is never a valid way to compare strings in C.
It seems you can't.
Alternatively, it works perfectly fine on a regular if condition, and gcc can optimize this nicely.
if (!strcmp(__BASE_FILE__, __FILE__)) {
// works.
}
but you can't define new main functions or use other preprocessor tricks. but you could short-circuit main by using static methods, but that's harsh and dirty.
But maybe you shouldn't do it. in Ruby/python, this works because usage of files is done at runtime. in C, all files are to be compiled to be used.
Keep in mind that most build system will build one file at a time, building them as object files, and rebuilding them only when necessary. So
__BASE_FILE__ and __FILE__
will be equals most of the time in sources files, if not always. And i would strongly discourage you to do this in header files.
It's easier to just put your tests in separate files, only linking them when needed.
Yup, as others say, you're misusing it since you can't compare strings that way in C, and especially not in the preprocessor.
The file that defines int main(int argc, char* argv[]) is the main file. There can be only one such function in an executable.
In addition to what others have said (you can't have the C preprocessor compare strings), be careful with __BASE_FILE__ because it may not correspond to your definition of "main" file. __BASE_FILE__ is the name of the file being compiled, so it's always equal to __FILE__ in source files, and only differs in headers and other included files.
In particular, __BASE_FILE__ is not the name of the file which contains the main() function.

Resources