C compiler option in Bazel CROSSTOOL file - c

How does one set C only (not C++) compiler flags in the CROSSTOOL file in Bazel.
compiler_flag can be used for both C and C++, cxx_flag for C++ code. What is the corresponding way to set C only options.
In particular I need to specify -std=c99 as an option. The only way I know of doing this right now is by passing copts = ["-std=c99"] to every target which is messy and error prone.

Looking at the protobuf of CROSSTOOL I don't think it's supported. You could write a Skylark macro called "c_library/c_binary" or something similar, and add your required copt before calling cc_library/cc_binary underneath.

Related

How to work around compiler built-in types in C standard header files

I am working on a static analysis tool for C. I need to pass the code being analysed through the C preprocessor so that the tool can see the library function prototypes, type definitions, etc. Unfortunately both with clang on Mac OS X and gcc on Linux distros, some of the standard header files refer to compiler built-in types like __builtin_va_list that my tool doesn't know about. Does anyone have any suggestions for how to work around this. One possibility, if it's available somewhere, would be a vanilla-flavoured set of header files that produce C that conforms strictly to the standard. The header files don't have to map to any ABI, as the tool doesn't need to compile and run the code: they just have to give the API promised by the C standard. Any suggestions will be gratefully received.
Instead of finding a set of standard standard header files, you can just use a set of empty files with the expected names and pass the source code through the compiler preprocessor with a -Idirectory option. Your syntax analysis tool should be able to deal with the remaining symbols.
It would be useful to have a preprocessor option in addition to -dI to preserve #include lines instead of handling them.
In the mean time, you can try using the include files from my nolibc repository.

What are Vectors and < > in C?

I was looking at the source code for gcc (out of curiosity), and I noticed a data structure that I've never seen in C before.
At line 80 and 129 (and many other places) in the parser, they seem to be using vectors.
80: vec<tree> incomplete_record_decls;
129: ridpointers = ggc_cleared_vec_alloc<tree> ((int) RID_MAX);
I've never encountered this data type in C, nor these: < >. Are they native to C?
Does anyone know what they are and how they are used?
Despite the .c filename, this code is not valid C; it is C++, using that language's template feature. If you inspect the gcc build process, you will find that this file is actually compiled with a C++ compiler.
https://gcc.gnu.org/codingconventions.html
The directories gcc, libcpp and fixincludes may use C++03. They may also use the long long type if the host C++ compiler supports it. These directories should use reasonably portable parts of C++03, so that it is possible to build GCC with C++ compilers other than GCC itself. If testing reveals that reasonably recent versions of non-GCC C++ compilers cannot compile GCC, then GCC code should be adjusted accordingly. (Avoiding unusual language constructs helps immensely.) Furthermore, these directories should also be compatible with C++11.
Keep in mind that although compilers will usually by default infer a source file's language from its filename, this default can always be overridden. It is entirely possible to have C++ code in a .c file, or C code in a .bas file for that matter; you just may have to tell the compiler some other way what language is in use.
I expect that gcc chose this file naming convention because this code was originally written in C and later converted to C++, and they found it too much of a pain to change all the filenames. It would mean a lot of work to update all the makefiles, etc. It may have been less of a pain to just change which compiler was used, and to explain the convention to all the developers. Of course, in general it is better programming practice to name your files in the standard way, but apparently the gcc developers felt it was not the best course of action in this case.
GCC has moved from C to C++ since GCC 4.8
GCC now uses C++ as its implementation language. This means that to build GCC from sources, you will need a C++ compiler that understands C++ 2003. For more details on the rationale and specific changes, please refer to the C++ conversion page.
GCC 4.8 Release Series - Changes, New Features, and Fixes
The work has actually begun long before that, with the creation of gcc-in-cxx branch. The developers first tried to compile the source code with a C++ compiler, so there weren't any name changes. I guess they didn't bother to rename the files later when merging the two branches and officially have only one C++ branch
You can read GCC's move to C++ for more historical information

Use #$? symbols in a C function/variable

I'm curious if there's any way to use #, $, or ? in a C function or variable name. I know that linkers allow them (because of C++ name mangling).
Is there any kind of escape code that could allow this (I don't care how ugly it looks)? Or, in standard C, is this completely impossible?
It is not possible in purely standard C.
But if using GCC (or probably Clang/LLVM) you can have $ in identifiers, and you can set the linker name using asm labels
You could perhaps also play GNU ld tricks with ld scripts.
Standard C does not allow any of these (and it can't really allow ? since it's an operator and thus a separate token). GCC (and possibly compatible compilers) allow $ but not the others. However, you could use the GNU C (GCC) extension for making a declaration for an external-linkage name that references a different underlying symbol name; this may achieve what you want, e.g. if you're trying to reference C++ symbols. I believe the syntax is something like adding __asm__("symbol_name") to the end of the function declaration. There are some examples in the glibc headers on most Linux systems.
Alternatively, if you have dlsym, you could use it to look up the names at runtime.

Use cases for -U

Some C compilers provide -D to define a macro on the command line and -U to undefine one (built-in or defined with -D).
I have used -D, but I'm curious about -U. What are the cases where it's useful in practice?
Here's one use case (I'm sure there are others):
Where your C compiler is being called from another application that generates the source for the C compiler, you won't easily get access to the source to modify it by hand (although most such compilers have a "keep C" option, editing generated code by hand is something to avoid). Usually the first compiler will have a bunch of options to set, and also let you pass further options to the C compiler yourself in an "options for the C compiler" argument (for instance, it might do this to let you control C compiler optimisation levels without assuming that the compiler is GCC). And sometimes the options for how to ultimately compile the generated code are controlled by macros built into the C output: since the output doesn't exist at the time you're entering command line options, -U and -D may be the only way to set those flags.
Real-world example: Gambit-C defaults to the option to output one massive C function instead of many separate ones, which (according to the docs) makes it easier for a C compiler to optimise the final code. It actually outputs the same C either way, toggling the behaviour with the __SINGLE_HOST macro. But compiling one huge function can take forever (or just fail) on an older machine, so there needs to be a way to turn this behaviour off. -U__SINGLE_HOST as one of the passed-through arguments to the C compiler can make it possible to actually compile Gambit projects on older computers while still enjoying some level of optimisation.
In this case the behaviour of __SINGLE_HOST could have been handled by the Gambit compiler instead, but while not strictly necessary, it gives more freedom to the person designing the first compiler. Which is always good.
The more generalised version of this answer would be that -U is useful any time your build system passes a bunch of -D arguments, and you don't want all of them; it can un-set default definitions after the system sets them.
I can only think of two cases where this can be useful:
If you have a #ifdef MY_MACRO or #ifndef MY_MACRO in your code, and MY_MACRO is defined (probably built-in, otherwise you could just delete it), and you want to compile without this macro (to change the behaviour of #ifdef)
Or if you want to redefine a macro with a different definition you "should" undefine it first (I write should, because the compiler complains if you doesn't, but everything works fine anyway)

C/C++ Compiler listing what's defined

This question : Is there a way to tell whether code is now being compiled as part of a PCH? lead me to thinking about this.
Is there a way, in perhaps only certain compilers, of getting a C/C++ compiler to dump out the defines that it's currently using?
Edit: I know this is technically a pre-processor issue but let's add that within the term compiler.
Yes. In GCC
g++ -E -dM <file>
I would bet it is possible in nearly all compilers.
Boost Wave (a preprocessor library that happens to include a command line driver) includes a tracing capability to trace macro expansions. It's probably a bit more than you're asking for though -- it doesn't just display the final result, but essentially every step of expanding a macro (even a very complex one).
The clang preprocessor is somewhat similar. It's also basically a library that happens to include a command line driver. The preprocessor defines a macro_iterator type and macro_begin/macro_end of that type, that will let you walk the preprocessor symbol table and do pretty much whatever you want with it (including printing out the symbols, of course).

Resources