How is "__builtin_va_list" implemented? - c

I want to delve into the implementation of function "printf" in C on macOS. "printf" uses the <stdarg.h> header file. I open the <stdarg.h> file and find that va_list is just a macro.
So, I am really curious about how the __builtin_va_list is implemented? I know it is compiler-specific. Where can I find the definition of the __builtin_va_list? Should I download the source code of clang compiler?

So, I am really curious about how the __builtin_va_list is implemented?
__builtin_va_list is implemented inside the GCC compiler (or the Clang/LLVM one). So you should study the GCC compiler source code to understand details.
Look into gcc/builtins.def & gcc/builtins.c for more.
I am less familiar with Clang, which implements the same builtin.
But both GCC & Clang are open source or free software. They are complex beasts (several millions lines of code each), so you could need years of work to understand them.
Be aware that the ABI of your compiler matters. Look for example into X86 psABI for more details.
BTW, Grady Player commented:
Pops the correct number of bytes off of the stack for each of those tokens...
Unfortunately, today it is much more complex than that. On current processors and ABIs the calling conventions do use processor registers to pass some arguments (and the evil is in the details).
Should I download the source code of clang compiler?
Yes, and you also need to allocate several years of work to understand the details.
A few years ago, I did write some tutorial slides and links to external documentation regarding GCC implementation, see my GCC MELT documentation page (a bit rotten).

In Clang 9, this is implemented in
clang\lib\AST\ASTContext.cpp
call graph:
getVaListTagDecl
=>getBuiltinVaListDecl
=>CreateVaListDecl
=>Create***BuiltinVaListDecl
for example:
=>CreateCharPtrBuiltinVaListDecl
=>CreateCharPtrNamedVaListDecl
=>buildImplicitTypedef
When there is __builtin_va_list in the preprocessed source, the compiler calls getVaListTagDecl to build a TypedefDecl AST node and insert it into the AST, the typedef doesn't exist in any source code, it is generated dynamically during build, as if there is such in the source:
typedef *** __builtin_va_list;
//for example
typedef char* __builtin_va_list;

This answer, for clang, just show how I find implementation of a builtin function.
I'm interested in implementation of std::atomic<T>. If T is not a trivial type, clang use a lock to guard its atomicity. Look this answer first, I find a builtin function named __c11_atomic_store. The question is, how this builtin function implemented in clang?
Searching Builtin in clang codebase, find in clang/Basic/Builtins.def:
// Some of our atomics builtins are handled by AtomicExpr rather than
// as normal builtin CallExprs. This macro is used for such builtins.
#ifndef ATOMIC_BUILTIN
#define ATOMIC_BUILTIN(ID, TYPE, ATTRS) BUILTIN(ID, TYPE, ATTRS)
#endif
// C11 _Atomic operations for <stdatomic.h>.
ATOMIC_BUILTIN(__c11_atomic_init, "v.", "t")
ATOMIC_BUILTIN(__c11_atomic_load, "v.", "t")
ATOMIC_BUILTIN(__c11_atomic_store, "v.", "t")
ATOMIC_BUILTIN(__c11_atomic_exchange, "v.", "t")
...
The keyword are AtomicExpr and CallExpr. Then I check every caller of AtomicExpr's constructor, but doesn't find any useful information. So I guess, maybe in parse phase, if parser match an builtin function calling, it will construct an CallExpr to AST with builtin flag. In code generate phase, it will emit the implementation.
Check CodeGen, I find the answer in lib/CodeGen/CGBuiltin.cpp and CodeGen/CGAtomic.cpp.
You can check CodeGenFunction::EmitVAArg, I holp that would be useful for you.

Related

`__noinline__` macro conflict between GLib and CUDA

I'm working on an application using both GLib and CUDA in C. It seems that there's a conflict when importing both glib.h and cuda_runtime.h for a .cu file.
7 months ago GLib made a change to avoid a conflict with pixman's macro. They added __ before and after the token noinline in gmacros.h: https://gitlab.gnome.org/GNOME/glib/-/merge_requests/2059
That should have worked, given that gcc claims:
You may optionally specify attribute names with __ preceding and following the name. This allows you to use them in header files without being concerned about a possible macro of the same name. For example, you may use the attribute name __noreturn__ instead of noreturn.
However, CUDA does use __ in its macros, and __noinline__ is one of them. They acknowledge the possible conflict, and add some compiler checks to ensure it won't conflict in regular c files, but it seems that in .cu files it still applies:
#if defined(__CUDACC__) || defined(__CUDA_ARCH__) || defined(__CUDA_LIBDEVICE__)
/* gcc allows users to define attributes with underscores,
e.g., __attribute__((__noinline__)).
Consider a non-CUDA source file (e.g. .cpp) that has the
above attribute specification, and includes this header file. In that case,
defining __noinline__ as below would cause a gcc compilation error.
Hence, only define __noinline__ when the code is being processed
by a CUDA compiler component.
*/
#define __noinline__ \
__attribute__((noinline))
I'm pretty new to CUDA development, and this is clearly a possible issue that they and gcc are aware of, so am I just missing a compiler flag or something? Or is this a genuine conflict that GLib would be left to solve?
Environment: glib 2.70.2, cuda 10.2.89, gcc 9.4.0
Edit: I've raised a GLib issue here
It might not be GLib's fault, but given the difference of opinion in the answers so far, I'll leave it to the devs there to decide whether to raise it with NVidia or not.
I've used nemequ's workaround for now and it compiles without complaint.
GCC's documentation states:
You may optionally specify attribute names with __ preceding and following the name. This allows you to use them in header files without being concerned about a possible macro of the same name. For example, you may use the attribute name __noreturn__ instead of noreturn.
Now, that's only assuming you avoid double-underscored names the compiler and library use; and they may use such names. So, if you're using NVCC - NVIDIA could declare "we use noinline and you can't use it".
... and indeed, this is basically the case: The macro is protected as follows:
#if defined(__CUDACC__) || defined(__CUDA_ARCH__) || defined(__CUDA_LIBDEVICE__)
#define __noinline__ __attribute__((noinline))
#endif /* __CUDACC__ || __CUDA_ARCH__ || __CUDA_LIBDEVICE__ */
__CUDA_ARCH__ - only defined for device-side code, where NVCC is the compiler (ignoring clang CUDA support here).
__CUDA_LIBDEVICE__ - Don't know where this is used, but you're certainly not building it, so you don't care about that.
__CUDACC__ defined when NVCC is compiling the code.
So in regular host-side code, including this header will not conflict with Glib's definitions.
Bottom line: NVIDIA is (basically) doing the right thing here and it shouldn't be a real problem.
GLib is clearly in the right here. They check for __GNUC__ (which is what compilers use to indicate compatibility with GNU C, AKA the GNU extensions to C and C++) prior to using __noinline__ exactly as the GNU documentation indicates it should be used: __attribute__((__noinline__)).
GNU C is clearly doing the right thing here, too. Compilers offering the GNU extensions (including GCC, clang, and many many others) are, well, compilers, so they are allowed to use the double-underscore prefixed identifiers. In fact, that's the whole idea behind them; it's a way for compilers to provide extensions without having to worry about conflicts to user code (which is not allowed to declare double-underscore prefixed identifiers).
At first glance, NVidia seems to be doing the right thing, too, but they're not. Assuming you consider them to be the compiler (which I think is correct), they are allowed to define double-underscore prefixed macros such as __noinline__. However, the problem is that NVidia also defines __GNUC__ (quite intentionally since they want to advertise support for GNU extensions), then proceeds to define __noinline__ in an incompatible way, breaking an API provided by GNU C.
Bottom line: NVidia is in the wrong here.
As for what to do about it, well that's a less interesting question but there are a few options. You could (and should) file an issue with NVidia to fix their compiler. In my experience they're pretty good about responding quickly but unlikely to get around to fixing the problem in a reasonable amount of time.
You could also send a patch to GLib to work around the problem by doing something like
#if defined(__CUDACC__)
__attribute__((noinline))
#elif defined(__GNUC__)
__attribute__((__noinline__))
#else
...
#endif
If you're in control of the code which includes glib, another option would be to do something like
#undef __noinline__
#include glib_or_file_which_includes_glib
#define __noinline__ __attribute__((noinline))
My advice would be to do all three, but especially the first one (file an issue with NVidia) and find a way to work around it in your code until NVidia fixes the problem.

paste operator in macros

I found the following snippet of code .
#define f(g,g2) g##g2
main() {
int var12=100;
printf("%d",f(var,12));
}
I understand that this will translate f(var,12) into var12 .
My question is in the macro definition, why didn't they just write the following :
#define f(g,g2) gg2
why do we need ## to concatenate text, rather than concatenate it ourselves ?
If one writes gg2 the preprocessor will perceive that as a single token. The preprocessor cannot understand that that is the concatenation of g and g2.
#define f(g,g2) g##g2
My opinion is that this is poor unreadable code. It needs at least a comment (giving some motivation, explanation, etc...), and a short name like f is meaningless.
My question is in the macro definition, why didn't they just write the following :
#define f(g,g2) gg2
With such a macro definition, f(x,y) would still be expanded to the token gg2, even if the author wanted the expansion to be xy
Please take time to read e.g. the documentation of GNU cpp (and of your compiler, perhaps GCC) and later some C standard like n1570 or better.
Consider also designing your software by (in some cases) generating C code (inspired by GNU bison, or GNU m4, or GPP). Your build machinery (e.g. your Makefile for GNU make) would process that as you want. In some cases (e.g. programs running for hours of CPU time), you might consider doing some partial evaluation and generating specialized code at runtime (for example, with libgccjit or GNU lightning). Pitrat's book on Artificial Beings, the conscience of a conscious machine explains and arguments that idea in an entire book.
Don't forget to enable all warnings and debug info in your compiler (e.g. with GCC use gcc -Wall -Wextra -g) and learn to use a debugger (like GNU gdb).
On Linux systems I sometimes like to generate some (more or less temporary) C code at runtime (from some kind of abstract syntax tree), then compile that code as a plugin, and dlopen(3) that plugin then dlsym(3) inside it. For a stupid example, see my manydl.c program (to demonstrate that you can generate hundreds of thousands of C files and plugins in the same program). For serious examples, read books.
You might also read books about Common Lisp or about Rust; both have a much richer macro system than C provides.

What does gcc -D_REENTRANT really do?

I am writing Java bindings for a C library, and therefore working with JNI. Oracle specifies, reasonably, that native libraries for use with Java should be compiled with multithread-aware compilers.
The JNI docs give the specific example that for gcc, this multithread-awareness requirement should be met by defining one of the macros _REENTRANT or _POSIX_C_SOURCE. That seems odd to me. _REENTRANT and _POSIX_C_SOURCE are feature-test macros. GCC and POSIX documentation describe their effects in terms of defining symbols and making declarations visible, just as I would expect for any feature-test macro.
If I do not need the additional symbols or functions, then do these macros in fact do anything useful for me? Does one or both cause gcc to generate different code than it otherwise would? Do they maybe cause my code's calls to standard library functions to be linked to different implementations? Or is Oracle just talking out of its nether regions?
Edit:
Additionally, it occurs to me that reentrancy is a separate consideration from threading. Non-reentrancy can be an issue even for single-threaded programs, so Oracle's suggestion that defining _REENTRANT makes gcc multithread-aware now seems even more dubious.
The Oracle recommendation was written for Solaris, not for Linux.
On Solaris, if you compiled a .so without _REENTRANT and ended up loaded by a multi-threaded application then very bad things could happen (e.g. random data corruption of libc internals). This was because without the define you ended up with unlocked variants of some routines by default.
This was the case when I first read this documentation, which was maybe 15 years ago, the mention of the -mt flag for the sun studio compiler was added after I last read this document in any detail.
This is no longer the case - You always get the same routine now whether or not you compile with the _REENTRANT flag; it's now only a feature macro, and not a behaviour macro.

Linking libraries built with different preprocessor flags or C standards

Scenario 1:
I want to link an new library (libA) into my program, libA was built using gcc with -std=gnu99 flag, while the current libraries of my program were built without that option (and let's assume gcc uses -std=gnu89 by default).
Scenario 2:
libB was built with some preprocessor flags like "-D_XOPEN_SOURCE -D_XOPEN_SOURCE_EXTENDED" to enable XPG4 features, e.g. msg_control member of struct msghdr. While libC wasn't built without those preprocessor flags, then it's linked against libB.
Is it wrong to link libraries built with different preprocessor flags or C standards ?
My concern is mainly about structure definitions mismatch.
Thanks.
Scenario 1 is completely safe for you. std= option in GCC checks code for compatibility with standard, but has nothing to do with ABI, so you may feel free to combine precompiled code with different std options.
Scenario 2 may be unsafe. I will put here just one simple example, real cases may be much more tricky.
Consider, that you have some function, like:
#ifdef MYDEF
int foo(int x) { ... }
#else
int foo(float x) { ... }
#endif
And you compile a.o with -DMYDEF and b.o without, and function bar from a.o calls function foo in b.o. Next you link it together and everything seems to be fine. Then everything fails in runtime and you may have very hard time debugging why are you passing int from one module, while expecting float on callee side.
Some more tricky cases may include conditionally defined structure fields, calling conventions, global variable sizes.
P.S. Assuming all your sources are written in the same language, varying only std options and macro definitions. Combining C and C++ code is really tricky sometimes, agree with Mikhail.
The few times I have encountered structure definition mismatch was when combining C and C++ code, in these cases there was a clear warning that something terrible was happening.
Something like
/usr/lib/gcc/i586-suse-linux/4.3/../../../../i586-suse-linux/bin/ld: Warning: size of symbol `tree' changed from 324 in /tmp/ccvx8fpJ.o to 328 in gpu.o
See that question.

C/C++ Compiler listing what's defined

This question : Is there a way to tell whether code is now being compiled as part of a PCH? lead me to thinking about this.
Is there a way, in perhaps only certain compilers, of getting a C/C++ compiler to dump out the defines that it's currently using?
Edit: I know this is technically a pre-processor issue but let's add that within the term compiler.
Yes. In GCC
g++ -E -dM <file>
I would bet it is possible in nearly all compilers.
Boost Wave (a preprocessor library that happens to include a command line driver) includes a tracing capability to trace macro expansions. It's probably a bit more than you're asking for though -- it doesn't just display the final result, but essentially every step of expanding a macro (even a very complex one).
The clang preprocessor is somewhat similar. It's also basically a library that happens to include a command line driver. The preprocessor defines a macro_iterator type and macro_begin/macro_end of that type, that will let you walk the preprocessor symbol table and do pretty much whatever you want with it (including printing out the symbols, of course).

Resources