Side effects in generic expressions - c

I'm doing some experiments with the new _Generic keyword and stumbled upon a special case regarding multiple evaluations. See the following:
#include <stdio.h>
#define write_char(c) _Generic(c, char: putchar, const char: putchar)(c)
int main(void)
{
const char *s = "foo";
write_char(*s++);
write_char(*s++);
write_char(*s++);
putchar('\n');
}
This compiles fine and produces the expected result with GCC:
$ gcc -std=c11 -Wall plusplus.c -o plusplus
$ ./plusplus
foo
On the other hand, Clang outputs a big honking warning:
$ clang -std=c11 plusplus.c -o plusplus
plusplus.c:9:18: warning: multiple unsequenced modifications to 's'
[-Wunsequenced]
write_char(*s++);
^~
plusplus.c:3:32: note: expanded from macro 'write_char'
#define write_char(c) _Generic(c, char: putchar, const char: putchar)(c)
...
Yet the result is as expected:
$ ./plusplus
foo
I checked the draft of the standard, which says (at p. 97 of the PDF):
The controlling expression of a generic selection is not evaluated.
This seems to precisely address the problem of side-effects in macros (e.g., MIN and MAX).
Now, can I safely ignore Clang's warning, or am I wrong?

As I mentioned in comments, you posted the question about two weeks after the bug was fixed in Clangs trunk. See revision rL223266 (December 3 2014). The fix is included in Clang 3.6.
Now, can I safely ignore Clang's warning, or am I wrong?
We already know that you're right, so here is a way to ignore pragmas in Clang for the
future:
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wunsequenced"
write_char(*s++);
#pragma clang diagnostic pop
To do not repeat it on every use of the macro, you could put _Pragma in its body:
#define write_char(c) \
_Pragma("clang diagnostic push") \
_Pragma("clang diagnostic ignored \"-Wunsequenced\"") \
_Generic(c, char: putchar, const char: putchar)(c) \
_Pragma("clang diagnostic pop")

It seems that was a bug. It has now been solved since clang 3.6 onward as shown here.

Related

How to force Werror=declaration-after-statement with -std=c99 in clang

I would like to have compiler throw an error every time there is a declaration after statement because that is the coding style I want to enforce, but I also want to compile with -std=c99 since I use some of the specific c99 features.
The problem is that in c99 declarations are allowed anywhere in the code, not just at the beginning of a block.
Take a look at the following program:
// prog.c
#include <stdio.h>
int main(void)
{
printf("hello world\n");
int i = 0;
return 0;
}
If I compile this code with gcc like this:
gcc -std=c99 -Werror=declaration-after-statement prog.c
it throws the following error:
prog.c: In function ‘main’:
prog.c:6:9: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
6 | int i = 0;
| ^~~
cc1: some warnings being treated as errors
This is the behavior I would like to have when compiling with clang, but clang behaves differently.
If I compile the same code with clang like this:
clang -std=c99 -Werror=declaration-after-statement prog.c
it throws no errors.
Only if I compile the code with clang like this it throws the error I want:
clang -std=c90 -Werror=declaration-after-statement prog.c
prog.c:6:6: error: ISO C90 forbids mixing declarations and code [-Werror,-Wdeclaration-after-statement]
int i = 0;
^
1 error generated.
But this is not good for me because I need to use -std=c99.
Is it possible to force -Werror=declaration-after-statement along with -std=c99 when compiling with clang?
Looking at the source code of clang it seems like not supported.
The diagnostic is defined in clang/include/clang/Basic/DiagnosticSemaKind.td
def ext_mixed_decls_code : Extension<
"ISO C90 forbids mixing declarations and code">,
InGroup<DiagGroup<"declaration-after-statement">>;
And its only usage is in clang/lib/Sema/SemaStmt.cpp
StmtResult Sema::ActOnCompoundStmt(SourceLocation L, SourceLocation R,
ArrayRef<Stmt *> Elts, bool isStmtExpr) {
const unsigned NumElts = Elts.size();
// If we're in C89 mode, check that we don't have any decls after stmts. If
// so, emit an extension diagnostic.
if (!getLangOpts().C99 && !getLangOpts().CPlusPlus) {
// Note that __extension__ can be around a decl.
unsigned i = 0;
// Skip over all declarations.
for (; i != NumElts && isa<DeclStmt>(Elts[i]); ++i)
/*empty*/;
// We found the end of the list or a statement. Scan for another declstmt.
for (; i != NumElts && !isa<DeclStmt>(Elts[i]); ++i)
/*empty*/;
if (i != NumElts) {
Decl *D = *cast<DeclStmt>(Elts[i])->decl_begin();
Diag(D->getLocation(), diag::ext_mixed_decls_code); // <-- here
}
}
...
Note the !getLangOpts().C99 in the if. The diagnose code will never execute with a standard above c90.
Well one thing you can surely try is build clang by yourself and delete that part of the if so end up with if (!getLangOpts().CPlusPlus).
I tried and it worked for me.
You can configure the clang build with cmake -G "Ninja" -DCMAKE_BUILD_TYPE="Release" -DLLVM_ENABLE_PROJECTS="clang" -DLLVM_TARGETS_TO_BUILD="X86" -DCMAKE_C_COMPILER="/usr/bin/gcc" -DCMAKE_CXX_COMPILER="/usr/bin/g++" -DLLVM_PARALLEL_LINK_JOBS=2 -DLLVM_OPTIMIZED_TABLEGEN=ON path/to/llvm-project/llvm

How do I work around the "unknown conversion type character `z' in format" compiler-specific warning?

I'm working on code that is cross-compiled to several target architectures.
I looked at the handful of hits from searching Stack Overflow for "printf size_t unknown conversion type character" warning, however those posts all seem to be related to minGW, so those answers, essentially ifdefing against _WIN32, do not apply to my instance of essentially the same problem, i.e. printf not recognizing "%zu" as the format-specifier for size_t, but with a mips cross compiler.
Is there an existing compiler flag (for the noted cross-compiler) that enables libc to recognize "%zu" as the format-specifier for size_t?
$ cat ./main.c
// main.c
#include <stdio.h>
int main( int argc, char* argv[] )
{
size_t i = 42;
printf( "%zu\n", i );
return 0;
}
$ /path/to/mips_fp_le-gcc --version
2.95.3
$
$ file /path/to/libc.so.6
/path/to/libc.so.6: ELF 32-bit LSB pie executable, MIPS, MIPS-I version 1 (SYSV), dynamically linked, interpreter /lib/ld.so.1, for GNU/Linux 2.2.15, not stripped, too many notes (256)
$
$ /path/to/mips_fp_le-gcc -mips2 -O2 -EL -DEL -pipe -Wall -Wa,-non_shared -DCPU=SPARC -DLINUX -D_REENTRANT -DPROCESS_AUID -DTAGGING -fPIC -I. -I../../../root/include -I../include -I../../../common/include -I../../..
/root/include -DDISABLE_CSL_BITE -DDISABLE_DNS_LOOKUP -DOS=UNIX -DLINUX -DPOSIX_THREADS -D__USE_GNU -D_FORTIFY_SOURCE=2 -DHANDLE_CSL_DUPLICATES -DOS=UNIX -DLINUX -DPOSIX_THREADS -D__USE_GNU -D_FORTIFY_SOURCE=2 -DHANDLE_CSL_DUPLICATES -DOS=UNIX -DLINUX -DPOSIX_THREADS -D__USE_GNU -D_FORTIFY_SOURCE=2 -DHANDLE_C
SL_DUPLICATES -DOS=UNIX -DLINUX -DPOSIX_THREADS -D__USE_GNU -D_FORTIFY_SOURCE=2 -DHANDLE_CSL_DUPLICATES -o ./main.o -c main.c
main.c: In function `main':
main.c:6: warning: unknown conversion type character `z' in format
main.c:6: warning: too many arguments for format
If the direct answer to the bolded question is "no", what are other possible solutions? Possibilities that come to mind are...
register_printf_function()
Wrap the format-specifier in a target-specific macro (similar to this minGW-specific post)
...any other ideas? I'd have a strong preference for solutions not involving target-specific preprocessor code, for which reason the above two are not ideal.
I think (but am not sure) that the cross-compiler version is old; are newer versions of the noted toolchain known/guaranteed to have a libc that recognize "%zu" as the format-specifier to size_t?
Update: This cross-compiler seems to not recognize -std=c99; adding it to the compiler flags generates the error "cc1: unknown C standard 'c99'"
I work with a big codebase that's compiled under several different compilers, some of which are old and don't understand %z, so we just do things like
printf("size = %d", (int)size);
That's the easy way for small sizes, of course. If the size might be large, other alternatives are
printf("size = %u", (unsigned)size);
or
printf("size = %lu", (unsigned long)size);
(and there are other obvious possibilities as well).
Your gcc does not support z as a length modifier. It's nothing to do with MIPS, which makes no difference at all, but rather that version 2.95.3 lacks support.
Support for a Z length modifier was added on Feb 9th 1998, commit by Andreas Schwab "c-common.c (format_char_info): Add new field zlen.". There was a gcc extension of Z as a conversion type specifier (rather than length modifier) for size_t before that. This code is in gcc 2.95.3, so it should recognize Z, but not z.
Support for z was added on July 17 2000 by Joseph Myers, "c-common.c (scan_char_table): Allow "z" length modifiers on diouxXn formats". Despite predating gcc 2.95.3 in time, this was in a gcc 3 branch and wasn't released until until gcc 3.0. So your ancient compiler simply hasn't got it.
So you could change your code to use Z, which is still supported. You could also define a macro based on compiler version:
#if __GNUC__ < 3
#define PZ "Z"
#else
#define PZ "z"
#endif
Then use this as in printf("The size is %"PZ"u\n", sizeof(int)); You'll still have to modify your code. But it wouldn't be any different in the end, as the format string, after the preprocessor, would still be %zu on newer compilers and %Zu on old ones. The idea of casting the size_t arguments to something else will actually change the result of the code, as they will be cast to larger/smaller types in some cases, depending on what size_t is and what you cast to.
Alternatively, if you can build your toolchain, you could patch gcc to know about z. I think a one line change in the case statement that uses zlen in "c-common.c" would do it.
register_printf_function() is part of glibc, which is where the printf() code lives. It would allow you to extend printf with new formats at run time. There's nothing you could do at compile time with it that will change the compiler. And I don't believe gcc will be able to know that a new format has been added when it does printf type checking when register_printf_function() is used.

C preprocessing fails to stop immediately after an #error

My question today should not be very much complicated, but I simply can't find a reason/solution. As a small, reproducible example, consider the following toy C code
#define _state_ 0
#if _state_ == 1
int foo(void) {return 1;}
#else
/* check GCC flags first
note that -mfma will automatically turn on -mavx, as shown by [gcc -mfma -dM -E - < /dev/null | egrep "SSE|AVX|FMA"]
so it is sufficient to check for -mfma only */
#ifndef __FMA__
#error "Please turn on GCC flag: -mfma"
#endif
#include <immintrin.h> /* All OK, compile C code */
void foo (double *A, double *B) {
__m256d A1_vec = _mm256_load_pd(A);
__m256d B_vec = _mm256_broadcast_sd(B);
__m256d C1_vec = A1_vec * B_vec;
}
#endif
I am going to compile this test.c file by
gcc -fpic -O2 -c test.c
Note I did not turn on GCC flag -mfma, so the #error will be triggered. What I would expect, is that compilation will stop immediately after GCC sees this #error, but this is what I got with GCC 5.3:
test.c:14:2: error: #error "Please turn on GCC flag: -mfma"
#error "Please turn on GCC flag: -mfma"
^
test.c: In function ‘foo’:
test.c:22:11: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
__m256d A1_vec = _mm256_load_pd(A);
^
GCC does stops, but why does it also pick up a line after #error? Any explanation? Thanks.
For people who want to try, there are some hardware requirement. You need an x86-64 with AVX and FMA instruction sets.
I have a draft copy of the C ISO spec, and in §4/4 - it states
The implementation shall not successfully translate a preprocessing translation unit containing a #error preprocessing directive unless it is part of a group skipped by conditional inclusion.
Later on, in §6.10.5, where #error is formally defined, it says
A preprocessing directive of the form
# error pp-tokens opt new-line
causes the implementation to produce a diagnostic message that includes the specified
sequence of preprocessing tokens.
In other words, the spec only requires that any code that has an #error just needs to fail to compile and report an error message along the way, not that the compilation immediately needs to terminate as soon as #error is reached.
Given that it's considered good practice to always check the top-level errors reported by a compiler before later ones, I'd imagine a competent programmer who saw a string of errors beginning with an #error directive would likely know what was going on.

In which version of C was _Bool type introduced?

Answers to this question on SO specify that _Bool was introduced in C99 (specifically
BobbyShaftoe's answer).
But the following code compiles perfectly with gcc -std=c90 -pedantic test.c and gcc -std=c89 -pedantic test.c both, producing output 1 (my gcc version is 4.7.1)
#include<stdio.h>
#include<stdlib.h>
int main(){
_Bool a = 0;
printf("%u\n",sizeof(a) );
return 0;
}
So in which version was _Bool introduced ?
As pointed by #0xC0000022L you need to differentiate between standard and actual implementation. Interestingly there is however some inconsistency how GCC (and Clang as well) treats _Bool. C99 introduced also _Complex type, for which there is diagnostic message:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
printf("%lu\n", (unsigned long) (sizeof(_Complex)));
return 0;
}
results into (GCC 4.4.7):
$ gcc -ansi -pedantic-errors check.c
check.c: In function ‘main’:
check.c:5: error: ISO C90 does not support complex types
check.c:5: error: ISO C does not support plain ‘complex’ meaning ‘double complex’
You may examine source code, and then you will find that indeed _Complex type is checked against standard version, while _Bool is not:
gcc/po/gcc.pot:19940
#: c-decl.c:7366
#, gcc-internal-format
msgid "ISO C90 does not support complex types"
msgstr ""
gcc/c-decl.c:7380
case RID_COMPLEX:
dupe = specs->complex_p;
if (!flag_isoc99 && !in_system_header)
pedwarn (input_location, OPT_pedantic, "ISO C90 does not support complex types");
I wouldn't call it a bug, but rather a decision that was made by developers.

How can I keep gcc -O2 from optimizing putchar out?

I have an application that uses a custom putchar(); which until now has been working fine.
I bumped up the optimization level of the application to -O2, and now my putchar isn't used.
I already use -fno-builtin, and based on some googling I added -fno-builtin-putchar to my CFLAGS, but that didn't matter.
Is there a "correct" way to get around this or do I have to go into my code and add something like
#define putchar myputchar
to be able to use -O2 and still pull in my own putchar() function?
edit--
Since my original post of this question, I stumbled on -fno-builtin-functions=putchar, as yet another gcc commandline option. Both this and the one above are accepted by gcc, but don't seem to have any noticeable effect.
edit more--
Experimenting further I see that gcc swallows -fno-builtin-yadayada also, so apparently the options parsing at the gcc front end is just passing the text after the second dash to some lower level which ignores it.
more detail:
Three files try1.c, try2.c and makefile...
try1.c:
#include <stdio.h>
int
main(int argc, char *argv[])
{
putchar('a');
printf("hello\n");
return(0);
}
try2.c:
#include <stdio.h>
int
putchar(int c)
{
printf("PUTCHAR: %c\n",c);
return(1);
}
makefile:
OPT=
try: try1.o try2.o
gcc -o try try1.o try2.o
try1.o: try1.c
gcc -o try1.o $(OPT) -c try1.c
try2.o: try2.c
gcc -o try2.o $(OPT) -c try2.c
clean:
rm -f try1.o try2.o try
Here's the output:
Notice that without optimization it uses the putchar I provided; but with -O2 it gets it from some other "magic" place...
els:make clean
rm -f try1.o try2.o try
els:make
gcc -o try1.o -c try1.c
gcc -o try2.o -c try2.c
gcc -o try try1.o try2.o
els:./try
PUTCHAR: a
hello
els:
els:
els:
els:make clean
rm -f try1.o try2.o try
els:make OPT=-O2
gcc -o try1.o -O2 -c try1.c
gcc -o try2.o -O2 -c try2.c
gcc -o try try1.o try2.o
els:./try
ahello
els:
Ideally, you should produce an MCVE (Minimal, Complete, Verifiable Example) or
SSCCE (Short, Self-Contained, Correct Example) — two names (and links) for the same basic idea.
When I attempt to reproduce the problem, I created:
#include <stdio.h>
#undef putchar
int putchar(int c)
{
fprintf(stderr, "%s: 0x%.2X\n", __func__, (unsigned char)c);
return fputc(c, stdout);
}
int main(void)
{
int c;
while ((c = getchar()) != EOF)
putchar(c);
return 0;
}
When compiled with GCC 4.9.1 on Mac OS X 10.9.4 under both -O2 and -O3, my putchar function was called:
$ gcc -g -O2 -std=c99 -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes -Werror pc.c -o pc
$ ./pc <<< "abc"
putchar: 0x61
putchar: 0x62
putchar: 0x63
putchar: 0x0A
abc
$
The only thing in the code that might be relevant to you is the #undef putchar which removes the macro override for the function.
Why try1.c doesn't call your putchar() function
#include <stdio.h>
int
main(int argc, char *argv[])
{
putchar('a');
printf("hello\n");
return(0);
}
The function putchar() may be overridden by a macro in <stdio.h>. If you wish to be sure to call a function, you must undefine the macro.
If you don't undefine the macro, that will override anything you do. Hence, it is crucial that you write the #undef putchar (the other changes are recommended, but not actually mandatory):
#include <stdio.h>
#undef putchar
int main(void)
{
putchar('a');
printf("hello\n");
return(0);
}
Note that putchar() is a reserved symbol. Although in practice you will get away with using it as a function, you have no grounds for complaint if you manage to find an implementation where it does not work. This applies to all the symbols in the standard C library. Officially, therefore, you should use something like:
#include <stdio.h>
#undef putchar
extern int put_char(int c); // Should be in a local header
#define putchar(c) put_char(c) // Should be in the same header
int main(void)
{
putchar('a');
printf("hello\n");
return(0);
}
This allows you to leave your 'using' source code unchanged (apart from including a local header — but you probably already have one to use). You just need to change the implementation to use the correct local name. (I'm not convinced that put_char() is a good choice of name, but I dislike the my_ prefix, for all it is a common convention in answers.)
ISO/IEC 9899:2011 §7.1.4 Use of library functions
Each of the following statements applies unless explicitly stated otherwise in the detailed
descriptions that follow: …
Any function
declared in a header may be additionally implemented as a function-like macro defined in
the header, so if a library function is declared explicitly when its header is included, one
of the techniques shown below can be used to ensure the declaration is not affected by
such a macro. Any macro definition of a function can be suppressed locally by enclosing
the name of the function in parentheses, because the name is then not followed by the left
parenthesis that indicates expansion of a macro function name. For the same syntactic
reason, it is permitted to take the address of a library function even if it is also defined as
a macro.185) The use of #undef to remove any macro definition will also ensure that an
actual function is referred to. Any inv ocation of a library function that is implemented as
a macro shall expand to code that evaluates each of its arguments exactly once, fully
protected by parentheses where necessary, so it is generally safe to use arbitrary
expressions as arguments.186) Likewise, those function-like macros described in the
following subclauses may be invoked in an expression anywhere a function with a
compatible return type could be called.187)
185) This means that an implementation shall provide an actual function for each library function, even if it
also provides a macro for that function.
186) Such macros might not contain the sequence points that the corresponding function calls do.
187) Because external identifiers and some macro names beginning with an underscore are reserved,
implementations may provide special semantics for such names. For example, the identifier
_BUILTIN_abs could be used to indicate generation of in-line code for the abs function. Thus, the
appropriate header could specify
#define abs(x) _BUILTIN_abs(x)
for a compiler whose code generator will accept it.
In this manner, a user desiring to guarantee that a given library function such as abs will be a genuine
function may write
#undef abs
whether the implementation’s header provides a macro implementation of abs or a built-in
implementation. The prototype for the function, which precedes and is hidden by any macro
definition, is thereby revealed also.
Judging from what you observe, in one set of headers, putchar() is not defined as a macro (it does not have to be, but it may be). And switching compilers/libraries means that now that putchar() is defined as a macro, the missing #undef putchar means that things no longer work as before.

Resources