C compiler preprocessor output - c

C compilers supports generating the preprocessor output file with .i extension.
As far as I know, this is true for Microsoft (Visual Studio), ARM, Keil and some GNU compilers.
They usually use the compiler switch -E or -P for that.
There's also the compiler switch -C to retain comments.
Is the creation of preprocessor files a standard in ANSI-C, or is this compiler specific?
Is the option -C also a standard?
EDIT:
To be more precise: This is about the support for creation of the .i file, not the compiler switch syntax or names.

It is not standardized in the ISO C standard.
However most compilers seem to have -E for generating prepro output. This is reasonable as the prepro output is often very useful for debugging.
Here is a list of compilers I checked:
gcc
pcc
clang
ctc (TriCore)
Tasking C166
Wind River (DIAB)
All these compilers allow writing the prepro output to any file (with any extension). The .i is definitely not standard.
The option -C for retaining comments seems to be rather specific.

The C programming language standard doesn't specify anything about how compilers are invoked.
It might be an "ad hoc" standard, but it's not something that is controlled in any official fashion "globally". There are local standards, such as POSIX that can specify things like these, but that would of course not cover compilers implemented for non-POSIX environments.

Related

What are Vectors and < > in C?

I was looking at the source code for gcc (out of curiosity), and I noticed a data structure that I've never seen in C before.
At line 80 and 129 (and many other places) in the parser, they seem to be using vectors.
80: vec<tree> incomplete_record_decls;
129: ridpointers = ggc_cleared_vec_alloc<tree> ((int) RID_MAX);
I've never encountered this data type in C, nor these: < >. Are they native to C?
Does anyone know what they are and how they are used?
Despite the .c filename, this code is not valid C; it is C++, using that language's template feature. If you inspect the gcc build process, you will find that this file is actually compiled with a C++ compiler.
https://gcc.gnu.org/codingconventions.html
The directories gcc, libcpp and fixincludes may use C++03. They may also use the long long type if the host C++ compiler supports it. These directories should use reasonably portable parts of C++03, so that it is possible to build GCC with C++ compilers other than GCC itself. If testing reveals that reasonably recent versions of non-GCC C++ compilers cannot compile GCC, then GCC code should be adjusted accordingly. (Avoiding unusual language constructs helps immensely.) Furthermore, these directories should also be compatible with C++11.
Keep in mind that although compilers will usually by default infer a source file's language from its filename, this default can always be overridden. It is entirely possible to have C++ code in a .c file, or C code in a .bas file for that matter; you just may have to tell the compiler some other way what language is in use.
I expect that gcc chose this file naming convention because this code was originally written in C and later converted to C++, and they found it too much of a pain to change all the filenames. It would mean a lot of work to update all the makefiles, etc. It may have been less of a pain to just change which compiler was used, and to explain the convention to all the developers. Of course, in general it is better programming practice to name your files in the standard way, but apparently the gcc developers felt it was not the best course of action in this case.
GCC has moved from C to C++ since GCC 4.8
GCC now uses C++ as its implementation language. This means that to build GCC from sources, you will need a C++ compiler that understands C++ 2003. For more details on the rationale and specific changes, please refer to the C++ conversion page.
GCC 4.8 Release Series - Changes, New Features, and Fixes
The work has actually begun long before that, with the creation of gcc-in-cxx branch. The developers first tried to compile the source code with a C++ compiler, so there weren't any name changes. I guess they didn't bother to rename the files later when merging the two branches and officially have only one C++ branch
You can read GCC's move to C++ for more historical information

what gcc compiler options can I use for gfortran

I studied Option Summary for gfortran but found no compiler option to detect integer overflow. Then I found the GCC (GNU Compiler Collection) flag option -fsanitize=signed-integer-overflow here and used it when invoking gfortran. It works--integer overflow can be detected at run time!
So what does -fsanitize=signed-integer-overflow do here? Just adding to the machine code generated by gfortran some machine-level pieces that check integer overflow?
What is the relation between GCC (GNU Compiler Collection) flag options and gfortran compiler options ? What gcc compiler options can I use for gfortran, g++ etc ?
There is the GCC - GNU Compiler Collection. It shares the common backend and middleend and has frontends for different languages. For example frontends for C, C++ and Fortran which are usually invoked by commands gcc, g++ and gfortran.
It is actually more complicated, you can call gcc on a Fortran source and gfortran on a C source and it will work almost the same with the exceptions of libraries being linked (there are some other fine points). The appropriate frontend will be called based on the file extension or the language requested.
You can look almost all GCC (not just gcc) flags for all of the mentioned frontends. There are certain flags which are language specific. Normally you will get a warning like
gfortran -fcheck=all source.c
cc1: warning: command line option ‘-fcheck=all’ is valid for Fortran but not for C
but the file will compile fine, the option is just ignored and you will get a warning about that. Notice it is a C file and it is compiled by the gfortran command just fine.
The sanitization options are AFAIK not that language specific and work for multiple languages implemented in GCC, maybe with some exceptions for some obviously language specific checks. Especially -fsanitize=signed-integer-overflow which you ask about works perfectly fine for both C and C++. Signed integer overwlow is undefined behaviour in C and C++ and it is not allowed by the Fortran standard (which effectively means the same, Fortran just uses different words).
This isn't a terribly precise answer to your question, but an aha! moment, when learning about compilers, is learning that gcc (the GNU Compiler Collection), like llvm, is an example of a three-stage compiler.
The ‘front end’ parses the syntax of whichever language you're interested, and spits out an Abstract Syntax Tree (AST), which represents your program in a language-independent way.
Then the ‘middle end’ (terrible name, but ‘the clever bit’) reorganises that AST into another AST which is semantically equivalent but easier to turn into machine code.
Then the ‘back end’ turns that reorganised AST into assembler for one-or-other processor, possibly doing platform-specific micro-optimisations along the way.
That's why the (huge number of) gcc/llvm options are unexpectedly common to (apparently wildly) different languages. A few of the options are specific to C, or Fortran, or Objective-C, or whatever, but the majority of them (probably) are concerned with the middle and last bits, and so are common to all of the languages that gcc/llvm supports.
Thus the various options are specific to stage 1, 2 or 3, but may not be conveniently labelled as such; with this in mind, however, you might reasonably intuit what is and isn't relevant to the particular language you're interested in.
(It's for this sort of reason that I will dogmatically claim that CC++FortranJavaPerlPython is essentially a single language, with only trivial syntactical and library minutiae to distinguish between dialects).

What is the difference between the "c99" and "gcc" commands with appropriate flags?

Up until today I always read on the Internet how gcc is the best compiler for C (at least for the student level of programing, followed closely by Clang).
However in "21st Century C" Mr Ben Klemens suggests that c99 is better(?) than running gcc -std=c99 (actual line is [page 11]: everybody else switched to C99 being the default a long
time ago...)
I wasn't able to find anything on the subject of c99 compiler, so my question is:
Is there any difference between those commands and if there are, which one is better?
EDIT: The standard C99 is clearly metioned in the paragraph, however from the beginning the suggested method of compiling is the command:
gcc erf.c -o erf -lm -g -Wall -O3 -std=gnu11
However on page 11 the author states:
The POSIX standard specifies that c99 be present on your system, so the
compiler-agnostic version of the above line would be:
c99 erf.c -o erf -lm -g -Wall -O3
This seems to suggest there is a difference in those 2 commands. I wasn't able to find any additional info nor was it clear to me from the text, what the second line is exactly (no man page for c99 on my Cygwin either).
C99 is the 1999 edition of the ISO C standard. It replaced the 1990 standard, and has been (officially, at least) replaced by the 2011 standard.
What you're asking about, though, is the c99 command (I've updated your question's title to clarify that).
POSIX specifies a c99 command. The requirements are documented here. It is "an interface to the standard C compilation system".
On typical Linux systems, the c99 command /usr/bin/c99 is a small shell script that invokes the gcc commmand. It invokes gcc with the -std=c99 option. It also checks whether the user has already specified an equivalent option, so it doesn't use the same option twice. If an incompatible option has been given, such as c99 -std=c90, it terminates with an error message.
Given such an implementation, the command
c99 [args]
is exactly equivalent to
gcc -std=c99 [args]
As I mentioned above, the C99 standard has been officially superseded by the C11 standard. gcc version 5 (the current latest release is 5.3.1) has reasonably good support but not 100% complete support for C11. POSIX has not (yet) specified a c11 command.
There's nothing wrong with using the C99 standard if you don't need C11-specific features -- or even the C90 standard if you don't need C99-specific features.
In my PDF copy of the book, the discussion about using c99 instead of gcc -std=c99 seems to be on page 10, not 11.
And what is being discussed is not that c99 is "better" than gcc, but that you might be able to more easily use C99-standard compiler features with the c99 command, since you don't then need to know the specific option to enable C99 features or whether the default for the compiler is C99 or C89.
On my system, the command c99 is just an alias or link for gcc that has the -std=c99 set by default (and complains if a non-C99 standard is specified with the -std= option). I imagine that or something similar is true on most systems with a c99 compiler command.
In fact, on my system c99 is a link to a shell script:
#! /bin/sh
# Call the appropriate C compiler with options to accept ANSI/ISO C
# The following options are the same (as of gcc-3.3):
# -std=c99
# -std=c9x
# -std=iso9899:1999
# -std=iso9899:199x
extra_flag=-std=c99
for i; do
case "$i" in
-std=c9[9x]|-std=iso9899:199[9x])
extra_flag=
;;
-std=*|-ansi)
echo >&2 "`basename $0` called with non ISO C99 option $i"
exit 1
;;
esac
done
exec gcc $extra_flag ${1+"$#"}
Try c99 --version on a typical Linux box. You will get the version and name of the compiler which is gcc.
c99 is just a shortcut to the c99 compliant compiler on your machine. That way you don't have to care about the actual compiler used. POSIX also requires some common command line options the compiler has to understand. If that is gcc, it shall enable c99 compliant features. This should be identical to gcc -std=c99.
gcc provides additional features which are enabled by default [1] when called by its native name and by the -std=gccXX option in addition to the CXX standard. For older versions, some of these extensions became part of the next C standard either directly or with slightly different syntax. A typical and appreciated extension for C90 is support for C++-style line-comments:
// this is not allowed in pure C90
For c99/gnu99 things are less obvious, but might still add some usefull features.
On other POSIX systems, e.g. Unix, you may find a different compiler. It shall still be available by the name c99.
Note that the current and only valid C standard is C11 since 2011. So if you want to use the new features (e.g. atomics, thread-support), you have to deviate from the pure POSIX-path. Yet it is likely POSIX might be updated some day.
[1] The default version of the C standard depends on the version of gcc. pre5 used C90, while gcc 5.x uses C11.

Using c89 in Xcode

Is there any way to compile C code with c89 standard NOT c99 in Xcode (or another way with terminal)?
I've searched in Xcode settings but I didn't find any way to choose compiler or standard.
You should add -pedantic-errors to Other C flags in your project settings, like so:
Of course, don't forget to set the C language dialect to C89 as well.
This will give you the appropriate compile time errors when you try to compile something that is not valid C89.
Optionally, if you want Xcode to compile your code regardless of incompatibilities, but only give you yellow warnings at the problematic lines, use -pedantic instead of -pedantic-errors.
In a nutshell, these flags make the compiler stick to the language standard more strictly, as opposed to the default behavior, which is to attempt compiling the code any way possible.
I hope this helps :)
Source
(even though they mention this in the context of GCC, but the same flags apply for Clang as well)

Is there any C preprocessor as an independent program?

I know that C preprocessor exists as part of compiler. But I'm looking for an independent program. Is there any such tool?
It's often called cpp. For example, on my Linux box:
CPP(1) GNU CPP(1)
NAME
cpp - The C Preprocessor
SYNOPSIS
cpp [-Dmacro[=defn]...] [-Umacro]
[-Idir...] [-iquotedir...]
[-Wwarn...]
[-M|-MM] [-MG] [-MF filename]
[-MP] [-MQ target...]
[-MT target...]
[-P] [-fno-working-directory]
[-x language] [-std=standard]
infile outfile
This particular one is part of gcc and is available for a wide variety of platforms.
mcpp.
From the homepage:
mcpp is a C/C++ preprocessor with the following features.
Implements all of C90, C99 and C++98 specifications.
Provides a validation suite to test C/C++ preprocessor's conformance and quality comprehensively. When this validation suite is applied, mcpp distinguishes itself among many existing preprocessors.
Has plentiful and on-target diagnostics to check all the preprocessing problems such as latent bug or lack of portability in source code.
Has #pragma directives to output debugging information.
Is portable and has been ported to many compiler-systems, including GCC and Visual C++, on UNIX-like systems and Windows.
Has various behavior modes.
Can be built either as a compiler-specific preprocessor to replace the resident preprocessor of a particular compiler system, or as a compiler-independent command, or >even as a subroutine called from some other main program.
Provides comprehensive documents both in Japanese and in English.
Is an open source software released under BSD-style-license.
You can also have a look at m4
What is m4?
M4 can be called a “template language”, a “macro language” or a “preprocessor language”. The name “m4” also refers to the program which processes texts in this language: this “preprocessor” or “macro processor” takes as input an m4 template and sends this to the output, after acting on any embedded directives, called macros.
I've used filepp for preprocessing files other than straight C. It's a Perl module, so it's pretty portable. It's handy in that you can use all the familiar idioms you are used to, and adds some useful features.
From the web site:
Why filepp and not plain old cpp?
cpp is designed specifically to
generate output for the C compiler.
Yes, you can use any file type with
it, but the output it creates includes
loads of blank lines and lines of the
style:
# 1 "file.c"
Obviously these lines are very useful
to the C-compiler, but no use in say
an HTML file. Also, as filepp is
written in Perl, it is 8-bit clean and
so works on any character set, not
just ASCII characters. filepp is also
customisable and hopefully more user
friendly than cpp.
cpp is just one. It's a separated program called by gcc when compiling.
It is a part of the package, and usually called cpp (C PreProcessor).
which cpp
# /usr/bin/cpp
man cpp

Resources