This question concerns the various types that need to be defined in the required headers of the POSIX/SUS standard.
Some types needs to be defined in many header files, and I'm wondering what's the correct and compliant way of achieving this.
For instance, look at the <time.h> header specification:
Inclusion of the header may make visible all symbols from the
<signal.h> header.
This is straightforward, <signal.h> is included from <time.h>.
Now what about this:
The clock_t, size_t, time_t, clockid_t, and
timer_t types shall be defined as described in
<sys/types.h>.
As I understand it, it means we can't simply include <sys/types.h> from <time.h>, as this would expose more symbols than required.
So can anyone confirm this?
Would it break the standard compliance to include <sys/types.h>?
If so, I think the best solution is to create a specific header for each type, so a particular type can be made visible from anywhere, without worrying about other symbols.
Is there any other good solution?
And last thing, what about the types from the C standard?
Lots of types in the POSIX/SUS specification are integral types, and may need to have fixed width.
So would it be ok for the standard to include <stdint.h> from a specific header, or would it break the compliance?
You are right that exposing unwanted types and declarations from additional headers would be non-conforming.
One trivial (but, in terms of time the preprocessor spends opening files, expensive) solution is to have a separate header for each type, and whenever you need, for example, time_t, do:
#include <types/time_t.h>
Of course types/time_t.h would have the appropriate multiple-inclusion guards.
There are many other ways of achieving the same. The approach glibc and gcc use is to have special "needed" macros you can define before including a header that asks it to do nothing but provide one or more types. This solution is also very expensive, probably moreso than the above, because it breaks compilers' heuristics for multiple-inclusion guards. The compiler can't elide repeated inclusions; it has to parse the file each time it's included.
The way we do it in musl is to have a single file, bits/alltypes.h, which includes the definitions of all types which are needed by multiple headers and macros to control which are exposed. It's generated by a simple sed script that hides all the macro logic:
http://git.musl-libc.org/cgit/musl/tree/arch/i386/bits/alltypes.h.in?id=9448b0513e2eec020fbca9c10412b83df5027a16
http://git.musl-libc.org/cgit/musl/tree/include/alltypes.h.in?id=9448b0513e2eec020fbca9c10412b83df5027a16
http://git.musl-libc.org/cgit/musl/tree/tools/mkalltypes.sed?id=9448b0513e2eec020fbca9c10412b83df5027a16
Upon reading this question i opened the couldn't resist checking this up in THE largest UNIX compliant (strictly though, only "UNIX-like") open-source project of all time - Linux Kernel (which is again almost always worried about standard compliance nowadays).
A sample implementation :
The time.h header will define a few internal flags and then include the types.h which defines ALL the types albeit within #ifdefs that check whether some internal flag is defined.
So this boils down to the following :
Define modular headers like time.h
#define the relevant internal flags to establish the context.
Include the "internal" headers that provide dependencies.
Implement the "internal" headers such that they selectively expose functionality based upon the context in which they are #include-ed.
This way modular headers can be provided for inclusion without worrying about accidentally exposing more symbols than required.
Related
From my understanding C99 new types such as uint32_t, uint_fast64_t, uintmax_t etc. are defined in <stdint.h>. However, I noticed they're also defined in stdlib.h, and from gnu.org I found out that it is one of many headers checked, but in other websites only <stdint.h> is referenced.
If I use these types including only <stdlib.h>, which has them defined in my implementation, will my program be portable for other platforms or it could not work because in another computer they're only defined in <stdint.h>?
My guess is that if I compile the program for every architecture/OS from my computer there won't be any problems, but the compilation could fail from another one because in that particular implementation the new types are only defined in another header.
Is it necessary to include <stdint.h> to guarantee portability of C99 new types?
Yes, or <inttypes.h>. Not really to "guarantee portability", but rather include <stdint.h> to use those types at all in any compiler at any time. Note that uint32_t and all intN_t types are optional - they may (potentially...) not be available even after including stdint.h.
If I use these types including only <stdlib.h>, which has them defined in my implementation, will my program be portable for other platforms or it could not work because in another computer they're only defined in <stdint.h>?
It could not work.
could fail from another one because in that particular implementation the new types are only defined in another header.
Yes.
from gnu.org I found out that it is one of many headers checked
The site you referenced is about build system configuration used to build GCC compiler itself, it's not related to what GCC is offering when compiling programs. GCC compiles only for twos complement platforms, you may be sure that stdint.h will be available when you are using GCC.
So I'm writing portable embedded ansi C code that is attempting to support multiple compilers and hardware targets. Each compiler/hardware vendor has different math.h functions it supports. Some support only C90, some support a subset of C99, others a full set of C99.
I'm trying to find a way to check if a given function exists during preprocessor so that I can use a custom macro if it doesn't exist. Some vendors have extern functions in the math.h, some use #define to remap to some internal call. Is there a piece of code that can tell if it is #defined or an extern function? I can use #ifdef for the define, but what about an actual function call?
The usual solution is instead to look at macros defined by the preprocessor itself, or passed into the build process as -D definitions, which identify the compiler and platform you're running on, and use those plus your knowledge of what special assists each environment needs to configure your code.
I suppose you could write a series of test .c files, try compiling them, look at the error codes coming back, and use those to set appropriate -D flags... but I'm not convinced that would be any cleaner.
This is just a general compiler question, directed at C based languages.
If I have some code that looks like this:
#include "header1.h"
#include "header2.h"
#include "header3.h"
#include "header4.h" //Header where #define BUILD_MODULE is located
#ifdef BUILD_MODULE
//module code to build
#endif //BUILD_MODULE
Will all of the code associated with those headers get built even if BUILD_MODULE is not defined? The compiler just "pastes" the contents of headers correct? So this would essentially build a useless bunch or header code that just takes up space?
All of the text of the headers will be included in the compilation, but they will generally have little or no effect, as explained below.
C does not have any concept of “header code”. A compilation of the file in the question would be treated the same as if the contents of all the included files appeared in a single file. Then what matters is whether the contents define any objects or functions.
Most declarations in header files are (as header files are commonly used) just declarations, not definitions. They just tell the compiler about things; they do not actually cause objects or code to be created. For the most part, a compiler will not generate any data or code from declarations that are not definitions.
If the headers define external objects or functions, the compiler must generate data (or space) or code for them, because these objects or functions could be referred to from other source files to be compiled later and then linked with the object produced from the current compilation. (Some linkers can determine that external objects or functions are not used and discard them.)
If the headers define static objects or functions (to be precise, objects with internal or no linkage), then a compiler may generate data or code for these. However, the optimizer should see that these objects and functions are not referenced, and therefore generation may be suppressed. This is a simple optimization, because it does not require any complicated code or data analysis, simply an observation that nothing depends on the objects or functions.
So, the C standard does not guarantee that no data or code is generated for static objects or functions, but even moderate quality C implementations should avoid it, unless optimization is disabled.
Depends on the actual compiler. Optimizing compilers will not generate the output for unrequired code, whereas dumber compilers will.
gcc (a very common c compiler for open-source platforms) will optimize your code with the -O option, which will not generate unneeded expressions.
Code in #ifdef statements where the target is not defined will never generate output, as this would violate the language specifications.
Conceptually, at least, include/macro processing is a separate step from compilation. The main source file is read and a new temporary file is constructed containing all the included code. If anything is "#ifdefed out" then that code is not included in the temporary file. At the same time, the occurrences of macro names are replaced with the text they "expand" into. It is that resulting file, with all the includes included, etc, that is fed into the actual compiler.
Some compilers do this literally (and you can even "capture" the intermediate file) while others sort of simulate it (and actually require an entire separate step if you request that the intermediate file be produced). But most compilers have one means or another of producing the file for your examination.
The C/C++ standards lay out some rather arcane rules that must be followed to assure that any "simulated" implementation doesn't somehow change the behavior of the resulting code, vs the "literal" approach.
I was looking at the man page of sigaction, and I ended up looking at the following line.
sigaction(): _POSIX_C_SOURCE >= 1 || _XOPEN_SOURCE || _POSIX_SOURCE
What do _POSIX_X_SOURCE, _X_OPEN_SOURCE, _POSIX_SOURCE mean? What to do with it?
These are feature test macros. Their purpose is to allow your program to inform the system header files which standards you want it to attempt to conform to, and what extensions you want available.
Without any feature test macros defined, implementations vary a lot in what macros, functions, and type definitions they make visible in their headers. A common practice is to make everything visible by default, which is a problem because "everything" is not very specific, and it's very possible that symbol names used in your program might clash with some of the extensions. Even if they don't clash now, there's no way to know if they will in the future. So the standards (like ISO C and POSIX) put strict requirements on the implementation that it not pollute the applications namespace with names not explicitly defined or reserved in the standards. When you use a feature test macro to request a particular standard, you're asking the implementation to ensure that (1) it provides everything defined in this standard, (2) it doesn't pollute your application's namespace by providing anything not defined in that standard.
A correct program should always explicitly use the right feature test macros for the standard(s) it's written to. The easiest way to do this is putting the right -D argument on the compiler command line (CFLAGS). Adding the #define as the first line in each source file also works. Be aware if you do it in source files though:
The feature test macros must be defined at the top before any system header is included.
It's usually a bad idea to use different feature test macros in different translation units.
As an aside, it's not exactly the same as the other feature test macros, but all modern programs should define _FILE_OFFSET_BITS=64 when built on Linux/glibc to request that off_t be 64-bit for large file support.
Here is a man for Feature macros: http://www.kernel.org/doc/man-pages/online/pages/man7/feature_test_macros.7.html
They will turn on or off some level of standard support in the headers.
E.g. _POSIX_C_SOURCE >= 1 means that POSIX.2-1992 or later should be supported; _X_OPEN_SOURCE means POSIX.1, POSIX.2, and XPG4 are enabled; and for greater values of macro (>=500; >=600; >=700) it will also turn on some variants of SUSv2 v3 or v4 (UNIX 98; 03 or POSIX.1-2008+XSI). And _POSIX_SOURCE is an obsolete way to define _POSIX_C_SOURCE = 1
They're the things you have to #define to get the prototype, and are known as feature test macros.
For example, the following code will susscessfully define the prototype for sigaction:
#define _XOPEN_SOURCE
#include <signal.h>
Including signal.h without that #define (or the others) will not define the prototype.
It is a Feature test macro.
Symbols called "feature test macros" are used to control the visibility of symbols that might be included in a header. Implementations, future versions of IEEE Std 1003.1-2001, and other standards may define additional feature test macros.
When defining macros that headers rely on, such as _FILE_OFFSET_BITS, FUSE_USE_VERSION, _GNU_SOURCE among others, where is the best place to put them?
Some possibilities I've considered include
At the top of the any source files that rely on definitions exposed by headers included in that file
Immediately before the include for the relevant header(s)
Define at the CPPFLAGS level via the compiler? (such as -D_FILE_OFFSET_BITS=64) for the:
Entire source repo
The whole project
Just the sources that require it
In project headers, which should also include those relevant headers to which the macros apply
Some other place I haven't thought of, but is infinitely superior
A note: Justification by applicability to make, autotools, and other build systems is a factor in my decision.
If the macros affect system headers, they probably ought to go somewhere where they affect every source file that includes those system headers (which includes those that include them indirectly). The most logical place would therefore be on the command line, assuming your build system allows you to set e.g. CPPFLAGS to affect the compilation of every file.
If you use precompiled headers, and have a precompiled header that must therefore be included first in every source file (e.g. stdafx.h for MSVC projects) then you could put them in there too.
For macros that affect self-contained libraries (whether third-party or written by you), I would create a wrapper header that defines the macros and then includes the library header. All uses of the library from your project should then include your wrapper header rather than including the library header directly. This avoids defining macros unnecessarily, and makes it clear that they relate to that library. If there are dependencies between libraries then you might want to make the macros global (in the build system or precompiled header) just to be on the safe side.
Well, it depends.
Most, I'd define via the command line - in a Makefile or whatever build system you use.
As for _FILE_OFFSET_BITS I really wouldn't define it explicitly, but rather use getconf LFS_CFLAGS and getconf LFS_LDFLAGS.
I would always put them on the command line via CPPFLAGS for the whole project. If you put them any other place, there's a danger that you might forget to copy them into a new source file or include a system header before including the project header that defines them, and this could lead to extremely nasty bugs (like one file declaring a legacy 32-bit struct stat and passing its address to a function in another file which expects a 64-bit struct stat).
BTW, it's really ridiculous that _FILE_OFFSET_BITS=64 still isn't the default on glibc.
Most projects that I've seen use them did it via -D command line options. They are there because that eases building the source with different compilers and system headers. If you were to build with a system compiler for another system that didn't need them or needed a different set of them then a configure script can easily change the command line arguments that a make file passes to the compiler.
It's probably best to do it for the entire program because some of the flags effect which version of a function gets brought in or the size/layout of a struct and mixing those up could cause crazy things if you aren't careful.
They certainly are annoying to keep up with.
For _GNU_SOURCE and the autotools in particular, you could use AC_USE_SYSTEM_EXTENSIONS (citing liberally from the autoconf manual here):
-- Macro: AC_USE_SYSTEM_EXTENSIONS
This macro was introduced in Autoconf 2.60. If possible, enable
extensions to C or Posix on hosts that normally disable the
extensions, typically due to standards-conformance namespace
issues. This should be called before any macros that run the C
compiler. The following preprocessor macros are defined where
appropriate:
_GNU_SOURCE
Enable extensions on GNU/Linux.
__EXTENSIONS__
Enable general extensions on Solaris.
_POSIX_PTHREAD_SEMANTICS
Enable threading extensions on Solaris.
_TANDEM_SOURCE
Enable extensions for the HP NonStop platform.
_ALL_SOURCE
Enable extensions for AIX 3, and for Interix.
_POSIX_SOURCE
Enable Posix functions for Minix.
_POSIX_1_SOURCE
Enable additional Posix functions for Minix.
_MINIX
Identify Minix platform. This particular preprocessor macro
is obsolescent, and may be removed in a future release of
Autoconf.
For _FILE_OFFSET_BITS, you need to call AC_SYS_LARGEFILE and AC_FUNC_FSEEKO:
— Macro: AC_SYS_LARGEFILE
Arrange for 64-bit file offsets, known as large-file support. On some hosts, one must use special compiler options to build programs that can access large files. Append any such options to the output variable CC. Define _FILE_OFFSET_BITS and _LARGE_FILES if necessary.
Large-file support can be disabled by configuring with the --disable-largefile option.
If you use this macro, check that your program works even when off_t is wider than long int, since this is common when large-file support is enabled. For example, it is not correct to print an arbitrary off_t value X with printf("%ld", (long int) X).
The LFS introduced the fseeko and ftello functions to replace their C counterparts fseek and ftell that do not use off_t. Take care to use AC_FUNC_FSEEKO to make their prototypes available when using them and large-file support is enabled.
If you are using autoheader to generate a config.h, you could define the other macros you care about using AC_DEFINE or AC_DEFINE_UNQUOTED:
AC_DEFINE([FUSE_VERSION], [28], [FUSE Version.])
The definition will then get passed to the command line or placed in config.h, if you're using autoheader. The real benefit of AC_DEFINE is that it easily allows preprocessor definitions as a result of configure checks and separates system-specific cruft from the important details.
When writing the .c file, #include "config.h" first, then the interface header (e.g., foo.h for foo.c - this ensures that the header has no missing dependencies), then all other headers.
I usually put them as close as practicable to the things that need them, whilst ensuring you don't set them incorrectly.
Related pieces of information should be kept close to make it easier to identify. A classic example is the ability for C to now allow variable definitions anywhere in the code rather than just at the top of a function:
void something (void) {
// 600 lines of code here
int x = fn(y);
// more code here
}
is a lot better than:
void something (void) {
int x;
// 600 lines of code here
x = fn(y);
// more code here
}
since you don't have to go searching for the type of x in the latter case.
By way of example, if you need to compile a single source file multiple times with different values, you have to do it with the compiler:
gcc -Dmydefine=7 -o binary7 source.c
gcc -Dmydefine=9 -o binary9 source.c
However, if every compilation of that file will use 7, it can be moved closer to the place where it's used:
source.c:
#include <stdio.h>
#define mydefine 7
#include "header_that_uses_mydefine.h"
#define mydefine 7
#include "another_header_that_uses_mydefine.h"
Note that I've done it twice so that it's more localised. This isn't a problem since, if you change only one, the compiler will tell you about it, but it ensures that you know those defines are set for the specific headers.
And, if you're certain that you will never include (for example) bitio.h without first setting BITCOUNT to 8, you can even go so far as to create a bitio8.h file containing nothing but:
#define BITCOUNT 8
#include "bitio.h"
and then just include bitio8.h in your source files.
Global, project-wide constants that are target specific are best put in CCFLAGS in your makefile. Constants you use all over the place can go in appropriate header files which are included by any file that uses them.
For example,
// bool.h - a boolean type for C
#ifndef __BOOL_H__
#define BOOL_H
typedef int bool_t
#define TRUE 1
#define FALSE 0
#endif
Then, in some other header,
`#include "bool.h"`
// blah
Using header files is what I recommend because it allows you to have a code base built by make files and other build systems as well as IDE projects such as Visual Studio. This gives you a single point of definition that can be accompanied by comments (I'm a fan of doxygen which allows you to generate macro documentation).
The other benefit with header files is that you can easily write unit tests to verify that only valid combinations of macros are defined.