As of C99, C supports implementation-defined extended integer types (6.2.5 p7). Does any implementation actually implement an extended integer type?
I'm aware of gcc's __int128, which is currently treated as a language extension and is not formally listed as an extended integer type in gcc's documentation of implementation-defined behavior (J.3.5). I couldn't find anything mentioned in the documentation for clang or MSVC. Solaris states that there are no extended integer types.
There is some related discussion at What are “extended integer types”?, but the only other candidate mentioned is __int64 in an older version of MSVC, and the comments seem to agree that it's not a formal extended integer type due to that version of MSVC being C90.
Example of an extended integer type?
Does any implementation actually implement an extended integer type?
Various processors have a 24-bit width for instructions and constant memory.
Compilers supporting such Microchip processors offer
(u)int24_t.
int24_t types added to C99 The int24_t and uint24_t types (along with the existing __int24 and __uint24 types) are now available when using the C99 library and when CCI is not active.
Even though some compilers do offer 128-bit integer types, if that type was an extended integer type, the C library would require (u)intmax_t to be at least that width. C11dr 7.20.1.5
C also requires "preprocessor arithmetic done in intmax_t/uintmax_t".
I suspect compilers offering intN (N > 64) do so as a language extension.
I know of no compiler where (u)int128_t exist (as an extended integer type).
Related
With C99 (and later standards) standard requires certain types to be available in the header <stdint.h>. For exact-width, e.g., int8_t, int16_t, etc..., they are optional and motivated in the standard why that is.
But for the uintptr_t and intptr_t type, they are also optional but I don't see a reason for them being optional instead of required.
On some platforms pointer types have much larger size than any integral type. I believe an example of such as platform would be IBM AS/400 with virtual instruction set defining all pointers as 128-bit. A more recent example of such platform would be Elbrus. It uses 128-bit pointers which are HW descriptors rather than normal addresses.
I have this MISRA C:2004 violation typedefs that indicate size and signedness should be used in place of the basic types
for example I have this piece of code, where I did not understand the right solution to avoid this violation
static int handlerCalled = 0;
int llvm_test_diagnostic_handler(void) {
LLVMContextRef C = LLVMGetGlobalContext();
LLVMContextSetDiagnosticHandler(C, &diagnosticHandler, &handlerCalled);
The MISRA rule is aimed at the fact that C does not define the exact size, range, or representation of its standard integer types. The stdint.h header mitigates this issue by providing several families of typedefs expressing the implementation-supported integer types that provide specific combinations of signedness, size, and representation. Each C implementation provides a stdint.h header appropriate for that implementation.
You should comply with the MISRA rule by using the types defined in your implementation's stdint.h header, choosing the types that meet your needs from among those it actually supports (or those you expect it to support). For example, if you want a signed integer type exactly 32 bits wide, with no padding bits, and expressed in two's complement representation, then that is int32_t -- if your implementation provides that at all (it would be surprising, but not impossible, for such a type not to be available).
For example,
#include <stdint.h>
// relies on the 'int32_t' definition from the above header:
static int32_t handlerCalled = 0;
The point I was raising in my comment was that you seemed to say that you not only included the header, but also defined your own typedef for uint32_t. You must not define your own typedef for this or other types in the scope of stdint.h. At best it is redundant to do so, but at worst it satisfies the MISRA checker yet breaks your code.
Some background:
the header stdint.h is part of the C standard since C99. It includes typedefs that are ensured to be 8, 16, 32, and 64-bit long integers, both signed and unsigned. This header is not part of the C89 standard, though, and I haven't yet found any straightforward way to ensure that my datatypes have a known length.
Getting to the actual topic
The following code is how SQLite (written in C89) defines 64-bit integers, but I don't find it convincing. That is, I don't think it's going to work everywhere. Worst of all, it could fail silently:
/*
** CAPI3REF: 64-Bit Integer Types
** KEYWORDS: sqlite_int64 sqlite_uint64
**
** Because there is no cross-platform way to specify 64-bit integer types
** SQLite includes typedefs for 64-bit signed and unsigned integers.
*/
#ifdef SQLITE_INT64_TYPE
typedef SQLITE_INT64_TYPE sqlite_int64;
typedef unsigned SQLITE_INT64_TYPE sqlite_uint64;
#elif defined(_MSC_VER) || defined(__BORLANDC__)
typedef __int64 sqlite_int64;
typedef unsigned __int64 sqlite_uint64;
#else
typedef long long int sqlite_int64;
typedef unsigned long long int sqlite_uint64;
#endif
typedef sqlite_int64 sqlite3_int64;
typedef sqlite_uint64 sqlite3_uint64;
So, this is what I've been doing so far:
Checking that the "char" data type is 8 bits long, since it's not guaranteed to be. If the preprocessor variable "CHAR_BIT" is not equal to 8, compilation fails
Now that "char" is guaranteed to be 8 bits long, I create a struct containing an array of several unsigned chars, which correspond to several bytes in the integer.
I write "operator" functions for my datatypes. Addition, multiplication, division, modulo, conversion from/to string, etc.
I have abstracted this process in a header file, which is the best I can do with what I know, but I wonder if there is a more straightforward way to achieve this.
I'm asking because I want to write a portable C library.
First, you should ask yourself whether you really need to support implementations that don't provide <stdint.h>. It was standardized in 1999, and even many pre-C99 implementations are likely to provide it as an extension.
Assuming you really need this, Doug Gwyn, a member of the ISO C standard committee, created an implementation of several of the new headers for C9x (as C99 was then known), compatible with C89/C90. The headers are in the public domain and should be reasonably portable.
http://www.lysator.liu.se/(nobg)/c/q8/index.html
(As I understand it, the name "q8" has no particular meaning; he just chose it as a reasonably short and unique search term.)
One rather nasty quirk of integer types in C stems from the fact that many "modern" implementations will have, for at least one size of integer, two incompatible signed types of that size with the same bit representation and likewise two incompatible unsigned types. Most typically the types will be 32-bit "int" and "long", or 64-bit "long" and "long long". The "fixed-sized" types will typically alias to one of the standard types, though implementations are not consistent about which one.
Although compilers used to assume that accesses to one type of a given size might affect objects of the other, the authors of the Standard didn't mandate that they do so (probably because there would have been no point ordering people to do things they would do anyway and they couldn't imagine any sane compiler writer doing otherwise; once compilers started doing so, it was politically difficult to revoke that "permission"). Consequently, if one has a library which stores data in a 32-bit "int" and another which reads data from a 32-bit "long", the only way to be assured of correct behavior is to either disable aliasing analysis altogether (probably the sanest choice while using gcc) or else add gratuitous copy operations (being careful that gcc doesn't optimize them out and then use their absence as an excuse to break code--something it sometimes does as of 6.2).
Choosing uintmax_t handles the integer case if I'm not overlooking something.
(1) Is there a similar such data type for floats, and, if yes, in which header? (it's not in float.h for me)
(2) is it correct that choosing a union of these 2 data types (assuming we can answer (1) in the affirmative) should always be most restrictive?
As pointed out in Eric's answer, C11 defines an object type with the greatest fundamental alignment in <stddef.h>: max_align_t.
Note, however, that this might not still work as expected on GCC. Taken from gcc's website:
A fourth version of the C standard, known as C11, was published in
2011 as ISO/IEC 9899:2011. GCC has substantially complete support for
this standard, enabled with -std=c11 or -std=iso9899:2011. (While in
development, drafts of this standard version were referred to as C1X.)
If you don't want to rely on C11 due to its freshness and current lack of support, I suggest defining a union with all integer types, all floating point types, a void pointer, and a function pointer: one of these must be the most restrictive type.
An object type with the greatest fundamental alignment supported by the implementation is max_align_t, defined in <stddef.h>.
I do not see text in the standard that specifies the alignment requirement of a union must be the strictest alignment requirement of its members. So, in theory, a union could require a stricter alignment than any of its members need. I see little reason for this and do not expect C implementations would do it. The usual case would be that the alignment requirement of a union is the strictest requirement of its members, unless the program explicitly requested greater alignment (as with the _Alignas keyword).
I am using a C library provided to me already compiled. I have limited information on the compiler, version, options, etc., used when compiling the library. The library interface uses enum both in structures that are passed and directly as passed parameters.
The question is: how can I assure or establish that when I compile code to use the provided library, that my compiler will use the same size for those enums? If it does not, the structures won't line up, and the parameter passing may be messed up, e.g. long vs. int.
My concern stems from the C99 standard, which states that the enum type:
shall be compatible with char, a signed integer type, or an unsigned
integer type. The choice of type is implementation-defined, but shall
be capable of representing the values of all the members of the
enumeration.
As far as I can tell, so long as the largest value fits, the compiler can pick any type it darn well pleases, effectively on a whim, potentially varying not only between compilers, but different versions of the same compiler and/or compiler options. It could pick 1, 2, 4, or 8-byte representations, resulting in potential incompatibilities in both structures and parameter passing. (It could also pick signed or unsigned, but I don't see a mechanism for that being a problem in this context.)
Am I missing something here? If I am not missing something, does this mean that enum should never be used in an API?
Update:
Yes, I was missing something. While the language specification doesn't help here, as noted by #Barmar the Application Binary Interface (ABI) does. Or if it doesn't, then the ABI is deficient. The ABI for my system indeed specifies that an enum must be a signed four-byte integer. If a compiler does not obey that, then it is a bug. Given a complete ABI and compliant compilers, enum can be used safely in an API.
APIs that use enum are depending on the assumption that the compiler will be consistent, i.e. given the same enum declaration, it will always choose the same underlying type.
While the language standard doesn't specifically require this, it would be quite perverse for a compiler to do anything else.
Furthermore, all compilers for a particular OS need to be consistent with the OS's ABI. Otherwise, you would have far more problems, such as the library using 64-bit int while the caller uses 32-bit int. Ideally, the ABI should constrain the representation of enums, to ensure compatibility.
More generally, the language specification only ensures compatibility between programs compiled with the same implementation. The ABI ensures compatibility between programs compiled with different implementations.
From the question:
The ABI for my system indeed specifies that an enum must be a signed four-byte integer. If a compiler does not obey that, then it is a bug.
I'm surprised about that. I suspect in reality you're compiler will select a 64-bit (8 byte) size for your enum if you define an enumerated constant with a value larger that 2^32.
On my platforms (MinGW gcc 4.6.2 targeting x86 and gcc 4,.4 on Linux targeting x86_64), the following code says that I get both 4 and 8 byte enums:
#include <stdio.h>
enum { a } foo;
enum { b = 0x123456789 } bar;
int main(void) {
printf("%lu\n", sizeof(foo));
printf("%lu", sizeof(bar));
return 0;
}
I compiled with -Wall -std=c99 switches.
I guess you could say that this is a compiler bug. But the alternatives of removing support for enumerated constants larger than 2^32 or always using 8-byte enums both seem undesirable.
Given that these common versions of GCC don't provide a fixed size enum, I think the only safe action in general is to not use enums in APIs.
Further notes for GCC
Compiling with "-pedantic" causes the following warnings to be generated:
main.c:4:8: warning: integer constant is too large for 'long' type [-Wlong-long]
main.c:4:12: warning: ISO C restricts enumerator values to range of 'int' [-pedantic]
The behavior can be tailored via the --short-enums and --no-short-enums switches.
Results with Visual Studio
Compiling the above code with VS 2008 x86 causes the following warnings:
warning C4341: 'b' : signed value is out of range for enum constant
warning C4309: 'initializing' : truncation of constant value
And with VS 2013 x86 and x64, just:
warning C4309: 'initializing' : truncation of constant value