Integer types in C - c

Suppose I wish to write a C program (C99 or C2011) that I want to be completely portable and not tied to a particular architecture.
It seems that I would then want to make a clean break from the old integer types (int, long, short) and friends and use only int8_t, uint8_t, int32_t and so on (perhaps using the the least and fast versions as well).
What then is the return type of main? Or must we strick with int? Is it required by the standard to be int?
GCC-4.2 allows me to write
#include <stdint.h>
#include <stdio.h>
int32_t main() {
printf("Hello\n");
return 0;
}
but I cannot use uint32_t or even int8_t because then I get
hello.c:3: warning: return type of ‘main’ is not ‘int’
This is because of a typedef, no doubt. It seems this is one case where we are stuck with having to use the unspecified size types, since it's not truly portable unless we leave the return type up to the target architecture. Is this interpretation correct? It seems odd to have "just one" plain old int in the code base but I am happy to be pragmatic.

Suppose I wish to write a C program (C99 or C2011) that I want to be
completely portable and not tied to a particular architecture.
It seems that I would then want to make a clean break from the old
integer types (int, long, short) and friends and use only int8_t,
uint8_t, int32_t and so on (perhaps using the the least and fast
versions as well).
These two affirmations, in bold, are contradictory. That's because whether uint32_t, uint8_t and al are available or not is actually implementation-defined (C11, 7.20.1.1/3: Exact-width integer types).
If you want your program to be truly portable, you must use the built-in types (int, long, etc.) and stick to the minimum ranges defined in the C standard (i.e.: C11, 5.2.4.2.1: Sizes of integer types),
Per example, the standard says that both short and int should range from at least -32767 to at least 32767. So if you want to store a bigger or lesser value, say 42000, you'd use a long instead.

The return type of main is required by the Standard to be int in C89, C99 and in C11.
Now the exact-width integer types are alias to integer types. So if you use the right alias for int it will still be valid.
For example:
int32_t main(void)
if int32_t is a typedef to int.

Related

Are exact-width integers in Cython actually platform dependent?

In Cython one can use exact-width integral types by importing them from stdint, e.g.
from libc.stdint cimport int32_t
Looking through stdint.pxd, we see that int32_t is defined as
cdef extern from "<stdint.h>" nogil:
...
ctypedef signed int int32_t
Does this mean that if I use int32_t in my Cython code, this type is just an alias for signed int (int), which might in fact be only 16 bits wide?
The issue is the same for all the other integral types.
They should be fine.
The typedefs that are really used come from the C stdint.h header, which are almost certainly right.
The Cython typedef
ctypedef signed int int32_t
Is really just so that Cython understands that the type is an integer and that it's signed. It isn't what's actually used in the generated C code. Since it's in a cdef extern block it's telling Cython "a typedef like this exists" rather than acting as the real definition.
No.
On a platform where signed int is not 32bit wide, int32_t would be typedef'ed to a type that actually is 32bit wide.
If no such type is available -- e.g. on a platform where the maximum int width is 16, or where all ints are 64bit, or where CHAR_BIT does not equal 8 -- the exact-width types would not be defined. (Yes, the exact-width types are optional. That is why there are least-width types as well.)
Disclaimer: This is speaking from a purely C perspective. I have no experience with Cython whatsoever. But it would be very surprising (and a bug) if this would not be covered adequately in Cython as well.
And as #JörgWMittag points out in his comment, the alternative is of course to simply not support any platform where signed int isn't 32 bit wide.

When to use uint16_t vs int and when to cast type [duplicate]

This question already has answers here:
Should I use cstdint?
(6 answers)
Closed 8 years ago.
I have 2 questions about C programming:
For int and uint16_t, long and uint32_t, and so on. When should I use the u*_t types instead of int, long, and so on? I found it confusing to choose which one is best for my program.
When do I need to cast type?
I have the following statement in my program:
long * src;
long * dst;
...
memcpy(dst, src, len);
My friend changes this to
memcpy((char *)dst, (char *)src, len).
This is just example I encountered. Generally, I am confused when cast is required?
Use the plain types (int etc) except when you need a precisely-sized type. You might need the precisely sized type if you are working with a wire protocol which defines that the size field shall be a 2-byte unsigned integer (hence uint16_t), but for most work, most of the time, use the plain types. (There are some caveats to this, but most of the time, most people can work with the plain types for simple numeric work. If you are working to a set of interfaces, use the types dictated by the interfaces. If you're using multiple interfaces and the types clash, you'll have to consider using casting some of the time — or change one or both interfaces. Etc.)
The casts added by your friend are pointless. The actual prototype of memcpy() is:
void *memcpy(void * restrict s1, const void * restrict s2, size_t n);
The compiler converts the long * values to void * (nominally via char * because of the cast), all of which is almost always a no-op.
More generally, you use a cast when you need to change the type of something. One place you might need it is in bitwise operations, where you want a 64-bit result but the operands are 32-bit and leaving the conversion until after the bitwise operations gives a different result from the one you wanted. For example, assuming a system where int is 32 bits and long is 64 bits.
unsigned int x = 0x012345678;
unsigned long y = (~x << 22) | 0x1111;
This would calculate ~x as a 32-bit quantity, and the shift would be performed on a 32-bit quantity, losing a number of bits. By contrast:
unsigned long z = (~(unsigned long)x << 22) | 0x1111;
ensures that the calculation is done in 64-bit arithmetic and doesn't lose any bits from the original value.
The size of "classical" types like int and long int can vary between systems. This can cause problems when, for example, accessing files with fixed-width data structures. For example, int long is currently a 64-bit integer on new systems, but only 32 bits on older systems.
The intN_t and uintN_t types were introduced with C99 and are defined in <inttypes.h>. Since they explicitly specify the number of bits, they eliminate any ambiguity. As a rule, you should use these types in preference if you are at all concerned about making your code portable.
Wikipedia has more information
If you do not want to rely on your compiler use predefined types provided by standard library headers. Every C library you'd compile with is guaranteed to assign proper types to have at least size to store values of size their types declare.
In your friend specific case one can assume that he made this type cast just because he wanted to point other readers that two pointers actually hold symbol characters. Or maybe he is kind of old-fashion guy who remembers the times when there was no void type and the "lowest common divisor" was pointer to char. In my developer life, if I want to emphasize some of my actions I'll make an explicit type cast even if it is, in fact, redundant.
For you 1st question, look at : https://stackoverflow.com/questions/11786113/difference-between-different-integer-types
Basically, the _t is the real standard type name and without, it's a define of the same type.
the u is for unsigned which doesn't allow negative number.
As for your second question, you often need to cast when the function called needs arguments of another type that what you're passing. You can look here for casting tips, or here...

What is the maximum value of a pointer on Mac OSX?

I'd like to make a custom pointer address printer (like the printf(%p)) and I'd like to know what is the maximum value that a pointer can have on the computer I'm using, which is an iMac OS X 10.8.5.
Someone recommended I use an unsigned long. Is the following cast the adapted one and big enough ?
function print_address(void *pointer)
{
unsigned long a;
a = (unsigned long) pointer;
[...]
}
I searched in the limits.h header but I couldn't find any mention of it. Is it a fixed value or there a way to find out what is the maximum on my system ?
Thanks for your help !
Quick summary: Convert the pointer to uintptr_t (defined in <stdint.h>), which will give you a number in the range 0 to UINTPTR_MAX. Read on for the gory details and some unlikely problems you might theoretically run into.
In general there is no such thing as a "maximum value" for a pointer. Pointers are not numbers, and < and > comparisons aren't even defined unless both pointers point into the same (array) object or just past the end of it.
But I think that the size of a pointer is really what you're looking for. And if you can convert a void* pointer value to an unsigned 32-bit or 64-bit integer, the maximum value of that integer is going to be 232-1 or 264-1, respectively.
The type uintptr_t, declared in <stdint.h>, is an unsigned integer type such that converting a void* value to uintptr_t and back again yields a value that compares equal to the original pointer. In short, the conversion (uintptr_t)ptr will not lose information.
<stdint.h> defines a macro UINTPTR_MAX, which is the maximum value of type uintptr_t. That's not exactly the "maximum value of a pointer", but it's probably what you're looking for.
(On many systems, including Mac OSX, pointers are represented as if they were integers that can be used as indices into a linear monolithic address space. That's a common memory model, but it's not actually required by the C standard. For example, some systems may represent a pointer as a combination of a descriptor and an offset, which makes comparisons between arbitrary pointer values difficult or even impossible.)
The <stdint.h> header and the uintptr_t type were added to the C language by the 1999 standard. For MacOS, you shouldn't have to worry about pre-C99 compilers.
Note also that the uintptr_t type is optional. If pointers are bigger than any available integer type, then the implementation won't define uintptr_t. Again, you shouldn't have to worry about that for MacOS. If you want to be fanatical about portable code, then you can use
#include <stdint.h>
#ifdef UINTPTR_MAX
/* uintptr_t exists */
#else
/* uintptr_t doesn't exist; do something else */
#endif
where "something else" is left as an exercise.
You probably are looking for the value of UINTPTR_MAX defined in <stdint.h>.
As ouah's answer says, uintptr_t sounds like the type you really want.
unsigned long is not guaranteed to to be able to represent a pointer value. Use uintptr_t which is an unsigned integer type that can hold a pointer value.

Difference between int32, int, int32_t, int8 and int8_t

I came across the data type int32_t in a C program recently. I know that it stores 32 bits, but don't int and int32 do the same?
Also, I want to use char in a program. Can I use int8_t instead? What is the difference?
To summarize: what is the difference between int32, int, int32_t, int8 and int8_t in C?
Between int32 and int32_t, (and likewise between int8 and int8_t) the difference is pretty simple: the C standard defines int8_t and int32_t, but does not define anything named int8 or int32 -- the latter (if they exist at all) is probably from some other header or library (most likely predates the addition of int8_t and int32_t in C99).
Plain int is quite a bit different from the others. Where int8_t and int32_t each have a specified size, int can be any size >= 16 bits. At different times, both 16 bits and 32 bits have been reasonably common (and for a 64-bit implementation, it should probably be 64 bits).
On the other hand, int is guaranteed to be present in every implementation of C, where int8_t and int32_t are not. It's probably open to question whether this matters to you though. If you use C on small embedded systems and/or older compilers, it may be a problem. If you use it primarily with a modern compiler on desktop/server machines, it probably won't be.
Oops -- missed the part about char. You'd use int8_t instead of char if (and only if) you want an integer type guaranteed to be exactly 8 bits in size. If you want to store characters, you probably want to use char instead. Its size can vary (in terms of number of bits) but it's guaranteed to be exactly one byte. One slight oddity though: there's no guarantee about whether a plain char is signed or unsigned (and many compilers can make it either one, depending on a compile-time flag). If you need to ensure its being either signed or unsigned, you need to specify that explicitly.
The _t data types are typedef types in the stdint.h header, while int is an in built fundamental data type. This make the _t available only if stdint.h exists. int on the other hand is guaranteed to exist.
Always keep in mind that 'size' is variable if not explicitly specified so if you declare
int i = 10;
On some systems it may result in 16-bit integer by compiler and on some others it may result in 32-bit integer (or 64-bit integer on newer systems).
In embedded environments this may end up in weird results (especially while handling memory mapped I/O or may be consider a simple array situation), so it is highly recommended to specify fixed size variables. In legacy systems you may come across
typedef short INT16;
typedef int INT32;
typedef long INT64;
Starting from C99, the designers added stdint.h header file that essentially leverages similar typedefs.
On a windows based system, you may see entries in stdin.h header file as
typedef signed char int8_t;
typedef signed short int16_t;
typedef signed int int32_t;
typedef unsigned char uint8_t;
There is quite more to that like minimum width integer or exact width integer types, I think it is not a bad thing to explore stdint.h for a better understanding.

Is the sizeof(enum) == sizeof(int), always?

Is the sizeof(enum) == sizeof(int), always ?
Or is it compiler dependent?
Is it wrong to say, as compiler are optimized for word lengths (memory alignment) ie y int is the word-size on a particular compiler? Does it means that there is no processing penalty if I use enums, as they would be word aligned?
Is it not better if I put all the return codes in an enum, as i clearly do not worry about the values it get, only the names while checking the return types. If this is the case wont #DEFINE be better as it would save memory.
What is the usual practice?
If I have to transport these return types over a network and some processing has to be done at the other end, what would you prefer enums/#defines/ const ints.
EDIT - Just checking on net, as complier don't symbolically link macros, how do people debug then, compare the integer value with the header file?
From Answers —I am adding this line below, as I need clarifications—
"So it is implementation-defined, and
sizeof(enum) might be equal to
sizeof(char), i.e. 1."
Does it not mean that compiler checks for the range of values in enums, and then assign memory. I don't think so, of course I don't know. Can someone please explain me what is "might be".
It is compiler dependent and may differ between enums. The following are the semantics
enum X { A, B };
// A has type int
assert(sizeof(A) == sizeof(int));
// some integer type. Maybe even int. This is
// implementation defined.
assert(sizeof(enum X) == sizeof(some_integer_type));
Note that "some integer type" in C99 may also include extended integer types (which the implementation, however, has to document, if it provides them). The type of the enumeration is some type that can store the value of any enumerator (A and B in this case).
I don't think there are any penalties in using enumerations. Enumerators are integral constant expressions too (so you may use it to initialize static or file scope variables, for example), and i prefer them to macros whenever possible.
Enumerators don't need any runtime memory. Only when you create a variable of the enumeration type, you may use runtime memory. Just think of enumerators as compile time constants.
I would just use a type that can store the enumerator values (i should know the rough range of values before-hand), cast to it, and send it over the network. Preferably the type should be some fixed-width one, like int32_t, so it doesn't come to conflicts when different machines are involved. Or i would print the number, and scan it on the other side, which gets rid of some of these problems.
Response to Edit
Well, the compiler is not required to use any size. An easy thing to see is that the sign of the values matter - unsigned types can have significant performance boost in some calculations. The following is the behavior of GCC 4.4.0 on my box
int main(void) {
enum X { A = 0 };
enum X a; // X compatible with "unsigned int"
unsigned int *p = &a;
}
But if you assign a -1, then GCC choses to use int as the type that X is compatible with
int main(void) {
enum X { A = -1 };
enum X a; // X compatible with "int"
int *p = &a;
}
Using the option --short-enums of GCC, that makes it use the smallest type still fitting all the values.
int main() {
enum X { A = 0 };
enum X a; // X compatible with "unsigned char"
unsigned char *p = &a;
}
In recent versions of GCC, the compiler flag has changed to -fshort-enums. On some targets, the default type is unsigned int. You can check the answer here.
C99, 6.7.2.2p4 says
Each enumerated type shall be
compatible with char, a signed
integer type, or an unsigned
integer type. The choice of type
is implementation-defined,108) but
shall be capable of representing the
values of all the members of the
enumeration. [...]
Footnote 108 adds
An implementation may delay the choice of which integer
type until all enumeration constants have been seen.
So it is implementation-defined, and sizeof(enum) might be equal to sizeof(char), i.e. 1.
In chosing the size of some small range of integers, there is always a penalty. If you make it small in memory, there probably is a processing penalty; if you make it larger, there is a space penalty. It's a time-space-tradeoff.
Error codes are typically #defines, because they need to be extensible: different libraries may add new error codes. You cannot do that with enums.
Is the sizeof(enum) == sizeof(int), always
The ANSI C standard says:
Each enumerated type shall be compatible with char, a signed integer type, or an unsigned integer type. The choice of type is implementation-defined. (6.7.2.2 Enumerationspecifiers)
So I would take that to mean no.
If this is the case wont #DEFINE be better as it would save memory.
In what way would using defines save memory over using an enum? An enum is just a type that allows you to provide more information to the compiler. In the actual resulting executable, it's just turned in to an integer, just as the preprocessor converts a macro created with #define in to its value.
What is the usual practise. I if i have to transport these return types over a network and some processing has to be done at the other end
If you plan to transport values over a network and process them on the other end, you should define a protocol. Decide on the size in bits of each type, the endianess (in which order the bytes are) and make sure you adhere to that in both the client and the server code. Also don't just assume that because it happens to work, you've got it right. It just might be that the endianess, for example, on your chosen client and server platforms matches, but that might not always be the case.
No.
Example: The CodeSourcery compiler
When you define an enum like this:
enum MyEnum1 {
A=1,
B=2,
C=3
};
// will have the sizeof 1 (fits in a char)
enum MyEnum1 {
A=1,
B=2,
C=3,
D=400
};
// will have the sizeof 2 (doesn't fit in a char)
Details from their mailing list
On some compiler the size of an enum is depending on how many entry's are in the Enum. (less than 255 Entrys => Byte, More than 255 Entrys int)
But this is depending on the Compiler and the Compiler Settings.
enum fruits {apple,orange,strawberry,grapefruit};
char fruit = apple;
fruit = orange;
if (fruit < strawberry)
...
all of this works perfectly
if you want a specific underlying type for an enum instance, just don't use the type itself.

Resources