Is there a C99 data type guaranteed to be at least two bytes? - c

To determine the endianness of a system, I plan to store a multi-byte integer value in a variable and access the first byte via an unsigned char wrapped in a union; for example:
union{
unsigned int val;
unsigned char first_byte;
} test;
test.val = 1; /* stored in little-endian system as "0x01 0x00 0x00 0x00" */
if(test.first_byte == 1){
printf("Little-endian system!");
}else{
printf("Big-endian system!");
}
I want to make this test portable across platforms, but I'm not sure if the C99 standard guarantees that the unsigned int data type will be greater than one byte in size. Furthermore, since a "C byte" does not technically have to be 8-bits in size, I cannot use exact width integer types (e.g. uint8_t, uint16_t, etc.).
Are there any C data types guaranteed by the C99 standard to be at least two bytes in size?
P.S. Assuming an unsigned int is in fact greater than one byte, would my union behave as I'm expecting (with the variable first_byte accessing the first byte in variable val) across all C99 compatible platforms?

Since int must have a range of at least 16 bits, int will meet your criterion on most practical systems. So would short (and long, and long long). If you want exactly 16 bits, you have to look to see whether int16_t and uint16_t are declared in <stdint.h>.
If you are worried about systems where CHAR_BIT is greater than 8, then you have to work harder. If CHAR_BIT is 32, then only long long is guaranteed to hold two characters.
What the C standard says about sizes of integer types
In a comment, Richard J Ross III says:
The standard says absolutely nothing about the size of an int except that it must be larger than or equal to short, so, for example, it could be 10 bits on some systems I've worked on.
On the contrary, the C standard has specifications on the lower bounds on the ranges that must be supported by different types, and a system with 10-bit int would not be conformant C.
Specifically, in ISO/IEC 9899:2011 §5.2.4.2.1 Sizes of integer types <limits.h>, it says:
¶1 The values given below shall be replaced by constant expressions suitable for use in #if
preprocessing directives. Moreover, except for CHAR_BIT and MB_LEN_MAX, the
following shall be replaced by expressions that have the same type as would an
expression that is an object of the corresponding type converted according to the integer
promotions. Their implementation-defined values shall be equal or greater in magnitude
(absolute value) to those shown, with the same sign.
— number of bits for smallest object that is not a bit-field (byte)
CHAR_BIT 8
[...]
— minimum value for an object of type short int
SHRT_MIN -32767 // −(215 − 1)
— maximum value for an object of type short int
SHRT_MAX +32767 // 215 − 1
— maximum value for an object of type unsigned short int
USHRT_MAX 65535 // 216 − 1
— minimum value for an object of type int
INT_MIN -32767 // −(215 − 1)
— maximum value for an object of type int
INT_MAX +32767 // 215 − 1
— maximum value for an object of type unsigned int
UINT_MAX 65535 // 216 − 1

GCC provides some macros giving the endianness of a system: GCC common predefined macros
example (from the link supplied):
/* Test for a little-endian machine */
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
Of course, this is only useful if you use gcc. Furthermore, conditional compilation for endianness can be considered harmful. Here is a nice article about this: The byte order fallacy.
I would prefer to do this using regular condtions to let the compiler check the other case. ie:
if (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__)
...

No, nothing is guaranteed to be larger than one byte -- but it is guaranteed that no (non-bitfield) type is smaller than one byte and that one byte can hold at 256 distinct values, so if you have an int8_t and an int16_t, then it's guaranteed that int8_t is one byte, so int16_t must be two bytes.

The C standard guarantees only that the size of char <= short <= int <= long <= long long [and likewise for unsigned]. So, theoretically, there can be systems that have only one size for all of the sizes.
If it REALLY is critical that this isn't going wrong on some particular architecture, I would add a piece of code to do something like if (sizeof(char) == sizeof(int)) exit_with_error("Can't do this...."); to the code.
In nearly all machines, int or short should be perfectly fine. I'm not actually aware of any machine where char and int are the same size, but I'm 99% sure that they do exist. Those machines may also have it's native byte != 8 bits, such as 9 or 14 bits, and words that are 14, 18 or 36 or 28 bits...

Take a look at the man page of stdint.h (uint_least16_t for 2 bytes)

At least according to http://en.wikipedia.org/wiki/C_data_types -- the size of an int is guaranteed to be two "char"s long. So, this test should work, although I'm wondering if there is a more appropriate solution. For one, with rare exception, most architectures would have their endianness set compile-time, and not runtime. There are a few architectures that can switch endianness, though (I believe ARM and PPC are configurable, but ARM is traditionally LE, and PPC is mostly BE).

A conforming implementation can have all its fundamental types of size 1 (and hold at least 32 bits worth of data). For such an implementation, however, the notion of endianness is not applicable.
Nothing forbids a conforming implementation to have, say, little-endian shorts and big-endian longs.
So there are three possible outcomes for each integral type: it could be big-endian, little-endian, or of size 1. Check each type separately for maximum theoretical portability. In practice this probably never happens.
Middle-endian types, or e.g. big-endian stuff on even-numbered pages only, are theoretically possible, but I would refrain from even thinking about such an implementation.

While the answer is basically "no", satisfying the interface requirements for the stdio functions requires that the range [0,UCHAR_MAX] fit in int, which creates an implicit requirement that sizeof(int) is greater than 1 on hosted implementations (freestanding implementations are free to omit stdio, and there's no reason they can't have sizeof(int)==1). So I think it's fairly safe to assume sizeof(int)>1.

Related

size_t is unsigned long long under 64 bit system? [duplicate]

I notice that modern C and C++ code seems to use size_t instead of int/unsigned int pretty much everywhere - from parameters for C string functions to the STL. I am curious as to the reason for this and the benefits it brings.
The size_t type is the unsigned integer type that is the result of the sizeof operator (and the offsetof operator), so it is guaranteed to be big enough to contain the size of the biggest object your system can handle (e.g., a static array of 8Gb).
The size_t type may be bigger than, equal to, or smaller than an unsigned int, and your compiler might make assumptions about it for optimization.
You may find more precise information in the C99 standard, section 7.17, a draft of which is available on the Internet in pdf format, or in the C11 standard, section 7.19, also available as a pdf draft.
Classic C (the early dialect of C described by Brian Kernighan and Dennis Ritchie in The C Programming Language, Prentice-Hall, 1978) didn't provide size_t. The C standards committee introduced size_t to eliminate a portability problem
Explained in detail at embedded.com (with a very good example)
In short, size_t is never negative, and it maximizes performance because it's typedef'd to be the unsigned integer type that's big enough -- but not too big -- to represent the size of the largest possible object on the target platform.
Sizes should never be negative, and indeed size_t is an unsigned type. Also, because size_t is unsigned, you can store numbers that are roughly twice as big as in the corresponding signed type, because we can use the sign bit to represent magnitude, like all the other bits in the unsigned integer. When we gain one more bit, we are multiplying the range of numbers we can represents by a factor of about two.
So, you ask, why not just use an unsigned int? It may not be able to hold big enough numbers. In an implementation where unsigned int is 32 bits, the biggest number it can represent is 4294967295. Some processors, such as the IP16L32, can copy objects larger than 4294967295 bytes.
So, you ask, why not use an unsigned long int? It exacts a performance toll on some platforms. Standard C requires that a long occupy at least 32 bits. An IP16L32 platform implements each 32-bit long as a pair of 16-bit words. Almost all 32-bit operators on these platforms require two instructions, if not more, because they work with the 32 bits in two 16-bit chunks. For example, moving a 32-bit long usually requires two machine instructions -- one to move each 16-bit chunk.
Using size_t avoids this performance toll. According to this fantastic article, "Type size_t is a typedef that's an alias for some unsigned integer type, typically unsigned int or unsigned long, but possibly even unsigned long long. Each Standard C implementation is supposed to choose the unsigned integer that's big enough--but no bigger than needed--to represent the size of the largest possible object on the target platform."
The size_t type is the type returned by the sizeof operator. It is an unsigned integer capable of expressing the size in bytes of any memory range supported on the host machine. It is (typically) related to ptrdiff_t in that ptrdiff_t is a signed integer value such that sizeof(ptrdiff_t) and sizeof(size_t) are equal.
When writing C code you should always use size_t whenever dealing with memory ranges.
The int type on the other hand is basically defined as the size of the (signed) integer value that the host machine can use to most efficiently perform integer arithmetic. For example, on many older PC type computers the value sizeof(size_t) would be 4 (bytes) but sizeof(int) would be 2 (byte). 16 bit arithmetic was faster than 32 bit arithmetic, though the CPU could handle a (logical) memory space of up to 4 GiB.
Use the int type only when you care about efficiency as its actual precision depends strongly on both compiler options and machine architecture. In particular the C standard specifies the following invariants: sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) placing no other limitations on the actual representation of the precision available to the programmer for each of these primitive types.
Note: This is NOT the same as in Java (which actually specifies the bit precision for each of the types 'char', 'byte', 'short', 'int' and 'long').
Type size_t must be big enough to store the size of any possible object. Unsigned int doesn't have to satisfy that condition.
For example in 64 bit systems int and unsigned int may be 32 bit wide, but size_t must be big enough to store numbers bigger than 4G
This excerpt from the glibc manual 0.02 may also be relevant when researching the topic:
There is a potential problem with the size_t type and versions of GCC prior to release 2.4. ANSI C requires that size_t always be an unsigned type. For compatibility with existing systems' header files, GCC defines size_t in stddef.h' to be whatever type the system'ssys/types.h' defines it to be. Most Unix systems that define size_t in `sys/types.h', define it to be a signed type. Some code in the library depends on size_t being an unsigned type, and will not work correctly if it is signed.
The GNU C library code which expects size_t to be unsigned is correct. The definition of size_t as a signed type is incorrect. We plan that in version 2.4, GCC will always define size_t as an unsigned type, and the fixincludes' script will massage the system'ssys/types.h' so as not to conflict with this.
In the meantime, we work around this problem by telling GCC explicitly to use an unsigned type for size_t when compiling the GNU C library. `configure' will automatically detect what type GCC uses for size_t arrange to override it if necessary.
If my compiler is set to 32 bit, size_t is nothing other than a typedef for unsigned int. If my compiler is set to 64 bit, size_t is nothing other than a typedef for unsigned long long.
size_t is the size of a pointer.
So in 32 bits or the common ILP32 (integer, long, pointer) model size_t is 32 bits.
and in 64 bits or the common LP64 (long, pointer) model size_t is 64 bits (integers are still 32 bits).
There are other models but these are the ones that g++ use (at least by default)

Can I assume the size of long int is always 4 bytes?

Is it always true that long int (which as far as I understand is a synonym for long) is 4 bytes?
Can I rely on that? If not, could it be true for a POSIX based OS?
The standards say nothing regarding the exact size of any integer types aside from char. Typically, long is 32-bit on 32-bit systems and 64-bit on 64-bit systems.
The standard does however specify a minimum size. From section 5.2.4.2.1 of the C Standard:
1 The values given below shall be replaced by constant expressions
suitable for use in #if preprocessing directives. Moreover,
except for CHAR_BIT and MB_LEN_MAX, the following shall be
replaced by expressions that have the same type as would an
expression that is an object of the corresponding type converted
according to the integer promotions. Their implementation-defined
values shall be equal or greater in magnitude (absolute value) to
those shown, with the same sign.
...
minimum value for an object of type long int
LONG_MIN -2147483647 // −(2^31−1)
maximum value for an object of type long int
LONG_MAX +2147483647 // 2^31−1
This says that a long int must be a minimum of 32 bits, but may be larger. On a machine where CHAR_BIT is 8, this gives a minimum byte size of 4. However on machine with e.g. CHAR_BIT equal to 16, a long int could be 2 bytes long.
Here's a real-world example. For the following code:
#include <stdio.h>
int main ()
{
printf("sizeof(long) = %zu\n", sizeof(long));
return 0;
}
Output on Debian 7 i686:
sizeof(long) = 4
Output on CentOS 7 x64:
sizeof(long) = 8
So no, you can't make any assumptions on size. If you need a type of a specific size, you can use the types defined in stdint.h. It defines the following types:
int8_t: signed 8-bit
uint8_t: unsigned 8-bit
int16_t: signed 16-bit
uint16_t: unsigned 16-bit
int32_t: signed 32-bit
uint32_t: unsigned 32-bit
int64_t: signed 64-bit
uint64_t: unsigned 64-bit
The stdint.h header is described in section 7.20 of the standard, with exact width types in section 7.20.1.1. The standard states that these typedefs are optional, but they exist on most implementations.
No, neither the C standard nor POSIX guarantee this and in fact most Unix-like 64-bit platforms have a 64 bit (8 byte) long.
Use code sizeof(long int) and check the size. It will give you the size of long int in bytes on the system you're working currently. The answer of your question in particular is NO. It is nowhere guaranteed in C or in POSIX or anywhere.
As pointed out by #delnan, POSIX implementations keep the size of long and int as unspecified and it often differs between 32 bit and 64 bit systems.
The length of long is mostly hardware related (often matching the size of data registers on the CPU and sometimes other software related issues such as OS design and ABI interfacing).
To ease your mind, sizeof isn't a function, but a compiler directive*, so your code isn't using operations when using sizeof - it's the same as writing a number, only it's portable.
use:
sizeof(long int)
* As Dave pointed out in the comments, sizeof will be computed at runtime when it's impossible to compute the value during compilation, such as when using variable length arrays.
Also, as pointed out in another comment, sizeof takes into consideration the padding and alignment used by the implementation, meaning that the actual bytes in use could be different then the size in memory (this could be important when bit shifting).
If you're looking for specific byte sized variables, consider using a byte array or (I would assume to be supported) the types defined by C99 in stdint.h - as suggested by #dbush.
When we first implemented C on ICL Series 39 hardware, we took the standard at its word and mapped the data types to the natural representation on that machine architecture, which was short = 32 bits, int = 64 bits, long = 128 bits.
But we found that no serious C applications worked; they all assumed the mapping short = 16, int = 32, long = 64, and we had to change the compiler to support that.
So whatever the official standard says, for many years everyone has converged on long = 64 bits and it's not likely to change.
The standard says nothing about the size of long int, so it is dependent on the environment which you are using.
To get the size of long int on your environment you can use the sizeof operator and get the size of long int. Something like
sizeof(long int)
C standard only requires the following points about the sizes of types
int >= 16 bits,
long >= 32 bits,
long long (since C99) >= 64 bits
sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
sizeof(char) == 1
CHAR_BIT >= 8
The remaining are implementations defined, so it's not surprised if
one encountered some systems where int has 18/24/36/60 bits, one's
complement signed form, sizeof(char) == sizeof(short) == sizeof(int)
== sizeof(long) == 4, 48-bit long or 9-bit char like Exotic architectures the standards committees care about and List of
platforms supported by the C standard
The point about long int above is completely wrong. Most Linux/Unix
implementations define long as a 64-bit type but it's only 32 bits in
Windows because they use different data models (have a look at the
table here 64-bit computing), and this is regardless of 32 or 64-bit
OS version.
Source
The compiler determines the size based on the type of hardware and OS.
So, assumptions should not be made regarding the size.
No, you can't assume that since the size of the “long” data type varies from compiler to compiler.
Check out this article for more details.
From Usrmisc's Blog:
The standard leaves it completely up to the compiler, which also means the same compiler can make it depend on options and target architecture.
So you can't.
Incidentally even long int could be the same as long.
Short answer: No! You cannot make fixed assumptions on the size of long int. Because, the standard (C standard or POSIX) does not document the size of long int (as repeatedly emphasized). Just to provide a counter example to your belief, most of the 64 bit systems have long of size 64! To maximize portability use sizeof appropriately.
Use sizeof(long int) to check the size, it returns the size of long in bytes. The value is system or environment dependent; meaning, the compiler determines the size based on the hardware and OS.

Is the size of C "int" 2 bytes or 4 bytes?

Does an Integer variable in C occupy 2 bytes or 4 bytes? What are the factors that it depends on?
Most of the textbooks say integer variables occupy 2 bytes.
But when I run a program printing the successive addresses of an array of integers it shows the difference of 4.
I know it's equal to sizeof(int). The size of an int is really compiler dependent. Back in the day, when processors were 16 bit, an int was 2 bytes. Nowadays, it's most often 4 bytes on a 32-bit as well as 64-bit systems.
Still, using sizeof(int) is the best way to get the size of an integer for the specific system the program is executed on.
EDIT: Fixed wrong statement that int is 8 bytes on most 64-bit systems. For example, it is 4 bytes on 64-bit GCC.
This is one of the points in C that can be confusing at first, but the C standard only specifies a minimum range for integer types that is guaranteed to be supported. int is guaranteed to be able to hold -32767 to 32767, which requires 16 bits. In that case, int, is 2 bytes. However, implementations are free to go beyond that minimum, as you will see that many modern compilers make int 32-bit (which also means 4 bytes pretty ubiquitously).
The reason your book says 2 bytes is most probably because it's old. At one time, this was the norm. In general, you should always use the sizeof operator if you need to find out how many bytes it is on the platform you're using.
To address this, C99 added new types where you can explicitly ask for a certain sized integer, for example int16_t or int32_t. Prior to that, there was no universal way to get an integer of a specific width (although most platforms provided similar types on a per-platform basis).
There's no specific answer. It depends on the platform. It is implementation-defined. It can be 2, 4 or something else.
The idea behind int was that it was supposed to match the natural "word" size on the given platform: 16 bit on 16-bit platforms, 32 bit on 32-bit platforms, 64 bit on 64-bit platforms, you get the idea. However, for backward compatibility purposes some compilers prefer to stick to 32-bit int even on 64-bit platforms.
The time of 2-byte int is long gone though (16-bit platforms?) unless you are using some embedded platform with 16-bit word size. Your textbooks are probably very old.
The answer to this question depends on which platform you are using.
But irrespective of platform, you can reliably assume the following types:
[8-bit] signed char: -127 to 127
[8-bit] unsigned char: 0 to 255
[16-bit]signed short: -32767 to 32767
[16-bit]unsigned short: 0 to 65535
[32-bit]signed long: -2147483647 to 2147483647
[32-bit]unsigned long: 0 to 4294967295
[64-bit]signed long long: -9223372036854775807 to 9223372036854775807
[64-bit]unsigned long long: 0 to 18446744073709551615
C99 N1256 standard draft
http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf
The size of int and all other integer types are implementation defined, C99 only specifies:
minimum size guarantees
relative sizes between the types
5.2.4.2.1 "Sizes of integer types <limits.h>" gives the minimum sizes:
1 [...] Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown [...]
UCHAR_MAX 255 // 2 8 − 1
USHRT_MAX 65535 // 2 16 − 1
UINT_MAX 65535 // 2 16 − 1
ULONG_MAX 4294967295 // 2 32 − 1
ULLONG_MAX 18446744073709551615 // 2 64 − 1
6.2.5 "Types" then says:
8 For any two integer types with the same signedness and different integer conversion rank
(see 6.3.1.1), the range of values of the type with smaller integer conversion rank is a
subrange of the values of the other type.
and 6.3.1.1 "Boolean, characters, and integers" determines the relative conversion ranks:
1 Every integer type has an integer conversion rank defined as follows:
The rank of long long int shall be greater than the rank of long int, which
shall be greater than the rank of int, which shall be greater than the rank of short
int, which shall be greater than the rank of signed char.
The rank of any unsigned integer type shall equal the rank of the corresponding
signed integer type, if any.
For all integer types T1, T2, and T3, if T1 has greater rank than T2 and T2 has
greater rank than T3, then T1 has greater rank than T3
Does an Integer variable in C occupy 2 bytes or 4 bytes?
That depends on the platform you're using, as well as how your compiler is configured. The only authoritative answer is to use the sizeof operator to see how big an integer is in your specific situation.
What are the factors that it depends on?
Range might be best considered, rather than size. Both will vary in practice, though it's much more fool-proof to choose variable types by range than size as we shall see. It's also important to note that the standard encourages us to consider choosing our integer types based on range rather than size, but for now let's ignore the standard practice, and let our curiosity explore sizeof, bytes and CHAR_BIT, and integer representation... let's burrow down the rabbit hole and see it for ourselves...
sizeof, bytes and CHAR_BIT
The following statement, taken from the C standard (linked to above), describes this in words that I don't think can be improved upon.
The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand.
Assuming a clear understanding will lead us to a discussion about bytes. It's commonly assumed that a byte is eight bits, when in fact CHAR_BIT tells you how many bits are in a byte. That's just another one of those nuances which isn't considered when talking about the common two (or four) byte integers.
Let's wrap things up so far:
sizeof => size in bytes, and
CHAR_BIT => number of bits in byte
Thus, Depending on your system, sizeof (unsigned int) could be any value greater than zero (not just 2 or 4), as if CHAR_BIT is 16, then a single (sixteen-bit) byte has enough bits in it to represent the sixteen bit integer described by the standards (quoted below). That's not necessarily useful information, is it? Let's delve deeper...
Integer representation
The C standard specifies the minimum precision/range for all standard integer types (and CHAR_BIT, too, fwiw) here. From this, we can derive a minimum for how many bits are required to store the value, but we may as well just choose our variables based on ranges. Nonetheless, a huge part of the detail required for this answer resides here. For example, the following that the standard unsigned int requires (at least) sixteen bits of storage:
UINT_MAX 65535 // 2¹⁶ - 1
Thus we can see that unsigned int require (at least) 16 bits, which is where you get the two bytes (assuming CHAR_BIT is 8)... and later when that limit increased to 2³² - 1, people were stating 4 bytes instead. This explains the phenomena you've observed:
Most of the textbooks say integer variables occupy 2 bytes. But when I run a program printing the successive addresses of an array of integers it shows the difference of 4.
You're using an ancient textbook and compiler which is teaching you non-portable C; the author who wrote your textbook might not even be aware of CHAR_BIT. You should upgrade your textbook (and compiler), and strive to remember that I.T. is an ever-evolving field that you need to stay ahead of to compete... Enough about that, though; let's see what other non-portable secrets those underlying integer bytes store...
Value bits are what the common misconceptions appear to be counting. The above example uses an unsigned integer type which typically contains only value bits, so it's easy to miss the devil in the detail.
Sign bits... In the above example I quoted UINT_MAX as being the upper limit for unsigned int because it's a trivial example to extract the value 16 from the comment. For signed types, in order to distinguish between positive and negative values (that's the sign), we need to also include the sign bit.
INT_MIN -32768 // -(2¹⁵)
INT_MAX +32767 // 2¹⁵ - 1
Padding bits... While it's not common to encounter computers that have padding bits in integers, the C standard allows that to happen; some machines (i.e. this one) implement larger integer types by combining two smaller (signed) integer values together... and when you combine signed integers, you get a wasted sign bit. That wasted bit is considered padding in C. Other examples of padding bits might include parity bits and trap bits.
As you can see, the standard seems to encourage considering ranges like INT_MIN..INT_MAX and other minimum/maximum values from the standard when choosing integer types, and discourages relying upon sizes as there are other subtle factors likely to be forgotten such as CHAR_BIT and padding bits which might affect the value of sizeof (int) (i.e. the common misconceptions of two-byte and four-byte integers neglects these details).
The only guarantees are that char must be at least 8 bits wide, short and int must be at least 16 bits wide, and long must be at least 32 bits wide, and that sizeof (char) <= sizeof (short) <= sizeof (int) <= sizeof (long) (same is true for the unsigned versions of those types).
int may be anywhere from 16 to 64 bits wide depending on the platform.
Is the size of C “int” 2 bytes or 4 bytes?
The answer is "yes" / "no" / "maybe" / "maybe not".
The C programming language specifies the following: the smallest addressable unit, known by char and also called "byte", is exactly CHAR_BIT bits wide, where CHAR_BIT is at least 8.
So, one byte in C is not necessarily an octet, i.e. 8 bits. In the past one of the first platforms to run C code (and Unix) had 4-byte int - but in total int had 36 bits, because CHAR_BIT was 9!
int is supposed to be the natural integer size for the platform that has range of at least -32767 ... 32767. You can get the size of int in the platform bytes with sizeof(int); when you multiply this value by CHAR_BIT you will know how wide it is in bits.
While 36-bit machines are mostly dead, there are still platforms with non-8-bit bytes. Just yesterday there was a question about a Texas Instruments MCU with 16-bit bytes, that has a C99, C11-compliant compiler.
On TMS320C28x it seems that char, short and int are all 16 bits wide, and hence one byte. long int is 2 bytes and long long int is 4 bytes. The beauty of C is that one can still write an efficient program for a platform like this, and even do it in a portable manner!
Mostly it depends on the platform you are using .It depends from compiler to compiler.Nowadays in most of compilers int is of 4 bytes.
If you want to check what your compiler is using you can use sizeof(int).
main()
{
printf("%d",sizeof(int));
printf("%d",sizeof(short));
printf("%d",sizeof(long));
}
The only thing c compiler promise is that size of short must be equal or less than int and size of long must be equal or more than int.So if size of int is 4 ,then size of short may be 2 or 4 but not larger than that.Same is true for long and int. It also says that size of short and long can not be same.
This depends on implementation, but usually on x86 and other popular architectures like ARM ints take 4 bytes. You can always check at compile time using sizeof(int) or whatever other type you want to check.
If you want to make sure you use a type of a specific size, use the types in <stdint.h>
#include <stdio.h>
int main(void) {
printf("size of int: %d", (int)sizeof(int));
return 0;
}
This returns 4, but it's probably machine dependant.
Is the size of C “int” 2 bytes or 4 bytes?
Does an Integer variable in C occupy 2 bytes or 4 bytes?
C allows "bytes" to be something other than 8 bits per "byte".
CHAR_BIT number of bits for smallest object that is not a bit-field (byte) C11dr §5.2.4.2.1 1
A value of something than 8 is increasingly uncommon. For maximum portability, use CHAR_BIT rather than 8. The size of an int in bits in C is sizeof(int) * CHAR_BIT.
#include <limits.h>
printf("(int) Bit size %zu\n", sizeof(int) * CHAR_BIT);
What are the factors that it depends on?
The int bit size is commonly 32 or 16 bits. C specified minimum ranges:
minimum value for an object of type int INT_MIN -32767
maximum value for an object of type int INT_MAX +32767
C11dr §5.2.4.2.1 1
The minimum range for int forces the bit size to be at least 16 - even if the processor was "8-bit". A size like 64 bits is seen in specialized processors. Other values like 18, 24, 36, etc. have occurred on historic platforms or are at least theoretically possible. Modern coding rarely worries about non-power-of-2 int bit sizes.
The computer's processor and architecture drive the int bit size selection.
Yet even with 64-bit processors, the compiler's int size may be 32-bit for compatibility reasons as large code bases depend on int being 32-bit (or 32/16).
This is a good source for answering this question.
But this question is a kind of a always truth answere "Yes. Both."
It depends on your architecture. If you're going to work on a 16-bit machine or less, it can't be 4 byte (=32 bit). If you're working on a 32-bit or better machine, its length is 32-bit.
To figure out, get you program ready to output something readable and use the "sizeof" function. That returns the size in bytes of your declared datatype. But be carfull using this with arrays.
If you're declaring int t[12]; it will return 12*4 byte. To get the length of this array, just use sizeof(t)/sizeof(t[0]).
If you are going to build up a function, that should calculate the size of a send array, remember that if
typedef int array[12];
int function(array t){
int size_of_t = sizeof(t)/sizeof(t[0]);
return size_of_t;
}
void main(){
array t = {1,1,1}; //remember: t= [1,1,1,0,...,0]
int a = function(t); //remember: sending t is just a pointer and equal to int* t
print(a); // output will be 1, since t will be interpreted as an int itselve.
}
So this won't even return something different. If you define an array and try to get the length afterwards, use sizeof. If you send an array to a function, remember the send value is just a pointer on the first element. But in case one, you always knows, what size your array has. Case two can be figured out by defining two functions and miss some performance. Define function(array t) and define function2(array t, int size_of_t). Call "function(t)" measure the length by some copy-work and send the result to function2, where you can do whatever you want on variable array-sizes.

Range of char type values in C

Is it possible to store more than a byte value to a char type?
Say for example char c; and I want to store 1000 in c. is it possible to do that?
Technically, no, you can't store more than a byte value to a char type. In C, a char and a byte are the same size, but not necessarily limited to 8 bits. Many standards bodies tend to use the term "octet" for an exactly-8-bit value.
If you look inside limits.h (from memory), you'll see the CHAR_BIT symbol (among others) telling you how many bits are actually used for a char and, if this is large enough then, yes, it can store the value 1000.
The range of values you can store in a C type depends on its size, and that is not specified by C, it depends on the architecture. A char type has a minimum of 8 bits. And typically (almost universally) that's also its maximum (you can check it in your limits.h).
Hence, in a char you will be able to store from -128 to 127, or from 0 to 255 (signed or unsigned).
The minimum size for a char in C is 8 bits, which is not wide enough to hold more than 256 values. It may be wider in a particular implementation such as a word-addressable architecture, but you shouldn't rely on that.
Include limits.h and check the value of CHAR_MAX.
Probably not. The C standard requires that a char can hold at least 8 bits, so you can't depend on being able to store a value longer than 8 bits in a char portably.
(* In most commonly-used systems today, chars are 8 bits).
Char's width is system-dependent. But assuming you're using something reasonably C99-compatible, you should have access to a header stdint.h, which defines types of the formats intN_t and uintN_t where N=8,16,32,64. These are guaranteed to be at least N bits wide. So if you want to be certain to have a type with a certain amount of bits (regardless of system), those are the guys you want.
Example:
#include <stdint.h>
uint32_t foo; /* Unsigned, 32 bits */
int16_t bar; /* Signed, 16 bits */

Any guaranteed minimum sizes for types in C?

Can you generally make any assumptions about the minimum size of a data type?
What I have read so far:
char: 1 Byte
short: 2 Byte
int: 2 Byte, typically 4 Byte
long: 4 Byte
float??? double???
Are the values in float.h and limits.h system dependent?
This is covered in the Wikipedia article:
A short int must not be larger than an int.
An int must not be larger than a long int.
A short int must be at least 16 bits long.
An int must be at least 16 bits long.
A long int must be at least 32 bits long.
A long long int must be at least 64 bits long.
The standard does not require that any of these sizes be necessarily different. It is perfectly valid, for example, if all four types are 64 bits long.
Yes, the values in float.h and limits.h are system dependent. You should never make assumptions about the width of a type, but the standard does lay down some minimums. See §6.2.5 and §5.2.4.2.1 in the C99 standard.
For example, the standard only says that a char should be large enough to hold every character in the execution character set. It doesn't say how wide it is.
For the floating-point case, the standard hints at the order in which the widths of the types are given:
§6.2.5.10
There are three real floating types, designated as float, double, and long
double. 32) The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double.
They implicitly defined which is wider than the other, but not specifically how wide they are. "Subset" itself is vague, because a long double can have the exact same range of a double and satisfy this clause.
This is pretty typical of how C goes, and a lot is left to each individual environment. You can't assume, you have to ask the compiler.
Nine years and still no direct answer about the minimum size for float, double, long double.
Any guaranteed minimum sizes for types in C?
For floating point type ...
From a practical point-of-view, float minimum size is 32-bits and double is 64- bits. C allows double and long double to share similar characteristics, so a long double could be as small as a double: Example1 or 80-bit or 128-bit or ...
I could imagine a C compliant 48-bit double may have existed – yet do not know of any.
Now, let us imagine our rich uncle dies and left us a fortune to pay for the development and cultural promotion for www.smallest_C_float.com.
C specifies:
float finite range is at least [1E-37… 1E+37]. See FLT_MIN, FLT_MAX
(1.0f + FLT_EPSILON) – 1.0f <= 1E-5.
float supports positive and negative values.
Let X: Digit 1-9
Let Y: Digit 0-9
Let E: value -37 to 36
Let S: + or -
Let b: 0 or 1
Our float could minimally represent all the combinations, using base 10, of SX.YYYYY*10^E.
0.0 and ±1E+37 are also needed (3 more). We do not need -0.0, sub-normals, ±infinity nor not-a-numbers.
That is 2910^5*74 + 3 combinations or 133,200,003 which needs at least 27 bits to encode - somehow. Recall the goal is minimal size.
With a classic base 2 approach, we can assume an implied 1 and get
S1.bbbb_bbbb_bbbb_bbbb_b2^e or 22^17*226 combinations or 26 bits.
If we try base 16, we then need about 21516^(4 or 5)*57 combinations or at least 26 to 30 bits.
Conclusion: A C float needs at least 26 bits of encoding.
A C’s double need not express a greater exponential range than float, it only has a different minimal precision requirement. 1E-9.
S1.bbbb_bbbb_bbbb_bbbb_ bbbb_ bbbb_ bbbb_bb2^e --> 22^30*226 combinations or 39 bits.
On our imagine-if-you-will computer, we could have a 13-bit char and so encode float, double, long double without padding. Thus we can realize a non-padded 26-bit float and 39-bit double, long double.
1: Microsoft Visual C++ for x86, which makes long double a synonym for double
[Edit] 2020
Additional double requirements may require 41 bits. May have to use 42-bit double and 28-bit float. Will need to review. Uncle will not be happy.
However, the new C99 specifies (in stdint.h) optional types of minimal sizes, like uint_least8_t, int_least32_t, and so on..
(see en_wikipedia_Stdint_h)
If you wan't to check the size (in multiples of chars) of any type on your system/platform really is the size you expect, you could do:
enum CHECK_FLOAT_IS_4_CHARS
{
IF_THIS_FAILS_FLOAT_IS_NOT_4_CHARS = 1/(sizeof(float) == 4)
};
Often developers asking this kind of question are dealing with arranging a packed struct to match a defined memory layout (as for a message protocol). The assumption is that the language should directly specify laying out 16-, 24-, 32-bit, etc. fields for the purpose.
That is routine and acceptable for assembly languages and other application-specific languages closely tied to a particular CPU architecture, but is sometimes a problem in a general purpose language which might be targeted at who-knows-what kind of architecture.
In fact, the C language was not intended for a particular hardware implementation. It was specified generally so a C compiler implementer could properly adapt to the realities of a particular CPU. A Frankenstein hardware architecture consisting of 9 bit bytes, 54 bit words, and 72 bit memory addresses is easily—and unambiguously—mapped to C features. (char is 9 bits; short int, int, and long int are 54 bits.)
This generality is why the C specification says something to the effect of "don't expect much about the sizes of ints beyond sizeof (char) <= sizeof (short int) <= sizeof (int) <= sizeof (long int)." That implies that chars could be the same size as longs!
The current reality is—and the future seems to hold—that software demands architectures provide 8-bit bytes and that memory words addressable as individual bytes. This wasn't always so. Not too long ago, I worked on an the CDC Cyber architecture which features 6 bit "bytes" and 60 bit words. A C implementation on that would be interesting. In fact, that architecture is responsible for the weird packing semantics of Pascal—if anyone remembers that.
C99 N1256 standard draft
http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf
C99 specifies two types of integer guarantees:
minimum size guarantees
relative sizes between the types
Relative guarantees
6.2.5 Types:
8 For any two integer types with the same signedness and different integer conversion rank
(see 6.3.1.1), the range of values of the type with smaller integer conversion rank is a
subrange of the values of the other type.
and 6.3.1.1 Boolean, characters, and integers determines the relative conversion ranks:
1 Every integer type has an integer conversion rank defined as follows:
The rank of long long int shall be greater than the rank of long int, which
shall be greater than the rank of int, which shall be greater than the rank of short
int, which shall be greater than the rank of signed char.
The rank of any unsigned integer type shall equal the rank of the corresponding
signed integer type, if any.
For all integer types T1, T2, and T3, if T1 has greater rank than T2 and T2 has
greater rank than T3, then T1 has greater rank than T3
Absolute minimum sizes
Mentioned by https://stackoverflow.com/a/1738587/895245 , here is the quote for convenience.
5.2.4.2.1 Sizes of integer types <limits.h>:
1 [...] Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown [...]
UCHAR_MAX 255 // 2 8 − 1
USHRT_MAX 65535 // 2 16 − 1
UINT_MAX 65535 // 2 16 − 1
ULONG_MAX 4294967295 // 2 32 − 1
ULLONG_MAX 18446744073709551615 // 2 64 − 1
Floating point
If the __STDC_IEC_559__ macro is defined, then IEEE types are guaranteed for each C type, although long double has a few possibilities: Is it safe to assume floating point is represented using IEEE754 floats in C?
Quoting the standard does give what is defined to be "the correct answer" but it doesn't actually reflect the way programs are generally written.
People make assumptions all the time that char is 8 bits, short is 16, int is 32, long is either 32 or 64, and long long is 64.
Those assumptions are not a great idea but you will not get fired for making them.
In theory, <stdint.h> can be used to specify fixed-bit-width types, but you have to scrounge one up for Microsoft. (See here for a MS stdint.h.) One of the problems here is that C++ technically only needs C89 compatibility to be a conforming implementation; even for plain C, C99 is not fully supported even in 2009.
It's also not accurate to say there is no width specification for char. There is, the standard just avoids saying whether it is signed or not. Here is what C99 actually says:
number of bits for smallest object that is not a bit-field (byte)
CHAR_BIT 8
minimum value for an object of type signed char
SCHAR_MIN -127 // -(27 - 1)
maximum value for an object of type signed char
SCHAR_MAX +127 // 27 - 1
maximum value for an object of type unsigned char
UCHAR_MAX 255 // 28 - 1
Most of the libraries define something like this:
#ifdef MY_ARCHITECTURE_1
typedef unsigned char u_int8_t;
typedef short int16_t;
typedef unsigned short u_int16_t;
typedef int int32_t;
typedef unsigned int u_int32_t;
typedef unsigned char u_char;
typedef unsigned int u_int;
typedef unsigned long u_long;
typedef unsigned short u_short;
#endif
you can then use those typedef in your programs instead of the standard types.

Resources