Does the ANSI C specification call for size of int to be equal to the word size (32 bit / 64 bit) of the system?
In other words, can I decipher the word size of the system based on the space allocated to an int?
The size of the int type is implementation-dependent, but cannot be shorter than 16 bits. See the Minimum Type Limits section here.
This Linux kernel development site claims that the size of the long type is guaranteed to be the machine's word size, but that statement is likely to be false: I couldn't find any confirmation of that in the standard, and long is only 32 bits wide on Win64 systems (since these systems use the LLP64 data model).
The language specification recommends that int should have the natural "word" size for the hardware platform. However, it is not strictly required. If you noticed, to simplify 32-bit-to-64-bit code transition some modern implementations prefer to keep int as 32-bit type even if the underlying hardware platform has 64-bit word size.
And as Frederic already noted, in any case the size may not be smaller than 16 value-forming bits.
The original intension was the int would be the word size - the most efficient data-processing size. Still, what tends to happen is that massive amounts of code are written that assume the size of int is X bits, and when the hardware that code runs on moves to larger word size, the carelessly-written code would break. Compiler vendors have to keep their customers happy, so they say "ok, we'll leave int sized as before, but we'll make long bigger now". Or, "ahhh... too many people complained about us making long bigger, we'll create a long long type while leaving sizeof(int) == sizeof(long)". So, these days, it's all a mess:
Does the ANSI C specification call for size of int to be equal to the word size (32 bit / 64 bit) of the system?
Pretty much the idea, but it doesn't insist on it.
In other words, can I decipher the word size of the system based on the space allocated to an int?
Not in practice.
You should check your system provided limits.h header file. INT_MAX declaration should help you back-calculate what is the minimum size an integer must have. For details look into http://www.opengroup.org/onlinepubs/009695399/basedefs/limits.h.html
Related
Why there is different size of C data types in 32bit and 64 bit system...
for example: int size in 32bit is 4 byte and 8 byte in 64bit
what is the reason behind it to double the size of data types in 64bit as well as My knowledge is concern there is no performance issue if we are going to use same size in 64 bit system as it is in 32 bit....
Why there is different size of C data types in 32bit and 64 bit system[?]
The sizes of C's basic data types are implementation-dependent. They are not necessarily dependent on machine architecture, and counterexamples abound. For example, 32-bit and 64-bit implementations of GCC use the same size data types for x86 and x86_64.
what is the reason behind it to double the size of data types in 64bit[?]
The reasons for implementation decisions vary with implementors and implementation characteristics. int is often, but not always, chosen to have a size that is natural for the target machine in some sense. That might mean that operations on it are fast, or that it is efficient to load from and store to memory, or other things. These are the kinds of considerations involved.
The C language definition does not mandate a specific size for most data types; instead, specifies the range of values each type must be able to represent. char must be large enough to represent a single character of the execution character set (at least the range [-127,127]), short and int must be large enough to represent at least the range [-32767,32767], etc.
Traditionally, the size of int was the same as the "natural" word size for a given architecture, on the premise that a) it would be easier to implement, and b) operations on that type would be the most efficient. Whether that's still true today, I'm not qualified to say (not a hardware guy).
So, I was writing an implementation of ye olde SHA1 algorithm in C (I know it's insecure, it's for the Matasano problems), and for some of the variables it's pretty crucial that they're exactly 32 bits long. Having read that unsigned long int is 32 bits by standard, I just used that, and then spent 4 hours trying to find why the hell my hashes were coming out all wrong, until I thought to see what sizeof(unsigned long int) came out to be. Spoiler, it was 64.
Now of course I'm using uint32_t, and always will in the future, but could someone (preferably, someone who has more discretion and less gullibility than I do) please point me to where the actual standards for variable sizes are written down for modern C? Or tell me why this question is misguided, if it is?
The minimum sizes are:
char - 8-bit
short - 16-bit
int - 16-bit
long - 32-bit
long long - 64-bit
There are no maximum sizes, the compiler writer chooses whatever they think will work best for the target platform. The corresponding unsigned types have the same minimum size.
In C integer and short integer variables are identical: both range from -32768 to 32767, and the required bytes of both are also identical, namely 2.
So why are two different types necessary?
Basic integer types in C language do not have strictly defined ranges. They only have minimum range requirements specified by the language standard. That means that your assertion about int and short having the same range is generally incorrect.
Even though the minimum range requirements for int and short are the same, in a typical modern implementation the range of int is usually greater than the range of short.
The standard only guarantees sizeof(short) <= sizeof(int) <= sizeof(long) as far as I remember. So both short and int can be the same but don't have to. 32 bit compilers usually have 2 bytes short and 4 bytes int.
The C++ standard (and the C standard, which has a very similar paragraph, but the quote is from the n3337 version of the C++11 draft specification):
Section 3.9.1, point 2:
There are five standard signed integer types : “signed char”, “short int”, “int”, “long int”, and “long
long int”. In this list, each type provides at least as much storage as those preceding it in the list.
There may also be implementation-defined extended signed integer types. The standard and extended signed
integer types are collectively called signed integer types. Plain ints have the natural size suggested by the
architecture of the execution environment ; the other signed integer types are provided to meet special
needs.
Different architectures have different size "natural" integers, so a 16-bit architecture will naturally calculate a 16-bit value, where a 32- or 64-bit architecture will use either 32 or 64-bit int's. It's a choice for the compiler producer (or the definer of the ABI for a particular architecture, which tends to be a decision formed by a combination of the OS and the "main" Compiler producer for that architecture).
In modern C and C++, there are types along the lines of int32_t that is guaranteed to be exactly 32 bits. This helps portability. If these types aren't sufficient (or the project is using a not so modern compiler), it is a good idea to NOT use int in a data structure or type that needs a particular precision/size, but to define a uint32 or int32 or something similar, that can be used in all places where the size matters.
In a lot of code, the size of a variable isn't critical, because the number is within such a range, that a few thousand is way more than you ever need - e.g. number of characters in a filename is defined by the OS, and I'm not aware of any OS where a filename/path is more than 4K characters - so a 16, 32 or 64 bit value that can go to at least 32K would be perfectly fine for counting that - it doesn't really matter what size it is - so here we SHOULD use int, not try to use a specific size. int should, in a compiler be a type that is "efficient", so should help to give good performance, where some architectures will run slower if you use short, and certainly 16-bit architectures will run slower using long.
The guaranteed minimum ranges of int and short are the same. However an implementation is free to define short with a smaller range than int (as long as it still meets the minimum), which means that it may be expected to take the same or smaller storage space than int1. The standard says of int that:
A ‘‘plain’’ int object has the natural size suggested by the
architecture of the execution environment.
Taken together, this means that (for values that fall into the range -32767 to 32767) portable code should prefer int in almost all cases. The exception would be where the a very large number of values are being stored, such that the potentially smaller storage space occupied by short is a consideration.
1. Of course a pathological implementation is free to define a short that has a larger size in bytes than int, as long as it still has equal or lesser range - there is no good reason to do so, however.
They both are identical for 16 bit IBM compatible PC. However it is not sure that it will be identical on other hardwares as well.
VAX type of system which is known as virtual address extension they treat all these 2 variables in different manner. It occupies 2 bytes for short integer and 4 bytes for integer.
So this is the reason that we have 2 different but identical variables and their property.
for general purpose in desktops and laptops we use integer.
How the size of int is decided?
Is it true that the size of int will depend on the processor. For 32-bit machine, it will be 32 bits and for 16-bit it's 16.
On my machine it's showing as 32 bits, although the machine has 64-bit processor and 64-bit Ubuntu installed.
It depends on the implementation. The only thing the C standard guarantees is that
sizeof(char) == 1
and
sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
and also some representable minimum values for the types, which imply that char is at least 8 bits long, int is at least 16 bit, etc.
So it must be decided by the implementation (compiler, OS, ...) and be documented.
It depends on the compiler.
For eg : Try an old turbo C compiler & it would give the size of 16 bits for an int because the word size (The size the processor could address with least effort) at the time of writing the compiler was 16.
Making int as wide as possible is not the best choice. (The choice is made by the ABI designers.)
A 64bit architecture like x86-64 can efficiently operate on int64_t, so it's natural for long to be 64 bits. (Microsoft kept long as 32bit in their x86-64 ABI, for various portability reasons that make sense given the existing codebases and APIs. This is basically irrelevant because portable code that actually cares about type sizes should be using int32_t and int64_t instead of making assumptions about int and long.)
Having int be int32_t actually makes for better, more efficient code in many cases. An array of int use only 4B per element has only half the cache footprint of an array of int64_t. Also, specific to x86-64, 32bit operand-size is the default, so 64bit instructions need an extra code byte for a REX prefix. So code density is better with 32bit (or 8bit) integers than with 16 or 64bit. (See the x86 wiki for links to docs / guides / learning resources.)
If a program requires 64bit integer types for correct operation, it won't use int. (Storing a pointer in an int instead of an intptr_t is a bug, and we shouldn't make the ABI worse to accommodate broken code like that.) A programmer writing int probably expected a 32bit type, since most platforms work that way. (The standard of course only guarantees 16bits).
Since there's no expectation that int will be 64bit in general (e.g. on 32bit platforms), and making it 64bit will make some programs slower (and almost no programs faster), int is 32bit in most 64bit ABIs.
Also, there needs to be a name for a 32bit integer type, for int32_t to be a typedef for.
It is depends on the primary compiler.
if you using turbo c means the integer size is 2 bytes.
else you are using the GNU gccompiler means the integer size is 4 bytes.
it is depends on only implementation in C compiler.
The size of integer is basically depends upon the architecture of your system.
Generally if you have a 16-bit machine then your compiler will must support a int of size 2 byte.
If your system is of 32 bit,then the compiler must support for 4 byte for integer.
In more details,
The concept of data bus comes into picture yes,16-bit ,32-bit means nothing but the size of data bus in your system.
The data bus size is required for to determine the size of an integer because,The purpose of data bus is to provide data to the processor.The max it can provide to the processor at a single fetch is important and this max size is preferred by the compiler to give a data at time.
Basing upon this data bus size of your system the compiler is
designed to provide max size of the data bus as the size of integer.
x06->16-bit->DOS->turbo c->size of int->2 byte
x306->32-bit>windows/Linux->GCC->size of int->4 byte
Yes. int size depends on the compiler size.
For 16 bit integer the range of the integer is between -32768 to 32767. For 32 & 64 bit compiler it will increase.
in this article, taken from the book "Linux kernel development":
http://www.makelinux.net/books/lkd2/ch19lev1sec2
it says:
The size of the C long type is guaranteed to be the machine's word size. On the downside, however, code cannot assume that the standard C types have any specific size. Furthermore, there is no guarantee that an int is the same size as a long
Question is, i thought int is the same as the word size, not long, and i couldn't find any official standard which defines this saying.
any thoughts?
Sometimes, people on the Internet are wrong. The sizes are fixed by the ABI. Linux ports don't necessarily create an original ABI (usually another platform or manufacturer recommendation is followed), so there's nobody making guarantees about int and long. The term "machine word" is also very ill-defined.
The size of the C long type is guaranteed to be the machine's word size.
This is wrong for a lot of platforms. For example, in the embedded world usually 8-bit MCU (e.g., HC08) have a 8-bit word size and 16-bit MCU (e.g., MSP430) have a 16-bit word size but long is 32-bit in these platforms. In Windows x64 (MSVC compiler), the size of a word is 64-bit but long is 32-bit.
The C standard does not know what a word is, and a C implementation might do things in unusual ways. So your book is wrong. (for example, some C implementation might use 64 bits long on a 8 bit micro-controller).
However, the C99 standard defines the <stdint.h> header with types like intptr_t (an integral type with the same size as void* pointers) or int64_t (a 64 bits integer) etc.
See also this question, and wikipedia's page on C data types.