integer size in c depends on what? - c

Size of the integer depends on what?
Is the size of an int variable in C dependent on the machine or the compiler?

It's implementation-dependent. The C standard only requires that:
char has at least 8 bits
short has at least 16 bits
int has at least 16 bits
long has at least 32 bits
long long has at least 64 bits (added in 1999)
sizeof(char) ≤ sizeof(short) ≤ sizeof(int) ≤ sizeof(long) ≤ sizeof(long long)
In the 16/32-bit days, the de facto standard was:
int was the "native" integer size
the other types were the minimum size allowed
However, 64-bit systems generally did not make int 64 bits, which would have created the awkward situation of having three 64-bit types and no 32-bit type. Some compilers expanded long to 64 bits.

Formally, representations of all fundamental data types (including their sizes) are compiler-dependent and only compiler-dependent. The compiler (or, more properly, the implementation) can serve as an abstraction layer between the program and the machine, completely hiding the machine from the program or distorting it in any way it pleases.
But in practice compilers are designed to generate the most efficient code for given machine and/or OS. In order to achieve that the fundamental data types should have natural representation for the given machine and/or OS. In that sense, these representations are indirectly dependent on the machine and/or OS.
In other words, from the abstract, formal and pedantic point of view the compiler is free to completely ignore the data type representations specific to the machine. But it makes no practical sense. In practice compilers make full use of data type representations provided by the machine.
Still, if some data type is not supported by the machine, the compiler can still provide that data type to the programs by implementing its support at the compiler level ("emulating" it). For example, 64-bit integer types are normally available in 32-bit compilers for 32-bit machines, even though they are not directly supported by the machine. Back in the day the compilers would often provide compiler-level support for floating-point types for machines that were not equipped with floating-point co-processor (and therefore did not support floating-point types directly).

It depends primarily on the compiler. For example, if you have a 64-bit x86 processor, you can use an old 16-bit compiler and get 16-bit ints, a 32-bit compiler and get 32-bit ints, or a 64-bit compiler and get 64-bit ints.
It depends on the processor to the degree that the compiler targets a particular processor, and (for example) an ancient 16-bit processor simply won't run code that targets a shiny new 64-bit processor.
The C and C++ standards do guarantee some minimum size (indirectly by specifying minimum supported ranges):
char: 8 bits
short: 16 bits
long: 32 bits
long long: 64 bits
The also guarantee that the sizes/ranges are strictly non-decreasing in the following order: char, short, int, long, and long long1.
1long long is specified in C99 and C++0x, but some compilers (e.g., gcc, Intel, Comeau) allow it in C++03 code as well. If you want to, you can persuade most (if not all) to reject long long in C++03 code.

As MAK said, it's implementation dependent. That means it depends on the compiler. Typically, a compiler targets a single machine so you can also think of it as machine dependent.

AFAIK, the size of data types is implementation dependent. This means that it is entirely up to the implementer (i.e. the guy writing the compiler) to choose what it will be.
So, in short it depends on the compiler. But often it is simpler to just use whatever size it is easiest to map to the word size of the underlying machine - so the compiler often uses the size that fits the best with the underlying machine.

It depends on the running environment no matter what hardware you have. If you are using a 16bit OS like DOS, then it will be 2 bytes. On a 32 bit OS like Windows or Unix, it is 4 bytes and so on. Even if you run a 32 bit OS on a 64 bit processor, the size will be 4 bytes only. I hope this helps.

It depends on both the architecture (machine, executable type) and the compiler. C and C++ only guarantee certain minimums. (I think those are char: 8 bits, int: 16 bits, long: 32 bits)
C99 includes certain known width types like uint32_t (when possible). See stdint.h
Update: Addressed Conrad Meyer's concerns.

The size of an Integer Variable depends upon the type of compiler:
if you have a 16 bit compiler:
size of int is 2 bytes
char holds 1 byte
float occupies 4 bytes
if you have a 32 bit compiler:
size of each variable is just double of its size in a 16 bit compiler
int hold 4 bytes
char holds 2 bytes
float holds 8 bytes
Same thing happens if you have a 64 bit compiler, and so on.

Related

different size of c data type in 32 and 64 bit

Why there is different size of C data types in 32bit and 64 bit system...
for example: int size in 32bit is 4 byte and 8 byte in 64bit
what is the reason behind it to double the size of data types in 64bit as well as My knowledge is concern there is no performance issue if we are going to use same size in 64 bit system as it is in 32 bit....
Why there is different size of C data types in 32bit and 64 bit system[?]
The sizes of C's basic data types are implementation-dependent. They are not necessarily dependent on machine architecture, and counterexamples abound. For example, 32-bit and 64-bit implementations of GCC use the same size data types for x86 and x86_64.
what is the reason behind it to double the size of data types in 64bit[?]
The reasons for implementation decisions vary with implementors and implementation characteristics. int is often, but not always, chosen to have a size that is natural for the target machine in some sense. That might mean that operations on it are fast, or that it is efficient to load from and store to memory, or other things. These are the kinds of considerations involved.
The C language definition does not mandate a specific size for most data types; instead, specifies the range of values each type must be able to represent. char must be large enough to represent a single character of the execution character set (at least the range [-127,127]), short and int must be large enough to represent at least the range [-32767,32767], etc.
Traditionally, the size of int was the same as the "natural" word size for a given architecture, on the premise that a) it would be easier to implement, and b) operations on that type would be the most efficient. Whether that's still true today, I'm not qualified to say (not a hardware guy).

size of CPU register

It's typically better to use CPU registers to their full capacity.
For a portable piece of code, it means using 64-bits arithmetic and storage on 64-bits CPU, and only 32-bits on 32-bits CPU (otherwise, 64-bits instructions will be emulated in 32-bits mode, resulting in devastating performances).
That means it's necessary to detect the size of CPU registers, typically at compile-time (since runtime tests are expensive).
For years now, I've used the simple heuristic sizeof(nativeRegisters) == sizeof(size_t).
It has worked fine for a lot of platforms, but it appears to be a wrong heuristic for linux x32 : in this case, size_t is only 32-bits, while registers could still handle 64-bits. It results in some lost performance opportunity (significant for my use case).
I would like to correctly detect the usable size of CPU registers even in such a situation.
I suspect I could try to find some compiler-specific macro to special-case x32 mode. But I was wondering if something more generic would exist, to cover more situations. For example another target would be OpenVMS 64-bits : there, native register size is 64-bits, but size_t is only 32-bits.
There is no reliable and portable way to determine register size from C. C doesn't even have a concept of "registers" (the description of the register keyword doesn't mention CPU registers).
But it does define a set of integer types that are the fastest type of at least a specified size. <stdint.h> defines uint_fastN_t, for N = 8, 16, 32, 64.
If you're assuming that registers are at least 32 bits, then uint_fast32_t is likely to be the same size as a register, either 32 or 64 bits. This isn't guaranteed. Here's what the standard says:
Each of the following types designates an integer type that is usually
fastest to operate with among all integer types that have at least the
specified width.
with a footnote:
The designated type is not guaranteed to be fastest for all purposes;
if the implementation has no clear grounds for choosing one type over
another, it will simply pick some integer type satisfying the
signedness and width requirements.
In fact, I suggest that using the [u]int_fastN_t types expresses your intent more clearly than trying to match the CPU register size.
If that doesn't work for some target, you'll need to add some special-case
#if or #ifdef directives to choose a suitable type. But uint_fast32_t (or uint_fast16_t if you want to support 16-bit systems) is probably a better starting point than size_t or int.
A quick experiment shows that if I compile with gcc -mx32, both uint_fast16_t and uint_fast32_t are 32 bits. They're both 64 bits when compiled without -mx32 (on my x86_64 system). Which means that, at least for gcc, the uint_fastN_t types don't do what you want. You'll need special-case code for x32. (Arguably gcc should be using 64-bit types for uint_fastN_t in x32 mode. I've just posted this question asking about that.)
This question asks how to detect an x32 environment in the preprocessor. gcc provides no direct way to determine this, but I've just posted an answer suggesting the use of the __x86_64__ and SIZE_MAX macros.

What's the difference between "int" and "int_fast16_t"?

As I understand it, the C specification says that type int is supposed to be the most efficient type on target platform that contains at least 16 bits.
Isn't that exactly what the C99 definition of int_fast16_t is too?
Maybe they put it in there just for consistency, since the other int_fastXX_t are needed?
Update
To summarize discussion below:
My question was wrong in many ways. The C standard does not specify bitness for int. It gives a range [-32767,32767] that it must contain.
I realize at first most people would say, "but that range implies at least 16-bits!" But C doesn't require two's-compliment storage of integers. If they had said "16-bit", there may be some platforms that have 1-bit parity, 1-bit sign, and 14-bit magnitude that would still being "meeting the standard", but not satisfy that range.
The standard does not say anything about int being the most efficient type. Aside from size requirements above, int can be decided by the compiler developer based on whatever criteria they deem most important. (speed, size, backward compatibility, etc)
On the other hand, int_fast16_t is like providing a hint to the compiler that it should use a type that is optimum for performance, possibly at the expense of any other tradeoff.
Likewise, int_least16_t would tell the compiler to use the smallest type that's >= 16-bits, even if it would be slower. Good for preserving space in large arrays and stuff.
Example: MSVC on x86-64 has a 32-bit int, even on 64-bit systems. MS chose to do this because too many people assumed int would always be exactly 32-bits, and so a lot of ABIs would break. However, it's possible that int_fast32_t would be a 64-bit number if 64-bit values were faster on x86-64. (Which I don't think is actually the case, but it just demonstrates the point)
int is a "most efficient type" in speed/size - but that is not specified by per the C spec. It must be 16 or more bits.
int_fast16_t is most efficient type in speed with at least the range of a 16 bit int.
Example: A given platform may have decided that int should be 32-bit for many reasons, not only speed. The same system may find a different type is fastest for 16-bit integers.
Example: In a 64-bit machine, where one would expect to have int as 64-bit, a compiler may use a mode with 32-bit int compilation for compatibility. In this mode, int_fast16_t could be 64-bit as that is natively the fastest width for it avoids alignment issues, etc.
int_fast16_t is guaranteed to be the fastest int with a size of at least 16 bits. int has no guarantee of its size except that:
sizeof(char) = 1 and sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long).
And that it can hold the range of -32767 to +32767.
(7.20.1.3p2) "The typedef name int_fastN_t designates the fastest signed integer type with a width of at least N. The typedef name uint_fastN_t designates the fastest unsigned integer type with a width of at least N."
As I understand it, the C specification says that type int is supposed to be the most efficient type on target platform that contains at least 16 bits.
Here's what the standard actually says about int: (N1570 draft, section 6.2.5, paragraph 5):
A "plain" int object has the natural size suggested by the
architecture of the execution environment (large enough to contain any
value in the range INT_MIN to INT_MAX as defined in the
header <limits.h>).
The reference to INT_MIN and INT_MAX is perhaps slightly misleading; those values are chosen based on the characteristics of type int, not the other way around.
And the phrase "the natural size" is also slightly misleading. Depending on the target architecture, there may not be just one "natural" size for an integer type.
Elsewhere, the standard says that INT_MIN must be at most -32767, and INT_MAX must be at least +32767, which implies that int is at least 16 bits.
Here's what the standard says about int_fast16_t (7.20.1.3):
Each of the following types designates an integer type that is usually
fastest to operate with among all integer types that have at least the
specified width.
with a footnote:
The designated type is not guaranteed to be fastest for all purposes;
if the implementation has no clear grounds for choosing one type over
another, it will simply pick some integer type satisfying the
signedness and width requirements.
The requirements for int and int_fast16_t are similar but not identical -- and they're similarly vague.
In practice, the size of int is often chosen based on criteria other than "the natural size" -- or that phrase is interpreted for convenience. Often the size of int for a new architecture is chosen to match the size for an existing architecture, to minimize the difficulty of porting code. And there's a fairly strong motivation to make int no wider than 32 bits, so that the types char, short, and int can cover sizes of 8, 16, and 32 bits. On 64-bit systems, particularly x86-64, the "natural" size is probably 64 bits, but most C compilers make int 32 bits rather than 64 (and some compilers even make long just 32 bits).
The choice of the underlying type for int_fast16_t is, I suspect, less dependent on such considerations, since any code that uses it is explicitly asking for a fast 16-bit signed integer type. A lot of existing code makes assumptions about the characteristics of int that go beyond what the standard guarantees, and compiler developers have to cater to such code if they want their compilers to be used.
The difference is that the fast types are allowed to be wider than their counterparts (without fast) for efficiency/optimization purposes. But the C standard by no means guarantees they are actually faster.
C11, 7.20.1.3 Fastest minimum-width integer types
1 Each of the following types designates an integer type that is
usually fastest 262) to operate with among all integer types that have
at least the specified width.
2 The typedef name int_fastN_t designates the fastest signed integer
type with a width of at least N. The typedef name uint_fastN_t
designates the fastest unsigned integer type with a width of at least
N.
262) The designated type is not guaranteed to be fastest for all
purposes; if the implementation has no clear grounds for choosing
one type over another, it will simply pick some integer type
satisfying the signedness and width requirements.
Another difference is that fast and least types are required types whereas other exact width types are optional:
3 The following types are required: int_fast8_t int_fast16_t
int_fast32_t int_fast64_t uint_fast8_t uint_fast16_t uint_fast32_t
uint_fast64_t All other types of this form are optional.
From the C99 rationale 7.8 Format conversion of integer types <inttypes.h> (document that accompanies with Standard), emphasis mine:
C89 specifies that the language should support four signed and
unsigned integer data types, char, short, int and long, but places
very little requirement on their size other than that int and short be
at least 16 bits and long be at least as long as int and not smaller
than 32 bits. For 16-bit systems, most implementations assign 8, 16,
16 and 32 bits to char, short, int, and long, respectively. For 32-bit
systems, the common practice is to assign 8, 16, 32 and 32 bits to
these types. This difference in int size can create some problems for
users who migrate from one system to another which assigns different
sizes to integer types, because Standard C’s integer promotion rule
can produce silent changes unexpectedly. The need for defining an
extended integer type increased with the introduction of 64-bit
systems.
The purpose of <inttypes.h> is to provide a set of integer types whose
definitions are consistent across machines and independent of
operating systems and other implementation idiosyncrasies. It defines,
via typedef, integer types of various sizes. Implementations are free
to typedef them as Standard C integer types or extensions that they
support. Consistent use of this header will greatly increase the
portability of a user’s program across platforms.
The main difference between int and int_fast16_t is that the latter is likely to be free of these "implementation idiosyncrasies". You may think of it as something like:
I don't care about current OS/implementation "politics" of int size. Just give me whatever the fastest signed integer type with at least 16 bits is.
On some platforms, using 16-bit values may be much slower than using 32-bit values [e.g. an 8-bit or 16-bit store would require performing a 32-bit load, modifying the loaded value, and writing back the result]. Even if one could fit twice as many 16-bit values in a cache as 32-bit values (the normal situation where 16-bit values would be faster than 32-bit values on 32-bit systems), the need to have every write preceded by a read would negate any speed advantage that could produce unless a data structure was read far more often than it was written. On such platforms, a type like int_fast16_t would likely be 32 bits.
That having been said, the Standard does not unfortunately allow what would be the most helpful semantics for a compiler, which would be to allow variables of type int_fast16_t whose address is not taken to arbitrarily behave as 16-bit types or larger types, depending upon what is convenient. Consider, for example, the method:
int32_t blah(int32_t x)
{
int_fast16_t y = x;
return y;
}
On many platforms, 16-bit integers stored in memory can often be manipulated just as those stored in registers, but there are no instructions to perform 16-bit operations on registers. If an int_fast16_t variable stored in memory are only capable of holding -32768 to +32767, that same restriction would apply to int_fast16_t variables stored in registers. Since coercing oversized values into signed integer types too small to hold them is implementation-defined behavior, that would compel the above code to add instructions to sign-extend the lower 16 bits of x before returning it; if the Standard allowed for such a type, a flexible "at least 16 bits, but more if convenient" type could eliminate the need for such instructions.
An example of how the two types might be different: suppose there’s an architecture where 8-bit, 16-bit, 32-bit and 64-bit arithmetic are equally fast. (The i386 comes close.) Then, the implementer might use a LLP64 model, or better yet allow the programmer to choose between ILP64, LP64 and LLP64, since there’s a lot of code out there that assumes long is exactly 32 bits, and that sizeof(int) <= sizeof(void*) <= sizeof(long). Any 64-bit implementation must violate at least one of these assumptions.
In that case, int would probably be 32 bits wide, because that will break the least code from other systems, but uint_fast16_t could still be 16 bits wide, saving space.

size of int variable

How the size of int is decided?
Is it true that the size of int will depend on the processor. For 32-bit machine, it will be 32 bits and for 16-bit it's 16.
On my machine it's showing as 32 bits, although the machine has 64-bit processor and 64-bit Ubuntu installed.
It depends on the implementation. The only thing the C standard guarantees is that
sizeof(char) == 1
and
sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
and also some representable minimum values for the types, which imply that char is at least 8 bits long, int is at least 16 bit, etc.
So it must be decided by the implementation (compiler, OS, ...) and be documented.
It depends on the compiler.
For eg : Try an old turbo C compiler & it would give the size of 16 bits for an int because the word size (The size the processor could address with least effort) at the time of writing the compiler was 16.
Making int as wide as possible is not the best choice. (The choice is made by the ABI designers.)
A 64bit architecture like x86-64 can efficiently operate on int64_t, so it's natural for long to be 64 bits. (Microsoft kept long as 32bit in their x86-64 ABI, for various portability reasons that make sense given the existing codebases and APIs. This is basically irrelevant because portable code that actually cares about type sizes should be using int32_t and int64_t instead of making assumptions about int and long.)
Having int be int32_t actually makes for better, more efficient code in many cases. An array of int use only 4B per element has only half the cache footprint of an array of int64_t. Also, specific to x86-64, 32bit operand-size is the default, so 64bit instructions need an extra code byte for a REX prefix. So code density is better with 32bit (or 8bit) integers than with 16 or 64bit. (See the x86 wiki for links to docs / guides / learning resources.)
If a program requires 64bit integer types for correct operation, it won't use int. (Storing a pointer in an int instead of an intptr_t is a bug, and we shouldn't make the ABI worse to accommodate broken code like that.) A programmer writing int probably expected a 32bit type, since most platforms work that way. (The standard of course only guarantees 16bits).
Since there's no expectation that int will be 64bit in general (e.g. on 32bit platforms), and making it 64bit will make some programs slower (and almost no programs faster), int is 32bit in most 64bit ABIs.
Also, there needs to be a name for a 32bit integer type, for int32_t to be a typedef for.
It is depends on the primary compiler.
if you using turbo c means the integer size is 2 bytes.
else you are using the GNU gccompiler means the integer size is 4 bytes.
it is depends on only implementation in C compiler.
The size of integer is basically depends upon the architecture of your system.
Generally if you have a 16-bit machine then your compiler will must support a int of size 2 byte.
If your system is of 32 bit,then the compiler must support for 4 byte for integer.
In more details,
The concept of data bus comes into picture yes,16-bit ,32-bit means nothing but the size of data bus in your system.
The data bus size is required for to determine the size of an integer because,The purpose of data bus is to provide data to the processor.The max it can provide to the processor at a single fetch is important and this max size is preferred by the compiler to give a data at time.
Basing upon this data bus size of your system the compiler is
designed to provide max size of the data bus as the size of integer.
x06->16-bit->DOS->turbo c->size of int->2 byte
x306->32-bit>windows/Linux->GCC->size of int->4 byte
Yes. int size depends on the compiler size.
For 16 bit integer the range of the integer is between -32768 to 32767. For 32 & 64 bit compiler it will increase.

What is the historical context for long and int often being the same size?

According to numerous answers here, long and int are both 32 bits in size on common platforms in C and C++ (Windows & Linux, 32 & 64 bit.) (I'm aware that there is no standard, but in practice, these are the observed sizes.)
So my question is, how did this come about? Why do we have two types that are the same size? I previously always assumed long would be 64 bits most of the time, and int 32. I'm not saying it "should" be one way or the other, I'm just curious as to how we got here.
From the C99 rationale (PDF) on section 6.2.5:
[...] In the 1970s, 16-bit C (for the
PDP-11) first represented file
information with 16-bit integers,
which were rapidly obsoleted by disk
progress. People switched to a 32-bit
file system, first using int[2]
constructs which were not only
awkward, but also not efficiently
portable to 32-bit hardware.
To solve the problem, the long type
was added to the language, even though
this required C on the PDP-11 to
generate multiple operations to
simulate 32-bit arithmetic. Even as
32-bit minicomputers became available
alongside 16-bit systems, people still
used int for efficiency, reserving
long for cases where larger integers
were truly needed, since long was
noticeably less efficient on 16-bit
systems. Both short and long were
added to C, making short available
for 16 bits, long for 32 bits, and
int as convenient for performance.
There was no desire to lock the
numbers 16 or 32 into the language, as
there existed C compilers for at least
24- and 36-bit CPUs, but rather to
provide names that could be used for
32 bits as needed.
PDP-11 C might have been
re-implemented with int as 32-bits,
thus avoiding the need for long; but
that would have made people change
most uses of int to short or
suffer serious performance degradation
on PDP-11s. In addition to the
potential impact on source code, the
impact on existing object code and
data files would have been worse, even
in 1976. By the 1990s, with an immense
installed base of software, and with
widespread use of dynamic linked
libraries, the impact of changing the
size of a common data object in an
existing environment is so high that
few people would tolerate it, although
it might be acceptable when creating a
new environment. Hence, many vendors,
to avoid namespace conflicts, have
added a 64-bit integer to their 32-bit
C environments using a new name, of
which long long has been the most
widely used. [...]
Historically, most of the sizes and types in C can be traced back to the PDP-11 architecture. That had bytes, words (16 bits) and doublewords (32 bits). When C and UNIX were moved to another machine (the Interdata 832 I think), the word length was 32 bits. To keep the source compatible, long and int were defined so that, strictly
sizeof(short) ≤ sizeof(int) ≤ sizeof(long).
Most machines now end up with sizeof(int) = sizeof(long) because 16 bits is no longer convenient, but we have long long to get 64 bits if needed.
Update strictly I should have said "compilers" because different compiler implmentors can make different decisions for the same instruction set architecture. GCC and Microsoft, for example.
Back in the late 70s and early 80s many architectures were 16 bit, so typically char was 8 bit, int was 16 bit and long was 32 bit. In the late 80s there was a general move to 32 bit architectures and so int became 32 bits but long remained at 32 bits.
Over the last 10 years there has been a move towards 64 bit computing and we now have a couple of different models, the most common being LP64, where ints are still 32 bits and long is now 64 bits.
Bottom line: don't make any assumptions about the sizes of different integer types (other than what's defined in the standard of course) and if you need fixed size types then use <stdint.h>.
As I understand it, the C standard requires that a long be at least 32 bits long, and be at least as long as an int. An int, on the other hand, is always (I think) equal to the native word size of the architecture.
Bear in mind that, when the standards were drawn up, 32-bit machines were not common; originally, an int would probably have been the native 16-bits, and a long would have been twice as long at 32-bits.
In 16-bit operating systems, int was 16-bit and long was 32-bit. After moving to Win32, both become 32 bit. Moving to 64 bit OS, it is a good idea to keep long size unchanged, this doesn't break existing code when it compiled in 64 bit. New types (like Microsoft-specific __int64, size_t etc.) may be used in 64 bit programs.

Resources