Does the size of the integer or any other data types in C dependent on the underlying architecture? - c

#include<stdio.h>
int main()
{
int c;
return 0;
} // on Intel architecture
#include <stdio.h>
int main()
{
int c;
return 0;
}// on AMD architecture
/*
Here I have a code on the two different machines and I want to know the 'Is the size of the data types dependent on the machine '
*/

see here:
size guarantee for integral/arithmetic types in C and C++
Fundamental C type sizes are depending on implementation (compiler) and architecture, however they have some guaranteed boundaries. One should therefore never hardcode type sizes and instead use sizeof(TYPENAME) to get their length in bytes.

Quick answer: Yes, mostly, but ...
The sizes of types in C are dependent on the decisions of compiler writers, subject to the requirements of the standard.
The decisions of compiler writers tend to be strongly influenced by the CPU architecture. For example, the C standard says:
A "plain" int object has the natural size suggested by the
architecture of the execution environment.
though that leaves a lot of room for judgement.
Such decisions can also be influenced by other considerations, such as compatibility with compilers from the same vendor for other architectures and the convenience of having types for each supported size. For example, on a 64-bit system, the obvious "natural size" for int is 64 bits, but many compilers still have 32-bit int. (With 8-bit char and 64-bit int, short would probably be either 16 or 32 bits, and you couldn't have fundamental integer types covering both sizes.)
(C99 introduces "extended integer types", which could solve the issue of covering all the supported sizes, but I don't know of any compiler that implements them.)

Yes. The size of the basic datatypes depends on the underlying CPU architecture. ISO C (and C++) guarantees only mininum sizes for datatypes.
But it's not consistent across compiler vendor for the same CPU. Consider that there are compilers with 32-bit long ints for Intel x386 CPUs, and other compilers that give you 64-bit longs.
And don't forget about the decade or so of pain that MS programmers had to deal with during the era of the Intel 286 machines, what with all of the different "memory models" that compilers forced on us. 16-bit pointers versus 32-bit segmented pointers. I for one am glad that those days are gone.

It usually does, for performance reasons. The C standard defines the minimum value ranges for all types like char, short, int, long, long long and their unsigned counterparts.
However, x86 CPUs from Intel and AMD are essentially the same hardware to most x86 compilers. At least, they expose the same registers and instructions to the programmer and most of them operate identically (if we consider what's officially defined and documented).
At any rate, it's up to the compiler or its developer(s) to use any other size, not necessarily matching the natural operand size on the target hardware as long as that size agrees with the C standard.

Related

Why do we use explicit data types? (from a low level point of view)

When we take a look at some fundamental data types, such as char and int, we know that a char is simply an unsigned byte (depending on the language), int is just a signed dword, bool is just a char that can only be 1 or 0, etc. My question is, why do we use these types in compiled languages instead of just declaring a variable of type byte, dword, etc, since the operations applied to the types mentionned above are pretty much all the same, once you differentiate signed and unsigned data, and floating point data?
To extend the context of the question, in the C language, if and while statements can take a boolean value as an input, which is usually stored as a char, which exausts the need for an explicit boolean type.
In practice, the 2 pieces of code should be equivilant at the binary level:
int main()
{
int x = 5;
char y = 'c';
printf("%d %c\n", x - 8, y + 1);
return 0;
}
//outputs: -3 d
-
signed dword main()
{
signed dword x = 5;
byte y = 'c';
printf("%d %c\n", x - 8, y + 1);
return 0;
}
//outputs: -3 d
My question is, why do we use these types in compiled languages
To make the code target-agnostic. Some platforms only have efficient 16-bit integers, and forcing your variables to always be 32-bit would make your code slower for no reason when compiled for such platforms. Or maybe you have a target with 36-bit integers, and a strict 32-bit type would require extra instructions to implement.
Your question sounds very x86-centric. x86 is not the only architecture, and for most languages not the one language designers had in mind.
Even more recent languages that were designed in the era of x86 being widespread on desktops and servers were designed to be portable to other ISAs, like 8-bit AVR where a 32-bit int would take 4 registers vs. 2 for a 16-bit int.
A programming language defines an "abstract" data model, that a computer designer is free to implement his way. For instance, nothing mandates to store a Boolean in a byte, it could be "packed" as a single bit along with others. And if you read carefully the C standard, you will notice that a char has no defined size.
[Anecdotically, I recall an old time when FORTRAN variables, including integers, floats but also booleans, were stored on 72 bits on IBM machines.]
Language designers should put little constraints on machine architecture, to leave opportunities for nice designs. In fact, languages have no "low level", they implicitly describe a virtual machine not tied to a particular hardware (it could be implemented with cogwheels and ropes).
As far as I know, only the ADA language went to the point of specifying in details all the characteristics of the arithmetic, but not to the point of enforcing a number of bits per word.
Ignoring the boolean type was one of the saddest design decision in the C language. I took as late as C99 to integrate it :-(
Another sad decision is to have stopped considering the int type as the one that naturally fits in a machine word (and should have become 64 bits in current PCs).
The point of a high-level language is to provide some isolation from machine details. So, we speak of "integers", not some particular number of bytes of memory. The implementation then maps the higher-level types on whatever seems best suited to the target hardware.
And there are different semantics associated with different 4-byte types: for integers, signed versus unsigned is important to some classes of programs.
I understand this is a C question and it's arguable about how high-level C is or is not; but it is at least intended to be portable across machine architectures.
And, in your example, you assume 'int' is 32 bits. Nothing in the language says that has to be true. It has not always been true, and certainly was not true in the original PDP-11 implementation. And nowadays, for example, it is possibly appropriate to have 'int' be 64 bits on a 64-bit machine.
Note that it's not invariable that languages have types like "integer", etc. BLISS, a language at the same conceptual level as C, has the machine word as the only builtin datatype.

different size of c data type in 32 and 64 bit

Why there is different size of C data types in 32bit and 64 bit system...
for example: int size in 32bit is 4 byte and 8 byte in 64bit
what is the reason behind it to double the size of data types in 64bit as well as My knowledge is concern there is no performance issue if we are going to use same size in 64 bit system as it is in 32 bit....
Why there is different size of C data types in 32bit and 64 bit system[?]
The sizes of C's basic data types are implementation-dependent. They are not necessarily dependent on machine architecture, and counterexamples abound. For example, 32-bit and 64-bit implementations of GCC use the same size data types for x86 and x86_64.
what is the reason behind it to double the size of data types in 64bit[?]
The reasons for implementation decisions vary with implementors and implementation characteristics. int is often, but not always, chosen to have a size that is natural for the target machine in some sense. That might mean that operations on it are fast, or that it is efficient to load from and store to memory, or other things. These are the kinds of considerations involved.
The C language definition does not mandate a specific size for most data types; instead, specifies the range of values each type must be able to represent. char must be large enough to represent a single character of the execution character set (at least the range [-127,127]), short and int must be large enough to represent at least the range [-32767,32767], etc.
Traditionally, the size of int was the same as the "natural" word size for a given architecture, on the premise that a) it would be easier to implement, and b) operations on that type would be the most efficient. Whether that's still true today, I'm not qualified to say (not a hardware guy).

C word size and standard size

in this article, taken from the book "Linux kernel development":
http://www.makelinux.net/books/lkd2/ch19lev1sec2
it says:
The size of the C long type is guaranteed to be the machine's word size. On the downside, however, code cannot assume that the standard C types have any specific size. Furthermore, there is no guarantee that an int is the same size as a long
Question is, i thought int is the same as the word size, not long, and i couldn't find any official standard which defines this saying.
any thoughts?
Sometimes, people on the Internet are wrong. The sizes are fixed by the ABI. Linux ports don't necessarily create an original ABI (usually another platform or manufacturer recommendation is followed), so there's nobody making guarantees about int and long. The term "machine word" is also very ill-defined.
The size of the C long type is guaranteed to be the machine's word size.
This is wrong for a lot of platforms. For example, in the embedded world usually 8-bit MCU (e.g., HC08) have a 8-bit word size and 16-bit MCU (e.g., MSP430) have a 16-bit word size but long is 32-bit in these platforms. In Windows x64 (MSVC compiler), the size of a word is 64-bit but long is 32-bit.
The C standard does not know what a word is, and a C implementation might do things in unusual ways. So your book is wrong. (for example, some C implementation might use 64 bits long on a 8 bit micro-controller).
However, the C99 standard defines the <stdint.h> header with types like intptr_t (an integral type with the same size as void* pointers) or int64_t (a 64 bits integer) etc.
See also this question, and wikipedia's page on C data types.

integer size in c depends on what?

Size of the integer depends on what?
Is the size of an int variable in C dependent on the machine or the compiler?
It's implementation-dependent. The C standard only requires that:
char has at least 8 bits
short has at least 16 bits
int has at least 16 bits
long has at least 32 bits
long long has at least 64 bits (added in 1999)
sizeof(char) ≤ sizeof(short) ≤ sizeof(int) ≤ sizeof(long) ≤ sizeof(long long)
In the 16/32-bit days, the de facto standard was:
int was the "native" integer size
the other types were the minimum size allowed
However, 64-bit systems generally did not make int 64 bits, which would have created the awkward situation of having three 64-bit types and no 32-bit type. Some compilers expanded long to 64 bits.
Formally, representations of all fundamental data types (including their sizes) are compiler-dependent and only compiler-dependent. The compiler (or, more properly, the implementation) can serve as an abstraction layer between the program and the machine, completely hiding the machine from the program or distorting it in any way it pleases.
But in practice compilers are designed to generate the most efficient code for given machine and/or OS. In order to achieve that the fundamental data types should have natural representation for the given machine and/or OS. In that sense, these representations are indirectly dependent on the machine and/or OS.
In other words, from the abstract, formal and pedantic point of view the compiler is free to completely ignore the data type representations specific to the machine. But it makes no practical sense. In practice compilers make full use of data type representations provided by the machine.
Still, if some data type is not supported by the machine, the compiler can still provide that data type to the programs by implementing its support at the compiler level ("emulating" it). For example, 64-bit integer types are normally available in 32-bit compilers for 32-bit machines, even though they are not directly supported by the machine. Back in the day the compilers would often provide compiler-level support for floating-point types for machines that were not equipped with floating-point co-processor (and therefore did not support floating-point types directly).
It depends primarily on the compiler. For example, if you have a 64-bit x86 processor, you can use an old 16-bit compiler and get 16-bit ints, a 32-bit compiler and get 32-bit ints, or a 64-bit compiler and get 64-bit ints.
It depends on the processor to the degree that the compiler targets a particular processor, and (for example) an ancient 16-bit processor simply won't run code that targets a shiny new 64-bit processor.
The C and C++ standards do guarantee some minimum size (indirectly by specifying minimum supported ranges):
char: 8 bits
short: 16 bits
long: 32 bits
long long: 64 bits
The also guarantee that the sizes/ranges are strictly non-decreasing in the following order: char, short, int, long, and long long1.
1long long is specified in C99 and C++0x, but some compilers (e.g., gcc, Intel, Comeau) allow it in C++03 code as well. If you want to, you can persuade most (if not all) to reject long long in C++03 code.
As MAK said, it's implementation dependent. That means it depends on the compiler. Typically, a compiler targets a single machine so you can also think of it as machine dependent.
AFAIK, the size of data types is implementation dependent. This means that it is entirely up to the implementer (i.e. the guy writing the compiler) to choose what it will be.
So, in short it depends on the compiler. But often it is simpler to just use whatever size it is easiest to map to the word size of the underlying machine - so the compiler often uses the size that fits the best with the underlying machine.
It depends on the running environment no matter what hardware you have. If you are using a 16bit OS like DOS, then it will be 2 bytes. On a 32 bit OS like Windows or Unix, it is 4 bytes and so on. Even if you run a 32 bit OS on a 64 bit processor, the size will be 4 bytes only. I hope this helps.
It depends on both the architecture (machine, executable type) and the compiler. C and C++ only guarantee certain minimums. (I think those are char: 8 bits, int: 16 bits, long: 32 bits)
C99 includes certain known width types like uint32_t (when possible). See stdint.h
Update: Addressed Conrad Meyer's concerns.
The size of an Integer Variable depends upon the type of compiler:
if you have a 16 bit compiler:
size of int is 2 bytes
char holds 1 byte
float occupies 4 bytes
if you have a 32 bit compiler:
size of each variable is just double of its size in a 16 bit compiler
int hold 4 bytes
char holds 2 bytes
float holds 8 bytes
Same thing happens if you have a 64 bit compiler, and so on.

Primitive data type in C to represent the WORD size of a CPU-arch

I observed that size of long is always equal to the WORD size of any given CPU architecture. Is it true for all architectures? I am looking for a portable way to represent a WORD sized variable in C.
C doesn't deal with instructions. In C99, you can copy any size struct using an single assignment:
struct huge { int data[1 << 20]; };
struct huge a, b;
a = b;
With a smart compiler, this should generate the fastest (single-threaded, though in the future hopefully multi-threaded) code to perform the copy.
You can use the int_fast8_t type if you want the "fastest possible" integer type as defined by the vendor. This will likely correspond with the word size, but it's certainly not guaranteed to even be single-instruction-writable.
I think your best option would be to default to one type (e.g. int) and use the C preprocessor to optimize for certain CPU's.
No. In fact, the scalar and vector units often have different word sizes. And then there are string instructions and built-in DMA controllers with oddball capabilities.
If you want to copy data fast, memcpy from the platform's standard C library is usually the fastest.
Under Windows, sizeof(long) is 4, even on 64-bit versions of Windows.
I think the nearest answers you'll get are...
int and unsigned int often (but not always) match the register width of the machine.
there's a type which is an integer-the-same-size-as-a-pointer, spelled intptr_t and available from stddef.h IIRC. This should obviously match the address-width for your architecture, though I don't know that there's any guarantee.
However, there often really isn't a single word-size for the architecture - there can be registers with different widths (e.g. the "normal" vs. MMX registers in Intel x86), the register width often doesn't match the bus width, addresses and data may be different widths and so on.
No, standard have no such type (with maximize memory throughput).
But it states that int must be fastest type for the processor for doing ALU operations on it.
Things will get more complicated in embedded world. ASAIK, C51 is 8bit processor but in Keil C for c51, long have 4 bytes. I think it's compiler dependent.

Resources