Is the Variable size in different Microcontroller will be the same? [duplicate] - c

This question already has answers here:
Different int sizes on my computer and Arduino
(5 answers)
Closed 9 years ago.
If we define A Variable "integer" In PIC Microcontroller, Will It be the same size when I define the same "int" Variable At Atmel Microcontroller ? Or it will be different sizes ?
This Question is in Embedded Systems interview, What Should the Answer be ?
I'm a little confused !!
Does it depend on the Microcontroller or the programming language ?
Does the same variables Type like integer are the same size in all different programming languages ??
It's Not the same question as It's a little different in the Embedded controllers.

The answer to the interview question should be something like:
Possibly, where it matters one should use the types defined in stdint.h, or otherwise consult the compiler documentation or inspect the definitions in limits.h.
The interviewer is unlikely to be asking for a yes/no answer and probably would not appreciate such terseness in an interview situation in any case - the questions are intended to get you talking until you have said something useful or interesting about yourself or your abilities and knowledge. What he is perhaps looking for is whether you are aware of the fact that standard type sizes in C are a compiler/architecture dependency and how you might handle the potential variability in portable code.
It is likely and possible that an int between one PIC and another PIC or one Atmel and another will differ let alone between PIC and Atmel. An Atmel AVR32 for example will certainly differ from an an 8bit AVR, and similarly the MIPS based PIC32 differs from "classic" PICs.
Also the size of built in types is strictly a "compiler implementation" issue, so it is possible that two different compilers for the same processor will differ (although it is highly improbable - since no compiler vendor would sensibly go out of their way to be that perverse!).
Languages other than C and C++ (and assembler of course) are less common on small micro-controllers because these are systems level languages with minimal runtime environment requirements, but certainly the sizes of types may vary depending on the language definition.

The problem is that standard C types will tend to vary from implementation to implementation. Using the types found in stdint.h will allow you to specify how many bits you want.

It depends on the architecture 32 bits or 64 bits.
On 32 bits systems, your integer would be coded on 32bit :
for signed integer 32 bits :
value between -2,147,483,648 and 2,147,483,647
On 64 bits system it will be 64 :
for signed integer 64 bits : value between -9223372036854775808 and 9223372036854775807
So to answer your question integer can have different size depending on the architecture you are using.

TIP: If you have your code assume that a specific type is of a specific size, you can verify this assumption during compilation:
#define C_ASSERT(cond) char c_assert_var_##__LINE__[(cond) ? 1 : -1]
C_ASSERT(sizeof(int) == 4);
During compile-time, this will produce the following code:
char c_assert_var_350[(sizeof(int) == 4) ? 1 : -1];
which will not compile if sizeof(int) != 4

It depends on many things, I can't say neither yes nor no, but my answer is more no.
int is guaranteed to be 16-bit. However in many latter architecture int is 32-bit number and it doesn't break any rules. As far I know what in Atmels 8-bit microcontrollers int is 16-bit, not sure about PIC.
Anyway, my suggest would be to use defined types. I don't know what compiler you are using but I'm using AVR Studio. It has defined types such as:
uint8_t
int8_t
uint16_t
...
int64_t
So these types are guaranteed to have same size on every processor, you just need to make a little research through your compiler.

Related

Why do datatypes have different sizes on different architectures? (i.e. a C int on 16-bit, 32-bit, 64-bit)

I read here that it depends on the specific compiler so you always have to use sizeof() to find out the size, however when reading about datatype sizes i always read things such as "on a typical 64-bit machine datatype_x is x_bytes long" "on a 16-bit machine..."
Why is it like this?
What's the correlation between datatype size and the machine architecture?
Edit: The reason why i posted this question despite there being similar duplicates is because im not content with the answer "It depends on the compiler, and the compiler is usually made to achieve the best performance on the system". I wanted to know why certain size for the datatype on the given bit system is considered to give the best performance. Which I guess has to do with how instructions are processed by the cpu but I didn't want to go and read a bunch about CPUs and such, just want to know the part that's relevant to this question.
"What's the correlation between datatype size and the machine architecture?"
There is no defined correlation. There is a tendency or more like guidelines than actual rules that int corresponds to the processor's integer width and is at least 16 bits. Minimum "limit of size_t ... 65535" C11 ยง7.20.3 1
Why is it like this?
That is the strength of C. size_t follows the processors "best/native" size making for good performance, tightness of executable code and the platforms memory capacity. Yet there are exceptions to the guideline.
It is also a weakness of C in that it varies from platform to platform.
Use fixed width types like int32_t if code goals requires it.

How the compiler make the link between basic ANSI C type and the processor selected

For each processor (Cortex M, Cortex 5), the compiler must know what is the data size associated to unsigned char, integer, short, .......
Please could you help me to understand how this choice is done?
Please could you help me to understand how this choice is done?
Guessing compiler we are talking is a C one.
First we have the spec for C language, see wiki article C data types. But then C spec gives some flexibility to implementer...
int is architecture's word. A word is architecture's almost atomic size of processing. For legacy ARM this is 32-bit, think as architecture has 32-bit registers.
char is shortest bit string. Nowadays you can't get less than 8, and almost all architectures let you play with 8. 8 is good because it matches with ASCII, so you get efficient support from architecture to play ASCII strings.
short helps when your architecture can do operations on 16-bit values, and long is good for 64-bit.
signed / unsignedisn't really related to this, as in when some bit string is used as signed it gets reduced in upper limit of what it can represent.

Would making plain int 64-bit break a lot of reasonable code?

Until recently, I'd considered the decision by most systems implementors/vendors to keep plain int 32-bit even on 64-bit machines a sort of expedient wart. With modern C99 fixed-size types (int32_t and uint32_t, etc.) the need for there to be a standard integer type of each size 8, 16, 32, and 64 mostly disappears, and it seems like int could just as well be made 64-bit.
However, the biggest real consequence of the size of plain int in C comes from the fact that C essentially does not have arithmetic on smaller-than-int types. In particular, if int is larger than 32-bit, the result of any arithmetic on uint32_t values has type signed int, which is rather unsettling.
Is this a good reason to keep int permanently fixed at 32-bit on real-world implementations? I'm leaning towards saying yes. It seems to me like there could be a huge class of uses of uint32_t which break when int is larger than 32 bits. Even applying the unary minus or bitwise complement operator becomes dangerous unless you cast back to uint32_t.
Of course the same issues apply to uint16_t and uint8_t on current implementations, but everyone seems to be aware of and used to treating them as "smaller-than-int" types.
As you say, I think that the promotion rules really are the killer. uint32_t would then promote to int and all of a sudden you'd have signed arithmetic where almost everybody expects unsigned.
This would be mostly hidden in places where you do just arithmetic and assign back to an uint32_t. But it could be deadly in places where you do comparison to constants. Whether code that relies on such comparisons without doing an explicit cast is reasonable, I don't know. Casting constants like (uint32_t)1 can become quite tedious. I personally at least always use the suffix U for constants that I want to be unsigned, but this already is not as readable as I would like.
Also have in mind that uint32_t etc are not guaranteed to exist. Not even uint8_t. The enforcement of that is an extension from POSIX. So in that sense C as a language is far from being able to make that move.
"Reasonable Code"...
Well... the thing about development is, you write and fix it and then it works... and then you stop!
And maybe you've been burned a lot so you stay well within the safe ranges of certain features, and maybe you haven't been burned in that particular way so you don't realize that you're relying on something that could kind-of change.
Or even that you're relying on a bug.
On olden Mac 68000 compilers, int was 16 bit and long was 32. But even then most extant C code assumed an int was 32, so typical code you found on a newsgroup wouldn't work. (Oh, and Mac didn't have printf, but I digress.)
So, what I'm getting at is, yes, if you change anything, then some things will break.
With modern C99 fixed-size types
(int32_t and uint32_t, etc.) the need
for there to be a standard integer
type of each size 8, 16, 32, and 64
mostly disappears,
C99 has fixed-sized typeDEFs, not fixed-size types. The native C integer types are still char, short, int, long, and long long. They are still relevant.
The problem with ILP64 is that it has a great mismatch between C types and C99 typedefs.
int8_t = char
int16_t = short
int32_t = nonstandard type
int64_t = int, long, or long long
From 64-Bit Programming Models: Why LP64?:
Unfortunately, the ILP64 model does
not provide a natural way to describe
32-bit data types, and must resort to
non-portable constructs such as
__int32 to describe such types. This is likely to cause practical problems
in producing code which can run on
both 32 and 64 bit platforms without
#ifdef constructions. It has been possible to port large quantities of
code to LP64 models without the need
to make such changes, while
maintaining the investment made in
data sets, even in cases where the
typing information was not made
externally visible by the application.
DEC Alpha and OSF/1 Unix was one of the first 64-bit versions of Unix, and it used 64-bit integers - an ILP64 architecture (meaning int, long and pointers were all 64-bit quantities). It caused lots of problems.
One issue I've not seen mentioned - which is why I'm answering at all after so long - is that if you have a 64-bit int, what size do you use for short? Both 16 bits (the classical, change nothing approach) and 32 bits (the radical 'well, a short should be half as long as an int' approach) will present some problems.
With the C99 <stdint.h> and <inttypes.h> headers, you can code to fixed size integers - if you choose to ignore machines with 36-bit or 60-bit integers (which is at least quasi-legitimate). However, most code is not written using those types, and there are typically deep-seated and largely hidden (but fundamentally flawed) assumptions in the code that will be upset if the model departs from the existing variations.
Notice Microsoft's ultra-conservative LLP64 model for 64-bit Windows. That was chosen because too much old code would break if the 32-bit model was changed. However, code that had been ported to ILP64 or LP64 architectures was not immediately portable to LLP64 because of the differences. Conspiracy theorists would probably say it was deliberately chosen to make it more difficult for code written for 64-bit Unix to be ported to 64-bit Windows. In practice, I doubt whether that was more than a happy (for Microsoft) side-effect; the 32-bit Windows code had to be revised a lot to make use of the LP64 model too.
There's one code idiom that would break if ints were 64-bits, and I see it often enough that I think it could be called reasonable:
checking if a value is negative by testing if ((val & 0x80000000) != 0)
This is commonly found in checking error codes. Many error code standards (like Window's HRESULT) uses bit 31 to represent an error. And code will sometimes check for that error either by testing bit 31 or sometimes by checking if the error is a negative number.
Microsoft's macros for testing HRESULT use both methods - and I'm sure there's a ton of code out there that does similar without using the SDK macros. If MS had moved to ILP64, this would be one area that caused porting headaches that are completely avoided with the LLP64 model (or the LP64 model).
Note: if you're not familiar with terms like "ILP64", please see the mini-glossary at the end of the answer.
I'm pretty sure there's a lot of code (not necessarily Windows-oriented) out there that uses plain-old-int to hold error codes, assuming that those ints are 32-bits in size. And I bet there's a lot of code with that error status scheme that also uses both kinds of checks (< 0 and bit 31 being set) and which would break if moved to an ILP64 platform. These checks could be made to continue to work correctly either way if the error codes were carefully constructed so that sign-extension took place, but again, many such systems I've seen construct the error values by or-ing together a bunch a bitfields.
Anyway, I don't think this is an unsolvable problem by any means, but I do think it's a fairly common coding practice that would cause a lot of code to require fixing up if moved to an ILP64 platform.
Note that I also don't think this was one of the foremost reasons for Microsoft to choose the LLP64 model (I think that decision was largely driven by binary data compatibility between 32-bit and 64-bit processes, as mentioned in MSDN and on Raymond Chen's blog).
Mini-Glossary for the 64-bit Platform Programming Model terminology:
ILP64: int, long, pointers are 64-bits
LP64: long and pointers are 64-bits, int is 32-bits (used by many (most?) Unix platforms)
LLP64: long long and pointers are 64-bits, int and long remain 32-bits (used on Win64)
For more information on 64-bit programming models, see "64-bit Programming Models: Why LP64?"
While I don't personally write code like this, I'll bet that it's out there in more than one place... and of course it'll break if you change the size of int.
int i, x = getInput();
for (i = 0; i < 32; i++)
{
if (x & (1 << i))
{
//Do something
}
}
Well, it's not like this story is all new. With "most computers" I assume you mean desktop computers. There already has been a transition from 16-bit to 32-bit int. Is there anything at all that says the same progression won't happen this time?
Not particularly. int is 64 bit on some 64 bit architectures (not x64).
The standard does not actually guarantee you get 32 bit integers, just that (u)int32_t can hold one.
Now if you are depending on int is the same size as ptrdiff_t you may be broken.
Remember, C does not guarantee that the machine even is a binary machine.

Primitive data type in C to represent the WORD size of a CPU-arch

I observed that size of long is always equal to the WORD size of any given CPU architecture. Is it true for all architectures? I am looking for a portable way to represent a WORD sized variable in C.
C doesn't deal with instructions. In C99, you can copy any size struct using an single assignment:
struct huge { int data[1 << 20]; };
struct huge a, b;
a = b;
With a smart compiler, this should generate the fastest (single-threaded, though in the future hopefully multi-threaded) code to perform the copy.
You can use the int_fast8_t type if you want the "fastest possible" integer type as defined by the vendor. This will likely correspond with the word size, but it's certainly not guaranteed to even be single-instruction-writable.
I think your best option would be to default to one type (e.g. int) and use the C preprocessor to optimize for certain CPU's.
No. In fact, the scalar and vector units often have different word sizes. And then there are string instructions and built-in DMA controllers with oddball capabilities.
If you want to copy data fast, memcpy from the platform's standard C library is usually the fastest.
Under Windows, sizeof(long) is 4, even on 64-bit versions of Windows.
I think the nearest answers you'll get are...
int and unsigned int often (but not always) match the register width of the machine.
there's a type which is an integer-the-same-size-as-a-pointer, spelled intptr_t and available from stddef.h IIRC. This should obviously match the address-width for your architecture, though I don't know that there's any guarantee.
However, there often really isn't a single word-size for the architecture - there can be registers with different widths (e.g. the "normal" vs. MMX registers in Intel x86), the register width often doesn't match the bus width, addresses and data may be different widths and so on.
No, standard have no such type (with maximize memory throughput).
But it states that int must be fastest type for the processor for doing ALU operations on it.
Things will get more complicated in embedded world. ASAIK, C51 is 8bit processor but in Keil C for c51, long have 4 bytes. I think it's compiler dependent.

Smart typedefs

I've always used typedef in embedded programming to avoid common mistakes:
int8_t - 8 bit signed integer
int16_t - 16 bit signed integer
int32_t - 32 bit signed integer
uint8_t - 8 bit unsigned integer
uint16_t - 16 bit unsigned integer
uint32_t - 32 bit unsigned integer
The recent embedded muse (issue 177, not on the website yet) introduced me to the idea that it's useful to have some performance specific typedefs. This standard suggests having typedefs that indicate you want the fastest type that has a minimum size.
For instance, one might declare a variable using int_fast16_t, but it would actually be implemented as an int32_t on a 32 bit processor, or int64_t on a 64 bit processor as those would be the fastest types of at least 16 bits on those platforms. On an 8 bit processor it would be int16_t bits to meet the minimum size requirement.
Having never seen this usage before I wanted to know
Have you seen this in any projects, embedded or otherwise?
Any possible reasons to avoid this sort of optimization in typedefs?
For instance, one might declare a
variable using int_fast16_t, but it
would actually be implemented as an
int32_t on a 32 bit processor, or
int64_t on a 64 bit processor as those
would be the fastest types of at least
16 bits on those platforms
That's what int is for, isn't it? Are you likely to encounter an 8-bit CPU any time soon, where that wouldn't suffice?
How many unique datatypes are you able to remember?
Does it provide so much additional benefit that it's worth effectively doubling the number of types to consider whenever I create a simple integer variable?
I'm having a hard time even imagining the possibility that it might be used consistently.
Someone is going to write a function which returns a int16fast_t, and then someone else is going to come along and store that variable into an int16_t.
Which means that in the obscure case where the fast variants are actually beneficial, it may change the behavior of your code. It may even cause compiler errors or warnings.
Check out stdint.h from C99.
The main reason I would avoid this typedef is that it allows the type to lie to the user. Take int16_t vs int_fast16_t. Both type names encode the size of the value into the name. This is not an uncommon practice in C/C++. I personally use the size specific typedefs to avoid confusion for myself and other people reading my code. Much of our code has to run on both 32 and 64 bit platforms and many people don't know the various sizing rules between the platforms. Types like int32_t eliminate the ambiguity.
If I had not read the 4th paragraph of your question and instead just saw the type name, I would have assumed it was some scenario specific way of having a fast 16 bit value. And I obviously would have been wrong :(. For me it would violate the "don't surprise people" rule of programming.
Perhaps if it had another distinguishing verb, letter, acronym in the name it would be less likely to confuse users. Maybe int_fast16min_t ?
When I am looking at int_fast16_t, and I am not sure about the native width of the CPU in which it will run, it may make things complicated, for example the ~ operator.
int_fast16_t i = 10;
int_16_t j = 10;
if (~i != ~j) {
// scary !!!
}
Somehow, I would like to willfully use 32 bit or 64 bit based on the native width of the processor.
I'm actually not much of a fan of this sort of thing.
I've seen this done many times (in fact, we even have these typedefs at my current place of employment)... For the most part, I doubt their true usefulness... It strikes me as change for changes sake... (and yes, I know the sizes of some of the built ins can vary)...
I commonly use size_t, it happens to be the fastest address size, a tradition I picked up in embedding. And it never caused any issues or confusion in embedded circles, but it actually began causing me problems when I began working on 64bit systems.

Resources