Different int sizes on my computer and Arduino - c

Im working on a sparetime project, making some server code to an Arduino Duemilanove, but before I test this code on the controller I am testing it on my own machine (An OS X based macbook). I am using ints some places, and I am worried that this will bring up strange errors when code is compiled and run on the Arduino Duemilanove because the Arduino handles ints as 2 bytes, and my macbook handles ints as 4 bytes. Im not a hardcore C and C++ programmer, so I am in a bit of worry how an experienced programmer would handle this situation.
Should I restrict the code with a typedef that wrap my own definition of and int that is restricted to 2 bytes? Or is there another way around?

Your best bet is to use the stdint.h header. It defines typedefs that explicitly refer to the signedness and size of your variables. For example, a 16-bit unsigned integer is a uint16_t. It's part of the C99 standard, so it's available pretty much everywhere. See:
http://en.wikipedia.org/wiki/Stdint.h

The C standard defines an int as being a signed type large enough to at least hold all integers between -32768 and 32767 - implementations are free to choose larger types, and any modern 32-bit system will choose a 32-bit integer. However, as you've seen, some embedded platforms still use 16-bit ints. I would recommend using uint16_t or uint32_t if your arduino compiler supports it; if not, use preprocessor macros to typedef those types yourself.

The correct way to handle the situation is to choose the type based on the values it will need to represent:
If it's a general small integer, and the range -32767 to 32767 is OK, use int;
Otherwise, if the range -2147483647 to 2147483647 is OK, use long;
Otherwise, use long long.
If the range -32767 to 32767 is OK and space efficiency is important, use short (or signed char, if the range -127 to 127 is OK).
As long as you have made no other assumptions that these (ie. always using sizeof instead of assuming the width of the type), then your code will be portable.
In general, you should only need to use the fixed-width types from stdint.h for values that are being exchanged through a binary interface with another system - ie. being read from or written to the network or a file.

Will you need values smaller than −32,768 or bigger than +32,767? If not, ignore the differnt sizes. If you do need them, there's stdint.h with fixed-size integers, singned und unsigned, called intN_t/uintN_t (N = number of bits). It's C99, but most compilers will support it. Note that using integers with a size bigger than the CPU's wordsize (16 bit in this case) will hurt performance, as there are no native instructions for handling them.

avoid using the type int as it's size can depend upon architecture / compiler.
use short and long instead

Related

Is it possible to create a custom sized variable type in c?

Good evening, sorry in advance if I have a bad English, i'm French.
So, in C, there is different variable types, for example int, long, ... That takes a number of bytes depending of the type, and if I'm not wrong the "largest" type is long long int (or just long long) that takes 8 bytes of memory (like long which is weird so if someone could explain me that too thanks)
So my first question is: can I create my custom variable type that takes for example 16 bytes or am I forced to use strings if the number is too high for long long (or unsigned long long) ?
You can create custom types of all sorts, and if you want a "integer" type that is 16 bytes wide you could create a custom struct and pair two long longs together. But then you'd have to implement all the arithmetic on those types manually. This was quite common in the past when 16 bit (and even 32 bit) machines were most common, you'd have "bigint" libraries to do like 64-bit integer math. That's less useful now that most machines are either 64 bit or have long long support natively on 32 bit targets.
You used to see libraries with stuff like this quite often:
typedef struct _BigInt {
unsigned long long high;
unsigned long long low;
} BigInt;
// Arithmetic functions:
BigInt BigIntAdd(BigInt a, BigInt b);
// etc.
These have faded away somewhat because the current typical CPU register width is 64 bits, which allows for an enormous range of values, and unless you're working with very specialized data, it's not longer "common" in normal programming tasks to need values outside that range. As #datenwolf is explicit and correct about in the comments below, if you find the need for such functionality in production code, seek out a reliable and debugged library for it. (Writing your own could be a fun exercise, though this sort of thing is likely to be a bug farm if you try to just whip it up as a quick step along the way to other work.) As Eric P indicates in the comments above, clang offers a native way of doing this without a third party library.
(The weird ambiguities or equivalencies about the widths of long and long long are mostly historical, and if you didn't evolve with the platforms it's confusing and kind of unnecessary. See the comment on the question about this-- the C standard defines minimum sizes for the integer types but doesn't say they have to be different from each other; historically the types char, short, int, long and long long were often useful ways of distinguishing e.g. 8, 16, 32, and 64 bit sizes but it's a bit of a mess now and if you want a particular size modern platforms provide a uint32_t to guarantee size rather than using the "classic" C types.)
Obviously you can. By preference you should not use string, because computations with those will be a lot more complicated and slower.
Also, you may not want to use bytes, but the 2nd largest datatype available on your compiler, because detecting overflow can be cumbersome if you're using the largest datatype.

Is there a general way in C to make a number greater than 64 bits or smaller than 8 bits?

First of all, I am aware that what I am trying to do might be outside the C standard.
I'd like to know if it is possible to make a uint4_t/int4_t or uint128_t/int128_t type in C.
I know I could do this using bitshifts and complex functions, but can I do it without those?
You can use bitfields within a structure to get fields narrower than a uint8_t, but, the base datatype they're stored in will not be any smaller.
struct SmallInt
{
unsigned int a : 4;
};
will give you a structure with a member called a that is 4 bits wide.
Individual storage units (bytes) are no less than CHAR_BITS bits wide1; even if you create a struct with a single 4-bit bitfield, the associated object will always take up a full storage unit.
There are multiple precision libraries such as GMP that allow you to work with values that can't fit into 32 or 64 bits. Might want to check them out.
8 bits minimum, but may be wider.
In practice, if you want very wide numbers (but that is not specified in standard C11) you probably want to use some arbitrary-precision arithmetic external library (a.k.a. bignums). I recommend using GMPlib.
In some cases, for tiny ranges of numbers, you might use bitfields inside struct to have tiny integers. Practically speaking, they can be costly (the compiler would emit shift and bitmask instructions to deal with them).
See also this answer mentioning __int128_t as an extension in some compilers.

Software exposed Bit width in C

I have two questions:
Is there any method to specify or limit the bit widths used for integer variables in a C program?
Is there any way to monitor the actual bit usage for a variable in a C program? What I mean by bit usage is, in some programs when a register is allocated for a variable not all the bits of that register are used for calculations. Hence when a program is executed can we monitor how many bits in a register have been actually changed through out the execution of a program?
You can use fixed width (or guaranteed-at-least-this-many-bits) types in C as of the 1999 standard, see e.g. Wikipedia or any decent C description, defined in the inttypes.h C header (called cinttypes in C++), also in stdint.h (C) or cstdint (C++).
You certainly can check for each computation what the values of the values could be, and limit the variables acordingly. But unless you are seriously strapped for space, I'd just forget about this. In many cases using "just large enough" data types wastes space (and computation time) by having to cast small values to the natural widths for computation, and then cast back. Beware of premature optimization, and even more of optimizing the wrong code (measure if the performance is enough, and if not, where modifications are worthwhile, before digging in to make code "better").
You have limited control if you use <stdint.h>.
On most systems, it will provide:
int8_t and uint8_t for 8-bit integers.
int16_t and uint16_t for 16-bit integers.
int32_t and uint32_t for 32-bit integers.
int64_t and uint64_t for 64-bit integers.
You don't usually get other choices. However, you might be able to use a bit-field to get a more arbitrary size value:
typedef struct int24_t
{
signed int b24:24;
} int24_t;
This might occupy more than 24 bits (probably 32 bits), but arithmetic will end up being 24-bit. You're not constrained to a power of 2, or even a multiple of 2:
typedef struct int13_t
{
signed int b13:13;
} int13_t;

C Variable Definition

In C integer and short integer variables are identical: both range from -32768 to 32767, and the required bytes of both are also identical, namely 2.
So why are two different types necessary?
Basic integer types in C language do not have strictly defined ranges. They only have minimum range requirements specified by the language standard. That means that your assertion about int and short having the same range is generally incorrect.
Even though the minimum range requirements for int and short are the same, in a typical modern implementation the range of int is usually greater than the range of short.
The standard only guarantees sizeof(short) <= sizeof(int) <= sizeof(long) as far as I remember. So both short and int can be the same but don't have to. 32 bit compilers usually have 2 bytes short and 4 bytes int.
The C++ standard (and the C standard, which has a very similar paragraph, but the quote is from the n3337 version of the C++11 draft specification):
Section 3.9.1, point 2:
There are five standard signed integer types : “signed char”, “short int”, “int”, “long int”, and “long
long int”. In this list, each type provides at least as much storage as those preceding it in the list.
There may also be implementation-defined extended signed integer types. The standard and extended signed
integer types are collectively called signed integer types. Plain ints have the natural size suggested by the
architecture of the execution environment ; the other signed integer types are provided to meet special
needs.
Different architectures have different size "natural" integers, so a 16-bit architecture will naturally calculate a 16-bit value, where a 32- or 64-bit architecture will use either 32 or 64-bit int's. It's a choice for the compiler producer (or the definer of the ABI for a particular architecture, which tends to be a decision formed by a combination of the OS and the "main" Compiler producer for that architecture).
In modern C and C++, there are types along the lines of int32_t that is guaranteed to be exactly 32 bits. This helps portability. If these types aren't sufficient (or the project is using a not so modern compiler), it is a good idea to NOT use int in a data structure or type that needs a particular precision/size, but to define a uint32 or int32 or something similar, that can be used in all places where the size matters.
In a lot of code, the size of a variable isn't critical, because the number is within such a range, that a few thousand is way more than you ever need - e.g. number of characters in a filename is defined by the OS, and I'm not aware of any OS where a filename/path is more than 4K characters - so a 16, 32 or 64 bit value that can go to at least 32K would be perfectly fine for counting that - it doesn't really matter what size it is - so here we SHOULD use int, not try to use a specific size. int should, in a compiler be a type that is "efficient", so should help to give good performance, where some architectures will run slower if you use short, and certainly 16-bit architectures will run slower using long.
The guaranteed minimum ranges of int and short are the same. However an implementation is free to define short with a smaller range than int (as long as it still meets the minimum), which means that it may be expected to take the same or smaller storage space than int1. The standard says of int that:
A ‘‘plain’’ int object has the natural size suggested by the
architecture of the execution environment.
Taken together, this means that (for values that fall into the range -32767 to 32767) portable code should prefer int in almost all cases. The exception would be where the a very large number of values are being stored, such that the potentially smaller storage space occupied by short is a consideration.
1. Of course a pathological implementation is free to define a short that has a larger size in bytes than int, as long as it still has equal or lesser range - there is no good reason to do so, however.
They both are identical for 16 bit IBM compatible PC. However it is not sure that it will be identical on other hardwares as well.
VAX type of system which is known as virtual address extension they treat all these 2 variables in different manner. It occupies 2 bytes for short integer and 4 bytes for integer.
So this is the reason that we have 2 different but identical variables and their property.
for general purpose in desktops and laptops we use integer.

Is there a standard way to detect bit width of hardware?

Variables of type int are allegedly "one machine-type word in length"
but in embedded systems, C compilers for 8 bit micro use to have int of 16 bits!, (8 bits for unsigned char) then for more bits, int behave normally:
in 16 bit micros int is 16 bits too, and in 32 bit micros int is 32 bits, etc..
So, is there a standar way to test it, something as BITSIZEOF( int ) ?
like "sizeof" is for bytes but for bits.
this was my first idea
register c=1;
int bitwidth=0;
do
{
bitwidth++;
}while(c<<=1);
printf("Register bit width is : %d",bitwidth);
But it takes c as int, and it's common in 8 bit compilers to use int as 16 bit, so it gives me 16 as result, It seems there is no standar for use "int" as "register width", (or it's not respected)
Why I want to detect it? suppose I need many variables that need less than 256 values, so they can be 8, 16, 32 bits, but using the right size (same as memory and registers) will speed up things and save memory, and if this can't be decided in code, I have to re-write the function for every architecture
EDIT
After read the answers I found this good article
http://embeddedgurus.com/stack-overflow/category/efficient-cc/page/4/
I will quote the conclusion (added bold)
Thus
the bottom line is this. If you want
to start writing efficient, portable
embedded code, the first step you
should take is start using the C99
data types ‘least’ and ‘fast’. If your
compiler isn’t C99 compliant then
complain until it is – or change
vendors. If you make this change I
think you’ll be pleasantly surprised
at the improvements in code size and
speed that you’ll achieve.
I have to re-write the function for every architecture
No you don't. Use C99's stdint.h, which has types like uint_fast8_t, which will be a type capable of holding 256 values, and quickly.
Then, no matter the platform, the types will change accordingly and you don't change anything in your code. If your platform has no set of these defined, you can add your own.
Far better than rewriting every function.
To answer your deeper question more directly, if you have a need for very specific storage sizes that are portable across platforms, you should use something like types.h stdint.h which defines storage types specified with number of bits.
For example, uint32_t is always unsigned 32 bits and int8_t is always signed 8 bits.
#include <limits.h>
const int bitwidth = sizeof(int) * CHAR_BIT;
The ISA you're compiling for is already known to the compiler when it runs over your code, so your best bet is to detect it at compile time. Depending on your environment, you could use everything from autoconf/automake style stuff to lower level #ifdef's to tune your code to the specific architecture it'll run on.
I don't exactly understand what you mean by "there is no standar for use "int" as "register width". In the original C language specification (C89/90) the type int is implied in certain contexts when no explicit type is supplied. Your register c is equivalent to register int c and that is perfectly standard in C89/90. Note also that C language specification requires type int to support at least -32767...+32767 range, meaning that on any platform int will have at least 16 value-forming bits.
As for the bit width... sizeof(int) * CHAR_BIT will give you the number of bits in the object representation of type int.
Theoretically though, the value representation of type int is not guaranteed to use all bits of its object representation. If you need to determine the number of bits used for value representation, you can simply analyze the INT_MIN and INT_MAX values.
P.S. Looking at the title of your question, I suspect that what you really need is just the CHAR_BIT value.
Does an unsigned char or unsigned short suit your needs? Why not use that? If not, you should be using compile time flags to bring in the appropriate code.
I think that in this case you don't need to know how many bits has your architecture. Just use variables as small as possible if you want to optimize your code.

Resources