Calculating with a variable outside of its bounds in C - c

If I make a calculation with a variable where an intermediate part of the calculation goes higher then the bounds of that variable type, is there any hazard that some platforms may not like?
This is an example of what I'm asking:
int a, b;
a=30000;
b=(a*32000)/32767;
I have compiled this, and it does give the correct answer of 29297 (well, within truncating error, anyway). But the part that worries me is that 30,000*32,000 = 960,000,000, which is a 30-bit number, and thus cannot be stored in a 16-bit int. The end result is well within the bounds of an int, but I was expecting that whatever working part of memory would have the same size allocated as the largest source variables did, so an overflow error would occur.
This is just a small example to show my problem, I am trying to avoid using floating points by making the fraction be a fraction of the max amount able to be stored in that variable (in this case, a signed integer, so 32767 on the positive side), because the embedded system I'm using I believe does not have an FPU.
So how do most processors handle calculations out of the bounds of the source and destination variables?

On a 16-bit compiler/CPU, you can (almost) plan on that code giving incorrect results. This is a bit sad, since nearly every CPU (that has a multiply instruction at all) will produce and store the intermediate result, but no C compiler (of which I'm aware) will normally use it (and if you made a and b unsigned, it wouldn't be allowed to use it).
You have a few choices to deal with this. One is to write small muldiv function in assembly language that does the multiplication (preserving the high word) then the division on that, and finally returns the value to C when it's been reduced back into range.
Another option is to do the math on unsigned integers, which at least allow you to figure out when a problem occurred. Unfortunately, none of the choices is what I'd call particularly appealing though...

As far as I know, most if not all processors will hold results for a word * word multiplication in a double word -- meaning, an 8 bit * 8 bit is stored in a 16-bit register(s) on an 8-bit processor, a 32-bit * 32 bit operation is stored in a 64-bit register(s) on a 32-bit machine. (At least, that's how it's been on all the embedded microcontrollers I've used)
If that weren't the case, the processor would be severely crippled in the sense of only allowing half-word * half-word multiplication.

AFAIK this kind of thing is formally "undefined". You have to do the algebra necessary to prevent overflow. That's always your first choice. Numeric stability is no accident, it requires some care in deciding when and how to do division and multiplication.
Or, you have to guarantee that you'll use an intermediate result buffer that's big enough.
Using a large intermediate buffer is what some C compilers do anyway. The language, however, doesn't make any guarantees.
So, to be sure that it works, most folks do something like this.
short a= 30000;
int temp= a;
int temp2= (a*32000)/32767;
// here you can check for errors; if temp2 > 32767, you have overflow.
short b= a;

Signed integer overflow is undefined behavior.
Almost any implementation you could possibly meet will wrap around on integer overflow, because (a) everyone uses 2's complement, in which arithmetic operations are bitwise identical for signed and unsigned types of the same size, and (b) wraparound is the defined behavior of unsigned types in C.
So, on an implementation with a 16 bit int, I would expect the result 0 for your calculation (and that is the result that it must have if you'd used an unsigned 16 bit int). But I'd code against the possibility it might throw a hardware exception, explode, etc.
Note that if you do the calculation with two 16 bit short variables on a machine with a 32 bit int, then you will generally get the "right" answer 29297, because the intermediate value (a*32000) is an int, and only gets truncated back to short at the end. I say "generally" because converting an out-of-bounds integer value to a signed integer type either gives an unspecified result or else raises a signal. But again, any implementation you'll encounter in polite company just takes a modulus.

Are you sure your compiler has 16 bit integers? On most systems nowadays, ints are 32 bits. Another possible reason you aren't getting an error is that some compilers will recognize that it can compute something like this at compile time and will do so.
If you are really concerned that you will end up with overflow, you can sometimes reorder or factor the formula differently so that no intermediate terms will overflow. In your example that would be hard to do since all of your terms are near the limit of a 16 bit value. Do you need the number to be exactly right, or can you approximate? If you can, you can do something like this:
int a, b;
a=30000;
//b=(a*32000)/32767 ~= a * (32000/32768) = a *(125/128)
b = (a / 128) * 125 // if a=30000, b = 29250 - about 0.16% error
Another option would be to use larger sized types for intermediate terms. If your compiler had 16 bit ints and 32 bit longs, you could do something like this:
int a, b;
a=30000;
b=((long)a*32000L)/32767L;
Really, there's no set answer for how to handle overflow. You need to evaluate each case on its own and decide what the best solution is.

Your compiler and target processor both have to do with the sizes of the various data types.
Compilers will usually promote variables to the largest easy to work with size during calculations and then convert the results whatever size is needed for an assignment at the end.
There's also C rules that govern promoting to sizes which are more difficult to work with for some calculations. If you are compiling for an AVR, which has 8 bit registers but defines an int to be 16 bits, many calculations end up using more registers than you might think that they need because of this promotion and the fact that constant numbers in your code have to be thought of as being int or unsigned int unless the compiler can prove to itself that this won't effect the outcome of the calculations.
Try rewriting your code with various different sizes of integers (short, int, long, long long) and see how that goes. You may also want to write a simple program that prints out the sizeof( ) of the standard predefined types.
If you need to worry about the sizes of your integer variables and/or the intermediate results of your calculations then you should include and use things like uint32_t and int64_t for your declarations and type casting.

Related

Comparison uint8_t vs uint16_t while declaring a counter

Assuming to have a counter which counts from 0 to 100, is there an advantage of declaring the counter variable as uint16_t instead of uint8_t.
Obviously if I use uint8_t I could save some space. On a processor with natural wordsize of 16 bits access times would be the same for both I guess. I couldn't think why I would use a uint16_t if uint8_t can cover the range.
Using a wider type than necessary can allow the compiler to avoid having to mask the higher bits.
Suppose you were working on a 16 bit architecture, then using uint16_t could be more efficient, however if you used uint16_t instead of uint8_t on a 32 bit architecture then you would still have the mask instructions but just masking a different number of bits.
The most efficient type to use in a cross-platform portable way is just plain int or unsigned int, which will always be the correct type to avoid the need for masking instructions, and will always be able to hold numbers up to 100.
If you are in a MISRA or similar regulated environment that forbids the use of native types, then the correct standard-compliant type to use is uint_fast8_t. This guarantees to be the fastest unsigned integer type that has at least 8 bits.
However, all of this is nonsense really. Your primary goal in writing code should be to make it readable, not to make it as fast as possible. Penny-pinching instructions like this makes code convoluted and more likely to have bugs. Also because it is harder to read, the bugs are less likely to be found during code review.
You should only try to optimize like this once the code is finished and you have tested it and found the particular part which is the bottleneck. Masking a loop counter is very unlikely to be the bottleneck in any real code.
Obviously if I use uint8_t I could save some space.
Actually, that's not necessarily obvious! A loop index variable is likely to end up in a register, and if it does there's no memory to be saved. Also, since the definition of the C language says that much arithmetic takes place using type int, it's possible that using a variable smaller than int might actually end up costing you space in terms of extra code emitted by the compiler to convert back and forth between int and your smaller variable. So while it could save you some space, it's not at all guaranteed that it will — and, in any case, the actual savings are going to be almost imperceptibly small in the grand scheme of things.
If you have an array of some number of integers in the range 0-100, using uint8_t is a fine idea if you want to save space. For an individual variable, on the other hand, the arguments are pretty different.
In general, I'd say that there are two reasons not to use type uint8_t (or, equivalently, char or unsigned char) as a loop index:
It's not going to save much data space (if at all), and it might cost code size and/or speed.
If the loop runs over exactly 256 elements (yours didn't, but I'm speaking more generally here), you may have introduced a bug (which you'll discover soon enough): your loop may run forever.
The interviewer was probably expecting #1 as an answer. It's not a guaranteed answer — under plenty of circumstances, using the smaller type won't cost you anything, and evidently there are microprocessors where it can actually save something — but as a general rule, I agree that using an 8-bit type as a loop index is, well, silly. And whether or not you agree, it's certainly an issue to be aware of, so I think it's a fair interview question.
See also this question, which discusses the same sorts of issues.
The interview question doesn't make much sense from a platform-generic point of view. If we look at code such as this:
for(uint8_t i=0; i<n; i++)
array[i] = x;
Then the expression i<n will get carried out on type int or larger because of implicit promotion. Though the compiler may optimize it to use a smaller type if it doesn't affect the result.
As for array[i], the compiler is likely to use a type corresponding to whatever address size the system is using.
What the interviewer was fishing for is likely that uint32_t on a 32 bitter tend to generate faster code in some situations. For those cases you can use uint_fast8_t, but more likely the compiler will perform optimizations no matter.
The only optimization uint8_t blocks the compiler from doing, is to allocate a larger variable than 8 bits on the stack. It doesn't however block the compiler from optimizing out the variable entirely and using a register instead. Such as for example storing it in an index register with the same width as the address bus.
Example with gcc x86_64: https://godbolt.org/z/vYscf3KW9. The disassembly is pretty painful to read, but the compiler just picked CPU registers to store anything regardless of the type of i, giving identical machine code between uint8_t anduint16_t. I would have been surprised if it didn't.
On a processor with natural wordsize of 16 bits access times would be the same for both I guess.
Yes this is true for all mainstream 16 bitters. Some might even manage faster code if given 8 bits instead of 16. Some exotic systems like DSP exist, but in case of lets say a 1 byte=16 bits DSP, then the compiler doesn't even provide you with uint8_t to begin with - it is an optional type. One generally doesn't bother with portability to wildly exotic systems, since doing so is a waste of everyone's time and money.
The correct answer: it is senseless to do manual optimization without a specific system in mind. uint8_t is perfectly fine to use for generic, portable code.

About the use of signed integers in C family of languages

When using integer values in my own code, I always try to consider the signedness, asking myself if the integer should be signed or unsigned.
When I'm sure the value will never need to be negative, I then use an unsigned integer.
And I have to say this happen most of the time.
When reading other peoples' code, I rarely see unsigned integers, even if the represented value can't be negative.
So I asked myself: «is there a good reason for this, or do people just use signed integers because the don't care»?
I've search on the subject, here and in other places, and I have to say I can't find a good reason not to use unsigned integers, when it applies.
I came across those questions: «Default int type: Signed or Unsigned?», and «Should you always use 'int' for numbers in C, even if they are non-negative?» which both present the following example:
for( unsigned int i = foo.Length() - 1; i >= 0; --i ) {}
To me, this is just bad design. Of course, it may result in an infinite loop, with unsigned integers.
But is it so hard to check if foo.Length() is 0, before the loop?
So I personally don't think this is a good reason for using signed integers all the way.
Some people may also say that signed integers may be useful, even for non-negative values, to provide an error flag, usually -1.
Ok, that's good to have a specific value that means «error».
But then, what's wrong with something like UINT_MAX, for that specific value?
I'm actually asking this question because it may lead to some huge problems, usually when using third-party libraries.
In such a case, you often have to deal with signed and unsigned values.
Most of the time, people just don't care about the signedness, and just assign a, for instance, an unsigned int to a signed int, without checking the range.
I have to say I'm a bit paranoid with the compiler warning flags, so with my setup, such an implicit cast will result in a compiler error.
For that kind of stuff, I usually use a function or macro to check the range, and then assign using an explicit cast, raising an error if needed.
This just seems logical to me.
As a last example, as I'm also an Objective-C developer (note that this question is not related to Objective-C only):
- ( NSInteger )tableView: ( UITableView * )tableView numberOfRowsInSection: ( NSInteger )section;
For those not fluent with Objective-C, NSInteger is a signed integer.
This method actually retrieves the number of rows in a table view, for a specific section.
The result will never be a negative value (as the section number, by the way).
So why use a signed integer for this?
I really don't understand.
This is just an example, but I just always see that kind of stuff, with C, C++ or Objective-C.
So again, I'm just wondering if people just don't care about that kind of problems, or if there is finally a good and valid reason not to use unsigned integers for such cases.
Looking forward to hear your answers : )
a signed return value might yield more information (think error-numbers, 0 is sometimes a valid answer, -1 indicates error, see man read) ... which might be relevant especially for developers of libraries.
if you are worrying about the one extra bit you gain when using unsigned instead of signed then you are probably using the wrong type anyway. (also kind of "premature optimization" argument)
languages like python, ruby, jscript etc are doing just fine without signed vs unsigned. that might be an indicator ...
When using integer values in my own code, I always try to consider the signedness, asking myself if the integer should be signed or unsigned.
When I'm sure the value will never need to be negative, I then use an unsigned integer.
And I have to say this happen most of the time.
To carefully consider which type that is most suitable each time you declare a variable is very good practice! This means you are careful and professional. You should not only consider signedness, but also the potential max value that you expect this type to have.
The reason why you shouldn't use signed types when they aren't needed have nothing to do with performance, but with type safety. There are lots of potential, subtle bugs that can be caused by signed types:
The various forms of implicit promotions that exist in C can cause your type to change signedness in unexpected and possibly dangerous ways. The integer promotion rule that is part of the usual arithmetic conversions, the lvalue conversion upon assignment, the default argument promotions used by for example VA lists, and so on.
When using any form of bitwise operators or similar hardware-related programming, signed types are dangerous and can easily cause various forms of undefined behavior.
By declaring your integers unsigned, you automatically skip past a whole lot of the above dangers. Similarly, by declaring them as large as unsigned int or larger, you get rid of lots of dangers caused by the integer promotions.
Both size and signedness are important when it comes to writing rugged, portable and safe code. This is the reason why you should always use the types from stdint.h and not the native, so-called "primitive data types" of C.
So I asked myself: «is there a good reason for this, or do people just use signed integers because the don't care»?
I don't really think it is because they don't care, nor because they are lazy, even though declaring everything int is sometimes referred to as "sloppy typing" - which means sloppily picked type more than it means too lazy to type.
I rather believe it is because they lack deeper knowledge of the various things I mentioned above. There's a frightening amount of seasoned C programmers who don't know how implicit type promotions work in C, nor how signed types can cause poorly-defined behavior when used together with certain operators.
This is actually a very frequent source of subtle bugs. Many programmers find themselves staring at a compiler warning or a peculiar bug, which they can make go away by adding a cast. But they don't understand why, they simply add the cast and move on.
for( unsigned int i = foo.Length() - 1; i >= 0; --i ) {}
To me, this is just bad design
Indeed it is.
Once upon a time, down-counting loops would yield more effective code, because the compiler pick add a "branch if zero" instruction instead of a "branch if larger/smaller/equal" instruction - the former is faster. But this was at a time when compilers were really dumb and I don't believe such micro-optimizations are relevant any longer.
So there is rarely ever a reason to have a down-counting loop. Whoever made the argument probably just couldn't think outside the box. The example could have been rewritten as:
for(unsigned int i=0; i<foo.Length(); i++)
{
unsigned int index = foo.Length() - i - 1;
thing[index] = something;
}
This code should not have any impact on performance, but the loop itself turned a whole lot easier to read, while at the same time fixing the bug that your example had.
As far as performance is concerned nowadays, one should probably spend the time pondering about which form of data access that is most ideal in terms of data cache use, rather than anything else.
Some people may also say that signed integers may be useful, even for non-negative values, to provide an error flag, usually -1.
That's a poor argument. Good API design uses a dedicated error type for error reporting, such as an enum.
Instead of having some hobbyist-level API like
int do_stuff (int a, int b); // returns -1 if a or b were invalid, otherwise the result
you should have something like:
err_t do_stuff (int32_t a, int32_t b, int32_t* result);
// returns ERR_A is a is invalid, ERR_B if b is invalid, ERR_XXX if... and so on
// the result is stored in [result], which is allocated by the caller
// upon errors the contents of [result] remain untouched
The API would then consistently reserve the return of every function for this error type.
(And yes, many of the standard library functions abuse return types for error handling. This is because it contains lots of ancient functions from a time before good programming practice was invented, and they have been preserved the way they are for backwards-compatibility reasons. So just because you find a poorly-written function in the standard library, you shouldn't run off to write an equally poor function yourself.)
Overall, it sounds like you know what you are doing and giving signedness some thought. That probably means that knowledge-wise, you are actually already ahead of the people who wrote those posts and guides you are referring to.
The Google style guide for example, is questionable. Similar could be said about lots of other such coding standards that use "proof by authority". Just because it says Google, NASA or Linux kernel, people blindly swallow them no matter the quality of the actual contents. There are good things in those standards, but they also contain subjective opinions, speculations or blatant errors.
Instead I would recommend referring to real professional coding standards instead, such as MISRA-C. It enforces lots of thought and care for things like signedness, type promotion and type size, where less detailed/less serious documents just skip past it.
There is also CERT C, which isn't as detailed and careful as MISRA, but at least a sound, professional document (and more focused towards desktop/hosted development).
There is one heavy-weight argument against widely unsigned integers:
Premature optimization is the root of all evil.
We all have at least on one occasion been bitten by unsigned integers. Sometimes like in your loop, sometimes in other contexts. Unsigned integers add a hazard, even though a small one, to your program. And you are introducing this hazard to change the meaning of one bit. One little, tiny, insignificant-but-for-its-sign-meaning bit. On the other hand, the integers we work with in bread and butter applications are often far below the range of integers, more in the order of 10^1 than 10^7. Thus, the different range of unsigned integers is in the vast majority of cases not needed. And when it's needed, it is quite likely that this extra bit won't cut it (when 31 is too little, 32 is rarely enough) and you'll need a wider or an arbitrary-wide integer anyway. The pragmatic approach in these cases is to just use the signed integer and spare yourself the occasional underflow bug. Your time as a programmer can be put to much better use.
From the C FAQ:
The first question in the C FAQ is which integer type should we decide to use?
If you might need large values (above 32,767 or below -32,767), use long. Otherwise, if space is very important (i.e. if there are large arrays or many structures), use short. Otherwise, use int. If well-defined overflow characteristics are important and negative values are not, or if you want to steer clear of sign-extension problems when manipulating bits or bytes, use one of the corresponding unsigned types.
Another question concerns types conversions:
If an operation involves both signed and unsigned integers, the situation is a bit more complicated. If the unsigned operand is smaller (perhaps we're operating on unsigned int and long int), such that the larger, signed type could represent all values of the smaller, unsigned type, then the unsigned value is converted to the larger, signed type, and the result has the larger, signed type. Otherwise (that is, if the signed type can not represent all values of the unsigned type), both values are converted to a common unsigned type, and the result has that unsigned type.
You can find it here. So basically using unsigned integers, mostly for arithmetic conversions can complicate the situation since you'll have to either make all your integers unsigned, or be at the risk of confusing the compiler and yourself, but as long as you know what you are doing, this is not really a risk per se. However, it could introduce simple bugs.
And when it is a good to use unsigned integers? one situation is when using bitwise operations:
The << operator shifts its first operand left by a number of bits
given by its second operand, filling in new 0 bits at the right.
Similarly, the >> operator shifts its first operand right. If the
first operand is unsigned, >> fills in 0 bits from the left, but if
the first operand is signed, >> might fill in 1 bits if the high-order
bit was already 1. (Uncertainty like this is one reason why it's
usually a good idea to use all unsigned operands when working with the
bitwise operators.)
taken from here
And I've seen this somewhere:
If it was best to use unsigned integers for values that are never negative, we would have started by using unsigned int in the main function int main(int argc, char* argv[]). One thing is sure, argc is never negative.
EDIT:
As mentioned in the comments, the signature of main is due to historical reasons and apparently it predates the existence of the unsigned keyword.
Unsigned intgers are an artifact from the past. This is from the time, where processors could do unsigned arithmetic a little bit faster.
This is a case of premature optimization which is considered evil.
Actually, in 2005 when AMD introduced x86_64 (or AMD64, how it was then called), the 64 bit architecture for x86, they brought the ghosts of the past back: If a signed integer is used as an index and the compiler can not prove that it is never negative, is has to insert a 32 to 64 bit sign extension instruction - because the default 32 to 64 bit extension is unsigned (the upper half of a 64 bit register gets cleard if you move a 32 bit value into it).
But I would recommend against using unsigned in any arithmetic at all, being it pointer arithmetic or just simple numbers.
for( unsigned int i = foo.Length() - 1; i >= 0; --i ) {}
Any recent compiler will warn about such an construct, with condition ist always true or similar. With using a signed variable you avoid such pitfalls at all. Instead use ptrdiff_t.
A problem might be the c++ library, it often uses an unsigned type for size_t, which is required because of some rare corner cases with very large sizes (between 2^31 and 2^32) on 32 bit systems with certain boot switches ( /3GB windows).
There are many more, comparisons between signed and unsigned come to my mind, where the signed value automagically gets promoted to a unsigned and thus becomes a huge positive number, when it has been a small negative before.
One exception for using unsigned exists: For bit fields, flags, masks it is quite common. Usually it doesn't make sense at all to interpret the value of these variables as a magnitude, and the reader may deduce from the type that this variable is to be interpreted in bits.
The result will never be a negative value (as the section number, by the way). So why use a signed integer for this?
Because you might want to compare the return value to a signed value, which is actually negative. The comparison should return true in that case, but the C standard specifies that the signed get promoted to an unsigned in that case and you will get a false instead. I don't know about ObjectiveC though.

Signed vs Unsigned operations in C

Very simple question:
I have a program doing lots and lots of mathematical computations over ints and long longs. To fit in an extra bit, I made the long longs unsigned, since I only dealt with positive numbers, and could now get a few more values.
Oddly enough, this gave me a 15% performance boost, which I confirmed to be in simply making all the long long's unsigned.
Is this possible? Are mathematical operations really faster with unsigned numbers? I remember reading that there would be no difference, and the compiler automatically picks out the fastest way to go whether signed or unsigned. Is this 15% boost really from making the vars unsigned, or could it be something else affected in my code?
And, if it really is from making the vars unsigned, should I aim to make everything (even ints) unsigned, as I never need negative numbers, and every second is important if I can save it.
In some operations, signed integers are faster, in others, unsigned are faster:
In C, signed integer operations can be assumed not to wrap. The compiler will take advantage of this in loop optimization, for example. Comparisons can be optimized away similarly. (This can also lead to subtle bugs if you don't expect this).
On the other hand, unsigned integers do not have this assumption. However, not having to deal with a sign is a big advantage for some operations, for example: division. Unsigned division by a constant power of two is a simple shift, but (depending on your rounding rules) there's a conditional off-by-1 for negative numbers.
Personally, I make a habit of only using unsigned integers unless I really, really do have a value which needs to be signed. It's not so much for performance as correctness.
You may see the effect magnified with long long, which (I'm guessing) is 64 bits in your case. The CPU usually doesn't have single instructions do deal with these types (in 32 bit mode), so the slight added complexity for signed operations will be more noticeable.
On a 32-bit processor, 64-bit integer operations are emulated; using unsigned instead of signed means the emulation library doesn't have to do extra work to propagate carry bits etc.
There are three cases where a compiler cares whether a variable is signed or unsigned:
When the variable is converted to a longer type
When the comparison operators (greater-than, etc.) are applied
When overflows might occur
On some machines, conversion of signed variables to longer types requires extra code; on other machines, a conversion may be performed as part of a 'load' or 'move' instruction.
Some machines (mainly small embedded microcontrollers) require more instructions to perform a signed-versus-signed comparison than unsigned-versus-unsigned, but most machines have a full array of both signed and unsigned compare instructions available.
When overflows occur with unsigned types, the compiler may have to add code to ensure that the defined behavior actually occurs. No such code is required for signed types, because anything that might happen in the absence of such code would be permitted by the standard.
The compiler doesn't pick if it's going to be unsigned or signed. But, yes, in theory, unsigned with unsigned is faster than signed with signed. If you really want to slow things down, you'll go with signed with unsigned. And even worse: floats with integers.
It depends on the processor, of course.

Fastest integer type for common architectures

The stdint.h header lacks an int_fastest_t and uint_fastest_t to correspond with the {,u}int_fastX_t types. For instances where the width of the integer type does not matter, how does one pick the integer type that allows processing the greatest quantity of bits with the least penalty to performance? For example, if one was searching for the first set bit in a buffer using a naive approach, a loop such as this might be considered:
// return the bit offset of the first 1 bit
size_t find_first_bit_set(void const *const buf)
{
uint_fastest_t const *p = buf; // use the fastest type for comparison to zero
for (; *p == 0; ++p); // inc p while no bits are set
// return offset of first bit set
return (p - buf) * sizeof(*p) * CHAR_BIT + ffsX(*p) - 1;
}
Naturally, using char would result in more operations than int. But long long might result in more expensive operations than the overhead of using int on a 32 bit system and so on.
My current assumption is for the mainstream architectures, the use of long is the safest bet: It's 32 bit on 32 bit systems, and 64 bit on 64 bit systems.
int_fast8_t is always the fastest integer type in a correct implementation. There can never be integer types smaller than 8 bits (because CHAR_BIT>=8 is required), and since int_fast8_t is the fastest integer type with at least 8 bits, it's thus the fastest integer type, period.
Theoretically, int is the best bet. It should map to the CPU's native register size, and thus be "optimal" in the sense you're asking about.
However, you may still find that an int-64 or int-128 is faster on some CPUs than an int-32, because although these are larger than the register size, they will reduce the number of iterations of your loop, and thus may work out more efficient by minimising the loop overheads and/or taking advantage of DMA to load/store the data faster.
(For example, on ARM-2 processors it took 4 memory cycles to load one 32-bit register, but only 5 cycles to load two sequentially, and 7 cycles to load 4 sequentially. The routine you suggest above would be optimised to use as many registers as you could free up (8 to 10 usually), and could therefore run up to 3 or 4 times faster by using multiple registers per loop iteration)
The only way to be sure is to write several routines and then profile them on the specific target machine to find out which produces the best performance.
I'm not sure I really understand the question, but why aren't you just using int? Quoting from my (free draft copy of the wrong, i. e. C++) standard, "Plain ints have the natural size suggested by the architecture of the execution environment."
But I think that if you want to have the optimal integer type for a certain operation, it will be different depending on which operation it is. Trying to find the first bit in a large data buffer, or finding a number in a sequence of integers, or moving them around, could very well have completely different optimal types.
EDIT:
For whatever it's worth, I did a small benchmark. On my particular system (Intel i7 920 with Linux, gcc -O3) it turns out that long ints (64 bits) are quite a bit faster than plain ints (32 bits), on this particular example. I would have guessed the opposite.
If you want to be certain you've got the fastest implementation, why not benchmark each one on the systems you're expecting to run on instead of trying to guess?
The answer is int itself. At least in C++, where 3.9.1/2 of the standard says:
Plain ints have the natural size
suggested by the architecture of the
execution environment
I expect the same is true for C, though I don't have any of the standards documents.
I would guess that the types size_t (for an unsigned type) and ptrdiff_t (for a signed type) will usually correspond to quite efficient integer types on any given platform.
But nothing can prove that than inspecting the produced assembler and to do benchmarks.
Edit, including the different comments, here and in other replies:
size_t and ptrdiff_t are the only typedefs that are normative in C99 and for which one may make a reasonable assumption that they are related to the architecture.
There are 5 different possible ranks for standard integer types (char, short, int, long, long long). All the forces go towards having types of width 8, 16, 32, 64 and in near future 128. As a consequence int will be stuck on 32 bit. Its definition will have nothing to do with efficiency on the platform, but just be constrained by that width requirement.
If you're compiling with gcc, i'd recommend using __builtin_ffs() for finding the first bit set:
Built-in Function: int __builtin_ffs (unsigned int x)
Returns one plus the index of the least significant 1-bit of x, or if x is zero, returns zero.
This will be compiled into (often a single) native assembly instruction.
It is not possible to answer this question since the question is incomplete. As an analogy, consider the question:
What is the fastest vehicle
A Bugatti Veyron? Certainly fast, but no good for going from London to New York.
What is missing from the question, is the context the integer will be used in. In the original example above, I doubt you'd see much difference between 8, 32 or 64 bit values if the array is large and sparse since you'll be hitting memory bandwidth limits before cpu limits.
The main point is, the architecture does not define what size the various integer types are, it's the compiler designer that does that. The designer will carefully weigh up the pros and cons for various sizes for each type for a given architecture and pick the most appropriate.
I guess the 32 bit int on the 64 bit system was chosen because for most operations ints are used for 32 bits are enough. Since memory bandwidth is a limiting factor, saving on memory use was probably the overriding factor.
For all existing mainstream architectures long is the fastest type at present for loop throughput.

Is there a standard way to detect bit width of hardware?

Variables of type int are allegedly "one machine-type word in length"
but in embedded systems, C compilers for 8 bit micro use to have int of 16 bits!, (8 bits for unsigned char) then for more bits, int behave normally:
in 16 bit micros int is 16 bits too, and in 32 bit micros int is 32 bits, etc..
So, is there a standar way to test it, something as BITSIZEOF( int ) ?
like "sizeof" is for bytes but for bits.
this was my first idea
register c=1;
int bitwidth=0;
do
{
bitwidth++;
}while(c<<=1);
printf("Register bit width is : %d",bitwidth);
But it takes c as int, and it's common in 8 bit compilers to use int as 16 bit, so it gives me 16 as result, It seems there is no standar for use "int" as "register width", (or it's not respected)
Why I want to detect it? suppose I need many variables that need less than 256 values, so they can be 8, 16, 32 bits, but using the right size (same as memory and registers) will speed up things and save memory, and if this can't be decided in code, I have to re-write the function for every architecture
EDIT
After read the answers I found this good article
http://embeddedgurus.com/stack-overflow/category/efficient-cc/page/4/
I will quote the conclusion (added bold)
Thus
the bottom line is this. If you want
to start writing efficient, portable
embedded code, the first step you
should take is start using the C99
data types ‘least’ and ‘fast’. If your
compiler isn’t C99 compliant then
complain until it is – or change
vendors. If you make this change I
think you’ll be pleasantly surprised
at the improvements in code size and
speed that you’ll achieve.
I have to re-write the function for every architecture
No you don't. Use C99's stdint.h, which has types like uint_fast8_t, which will be a type capable of holding 256 values, and quickly.
Then, no matter the platform, the types will change accordingly and you don't change anything in your code. If your platform has no set of these defined, you can add your own.
Far better than rewriting every function.
To answer your deeper question more directly, if you have a need for very specific storage sizes that are portable across platforms, you should use something like types.h stdint.h which defines storage types specified with number of bits.
For example, uint32_t is always unsigned 32 bits and int8_t is always signed 8 bits.
#include <limits.h>
const int bitwidth = sizeof(int) * CHAR_BIT;
The ISA you're compiling for is already known to the compiler when it runs over your code, so your best bet is to detect it at compile time. Depending on your environment, you could use everything from autoconf/automake style stuff to lower level #ifdef's to tune your code to the specific architecture it'll run on.
I don't exactly understand what you mean by "there is no standar for use "int" as "register width". In the original C language specification (C89/90) the type int is implied in certain contexts when no explicit type is supplied. Your register c is equivalent to register int c and that is perfectly standard in C89/90. Note also that C language specification requires type int to support at least -32767...+32767 range, meaning that on any platform int will have at least 16 value-forming bits.
As for the bit width... sizeof(int) * CHAR_BIT will give you the number of bits in the object representation of type int.
Theoretically though, the value representation of type int is not guaranteed to use all bits of its object representation. If you need to determine the number of bits used for value representation, you can simply analyze the INT_MIN and INT_MAX values.
P.S. Looking at the title of your question, I suspect that what you really need is just the CHAR_BIT value.
Does an unsigned char or unsigned short suit your needs? Why not use that? If not, you should be using compile time flags to bring in the appropriate code.
I think that in this case you don't need to know how many bits has your architecture. Just use variables as small as possible if you want to optimize your code.

Resources