Unsigned version of lldiv in C? - c

I have a function which needs the quotient and remainder for an unsigned 64-bit division. It looks like lldiv and lldiv_t, while long long ints rather than ints, are signed. Is there an unsigned version? If not, what's the best way to handle this?
Speed is important (as usual, billions or trillions of operations), but the compiler might be smart enough to handle this properly -- I'm using gcc 4.3.3.

Just use the division and remainder operators. Any sane compiler will do a much better job optimizing them than a call to div, ldiv, or lldiv.

Related

Determine the fastest unsigned integer type for basic arithmetic calculations

I'm writing some code for calculating with arbitrarily large unsigned integers. This is just for fun and training, otherwise I'd use libgmp. My representation uses an array of unsigned integers and for chosing the "base type", I use a typedef:
#include <limits.h>
#include <stdint.h>
typedef unsigned int hugeint_Uint;
typedef struct hugeint hugeint;
#define HUGEINT_ELEMENT_BITS (CHAR_BIT * sizeof(hugeint_Uint))
#define HUGEINT_INITIAL_ELEMENTS (256 / HUGEINT_ELEMENT_BITS)
struct hugeint
{
size_t s; // <- maximum number of elements
size_t n; // <- number of significant elements
hugeint_Uint e[]; // <- elements of the number starting with least significant
};
The code is working fine, so I only show the part relevant to my question here.
I would like to pick a better "base type" than unsigned int, so the calculations are the fastest possible on the target system (e.g. a 64bit type when targeting x86_64, a 32bit type when targeting i686, an 8bit type when targeting avr_attiny, ...)
I thought that uint_fast8_t should do what I want. But I found out it doesn't, see e.g. here the relevant part of stdint.h from MinGW:
/* 7.18.1.3 Fastest minimum-width integer types
* Not actually guaranteed to be fastest for all purposes
* Here we use the exact-width types for 8 and 16-bit ints.
*/
typedef signed char int_fast8_t;
typedef unsigned char uint_fast8_t;
The comment is interesting: for which purpose would an unsigned char be faster than an unsigned int on win32? Well, the important thing is: uint_fast8_t will not do what I want.
So is there some good and portable way to find the fastest unsigned integer type?
It's not quite that black and white; processors may have different/specialized registers for certain operations, like AVX registers on x86_64, may operate most efficiently on half-sized registers or not have registers at all. The choice of the "fastest integer type" thus depends heavily on the actual calculation you need to perform.
Having said that, C99 defines uintmax_t which is meant to represent the maximum width unsigned integer type, but beware, it could be 64 bit simply because the compiler is able to emulate 64-bit math.
If you target commodity processors, size_t usually provides a good approximation for the "bitness" of the underlying hardware because it is directly tied to the memory addressing capability of the machine, and as such is most likely to be the most optimal size for integer math.
In any case you're going to have to test your solution on all hardware that you're planning to support.
It's a good idea to start your code with the largest integer type the platform has, uintmax_t. As has already been pointed out, this is not necessarily, but rather most probably the fastest. I'd rather say there are exceptions where this is not the case, but as a default, it is probably your best bet.
Be very careful to build the size granularity into expressions that the compiler can resolve at compile type, rather than runtime for speed.
It is most probably a good idea to define the base type as something like
#define LGINT_BASETYPE uintmax_t
#define LGINT_GRANUL sizeof(LGINT_BASETYPE)
This will allow you to change the base type in one single place and adapt to different platforms quickly. That results in code that is easily moved to a new platform, but still can be adapted to the exception cases where the largest int type is not the most performant one (after you have proven that by measurement)
As always, it does not make a lot of sense to think about optimal performance when designing your code - Start with a reasonable balance of "designed for optimization" and "design for maintainability" - You might easily find out that the choice of base type is not really the most CPU-eating part of your code. In my experience, I was nearly always in for some surprises when comparing my guesses on where CPU is spent to my measurements. Don't fall into the premature optimization trap.

C atmega2560 Division of large integers

So I'm wondering about the costs of division on a atmega2560 as well as in general:
Let's say I got something like this
unsigned long long a=some-large-value;
unsigned long long b=some-other-large-value;
unsigned long result=(a-b)/A_CONSTANT
//A_CONSTANT i.e. 16
How long does it actually take? Are we speaking about hundrets or thousands of cycles? And does it make a difference if I change the division to a multiplication i.e. like so
unsigned long result=(a-b)*1/A_CONSTANT
I want to use that in a time-critical application for calculating a time span which is used for determining when to execute another part of the program. Assuming the division takes too much time, what other options do I have?
This really depends on your A_CONSTANT and how good the compiler is IMO.
I've looked up the chip and it's obviously an 8 bit processor with 8 or 16 MHz.
As such, I'd consider those unsigned long long integer to be the biggest hurdle to take, if your division is trivial.
For this it would have to be a power of two (like 2, 4, 8, 16, etc.). What would happen then, would be an optimization, replacing the whole division with a simple right shift, which would be completed in far less cycles.
Switching to a multiplication won't net you anything good. You'll at least suffer precision issues and your current code would result in the result 0 all the time, unless A_CONSTANT is 1 (since you're obviously doing an integer division, where the result is rounded down).
So what to do or whether to consider this something for optimization heavily depends on the actual value of A_CONSTANT.
Probably the easiest way solving this (or comparing solutions) would be comparing the resulting assembly code, because it will be the final result that's actually processed. Optimizing this purely on theory is rather complicated and might even get you wrong or misleading results.
AVR instructions set doesn't have a divide operation on its own so as being mentioned in the comments it's all goes to point how compiler you are using implements this operation.
You might want to have a look on generated machine instructions to see what's actually generated and think of possible optimisation.
There are a lot of information available on google about different implementations of integer divisions, like for example this
Also very good source of information.

About the use of signed integers in C family of languages

When using integer values in my own code, I always try to consider the signedness, asking myself if the integer should be signed or unsigned.
When I'm sure the value will never need to be negative, I then use an unsigned integer.
And I have to say this happen most of the time.
When reading other peoples' code, I rarely see unsigned integers, even if the represented value can't be negative.
So I asked myself: «is there a good reason for this, or do people just use signed integers because the don't care»?
I've search on the subject, here and in other places, and I have to say I can't find a good reason not to use unsigned integers, when it applies.
I came across those questions: «Default int type: Signed or Unsigned?», and «Should you always use 'int' for numbers in C, even if they are non-negative?» which both present the following example:
for( unsigned int i = foo.Length() - 1; i >= 0; --i ) {}
To me, this is just bad design. Of course, it may result in an infinite loop, with unsigned integers.
But is it so hard to check if foo.Length() is 0, before the loop?
So I personally don't think this is a good reason for using signed integers all the way.
Some people may also say that signed integers may be useful, even for non-negative values, to provide an error flag, usually -1.
Ok, that's good to have a specific value that means «error».
But then, what's wrong with something like UINT_MAX, for that specific value?
I'm actually asking this question because it may lead to some huge problems, usually when using third-party libraries.
In such a case, you often have to deal with signed and unsigned values.
Most of the time, people just don't care about the signedness, and just assign a, for instance, an unsigned int to a signed int, without checking the range.
I have to say I'm a bit paranoid with the compiler warning flags, so with my setup, such an implicit cast will result in a compiler error.
For that kind of stuff, I usually use a function or macro to check the range, and then assign using an explicit cast, raising an error if needed.
This just seems logical to me.
As a last example, as I'm also an Objective-C developer (note that this question is not related to Objective-C only):
- ( NSInteger )tableView: ( UITableView * )tableView numberOfRowsInSection: ( NSInteger )section;
For those not fluent with Objective-C, NSInteger is a signed integer.
This method actually retrieves the number of rows in a table view, for a specific section.
The result will never be a negative value (as the section number, by the way).
So why use a signed integer for this?
I really don't understand.
This is just an example, but I just always see that kind of stuff, with C, C++ or Objective-C.
So again, I'm just wondering if people just don't care about that kind of problems, or if there is finally a good and valid reason not to use unsigned integers for such cases.
Looking forward to hear your answers : )
a signed return value might yield more information (think error-numbers, 0 is sometimes a valid answer, -1 indicates error, see man read) ... which might be relevant especially for developers of libraries.
if you are worrying about the one extra bit you gain when using unsigned instead of signed then you are probably using the wrong type anyway. (also kind of "premature optimization" argument)
languages like python, ruby, jscript etc are doing just fine without signed vs unsigned. that might be an indicator ...
When using integer values in my own code, I always try to consider the signedness, asking myself if the integer should be signed or unsigned.
When I'm sure the value will never need to be negative, I then use an unsigned integer.
And I have to say this happen most of the time.
To carefully consider which type that is most suitable each time you declare a variable is very good practice! This means you are careful and professional. You should not only consider signedness, but also the potential max value that you expect this type to have.
The reason why you shouldn't use signed types when they aren't needed have nothing to do with performance, but with type safety. There are lots of potential, subtle bugs that can be caused by signed types:
The various forms of implicit promotions that exist in C can cause your type to change signedness in unexpected and possibly dangerous ways. The integer promotion rule that is part of the usual arithmetic conversions, the lvalue conversion upon assignment, the default argument promotions used by for example VA lists, and so on.
When using any form of bitwise operators or similar hardware-related programming, signed types are dangerous and can easily cause various forms of undefined behavior.
By declaring your integers unsigned, you automatically skip past a whole lot of the above dangers. Similarly, by declaring them as large as unsigned int or larger, you get rid of lots of dangers caused by the integer promotions.
Both size and signedness are important when it comes to writing rugged, portable and safe code. This is the reason why you should always use the types from stdint.h and not the native, so-called "primitive data types" of C.
So I asked myself: «is there a good reason for this, or do people just use signed integers because the don't care»?
I don't really think it is because they don't care, nor because they are lazy, even though declaring everything int is sometimes referred to as "sloppy typing" - which means sloppily picked type more than it means too lazy to type.
I rather believe it is because they lack deeper knowledge of the various things I mentioned above. There's a frightening amount of seasoned C programmers who don't know how implicit type promotions work in C, nor how signed types can cause poorly-defined behavior when used together with certain operators.
This is actually a very frequent source of subtle bugs. Many programmers find themselves staring at a compiler warning or a peculiar bug, which they can make go away by adding a cast. But they don't understand why, they simply add the cast and move on.
for( unsigned int i = foo.Length() - 1; i >= 0; --i ) {}
To me, this is just bad design
Indeed it is.
Once upon a time, down-counting loops would yield more effective code, because the compiler pick add a "branch if zero" instruction instead of a "branch if larger/smaller/equal" instruction - the former is faster. But this was at a time when compilers were really dumb and I don't believe such micro-optimizations are relevant any longer.
So there is rarely ever a reason to have a down-counting loop. Whoever made the argument probably just couldn't think outside the box. The example could have been rewritten as:
for(unsigned int i=0; i<foo.Length(); i++)
{
unsigned int index = foo.Length() - i - 1;
thing[index] = something;
}
This code should not have any impact on performance, but the loop itself turned a whole lot easier to read, while at the same time fixing the bug that your example had.
As far as performance is concerned nowadays, one should probably spend the time pondering about which form of data access that is most ideal in terms of data cache use, rather than anything else.
Some people may also say that signed integers may be useful, even for non-negative values, to provide an error flag, usually -1.
That's a poor argument. Good API design uses a dedicated error type for error reporting, such as an enum.
Instead of having some hobbyist-level API like
int do_stuff (int a, int b); // returns -1 if a or b were invalid, otherwise the result
you should have something like:
err_t do_stuff (int32_t a, int32_t b, int32_t* result);
// returns ERR_A is a is invalid, ERR_B if b is invalid, ERR_XXX if... and so on
// the result is stored in [result], which is allocated by the caller
// upon errors the contents of [result] remain untouched
The API would then consistently reserve the return of every function for this error type.
(And yes, many of the standard library functions abuse return types for error handling. This is because it contains lots of ancient functions from a time before good programming practice was invented, and they have been preserved the way they are for backwards-compatibility reasons. So just because you find a poorly-written function in the standard library, you shouldn't run off to write an equally poor function yourself.)
Overall, it sounds like you know what you are doing and giving signedness some thought. That probably means that knowledge-wise, you are actually already ahead of the people who wrote those posts and guides you are referring to.
The Google style guide for example, is questionable. Similar could be said about lots of other such coding standards that use "proof by authority". Just because it says Google, NASA or Linux kernel, people blindly swallow them no matter the quality of the actual contents. There are good things in those standards, but they also contain subjective opinions, speculations or blatant errors.
Instead I would recommend referring to real professional coding standards instead, such as MISRA-C. It enforces lots of thought and care for things like signedness, type promotion and type size, where less detailed/less serious documents just skip past it.
There is also CERT C, which isn't as detailed and careful as MISRA, but at least a sound, professional document (and more focused towards desktop/hosted development).
There is one heavy-weight argument against widely unsigned integers:
Premature optimization is the root of all evil.
We all have at least on one occasion been bitten by unsigned integers. Sometimes like in your loop, sometimes in other contexts. Unsigned integers add a hazard, even though a small one, to your program. And you are introducing this hazard to change the meaning of one bit. One little, tiny, insignificant-but-for-its-sign-meaning bit. On the other hand, the integers we work with in bread and butter applications are often far below the range of integers, more in the order of 10^1 than 10^7. Thus, the different range of unsigned integers is in the vast majority of cases not needed. And when it's needed, it is quite likely that this extra bit won't cut it (when 31 is too little, 32 is rarely enough) and you'll need a wider or an arbitrary-wide integer anyway. The pragmatic approach in these cases is to just use the signed integer and spare yourself the occasional underflow bug. Your time as a programmer can be put to much better use.
From the C FAQ:
The first question in the C FAQ is which integer type should we decide to use?
If you might need large values (above 32,767 or below -32,767), use long. Otherwise, if space is very important (i.e. if there are large arrays or many structures), use short. Otherwise, use int. If well-defined overflow characteristics are important and negative values are not, or if you want to steer clear of sign-extension problems when manipulating bits or bytes, use one of the corresponding unsigned types.
Another question concerns types conversions:
If an operation involves both signed and unsigned integers, the situation is a bit more complicated. If the unsigned operand is smaller (perhaps we're operating on unsigned int and long int), such that the larger, signed type could represent all values of the smaller, unsigned type, then the unsigned value is converted to the larger, signed type, and the result has the larger, signed type. Otherwise (that is, if the signed type can not represent all values of the unsigned type), both values are converted to a common unsigned type, and the result has that unsigned type.
You can find it here. So basically using unsigned integers, mostly for arithmetic conversions can complicate the situation since you'll have to either make all your integers unsigned, or be at the risk of confusing the compiler and yourself, but as long as you know what you are doing, this is not really a risk per se. However, it could introduce simple bugs.
And when it is a good to use unsigned integers? one situation is when using bitwise operations:
The << operator shifts its first operand left by a number of bits
given by its second operand, filling in new 0 bits at the right.
Similarly, the >> operator shifts its first operand right. If the
first operand is unsigned, >> fills in 0 bits from the left, but if
the first operand is signed, >> might fill in 1 bits if the high-order
bit was already 1. (Uncertainty like this is one reason why it's
usually a good idea to use all unsigned operands when working with the
bitwise operators.)
taken from here
And I've seen this somewhere:
If it was best to use unsigned integers for values that are never negative, we would have started by using unsigned int in the main function int main(int argc, char* argv[]). One thing is sure, argc is never negative.
EDIT:
As mentioned in the comments, the signature of main is due to historical reasons and apparently it predates the existence of the unsigned keyword.
Unsigned intgers are an artifact from the past. This is from the time, where processors could do unsigned arithmetic a little bit faster.
This is a case of premature optimization which is considered evil.
Actually, in 2005 when AMD introduced x86_64 (or AMD64, how it was then called), the 64 bit architecture for x86, they brought the ghosts of the past back: If a signed integer is used as an index and the compiler can not prove that it is never negative, is has to insert a 32 to 64 bit sign extension instruction - because the default 32 to 64 bit extension is unsigned (the upper half of a 64 bit register gets cleard if you move a 32 bit value into it).
But I would recommend against using unsigned in any arithmetic at all, being it pointer arithmetic or just simple numbers.
for( unsigned int i = foo.Length() - 1; i >= 0; --i ) {}
Any recent compiler will warn about such an construct, with condition ist always true or similar. With using a signed variable you avoid such pitfalls at all. Instead use ptrdiff_t.
A problem might be the c++ library, it often uses an unsigned type for size_t, which is required because of some rare corner cases with very large sizes (between 2^31 and 2^32) on 32 bit systems with certain boot switches ( /3GB windows).
There are many more, comparisons between signed and unsigned come to my mind, where the signed value automagically gets promoted to a unsigned and thus becomes a huge positive number, when it has been a small negative before.
One exception for using unsigned exists: For bit fields, flags, masks it is quite common. Usually it doesn't make sense at all to interpret the value of these variables as a magnitude, and the reader may deduce from the type that this variable is to be interpreted in bits.
The result will never be a negative value (as the section number, by the way). So why use a signed integer for this?
Because you might want to compare the return value to a signed value, which is actually negative. The comparison should return true in that case, but the C standard specifies that the signed get promoted to an unsigned in that case and you will get a false instead. I don't know about ObjectiveC though.

Signed vs Unsigned operations in C

Very simple question:
I have a program doing lots and lots of mathematical computations over ints and long longs. To fit in an extra bit, I made the long longs unsigned, since I only dealt with positive numbers, and could now get a few more values.
Oddly enough, this gave me a 15% performance boost, which I confirmed to be in simply making all the long long's unsigned.
Is this possible? Are mathematical operations really faster with unsigned numbers? I remember reading that there would be no difference, and the compiler automatically picks out the fastest way to go whether signed or unsigned. Is this 15% boost really from making the vars unsigned, or could it be something else affected in my code?
And, if it really is from making the vars unsigned, should I aim to make everything (even ints) unsigned, as I never need negative numbers, and every second is important if I can save it.
In some operations, signed integers are faster, in others, unsigned are faster:
In C, signed integer operations can be assumed not to wrap. The compiler will take advantage of this in loop optimization, for example. Comparisons can be optimized away similarly. (This can also lead to subtle bugs if you don't expect this).
On the other hand, unsigned integers do not have this assumption. However, not having to deal with a sign is a big advantage for some operations, for example: division. Unsigned division by a constant power of two is a simple shift, but (depending on your rounding rules) there's a conditional off-by-1 for negative numbers.
Personally, I make a habit of only using unsigned integers unless I really, really do have a value which needs to be signed. It's not so much for performance as correctness.
You may see the effect magnified with long long, which (I'm guessing) is 64 bits in your case. The CPU usually doesn't have single instructions do deal with these types (in 32 bit mode), so the slight added complexity for signed operations will be more noticeable.
On a 32-bit processor, 64-bit integer operations are emulated; using unsigned instead of signed means the emulation library doesn't have to do extra work to propagate carry bits etc.
There are three cases where a compiler cares whether a variable is signed or unsigned:
When the variable is converted to a longer type
When the comparison operators (greater-than, etc.) are applied
When overflows might occur
On some machines, conversion of signed variables to longer types requires extra code; on other machines, a conversion may be performed as part of a 'load' or 'move' instruction.
Some machines (mainly small embedded microcontrollers) require more instructions to perform a signed-versus-signed comparison than unsigned-versus-unsigned, but most machines have a full array of both signed and unsigned compare instructions available.
When overflows occur with unsigned types, the compiler may have to add code to ensure that the defined behavior actually occurs. No such code is required for signed types, because anything that might happen in the absence of such code would be permitted by the standard.
The compiler doesn't pick if it's going to be unsigned or signed. But, yes, in theory, unsigned with unsigned is faster than signed with signed. If you really want to slow things down, you'll go with signed with unsigned. And even worse: floats with integers.
It depends on the processor, of course.

Calculating with a variable outside of its bounds in C

If I make a calculation with a variable where an intermediate part of the calculation goes higher then the bounds of that variable type, is there any hazard that some platforms may not like?
This is an example of what I'm asking:
int a, b;
a=30000;
b=(a*32000)/32767;
I have compiled this, and it does give the correct answer of 29297 (well, within truncating error, anyway). But the part that worries me is that 30,000*32,000 = 960,000,000, which is a 30-bit number, and thus cannot be stored in a 16-bit int. The end result is well within the bounds of an int, but I was expecting that whatever working part of memory would have the same size allocated as the largest source variables did, so an overflow error would occur.
This is just a small example to show my problem, I am trying to avoid using floating points by making the fraction be a fraction of the max amount able to be stored in that variable (in this case, a signed integer, so 32767 on the positive side), because the embedded system I'm using I believe does not have an FPU.
So how do most processors handle calculations out of the bounds of the source and destination variables?
On a 16-bit compiler/CPU, you can (almost) plan on that code giving incorrect results. This is a bit sad, since nearly every CPU (that has a multiply instruction at all) will produce and store the intermediate result, but no C compiler (of which I'm aware) will normally use it (and if you made a and b unsigned, it wouldn't be allowed to use it).
You have a few choices to deal with this. One is to write small muldiv function in assembly language that does the multiplication (preserving the high word) then the division on that, and finally returns the value to C when it's been reduced back into range.
Another option is to do the math on unsigned integers, which at least allow you to figure out when a problem occurred. Unfortunately, none of the choices is what I'd call particularly appealing though...
As far as I know, most if not all processors will hold results for a word * word multiplication in a double word -- meaning, an 8 bit * 8 bit is stored in a 16-bit register(s) on an 8-bit processor, a 32-bit * 32 bit operation is stored in a 64-bit register(s) on a 32-bit machine. (At least, that's how it's been on all the embedded microcontrollers I've used)
If that weren't the case, the processor would be severely crippled in the sense of only allowing half-word * half-word multiplication.
AFAIK this kind of thing is formally "undefined". You have to do the algebra necessary to prevent overflow. That's always your first choice. Numeric stability is no accident, it requires some care in deciding when and how to do division and multiplication.
Or, you have to guarantee that you'll use an intermediate result buffer that's big enough.
Using a large intermediate buffer is what some C compilers do anyway. The language, however, doesn't make any guarantees.
So, to be sure that it works, most folks do something like this.
short a= 30000;
int temp= a;
int temp2= (a*32000)/32767;
// here you can check for errors; if temp2 > 32767, you have overflow.
short b= a;
Signed integer overflow is undefined behavior.
Almost any implementation you could possibly meet will wrap around on integer overflow, because (a) everyone uses 2's complement, in which arithmetic operations are bitwise identical for signed and unsigned types of the same size, and (b) wraparound is the defined behavior of unsigned types in C.
So, on an implementation with a 16 bit int, I would expect the result 0 for your calculation (and that is the result that it must have if you'd used an unsigned 16 bit int). But I'd code against the possibility it might throw a hardware exception, explode, etc.
Note that if you do the calculation with two 16 bit short variables on a machine with a 32 bit int, then you will generally get the "right" answer 29297, because the intermediate value (a*32000) is an int, and only gets truncated back to short at the end. I say "generally" because converting an out-of-bounds integer value to a signed integer type either gives an unspecified result or else raises a signal. But again, any implementation you'll encounter in polite company just takes a modulus.
Are you sure your compiler has 16 bit integers? On most systems nowadays, ints are 32 bits. Another possible reason you aren't getting an error is that some compilers will recognize that it can compute something like this at compile time and will do so.
If you are really concerned that you will end up with overflow, you can sometimes reorder or factor the formula differently so that no intermediate terms will overflow. In your example that would be hard to do since all of your terms are near the limit of a 16 bit value. Do you need the number to be exactly right, or can you approximate? If you can, you can do something like this:
int a, b;
a=30000;
//b=(a*32000)/32767 ~= a * (32000/32768) = a *(125/128)
b = (a / 128) * 125 // if a=30000, b = 29250 - about 0.16% error
Another option would be to use larger sized types for intermediate terms. If your compiler had 16 bit ints and 32 bit longs, you could do something like this:
int a, b;
a=30000;
b=((long)a*32000L)/32767L;
Really, there's no set answer for how to handle overflow. You need to evaluate each case on its own and decide what the best solution is.
Your compiler and target processor both have to do with the sizes of the various data types.
Compilers will usually promote variables to the largest easy to work with size during calculations and then convert the results whatever size is needed for an assignment at the end.
There's also C rules that govern promoting to sizes which are more difficult to work with for some calculations. If you are compiling for an AVR, which has 8 bit registers but defines an int to be 16 bits, many calculations end up using more registers than you might think that they need because of this promotion and the fact that constant numbers in your code have to be thought of as being int or unsigned int unless the compiler can prove to itself that this won't effect the outcome of the calculations.
Try rewriting your code with various different sizes of integers (short, int, long, long long) and see how that goes. You may also want to write a simple program that prints out the sizeof( ) of the standard predefined types.
If you need to worry about the sizes of your integer variables and/or the intermediate results of your calculations then you should include and use things like uint32_t and int64_t for your declarations and type casting.

Resources