Storing a large number in C - c

I have the following code where I have an array. I add a large number to that array, but when printing it, it shows a smaller, incorrect value. Why is that, and is there a way to fix this?
int x[10];
x[0] = 252121521121;
printf(" %i " , x[0]); //prints short wrong value

Your number requires 38 bit. If your platform's int isn't that big (and there's no reason it should be), the number simply won't fit. (In fact, even the int literal should already have triggered a compiler warning, supposing that this is C or C++.)
You could always use a data type of guaranteed size, like an int64 or something like that, depending on your language and platform. Probably no need for arbitrary-precision libraries here.
In C, include <stdint.h> and use int64_t, or just use long long int, and make sure you initialize it from a long long integer literal, e.g. 252121521121LL. (Long longs are only officially part of the most recent language standards, I might add.)
(Edit: long long int is guaranteed to be at least 64 bit, so it should be a good choice.)

An int, on most systems, is 32 bits. That's enough to store a number of about 2 billion signed, or 4 billion unsigned. To store larger numbers you need a larger form of int. (Unfortunately, on some systems a long int is the same as an int -- good ol' standardization -- so you need to go to a long long int. Better if you can find a typedef in your library such as int64_t.)

If you only have the problem with this particular number, then just use a long long int as suggested in previous answers.
Otherwise, for even larger numbers (>1E19 for signed numbers), you might want to switch to a large number library or code yourself this kind of data type. You basically need to store each digit of your number in an array (or linked list) and manually code basic operations you need on them : adding, subtracting, multiplying etc.
Some libraries include
https://mattmccutchen.net/bigint/
or GMP.

Well, your number just seems to exceed the maximum value a 32bit integer can hold..

Related

Declaring the array size in C

Its quite embarrassing but I really want to know... So I needed to make a conversion program that converts decimal(base 10) to binary and hex. I used arrays to store values and everything worked out fine, but i declared the array as int arr[1000]; because i thought 1000 was just an ok number, not too big, not to small...someone in class said " why would you declare an array of 1000? Integers are 32 bits". I was too embarrased to ask what that meant so i didnt say anything. But does this mean that i can just declare the array as int arr[32]; instead? Im using C btw
No, the int type has tipically a 32 bit size, but when you declare
int arr[1000];
you are reserving space for 1000 integers, i.e. 32'000 bits, while with
int arr[32];
you can store up to 32 integers.
You are practically asking yourself a question like this: if an apple weighs 32 grams, I want to my bag to
contain 1000 apples or 32 apples?
Don't be embarrassed. Fear is your enemy and in the end you will be perceived based on contexts that you have no hope of significantly influencing. Anyway, to answer your question, your approach is incorrect. You should declare the array with a size completely determined by the number of positions used.
Concretely, if you access the array at 87 distinct positions (from 0 to 86) then you need a size of 87.
0 to 4,294,967,295 is the maximum possible range of numbers you can store in 32 bits.If your number is outside this range you cannot store your number in 32 bits.Since each bit will occupy one index location of your array if you number falls in that range array size of 32 will do fine.for example consider number 9 it will be stored in array as a[]={1,0,0,1}.
In order to know the know range of numbers, your formula is 0 to (2^n -1) where n is the number of bits in binary. means in the array size of 4 or 4 bits you can just store number from range 0 to 15.
In C , integer datatype can store typically up to 2,147,483,647 and 4,294,967,295 if you are using unsigned integer. Since the maximum value, an integer data type can store in C is within the range of maximum possible number which can be expressed using 32 bits. It is safe to say that array size of 32 is the best size for defining an array.Sice you will never require more than 32 bits to express any number using an int.
I will use
int a = 42;
char bin[sizeof a * CHAR_BIT + 1];
char hex[sizeof a * CHAR_BIT / 4 + 1]
I think this include all possibility.
Consider that also the 'int' type is ambiguous. Generally it depends on the machine you're working on and at minimum its ranges are: -32767,+32767:
https://en.wikipedia.org/wiki/C_data_types
Can I suggest to use the stdint types?
int32_t/uint32_t
What you did is okay. If that is precisely what you want to do. C is a language that lets you do whatever you want. Whenever you want. The reason you were berated on the declaration is because of 'hogging' memory. The thought being, how DARE YOU take up space that is possibly never used... it is inefficient.
And it is. But who cares if you just want to run a program that has a simple purpose? A 1000 16 or 32 bit block of memory is weeeeeensy teeeeny tiny compared to computers from the way back times when it was necessary to watch over how much RAM you were taking up. So - go ahead.
But what they should have said next is how to avoid that. More on that at the end - but first a thing about built in data types in C.
An int can be 16 or 32 bits depending on how you declare it. And your compiler's settings...
A LONG int is 32.
consider:
short int x = 10; // declares an integer that is 16 bits
signed int x = 10; // 32 bit integer with negative and positive range
unsigned int x = 10 // same 32 bit integer - but only 0 to positive values
To specifically code a 32 bit integer you declare it 'long'
long int = 10; // 32 bit
unsigned long int = 10; // 32 bit 0 to positive values
Typical nomenclature is to call a 16 bit value a WORD and a 32 bit value a DWORD - (double word). But why would you want to type in:
long int x = 10;
instead of:
int x = 10;
?? For a few reasons. Some compilers may handle the int as a 16 bit WORD if keeping up with older standards. But the only real reason is to maintain a convention of strongly typecasted code. Make it read directly what you intend it to do. This also helps in readability. You will KNOW when you see it = what size it is for sure, and be reminded whilst coding. Many many code mishaps happen for lack of attention to code practices and naming things well. Save yourself hours of headache later on by learning good habits now. Create YOUR OWN style of coding. Take a look at other styles just to get an idea on what the industry may expect. But in the end you will find you own way in it.
On to the array issue ---> So, I expect you know that the array takes up memory right when the program runs. Right then, wham - the RAM for that array is set aside just for your program. It is locked out from use by any other resource, service, etc the operating system is handling.
But wouldn't it be neat if you could just use the memory you needed when you wanted, and then let it go when done? Inside the program - as it runs. So when your program first started, the array (so to speak) would be zero. And when you needed a 'slot' in the array, you could just add one.... use it, and then let it go - or add another - or ten more... etc.
That is called dynamic memory allocation. And it requires the use of a data type that you may not have encountered yet. Look up "Pointers in C" to get an intro.
If you are coding in regular C there are a few functions that assist in performing dynamic allocation of memory:
malloc and free ~ in the alloc.h library routines
in C++ they are implemented differently. Look for:
new and delete
A common construct for handling dynamic 'arrays' is called a "linked-list." Look that up too...
Don't let someone get your flustered with code concepts. Next time just say your program is designed to handle exactly what you have intended. That usually stops the discussion.
Atomkey

Should I use the stdint.h integer types on 32/64 bit machines?

One thing that bugs me about the regular c integer declarations is that their names are strange, "long long" being the worst. I am only building for 32 and 64 bit machines so I do not necessarily need the portability that the library offers, however I like that the name for each type is a single word in similar length with no ambiguity in size.
// multiple word types are hard to read
// long integers can be 32 or 64 bits depending on the machine
unsigned long int foo = 64;
long int bar = -64;
// easy to read
// no ambiguity
uint64_t foo = 64;
int64_t bar = -64;
On 32 and 64 bit machines:
1) Can using a smaller integer such as int16_t be slower than something higher such as int32_t?
2) If I needed a for loop to run just 10 times, is it ok to use the smallest integer that can handle it instead of the typical 32 bit integer?
for (int8_t i = 0; i < 10; i++) {
}
3) Whenever I use an integer that I know will never be negative is it ok to prefer using the unsigned version even if I do not need the extra range in provides?
// instead of the one above
for (uint8_t i = 0; i < 10; i++) {
}
4) Is it safe to use a typedef for the types included from stdint.h
typedef int32_t signed_32_int;
typedef uint32_t unsigned_32_int;
edit: both answers were equally good and I couldn't really lean towards one so I just picked the answerer with lower rep
1) Can using a smaller integer such as int16_t be slower than something higher such as int32_t?
Yes it can be slower. Use int_fast16_t instead. Profile the code as needed. Performance is very implementation dependent. A prime benefit of int16_t is its small, well defined size (also it must be 2's complement) as used in structures and arrays, not so much for speed.
The typedef name int_fastN_t designates the fastest signed integer type with a width of at least N. C11 §7.20.1.3 2
2) If I needed a for loop to run just 10 times, is it ok to use the smallest integer that can handle it instead of the typical 32 bit integer?
Yes but that savings in code and speed is questionable. Suggest int instead. Emitted code tends to be optimal in speed/size with the native int size.
3) Whenever I use an integer that I know will never be negative is it OK to prefer using the unsigned version even if I do not need the extra range in provides?
Using some unsigned type is preferred when the math is strictly unsigned (such as array indexing with size_t), yet code needs to watch for careless application like
for (unsigned i = 10 ; i >= 0; i--) // infinite loop
4) Is it safe to use a typedef for the types included from stdint.h
Almost always. Types like int16_t are optional. Maximum portability uses required types uint_least16_t and uint_fast16_t for code to run on rare platforms that use bits widths like 9, 18, etc.
Can using a smaller integer such as int16_t be slower than something higher such as int32_t?
Yes. Some CPUs do not have dedicated 16-bit arithmetic instructions; arithmetic on 16-bit integers must be emulated with an instruction sequence along the lines of:
r1 = r2 + r3
r1 = r1 & 0xffff
The same principle applies to 8-bit types.
Use the "fast" integer types in <stdint.h> to avoid this -- for instance, int_fast16_t will give you an integer that is at least 16 bits wide, but may be wider if 16-bit types are nonoptimal.
If I needed a for loop to run just 10 times, is it ok to use the smallest integer that can handle it instead of the typical 32 bit integer?
Don't bother; just use int. Using a narrower type doesn't actually save any space, and may cause you issues down the line if you decide to increase the number of iterations to over 127 and forget that the loop variable is using a narrow type.
Whenever I use an integer that I know will never be negative is it ok to prefer using the unsigned version even if I do not need the extra range in provides?
Best avoided. Certain C idioms do not work properly on unsigned integers; for instance, you cannot write a loop of the form:
for (i = 100; i >= 0; i--) { … }
if i is an unsigned type, because i >= 0 will always be true!
Is it safe to use a typedef for the types included from stdint.h
Safe from a technical perspective, but it'll annoy other developers who have to work with your code.
Get used to the <stdint.h> names. They're standardized and reasonably easy to type.
Absolutely possible, yes. On my laptop (Intel Haswell), in a microbenchmark that counts up and down between 0 and 65535 on two registers 2 billion times, this takes
1.313660150s - ax dx (16-bit)
1.312484805s - eax edx (32-bit)
1.312270238s - rax rdx (64-bit)
Minuscule but repeatable differences in timing. (I wrote the benchmark in assembly, because C compilers may optimize it to a different register size.)
It will work, but you'll have to keep it up to date if you change the bounds and the C compiler will probably optimize it to the same assembly code anyway.
As long as it's correct C, that's totally fine. Keep in mind that unsigned overflow is defined and signed overflow is undefined, and compilers do take advantage of that for optimization. For example,
void foo(int start, int count) {
for (int i = start; i < start + count; i++) {
// With unsigned arithmetic, this will execute 0 times if
// "start + count" overflows to a number smaller than "start".
// With signed arithmetic, that may happen, or the compiler
// may assume this loop always runs "count" times.
// For defined behavior, avoid signed overflow.
}
Yes. Also, POSIX provides inttypes.h which extends stdint.h with some useful functions and macros.

How to find the largest prime factor of 600851475143?

#include <stdio.h>
main()
{
long n=600851475143;
int i,j,flag;
for(i=2;i<=n/2;i++)
{
flag=1;
if(n%i==0)//finds factors backwards
{
for(j=2;j<=(n/i)/2;j++)//checks if factor is prime
{
if((n/i)%j==0)
flag=0;
}
if(flag==1)
{
printf("%d\n",n/i);//displays largest prime factor and exits
exit(0);
}
}
}
}
The code above works for n = 6008514751. However, it doesn't work for n = 600851475143, even though that number still is within the range of a long.
What can I do to make it work?
One potential problem is that i and j are int, and could overflow for large n (assuming int is narrower than long, which it often is).
Another issue is that for n=600,851,475,143 your program does quite a lot of work (the largest factor is 6857). It is not unreasonable to expect it to take a long time to complete.
Use longs in place of ints. Better still, use uint64_t which has been defined since C99 (acknowledge Zaibis). It is a 64 bit unsigned integral type on all platforms. (The code as you have it will overflow on some platforms).
And now we need to get your algorithm working more quickly:
Your test for prime is inefficient; you don't need to iterate over all the even numbers. Just iterate over primes; up to and equal to the square root of the number you're testing (not half way which you currently do).
Where do you get the primes from? Well, call your function recursively. Although in reality I'd be tempted to cache the primes up to, say, 65536.
From ISO/IEC 9899:TC3
5.2.4.2.1 Sizes of integer types
[...]
Their implementation-defined values shall be equal or greater in magnitude(absolute value) to those shown, with the same sign.
[...]
— minimum value for an object of type long int
LONG_MIN -2147483647 // -(2^31 - 1)
— maximum value for an object of type long int
LONG_MAX +2147483647 // 2^31 - 1
EDIT:
Sorry I forgot to add what this should tell you.
The point is long doesn't even need to be able to hold the value you mentioned, as the standard says it has to be able to hold at least 4 Bytes with sign so it could be possible that your machine is just able to hold values up to 2147483647 in a variable of type long.
On 32-bit machine long range from -2,147,483,648 to 2,147,483,647 and On 64-bit machine its range is from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (NOTE: This is not mandated by C standard and may vary from one compiler to another).
As OP said in comment he is on 32-bit, 600851475143 goes out of range as it is not fit in the range of long.
Try changing n to long long int .. and change i,j to long
EDIT: define n like this :
long long int n = 600851475143LL;
LL - is a suffix to enforce long long type ...

int v/s. long in C

On my system, I get:
sizeof ( int ) = 4
sizeof ( long ) = 4
When I checked with a C program, both int & long overflowed to the negative after:
a = 2147483647;
a++;
If both can represent the same range of numbers, why would I ever use the long keyword?
int has a minimum range of -32767 to 32767, whereas long has a minimum range of -2147483647 to 2147483647.
If you are writing portable code that may have to compile on different C implementations, then you should use long if you need that range. If you're only writing non-portable code for one specific implementation, then you're right - it doesn't matter.
Because sizeof(int)==sizeof(long) isn't always true. int normaly represents the fastest size with at least 2*8 Bit. long on the other hand is at least 4*8 Bit.
C defines a number of integer types and specifies the relation of their sizes. Basically, what it says is that sizeof(long long) >= sizeof(long) >= sizeof(int) >= sizeof(short) >= sizeof(char), and that sizeof(char) == 1.
But the actual sizes are not defined, and depend on the architecture you are running on. On a 32-bit PC, int and long are typically four bytes and long long is 8 bytes. But on a 64-bit system, long is typically 8 bytes, and thus different from int.
There is also a type called uintptr_t (and intptr_t) that is guaranteed to have the same size as data pointers.
The important thing to remember is to not assume that you can, for example, store pointer values in a long or an int. Being portable is probably more important than you think, and it is likely that you will want to compile your code on a 64-bit system in the near future.
I think it's more of a compiler issue nowadays, since computers has gone much faster and demands more numbers, as was the case before.
On different platform or with a different compiler, the int and long may be different.
If you don't plan to port your code to anything else or use a different machine, then pick the one you want, it won't make a difference.
It depends on the compiler, and you might want to check this out: What does the C++ standard state the size of int, long type to be?
The size of built-in data types is variable depending on the C implementation, but they all have minimum ranges. Nowadays, int is typically 4 bytes long (32-bits) because most OS are 32-bit. Note that char will always be 1 bytes.
The size of a data type depends upon the compiler. Different compilers have diffrent size of int and other data types.
So if you make a code which is going to run on diffrent machine you should use long or it is depend on the range of the value tha t ur variable may have.

Should you always use 'int' for numbers in C, even if they are non-negative?

I always use unsigned int for values that should never be negative. But today I
noticed this situation in my code:
void CreateRequestHeader( unsigned bitsAvailable, unsigned mandatoryDataSize,
unsigned optionalDataSize )
{
If ( bitsAvailable – mandatoryDataSize >= optionalDataSize ) {
// Optional data fits, so add it to the header.
}
// BUG! The above includes the optional part even if
// mandatoryDataSize > bitsAvailable.
}
Should I start using int instead of unsigned int for numbers, even if they
can't be negative?
One thing that hasn't been mentioned is that interchanging signed/unsigned numbers can lead to security bugs. This is a big issue, since many of the functions in the standard C-library take/return unsigned numbers (fread, memcpy, malloc etc. all take size_t parameters)
For instance, take the following innocuous example (from real code):
//Copy a user-defined structure into a buffer and process it
char* processNext(char* data, short length)
{
char buffer[512];
if (length <= 512) {
memcpy(buffer, data, length);
process(buffer);
return data + length;
} else {
return -1;
}
}
Looks harmless, right? The problem is that length is signed, but is converted to unsigned when passed to memcpy. Thus setting length to SHRT_MIN will validate the <= 512 test, but cause memcpy to copy more than 512 bytes to the buffer - this allows an attacker to overwrite the function return address on the stack and (after a bit of work) take over your computer!
You may naively be saying, "It's so obvious that length needs to be size_t or checked to be >= 0, I could never make that mistake". Except, I guarantee that if you've ever written anything non-trivial, you have. So have the authors of Windows, Linux, BSD, Solaris, Firefox, OpenSSL, Safari, MS Paint, Internet Explorer, Google Picasa, Opera, Flash, Open Office, Subversion, Apache, Python, PHP, Pidgin, Gimp, ... on and on and on ... - and these are all bright people whose job is knowing security.
In short, always use size_t for sizes.
Man, programming is hard.
Should I always ...
The answer to "Should I always ..." is almost certainly 'no', there are a lot of factors that dictate whether you should use a datatype- consistency is important.
But, this is a highly subjective question, it's really easy to mess up unsigneds:
for (unsigned int i = 10; i >= 0; i--);
results in an infinite loop.
This is why some style guides including Google's C++ Style Guide discourage unsigned data types.
In my personal opinion, I haven't run into many bugs caused by these problems with unsigned data types — I'd say use assertions to check your code and use them judiciously (and less when you're performing arithmetic).
Some cases where you should use unsigned integer types are:
You need to treat a datum as a pure binary representation.
You need the semantics of modulo arithmetic you get with unsigned numbers.
You have to interface with code that uses unsigned types (e.g. standard library routines that accept/return size_t values.
But for general arithmetic, the thing is, when you say that something "can't be negative," that does not necessarily mean you should use an unsigned type. Because you can put a negative value in an unsigned, it's just that it will become a really large value when you go to get it out. So, if you mean that negative values are forbidden, such as for a basic square root function, then you are stating a precondition of the function, and you should assert. And you can't assert that what cannot be, is; you need a way to hold out-of-band values so you can test for them (this is the same sort of logic behind getchar() returning an int and not char.)
Additionally, the choice of signed-vs.-unsigned can have practical repercussions on performance, as well. Take a look at the (contrived) code below:
#include <stdbool.h>
bool foo_i(int a) {
return (a + 69) > a;
}
bool foo_u(unsigned int a)
{
return (a + 69u) > a;
}
Both foo's are the same except for the type of their parameter. But, when compiled with c99 -fomit-frame-pointer -O2 -S, you get:
.file "try.c"
.text
.p2align 4,,15
.globl foo_i
.type foo_i, #function
foo_i:
movl $1, %eax
ret
.size foo_i, .-foo_i
.p2align 4,,15
.globl foo_u
.type foo_u, #function
foo_u:
movl 4(%esp), %eax
leal 69(%eax), %edx
cmpl %eax, %edx
seta %al
ret
.size foo_u, .-foo_u
.ident "GCC: (Debian 4.4.4-7) 4.4.4"
.section .note.GNU-stack,"",#progbits
You can see that foo_i() is more efficient than foo_u(). This is because unsigned arithmetic overflow is defined by the standard to "wrap around," so (a + 69u) may very well be smaller than a if a is very large, and thus there must be code for this case. On the other hand, signed arithmetic overflow is undefined, so GCC will go ahead and assume signed arithmetic doesn't overflow, and so (a + 69) can't ever be less than a. Choosing unsigned types indiscriminately can therefore unnecessarily impact performance.
The answer is Yes. The "unsigned" int type of C and C++ is not an "always positive integer", no matter what the name of the type looks like. The behavior of C/C++ unsigned ints has no sense if you try to read the type as "non-negative"... for example:
The difference of two unsigned is an unsigned number (makes no sense if you read it as "The difference between two non-negative numbers is non-negative")
The addition of an int and an unsigned int is unsigned
There is an implicit conversion from int to unsigned int (if you read unsigned as "non-negative" it's the opposite conversion that would make sense)
If you declare a function accepting an unsigned parameter when someone passes a negative int you simply get that implicitly converted to a huge positive value; in other words using an unsigned parameter type doesn't help you finding errors neither at compile time nor at runtime.
Indeed unsigned numbers are very useful for certain cases because they are elements of the ring "integers-modulo-N" with N being a power of two. Unsigned ints are useful when you want to use that modulo-n arithmetic, or as bitmasks; they are NOT useful as quantities.
Unfortunately in C and C++ unsigned were also used to represent non-negative quantities to be able to use all 16 bits when the integers where that small... at that time being able to use 32k or 64k was considered a big difference. I'd classify it basically as an historical accident... you shouldn't try to read a logic in it because there was no logic.
By the way in my opinion that was a mistake... if 32k are not enough then quite soon 64k won't be enough either; abusing the modulo integer just because of one extra bit in my opinion was a cost too high to pay. Of course it would have been reasonable to do if a proper non-negative type was present or defined... but the unsigned semantic is just wrong for using it as non-negative.
Sometimes you may find who says that unsigned is good because it "documents" that you only want non-negative values... however that documentation is of any value only for people that don't actually know how unsigned works for C or C++. For me seeing an unsigned type used for non-negative values simply means that who wrote the code didn't understand the language on that part.
If you really understand and want the "wrapping" behavior of unsigned ints then they're the right choice (for example I almost always use "unsigned char" when I'm handling bytes); if you're not going to use the wrapping behavior (and that behavior is just going to be a problem for you like in the case of the difference you shown) then this is a clear indicator that the unsigned type is a poor choice and you should stick with plain ints.
Does this means that C++ std::vector<>::size() return type is a bad choice ? Yes... it's a mistake. But if you say so be prepared to be called bad names by who doesn't understand that the "unsigned" name is just a name... what it counts is the behavior and that is a "modulo-n" behavior (and no one would consider a "modulo-n" type for the size of a container a sensible choice).
Bjarne Stroustrup, creator of C++, warns about using unsigned types in his book The C++ programming language:
The unsigned integer types are ideal
for uses that treat storage as a bit
array. Using an unsigned instead of an
int to gain one more bit to represent
positive integers is almost never a
good idea. Attempts to ensure that
some values are positive by declaring
variables unsigned will typically be
defeated by the implicit conversion
rules.
I seem to be in disagreement with most people here, but I find unsigned types quite useful, but not in their raw historic form.
If you consequently stick to the semantic that a type represents for you, then there should be no problem: use size_t (unsigned) for array indices, data offsets etc. off_t (signed) for file offsets. Use ptrdiff_t (signed) for differences of pointers. Use uint8_t for small unsigned integers and int8_t for signed ones. And you avoid at least 80% of portability problems.
And don't use int, long, unsigned, char if you mustn't. They belong in the history books. (Sometimes you must, error returns, bit fields, e.g)
And to come back to your example:
bitsAvailable – mandatoryDataSize >= optionalDataSize
can be easily rewritten as
bitsAvailable >= optionalDataSize + mandatoryDataSize
which doesn't avoid the problem of a potential overflow (assert is your friend) but gets you a bit nearer to the idea of what you want to test, I think.
if (bitsAvailable >= optionalDataSize + mandatoryDataSize) {
// Optional data fits, so add it to the header.
}
Bug-free, so long as mandatoryDataSize + optionalDataSize can't overflow the unsigned integer type -- the naming of these variables leads me to believe this is likely to be the case.
You can't fully avoid unsigned types in portable code, because many typedefs in the standard library are unsigned (most notably size_t), and many functions return those (e.g. std::vector<>::size()).
That said, I generally prefer to stick to signed types wherever possible for the reasons you've outlined. It's not just the case you bring up - in case of mixed signed/unsigned arithmetic, the signed argument is quietly promoted to unsigned.
From the comments on one of Eric Lipperts Blog Posts (See here):
Jeffrey L. Whitledge
I once developed a system in which
negative values made no sense as a
parameter, so rather than validating
that the parameter values were
non-negative, I thought it would be a
great idea to just use uint instead. I
quickly discovered that whenever I
used those values for anything (like
calling BCL methods), they had be
converted to signed integers. This
meant that I had to validate that the
values didn't exceed the signed
integer range on the top end, so I
gained nothing. Also, every time the
code was called, the ints that were
being used (often received from BCL
functions) had to be converted to
uints. It didn't take long before I
changed all those uints back to ints
and took all that unnecessary casting
out. I still have to validate that the
numbers are not negative, but the code
is much cleaner!
Eric Lippert
Couldn't have said it better myself.
You almost never need the range of a
uint, and they are not CLS-compliant.
The standard way to represent a small
integer is with "int", even if there
are values in there that are out of
range. A good rule of thumb: only use
"uint" for situations where you are
interoperating with unmanaged code
that expects uints, or where the
integer in question is clearly used as
a set of bits, not a number. Always
try to avoid it in public interfaces.
Eric
The situation where (bitsAvailable – mandatoryDataSize) produces an 'unexpected' result when the types are unsigned and bitsAvailable < mandatoryDataSize is a reason that sometimes signed types are used even when the data is expected to never be negative.
I think there's no hard and fast rule - I typically 'default' to using unsigned types for data that has no reason to be negative, but then you have to take to ensure that arithmetic wrapping doesn't expose bugs.
Then again, if you use signed types, you still have to sometimes consider overflow:
MAX_INT + 1
The key is that you have to take care when performing arithmetic for these kinds of bugs.
No you should use the type that is right for your application. There is no golden rule. Sometimes on small microcontrollers it is for example more speedy and memory efficient to use say 8 or 16 bit variables wherever possible as that is often the native datapath size, but that is a very special case. I also recommend using stdint.h wherever possible. If you are using visual studio you can find BSD licensed versions.
If there is a possibility of overflow, then assign the values to the next highest data type during the calculation, ie:
void CreateRequestHeader( unsigned int bitsAvailable, unsigned int mandatoryDataSize, unsigned int optionalDataSize )
{
signed __int64 available = bitsAvailable;
signed __int64 mandatory = mandatoryDataSize;
signed __int64 optional = optionalDataSize;
if ( (mandatory + optional) <= available ) {
// Optional data fits, so add it to the header.
}
}
Otherwise, just check the values individually instead of calculating:
void CreateRequestHeader( unsigned int bitsAvailable, unsigned int mandatoryDataSize, unsigned int optionalDataSize )
{
if ( bitsAvailable < mandatoryDataSize ) {
return;
}
bitsAvailable -= mandatoryDataSize;
if ( bitsAvailable < optionalDataSize ) {
return;
}
bitsAvailable -= optionalDataSize;
// Optional data fits, so add it to the header.
}
You'll need to look at the results of the operations you perform on the variables to check if you can get over/underflows - in your case, the result being potentially negative. In that case you are better off using the signed equivalents.
I don't know if its possible in c, but in this case I would just cast the X-Y thing to an int.
If your numbers should never be less than zero, but have a chance to be < 0, by all means use signed integers and sprinkle assertions or other runtime checks around. If you're actually working with 32-bit (or 64, or 16, depending on your target architecture) values where the most significant bit means something other than "-", you should only use unsigned variables to hold them. It's easier to detect integer overflows where a number that should always be positive is very negative than when it's zero, so if you don't need that bit, go with the signed ones.
Suppose you need to count from 1 to 50000. You can do that with a two-byte unsigned integer, but not with a two-byte signed integer (if space matters that much).

Resources