Clarification needed on (u/i)int_fastN_t - c

i read many explanation on fastest minimum-width integer types but i couldn't understand when to use these data types.
My understanding :
On 32-bit machine,
uint_least16_t could be typedef to an unsigned short.
1. uint_least16_t small = 38;
2. unsigned short will be of 16 bits so the value 38 will be stored using 16 bits. And this will take up 16 bits of memory.
3. The range for this data type will be 0 to (2^N)-1 , here N=16.
uint_fast16_t could be typedef to an unsigned int.
1. uint_fast16_t fast = 38;
2. unsigned int will be of 32 bits so the value 38 will be stored using 32 bits. And this will take up 32 bits of memory.
3. what will be the range for this data type ?
uint_fast16_t => uint_fastN_t , here N = 16
but the value can be stored in 32 bits so IS it 0 to (2^16)-1 OR 0 to (2^32)-1 ?
how can we make sure that its not overflowing ?
Since its a 32 bit, Can we assign >65535 to it ?
If it is a signed integer, how signedness is maintained.
For example int_fast16_t = 32768;
since the value falls within the signed int range, it'll be a positive value.

A uint_fast16_t is just the fastest unsigned data type that has at least 16 bits. On some machines it will be 16 bits and on others it could be more. If you use it, you should be careful because arithmetic operations that give results above 0xFFFF could have different results on different machines.
On some machines, yes, you will be able to store numbers larger than 0xFFFF in it, but you should not rely on that being true in your design because on other machines it won't be possible.
Generally the uint_fast16_t type will either be an alias for uint16_t, uint32_t, or uint64_t, and you should make sure the behavior of your code doesn't depend on which type is used.
I would say you should only use uint_fast16_t if you need to write code that is both fast and cross-platform. Most people should stick to uint16_t, uint32_t, and uint64_t so that there are fewer potential issues to worry about when porting code to another platform.
An example
Here is an example of how you might get into trouble:
bool bad_foo(uint_fast16_t a, uint_fast16_t b)
{
uint_fast16_t sum = a + b;
return sum > 0x8000;
}
If you call the function above with a as 0x8000 and b as 0x8000, then on some machines the sum will be 0 and on others it will be 0x10000, so the function could return true or false. Now, if you can prove that a and b will never sum to a number larger than 0xFFFF, or if you can prove that the result of bad_foo is ignored in those cases, then this code would be OK.
A safer implementation of the same code, which (I think) should behave the same way on all machines, would be:
bool good_foo(uint_fast16_t a, uint_fast16_t b)
{
uint_fast16_t sum = a + b;
return (sum & 0xFFFF) > 0x8000;
}

Related

32 bit unsigned multiply on 64 bit causing undefined behavior?

So I have about this code:
uint32_t s1 = 0xFFFFFFFFU;
uint32_t s2 = 0xFFFFFFFFU;
uint32_t v;
...
v = s1 * s2; /* Only need the low 32 bits of the result */
In all the followings I assume the compiler couldn't have any preconceptions on the range of s1 or s2, the initializers only serving for an example above.
If I compiled this on a compiler with an integer size of 32 bits (such as when compiling for x86), no problem. The compiler would simply use s1 and s2 as uint32_t typed values (not being able to promote them further), and the multiplication would simply give the result as the comment says (modulo UINT_MAX + 1 which is 0x100000000 this case).
However if I compiled this on a compiler with an integer size of 64 bits (such as for x86-64), there might be undefined behavior from what I can deduce from the C standard. Integer promotion would see uint32_t can be promoted to int (64 bit signed), the multiplication would then attempt to multiply two int's, which, if they happen to have the values shown in the example, would cause an integer overflow, which is undefined behavior.
Am I correct with this and if so how would you avoid it in a sane way?
I spotted this question which is similar, but covers C++: What's the best C++ way to multiply unsigned integers modularly safely?. Here I would like to get an answer applicable to C (preferably C89 compatible). I wouldn't consider making a poor 32 bit machine potentially executing a 64 bit multiply an acceptable answer though (usually in code where this would be of concern, 32 bit performance might be more critical as typically those are the slower machines).
Note that the same problem can apply to 16 bit unsigned ints when compiled with a compiler having a 32 bit int size, or unsigned chars when compiled with a compiler having a 16 bit int size (the latter might be common with compilers for 8 bit CPUs: the C standard requires integers to be at least 16 bits, so a conforming compiler is likely affected).
The simplest way to get the multiplication to happen in an unsigned type that is at least uint32_t, and also at least unsigned int, is to involve an expression of type unsigned int.
v = 1U * s1 * s2;
This either converts 1U to uint32_t, or s1 and s2 to unsigned int, depending on what's appropriate for your particular platform.
#Deduplicator comments that some compilers, where uint32_t is narrower than unsigned int, may warn about the implicit conversion in the assignment, and notes that such warnings are likely suppressable by making the conversion explicit:
v = (uint32_t) (1U * s1 * S2);
It looks a bit less elegant, in my opinion, though.
Congratulations on finding a friction point.
A possible way:
v = (uint32_t) (UINT_MAX<=0xffffffff
? s1 * s2
: (unsigned)s1 * (unsigned)s2);
Anyway, looks like adding some typedefs to <stdint.h> for types guaranteed to be no smaller than int would be in order ;-).

How can I define a datatype with 1 bit size in C?

I want to define a datatype for boolean true/false value in C. Is there any way to define a datatype with 1 bit size to declare for boolean?
Maybe you are looking for a bit-field:
struct bitfield
{
unsigned b0:1;
unsigned b1:1;
unsigned b2:1;
unsigned b3:1;
unsigned b4:1;
unsigned b5:1;
unsigned b6:1;
unsigned b7:1;
};
There are so many implementation-defined features to bit-fields that it is almost unbelievable, but each of the elements of the struct bitfield occupies a single bit. However, the size of the structure may be 4 bytes even though you only use 8 bits (1 byte) of it for the 8 bit-fields.
There is no other way to create a single bit of storage. Otherwise, C provides bytes as the smallest addressable unit, and a byte must have at least 8 bits (historically, there were machines with 9-bit or 10-bit bytes, but most machines these days provide 8-bit bytes only — unless perhaps you're on a DSP where the smallest addressable unit may be a 16-bit quantity).
Try this:
#define bool int
#define true 1
#define false 0
In my opinion use a variable of type int. That is what we do in C. For example:
int flag=0;
if(flag)
{
//do something
}
else
{
//do something else
}
EDIT:
You can specify the size of the fields in a structure of bits fields in bits. However the compiler will round the type to at the minimum the nearest byte so you save nothing and the fields are not addressable and the order that bits in a bit field are stored is implementation defined. Eg:
struct A {
int a : 1; // 1 bit wide
int b : 1;
int c : 2; // 2 bits
int d : 4; // 4 bits
};
Bit-fields are only allowed inside structures. And other than bit-fields, no object is allowed to be smaller than sizeof(char).
The answer is _Bool or bool.
C99 and later have a built-in type _Bool which is guaranteed to be large enough to store the values 0 and 1. It may be 1 bit or larger, and it is an integer.
They also have a library include which provides a macro bool which expands to _Bool, and macros true and false that expand to 1 and 0 respectively.
If you are using an older compiler, you will have to fake it.
[edit: thanks Jonathan]

Subtracting 0x8000 from an int

I am reverse engineering some old C, running under Win95 (yes, in production) appears to have been compiled with a Borland compiler (I don't have the tool chain).
There is a function which does (among other things) something like this:
static void unknown(int *value)
{
int v = *value;
v-=0x8000;
*value = v;
}
I can't quite work out what this does. I assume 'int' in this context is signed 32 bit. I think 0x8000 would be unsigned 32bit int, and outside the range of a signed 32 bit int. (edit - this is wrong, it is outside of a signed 16 bit int)
I am not sure if one of these would be cast first, and how the casting would handle overflows, and/or how the subtraction would handle the over flow.
I could try on a modern system, but I am also unsure if the results would be the same.
Edit for clarity:
1: 'v-=0x8000;' is straight from the original code, this is what makes little sense to me. v is defined as an int.
2: I have the code, this is not from asm.
3: The original code is very, very bad.
Edit: I have the answer! The answer below wasn't quite right, but it got me there (fix up and I'll mark it as the answer).
The data in v is coming from an ambiguous source, which actually seems to be sending unsigned 16 bit data, but it is being stored as a signed int. Latter on in the program all values are converted to floats and normalised to an average 0 point, so actual value doesn't matter, only order. Because we are looking at an unsigned int as a signed one, values over 32767 are incorrectly placed below 0, so this hack leaves the value as signed, but swaps the negative and positive numbers around (not changing order). End results is all numbers have the same order (but different values) as if they were unsigned in the first place.
(...and this is not the worst code example in this program)
In Borland C 3.x, int and short were the same: 16 bits. long was 32-bits.
A hex literal has the first type in which the value can be represented: int, unsigned int, long int or unsigned long int.
In the case of Borland C, 0x8000 is a decimal value of 32768 and won't fit in an int, but will in an unsigned int. So unsigned int it is.
The statement v -= 0x8000 ; is identical to v = v - 0x8000 ;
On the right-hand side, the int value v is implicitly cast to unsigned int, per the rules, the arithmetic operation is performed, yielding an rval that is an unsigned int. That unsigned int is then, again per the rules, implicitly cast back to the type of the lval.
So, by my estimation, the net effect is to toggle the sign bit — something that could be more easily and clearly done via simple bit-twiddling: *value ^= 0x8000 ;.
There is possibly a clue on this page http://www.ousob.com/ng/borcpp/nga0e24.php - Guide to Borland C++ 2.x ( with Turbo C )
There is no such thing as a negative numeric constant. If
a minus sign precedes a numeric constant it is treated as
the unary minus operator, which, along with the constant,
constitutes a numeric expression. This is important with
-32768, which, while it can be represented as an int,
actually has type long int, since 32768 has type long. To
get the desired result, you could use (int) -32768,
0x8000, or 0177777.
This implies the use of two's complement for negative numbers. Interestingly, the two's complement of 0x8000 is 0x8000 itself (as the value +32768 does not fit in the range for signed 2 byte ints).
So what does this mean for your function? Bit wise, this has the effect of toggling the sign bit, here are some examples:
f(0) = f(0x0000) = 0x8000 = -32768
f(1) = f(0x0001) = 0x8001 = -32767
f(0x8000) = 0
f(0x7fff) = 0xffff
It seems like this could be represented as val ^= 0x8000, but perhaps the XOR operator was not implemented in Borland back then?

Is it safe to compare an (uint32_t) with an hard-coded value?

I need to do bitewise operations on 32bit integers (that indeed represent chars, but whatever).
Is the following kind of code safe?
uint32_t input;
input = ...;
if(input & 0x03000000) {
output = 0x40000000;
output |= (input & 0xFC000000) >> 2;
I mean, in the "if" statement, I am doing a bitwise operation on, on the left side, a uint32_t, and on the right side... I don't know!
So do you know the type and size (by that I mean on how much bytes is it stored) of hard-coded "0x03000000" ?
Is it possible that some systems consider 0x03000000 as an int and hence code it only on 2 bytes, which would be catastrophic?
Is the following kind of code safe?
Yes, it is.
So do you know the type and size (by that I mean on how much bytes is it stored) of hard-coded "0x03000000" ?
0x03000000 is int on a system with 32-bit int and long on a system with 16-bit int.
(As uint32_t is present here I assume two's complement and CHAR_BIT of 8. Also I don't know any system with 16-bit int and 64-bit long.)
Is it possible that some systems consider 0x03000000 as an int and hence code it only on 2 bytes, which would be catastrophic?
See above on a 16-bit int system, 0x03000000 is a long and is 32-bit. An hexadecimal constant in C is the first type in which it can be represented:
int, unsigned int, long, unsigned long, long long, unsigned long long

Can the type difference between constants 32768 and 0x8000 make a difference?

The Standard specifies that hexadecimal constants like 0x8000 (larger than fits in a signed integer) are unsigned (just like octal constants), whereas decimal constants like 32768 are signed long. (The exact types assume a 16-bit integer and a 32-bit long.) However, in regular C environments both will have the same representation, in binary 1000 0000 0000 0000.
Is a situation possible where this difference really produces a different outcome? In other words, is a situation possible where this difference matters at all?
Yes, it can matter. If your processor has a 16-bit int and a 32-bit long type, 32768 has the type long (since 32767 is the largest positive value fitting in a signed 16-bit int), whereas 0x8000 (since it is also considered for unsigned int) still fits in a 16-bit unsigned int.
Now consider the following program:
int main(int argc, char *argv[])
{
volatile long long_dec = ((long)~32768);
volatile long long_hex = ((long)~0x8000);
return 0;
}
When 32768 is considered long, the negation will invert 32 bits,
resulting in a representation 0xFFFF7FFF with type long; the cast is
superfluous.
When 0x8000 is considered unsigned int, the negation will invert
16 bits, resulting in a representation 0x7FFF with type unsigned int;
the cast will then zero-extend to a long value of 0x00007FFF.
Look at H&S5, section 2.7.1 page 24ff.
It is best to augment the constants with U, UL or L as appropriate.
On a 32 bit platform with 64 bit long, a and b in the following code will have different values:
int x = 2;
long a = x * 0x80000000; /* multiplication done in unsigned -> 0 */
long b = x * 2147483648; /* multiplication done in long -> 0x100000000 */
Another examine not yet given: compare (with greater-than or less-than operators) -1 to both 32768 and to 0x8000. Or, for that matter, try comparing each of them for equality with an 'int' variable equal to -32768.
Assuming int is 16 bits and long is 32 bits (which is actually fairly unusual these days; int is more commonly 32 bits):
printf("%ld\n", 32768); // prints "32768"
printf("%ld\n", 0x8000); // has undefined behavior
In most contexts, a numeric expression will be implicitly converted to an appropriate type determined by the context. (That's not always the type you want, though.) This doesn't apply to non-fixed arguments to variadic functions, such as any argument to one of the *printf() functions following the format string.
The difference would be if you were to try and add a value to the 16 bit int it would not be able to do so because it would exceed the bounds of the variable whereas if you were using a 32bit long you could add any number that is less than 2^16 to it.

Resources