Declaring 64-bit variables in C

Declaring 64-bit variables in C - c

I have a question.
uint64_t var = 1; // this is 000000...00001 right?
And in my code this works:
var ^ (1 << 43)
But how does it know 1 should be in 64 bits? Shouldn’t I write this instead?
var ^ ( (uint64_t) 1 << 43 )

As you supposed, 1 is a plain signed int (which probably on your platform is 32 bit wide in 2's complement arithmetic), and so is 43, so by any chance 1<<43 results in an overflow: in facts, if both arguments are of type int operator rules dictate that the result will be an int as well.
Still, in C signed integer overflow is undefined behavior, so in line of principle anything could happen. In your case, probably the compiler emitted code to perform that shift in a 64 bit register, so by luck it appears to work; to get a guaranteed-correct result you should use the second form you wrote, or, in alternative, specify 1 as an unsigned long long literal using the ull suffix (unsigned long long is guaranteed to be at least 64 bit).
var ^ ( 1ULL << 43 )

I recommend OP's approach, cast the constant ( (uint64_t) 1 << 43 )
For OP's small example, the 2 below will likely perform the same.
uint64_t var = 1;
// OP solution)
var ^ ( (uint64_t) 1 << 43 )
// Others suggested answer
var ^ ( 1ULL << 43 )
The above results have the same value, but different types. The potential difference lies in how 2 types exist in C: uint64_t and unsigned long long and what may follow.
uint64_t has an exact range 0 to 264-1.
unsigned long long has a range 0 to at least 264-1.
If unsigned long long will always be 64-bits, as it seems to be on many a machine there days, there is no issue, but let's look to the future and say this code was run on a machine where unsigned long long was 16 bytes (0 to at least 2128-1).
A contrived example below: The first result of the ^ is a uint64_t, when multiplied by 3, the product will still be uint64_t, performing a modulo 264, should overflow occur, then the result is assigned to d1. In the next case, the result of ^ is an unsigned long long and when multiplied by 3, the product may be bigger than 264 which is then assigned to d2. So d1 and d2 have a different answer.
double d1, d2;
d1 = 3*(var ^ ( (uint64_t) 1 << 43 ));
d2 = 3*(var ^ ( 1ULL << 43 ));
If one wants to work with unit64_t, be consistent. Do not assume unit64_t and unsigned long long are the same. If it is OK for your answer to be a unsigned long long, fine. But in my experience, if one starts using fixed sized types like uint64_t, one does not want variant size types messing up the computations.

var ^ ( 1ULL << 43 ) should do it.

A portable way to have a unit64_t constant is to use UINT64_C macro (from stdint.h):
UINT64_C(1) << 43
Most likely UINT64_C(c) is defined to something like c ## ULL.
From the C standard:
The macro INTN_C(value) shall expand to an integer constant expression
corresponding to the type int_leastN_t. The macro UINTN_C(value) shall
expand to an integer constant expression corresponding to the type
uint_leastN_t. For example, if uint_least64_t is a name for the type
unsigned long long int, then UINT64_C(0x123) might expand to the
integer constant 0x123ULL.

Your compiler doesn't know that the shift should be done in 64 bits. However, with this particular version of the compiler in this particular configuration for this particular code, two wrongs happen to make a right. Don't count on it.
Assuming that int is a 32-bit type on your platform (which is very likely), the two wrongs in 1 << 43 are:
If the shift amount is greater than or equal to the width of the type of the left operand, the behavior is undefined. This means that if x is of type int or unsigned int, then x << 43 has undefined behavior, as does x << 32 or any other x << n where n ≥ 32. For example 1u << 43 would have undefined behavior too.
If the left operand has a signed type, and the result of the operation overflows that type, then the behavior is undefined. For example 0x12345 << 16 has undefined behavior, because the type of the left operand is the signed type int but the result value doesn't fit in int. On the other hand, 0x12345u << 16 is well-defined and has the value 0x23450000u.
“Undefined behavior” means that the compiler is free to generate code that crashes or returns a wrong result. It so happens that you got the desired result in this case — this is not forbidden, however Murphy's law dictates that one day the generated code won't do what you want.
To guarantee that the operation takes place on a 64-bit type, you need to ensure that the left operand is a 64-bit type — the type of the variable that you're assigning the result to doesn't matter. It's the same issue as float x = 1 / 2 resulting in x containing 0 and not 0.5: only the types of the operands matter to determine the behavior of the arithmetic operator. Any of (uint64)1 << 43 or (long long)1 << 43 or (unsigned long long)1 << 43 or 1ll << 43 or 1ull << 43 will do. If you use a signed type, then the behavior is only defined if there is no overflow, so if you're expecting truncation on overflow, be sure to use an unsigned type. An unsigned type is generally recommended even if overflow isn't supposed to happen because the behavior is reproducible — if you use a signed type, then the mere act of printing out values for debugging purposes could change the behavior (because compilers like to take advantage of undefined behavior to generate whatever code is most efficient on a micro level, which can be very sensitive to things like pressure on register allocation).
Since you intend the result to be of type uint64_t, it is clearer to perform all computations with that type. Thus:
uint64_t var = 1;
… var ^ ((uint64_t)1 << 43) …

Related

shift count greater than width of type

I have a function that takes an int data_length and does the following:
unsigned char *message = (unsigned char*)malloc(65535 * sizeof(char));
message[2] = (unsigned char)((data_length >> 56) & 255);
I'm getting the following:
warning: right shift count >= width of type [-Wshift-count-overflow]
message[2] = (unsigned char)((data_length >> 56) & 255);
The program works as expected, but how can I remove the compiler warning (without disabling it)?
Similar questions didn't seem to use a variable as the data to be inserted so it seemed the solution was to cast them to int or such.

Shifting by an amount greater than the bit width of the type in question is not allowed by the standard, and doing so invokes undefined behavior.
This is detailed in section 6.5.7p3 of the C standard regarding bitwise shift operators.
The integer promotions are performed on each of the operands. The
type of the result is that of the promoted left operand. If
the value of the right operand is negative or is greater than
or equal to the width of the promoted left operand, the behavior is
undefined.
If the program appears to be working, it is by luck. You could make a unrelated change to your program or simply build it on a different machine and suddenly things will stop working.
If the size of data_length is 32 bits or less, then shifting right by 56 is too big. You can only shift by 0 - 31.

The problem is simple. You're using data_length as int when it should be unsigned as negative lengths hardly make sense. Also to be able to shift 56 bits the value must be at least 56 57 bits wide. Otherwise the behaviour is undefined.
In practice processors are known to do wildly different things. In one, shifting a 32-bit value right by 32 bits will clear the variable. In another, the value is shifted by 0 bits (32 % 32!). And then in some, perhaps the processor considers it invalid opcode and the OS kills the process.
Simple solution: declare uint64_t data_length.
If you really have limited yourself to 32-bit datatypes, then you can just assign 0 to these bytes that signify the most significant bytes. Or just cast to uint64_t or unsigned long long before the shift.

Expression assigned to a wider essential type

I get the following warning from our analysis tool Composite expression assigned to a wider essential type
This is the code:
uint32_t result;
uint8_t resolution;
result = 1U << resolution;
I tried the following:
#define SHIFT_BY_ONE (uint8_t)1
result = SHIFT_BY_ONE << resolution;
but that then throws this warning Shift left of signed quantity (int)
so I guess I am not understanding the issue correctly. How do I fix this error?

This sounds as if you are running a MISRA-C:2012 checker. Essential type and composite expression are terms from MISRA-C, not from the C standard.
In order to understand anything at all from this warning, you have to study the meaning of essential type (MISRA-C:2012 8.10) and composite expression (MISRA-C:2012 8.10.3).
As for the reason for the warning, it is rule 10.6:
The value of a composite expression shall not be assigned to an object with wider essential type.
If we ignore the meaning of all these terms, what the rule boils down to is that if you have an operation with 2 operands of a smaller type, the result should not get assigned to a variable of larger type than those operands.
This is to prevent code like this:
uint16_t a = 40000;
uint16_t b = 40000;
uint32_t result = a + b;
On a 16 bit system, the operands a and b will not get promoted, so the actual operation will get carried out on 16 bit type - and there will be an unsigned wrap-around (or overflow in case of signed variables).
Confused programmers that don't understand how implicit type promotions work in C might think that the above operation gets carried out on uint32_t just because the result is stored in such a type. But this is not correct, the left-hand side of the assignment operator has nothing to do with the sub-expression a + b what-so-ever. Which type that gets used in sub-expressions is entirely determined by operator precedence, and = has lower precedence than +.
Apparently MISRA-C believes that such misunderstandings are common, which is the rationale for this rule.
As for how to fix it, it is easy:
result = (uint32_t)1U << resolution;

This happens because 1U does not match uint32_t type. Use UINT32_C(...) macro to ensure type compatibility:
result = UINT32_C(1) << resolution;

On your system 1U is probably 16-bit unsigned (which explains the "wider type" when trying to assign to 32-bit unsigned).
In that case, I would use the long suffix for the literal:
result = 1UL << resolution;
(some comments suggest ((uint32_t)1U) << resolution which would be the most portable way after all)

1U may be narrower (16-bit) than uint32_t and therefore "Composite expression assigned to a wider essential type". A shift of a 16-bit unsigned by 20 does not make 0x100000,
uint32_t result;
uint8_t resolution;
result = 1U << resolution; // Potential 16-bit assignment to a 32-bit type.
The narrow-ness of resolution is not an issue here.
How do I fix this error?
I prefer to avoid casting when able and offer 2 alternatives. Both effectively first make a 1 of the destination type without casting. The compiler will certainly emit optimized code.
result = 1u;
result <<= resolution;
// or
result = (result*0u + 1u) << resolution;

Rotate left and back to the right for sign extension with (signed short) cast in C

Previously, I had the following C code, through which I intended to do sign extension of variable 'sample' after a cast to 'signed short' of variable 'sample_unsigned'.
unsigned short sample_unsigned;
signed short sample;
sample = ((signed short) sample_unsigned << 4) >> 4;
In binary representation, I would expect 'sample' to have its most significant bit repeated 4 times. For instance, if:
sample_unsigned = 0x0800 (corresponding to "100000000000" in binary)
I understand 'sample' should result being:
sample = 0xF800 (corresponding to "1111100000000000" in binary)
However, 'sample' always ended being the same as 'sample_unsigned', and I had to split the assignment statement as below, which worked. Why this?
sample = ((signed short) sample_unsigned << 4);
sample >>= 4;

Your approach will not work. There is no gaurantee right shifting will preserve the sign. Even if, it would only work for 16 bit int. For >=32 bit int you have to replicate the sign manually into the upper bits, otherwise it just shifts the same data back and forth. In general, bitshifts of signed values are critical - see the [standard](http://port70.net/~nsz/c/c11/n1570.html#6.5.7 for details. Some constellations invoke undefined behaviour. It is better to avoid them and just work with unsigned integers.
For most platforms, the following works, however. It is not necessarily slower (on platforms with 16 bit int, it is likely even faster):
uint16_t usample;
int16_t ssample;
ssample = (int16_t)usample;
if ( ssample & 0x800 )
ssample |= ~0xFFF;
The cast to int16_t is implementation defined; your compiler shall specify how it is performed. For (almost?) all recent implementations no extra operation is performed. Just verify in the generated code or your compiler documentation. The logical-or relies on intX_t using 2s complement which is guaranteed by the standard - as opposed to the standard types.
On 32 bit platforms, there might be an intrinsic instruction to sign-extend (e.g. ARM Cortex-M3/4 SBFX). Or the compiler provides a builtin function. Depending on your use-case and speed requirements, it might be suitable to use them.
Update:
An alternative approach would be using a bitfield structure:
struct {
int16_t val : 12; // assuming 12 bit signed value like above
} extender;
extender.val = usample;
ssample = extender.val;
This might result in using the same assembler instructions I proposed above.

It is because (signed short) sample_unsigned is automatically converted to int as operand due to interger promotion.
sample = (signed short)((signed short) sample_unsigned << 4) >> 4;
will work as well.

Why first bitshifting left and then right, instead of AND-ing?

I came across this piece of C code:
typedef int gint
// ...
gint a, b;
// ...
a = (b << 16) >> 16;
For ease of notation let's assume that b = 0x11223344 at this point. As far as I can see it does the following:
b << 16 will give 0x33440000
>> 16 will give 0x00003344
So, the 16 highest bits are discarded.
Why would anyone write (b << 16) >> 16 if b & 0x0000ffff would work as well? Isn't the latter form more understandable? Is there any reason to use bitshifts in a case like this? Is there any edge-case where the two could not be the same?

Assuming that the size of int is 32 bits, then there is no need to use shifts. Indeed, bitwise & with a mask would be more readable, more portable and safer.
It should be noted that left-shifting on negative signed integers invokes undefined behavior, and that left-shifting things into the sign bits of a signed integer could also invoke undefined behavior. C11 6.5.7 (emphasis mine):
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated
bits are filled with zeros. If E1 has an unsigned type, the value of
the result is E1 × 2E2, reduced modulo one more than the maximum value
representable in the result type. If E1 has a signed type and
nonnegative value, and E1 × 2E2 is representable in the result type,
then that is the resulting value; otherwise, the behavior is
undefined.
(The only possible rationale I can think of, is some pre-mature optimization for a 16-bit CPU that comes with a poor compiler. Then the code would be more efficient if you broke up the arithmetic in 16 bit chunks. But on such a system, int would most likely be 16 bits, so the code wouldn't make any sense then.)
As a side note, it doesn't make any sense to use the signed int type either. The most correct and safe type for this code would have been uint32_t.

So, the 16 highest bits are discarded.
They are not. Though it is formally implementation-defined how right-shift operation is performed on signed types, most compilers do it so as to replicate the sign bit.
Thus, the 16 highest bits are filled by the replicated value of the 15th bit as the result of this expression.

For an unsigned integral type (eg, the uint32_t we first thought was being used),
(b << 16) >> 16
is identical to b & (1<<16 - 1).
For a signed integral type though,
(b << 16)
could become negative (ie, the low int16_t would have been considered negative when taken on its own), in which case
(b << 16) >> 16
will (probably) still be negative due to sign extension. In that case, it isn't the same as the & mask, because the top bits will be set instead of zero.
Either this behaviour is deliberate (in which case the commented-out typedef is misleading), or it's a bug. I can't tell without reading the code.
Oh, and the shift behaviour in both directions is how I'd expect gcc to behave on x86, but I can't comment on how portable it is outside that. The left-shift may be UB as Lundin points out, and sign extension on the right-shift is implementation defined.

Is there any difference between 1U and 1 in C?

while ((1U << i) < nSize) {
i++;
}
Any particular reason to use 1U instead of 1?

On most compliers, both will give a result with the same representation. However, according to the C specification, the result of a bit shift operation on a signed argument gives implementation-defined results, so in theory 1U << i is more portable than 1 << i. In practice all C compilers you'll ever encounter treat signed left shifts the same as unsigned left shifts.
The other reason is that if nSize is unsigned, then comparing it against a signed 1 << i will generate a compiler warning. Changing the 1 to 1U gets rid of the warning message, and you don't have to worry about what happens if i is 31 or 63.
The compiler warning is most likely the reason why 1U appears in the code. I suggest compiling C with most warnings turned on, and eliminating the warning messages by changing your code.

1U is unsigned. It can carry values twice as big, but without negative values.
Depending on the environment, when using U, i can be a maximum of either 31 or 15, without causing an overflow. Without using U, i can be a maximum of 30 or 14.
31, 30 are for 32 bit int
15, 14 are for 16 bit int

If nSize is an int, it can be maximum of 2147483647 (2^31-1). If you use 1 instead of 1U then 1 << 30 will get you 1073741824 and 1 << 31 will be -2147483648, and so the while loop will never end if nSize is larger than 1073741824.
With 1U << i, 1U << 31 will evaluate to 2147483648, and so you can safely use it for nSize up to 2147483647. If nSize is an unsigned int, it is also possible that the loop never ends, as in that case nSize can be larger than 1U << 31.
Edit: So I disagree with the answers telling you nSize should be unsigned, but if it is signed then it should not be negative...

1U is unsigned.
The reason why they used an unsigned value in that is expression is (I guess) because nSize is unsigned too, and compilers (when invoked with certain parameters) give warnings when comparing a signed and an unsigned values.
Another reason (less likely, in my opinion, but we cannot know without knowing wath value nSize is supposed to assume) is that unsigned values can be twice as big as signed ones, so nSize could be up to ~4*10^9 instead of ~2*10^9.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight