set most significant bit in C - c

I am trying to set the most significant bit in a long long unsigned, x.
To do that I am using this line of code:
x |= 1<<((sizeof(x)*8)-1);
I thought this should work, because sizeof gives size in bytes, so I multiplied by 8 and subtract one to set the final bit. Whenever I do that, the compiler has this warning: "warning: left shift count >= width of type"
I don't understand why this error is occurring.

The 1 that you are shifting is a constant of type int, which means that you are shifting an int value by sizeof(unsigned long long) * 8) - 1 bits. This shift can easily be more than the width of int, which is apparently what happened in your case.
If you want to obtain some bit-mask mask of unsigned long long type, you should start with an initial bit-mask of unsigned long long type, not of int type.
1ull << (sizeof(x) * CHAR_BIT) - 1
An arguably better way to build the same mask would be
~(-1ull >> 1)
or
~(~0ull >> 1)

use 1ULL << instead of 1 <<
Using just "1" makes you shift an integer. 1ULL will be an unsigned long long which is what you need.
An integer will probably be 32 bits and long long probably 64 bits wide. So shifting:
1 << ((sizeof(long long)*8)-1)
will be (most probably):
1 << 63
Since 1 is an integer which is (most probably) 32 bits you get a warning because you are trying to shift past the MSB of a 32 bit value.

The literal 1 you are shifting is not automatically an unsigned long long (but an int) and thus does not have as many bits as you need. Suffix it with ULL (i.e., 1ULL), or cast it to unsigned long long before shifting to make it the correct type.
Also, to be a bit safer for strange platforms, replace 8 with CHAR_BIT. Note that this is still not necessarily the best way to set the most significant bit, see, e.g., this question for alternatives.
You should also consider using a type such as uint64_t if you're assuming unsigned long long to be a certain width, or uint_fast64_t/uint_least64_t if you need at least a certain width, or uintmax_t if you need the largest available type.

Thanks to the 2's complement representation of negative integers, the most-negative interger is exactly the desired bit pattern with only the MSB set. So x |= (unsigned long long )LONG_LONG_MIN; should work too.

Related

Is there a generic "isolate a single byte" bit mask for all systems, irrespective of CHAR_BIT?

If CHAR_BIT == 8 on your target system (most cases), it's very easy to mask out a single byte:
unsigned char lsb = foo & 0xFF;
However, there are a few systems and C implementations out there where CHAR_BIT is neither 8 nor a multiple thereof. Since the C standard only mandates a minimum range for char values, there is no guarantee that masking with 0xFF will isolate an entire byte for you.
I've searched around trying to find information about a generic "byte mask", but so far haven't found anything.
There is always the O(n) solution:
unsigned char mask = 1;
size_t i;
for (i = 0; i < CHAR_BIT; i++)
{
mask |= (mask << i);
}
However, I'm wondering if there is any O(1) macro or line of code somewhere that can accomplish this, given how important this task is in many system-level programming scenarios.
The easiest way to extract an unsigned char from an integer value is simply to cast it to unsigned char:
(unsigned char) SomeInteger
Per C 2018 6.3.1.3 2, the result is the remainder of SomeInteger modulo UCHAR_MAX+1. (This is a non-negative remainder; it is always adjusted to be greater than or equal to zero and less than UCHAR_MAX+1.)
Assigning to an unsigned char has the same effect, as assignment performs a conversion (and initializing works too):
unsigned char x;
…
x = SomeInteger;
If you want an explicit bit mask, UCHAR_MAX is such a mask. This is so because unsigned integers are pure binary in C, and the maximum value of an unsigned integer has all value bits set. (Unsigned integers in general may also have padding bit, but unsigned char may not.)
One difference can occur in very old or esoteric systems: If a signed integer is represented with sign-and-magnitude or one’s complement instead of today’s ubiquitous two’s complement, then the results of extracting an unsigned char from a negative value will differ depending on whether you use the conversion method or the bit-mask method.
On review (after accept) , #Eric Postpischil answer's part about UCHAR_MAX makes for a preferable mask.
#define BYTE_MASK UCHAR_MAX
The value UCHAR_MAX shall equal 2CHAR_BIT − 1. C11dr §5.2.4.2.1 2
As unsigned char cannot have padding. So UCHAR_MAX is always the all bits set pattern in a character type and hence in a C "byte".
some_signed & some_unsigned is a problem on non-2's complement as the some_signed is convert to unsigned before the & thus changing the bit pattern on negative vales. To avoid, the all ones mask needs to be signed when masking signed types. The is usually the case with foo & UINT_MAX
Conclusion
Assume: foo is of some integer type.
If only 2's complement is of concern, use a cast - it does not change the bit pattern.
unsigned char lsb = (unsigned char) foo;
Otherwise with any integer encoding and CHAR_MAX <= INT_MAX
unsigned char lsb = foo & UCHAR_MAX;
Otherwise TBD
Shifting an unsigned 1 by CHAR_BIT and then subtracting 1 will work even on esoteric non-2's complement systems. #Some programmer dude. Be sure to use unsigned math.
On such systems, this preserves the bit patten unlike (unsigned char) cast on negative integers.
unsigned char mask = (1u << CHAR_BIT) - 1u;
unsigned char lsb = foo & mask;
Or make a define
#define BYTE_MASK ((1u << CHAR_BIT) - 1u)
unsigned char lsb = foo & BYTE_MASK;
To also handle those pesky cases where UINT_MAX == UCHAR_MAX where 1u << CHAR_BIT would be UB, shift in 2 steps.
#define BYTE_MASK (((1u << (CHAR_BIT - 1)) << 1u) - 1u)
UCHAR_MAX does not have to be equal to (1U << CHAR_BIT) - 1U
you need actually to and with that calculated value not with the UCHAR_MAX
value & ((1U << CHAR_BIT) - 1U).
Many real implementations (for example TI) define UCHAR_MAX as 255 and emit the code which behaves like the one on the machines having 8 bits bytes. It is done to preserve compatibility with the code written for other targets.
For example
unsigned char x;
x++;
will generate the code which checks in the value of x is larger than UCHAR_MAX and if it the truth zeroing the 'x'

C find maximum two's complement integer

I am tasked with finding maximum two's complement integer, or the TMax. I am at a complete loss for how to do this. I know that the correct value is 0x7fffffff, or 2147483647, but I do not know how exactly to get to this result. That's the maximum number for a 32 bit integer. I cannot use functions or conditionals, and at most I can use 4 operations. Can anyone try and help explain this to me? I know the way to find the maximum number for a certain bit count is 2^(bits - 1) - 1, so 2^(31) - 1 = 2147483647
Assuming you know that your machine uses two's complement representation, this is how you would do so in a standard compliant manner:
unsigned int x = ~0u;
x >>= 1;
printf("max int = %d\n", (int)x);
By using an unsigned int, you prevent any implementation defined behavior caused by right shifting a negative value.
find maximum two's complement integer
int TMax = -1u >> 1 or -1u/2 is sufficient when INT_MAX == UINT_MAX/2 to find the maximum int,
This "works" even if int is encoded as 2's complement or the now rare 1s complement or sign magnitude.
Better to use
#include <limits.h>
int TMax = INT_MAX;
Other tricks can involve undefined, implementation defined, unspecified behavior which are best avoided in C.
There are two scenarios in which you may be looking for the maximum positive number, either given an integer data type or given a number of bits. The are also two solutions.
Fill and shift right
Working in an integer data type of a size that exactly matches the size of the desired twos complement data type, you might be able to solve the problem by
(unsigned 'type') ^0)>>1
or equivalently,
(unsigned 'type') ^0)/2.
For example, on a machine where short is 16 bits,
(unsigned short) ^0 ==> 0xFFFF (65535)
((unsigned short) ^0 ) >> 1 ==> 0x7FFF (32767)
On a 32 bit data type, this method gives us 0x7FFFFFFF (2147483647).
In C, an integer type has a minimum size only, c.f. an int can be 16 bits, 32 bits, or larger. But, the word size used in the calculation must exactly match that of the intended target.
Also, note that the data must be an unsigned type. The right shift for a signed type is usually implemented as a sign extended shift (the sign bit is copied into the result).
Set the sign bit only and subtract 1
The second technique, which works for any word size equal to or larger than the number of bits of the desired twos complement word size, is
(unsigned integer_type) 1<<(n-1)-1
For example, in any integer word size greater to or larger than 16, we can find the TMAX for 16 as
(unsigned integer_type) 1<<15 ==> binary 1000 0000 0000 0000 (0x8000)
(unsigned integer_type) (1<<15 - 1) == > 0111 1111 1111 1111 (0x7FFF)
This is robust and works on almost any scenario that provides adequate word size.
Again the data type for the calculation has to be unsigned if the word size in the calculation is that of the target. This is not necessary for a larger word size.
Examples
In the first example, we show that the second method works for 32 bits, using long or long long types.
#include <stdio.h>
int main() {
printf( "%ld\n", (long) ( ( ((unsigned long) 1)<<31 ) - 1 ) );
printf( "%lld\n", (long long) ( ( ((unsigned long long) 1)<<31 ) - 1 ) );
}
Output:
2147483647
2147483647
And here we show that the first method, shift right from all bits set, fails when int is not exactly 32 bits, which as noted, is not guaranteed in C.
#include <stdio.h>
int main() {
printf( "from long long %lld (%zu bits)\n", ( (unsigned long long) ~0 )>>1,
sizeof(unsigned long long)*8 );
printf( "from long %ld (%zu bits)\n", ( (unsigned long) ~0 )>>1,
sizeof(unsigned long)*8 );
printf( "from int %d (%zu bits)\n", ( (unsigned int) ~0 )>>1,
sizeof(unsigned int)*8 );
printf( "from short %d (%zu bits)\n", ( (unsigned short) ~0 )>>1,
sizeof(unsigned short)*8 );
}
Output:
from long long 9223372036854775807 (64 bits)
from long 9223372036854775807 (64 bits)
from int 2147483647 (32 bits)
from short 32767 (16 bits)
Again, recall that the C language only guarantees a minimum size for any integer data types. An int can be 16 bits or 32 bits or larger, depending on your platform.
Thanks for the help, everyone! Turns out I cannot use macros, unsigned, or longs. I came to this solution:
~(1 << 31)
That generates the correct output, so I will leave it at that!

Assigning bits to a 64-bit variable

I am kinda new to bit operations. I am trying to store information in an int64_t variable like this:
int64_t u = 0;
for(i=0;i<44;i++)
u |= 1 << i;
for(;i<64;i++)
u |= 0 << i;
int t = __builtin_popcountl(u);
and what I intended with this was to store 44 1s in variable u and make sure that the remaining positions are all 0, so "t" returns 44. However, it always returns 64. With other variables, e.g. int32, it also fails. Why?
The type of an expression is generally determined by the expression itself, not by the context in which it appears.
Your variable u is of type int64_t (incidentally, uint64_t would be better since you're performing bitwise operations).
In this line:
u |= 1 << i;
since 1 is of type int, 1 << i is also of type int. If, as is typical, int is 32 bits, this has undefined behavior for larger values of i.
If you change this line to:
u |= (uint64_t)1 << i;
it should do what you want.
You could also change the 1 to 1ULL. That gives it a type of unsigned long long, which is guaranteed to be at least 64 bits but is not necessarily the same type as uint64_t.
__builtin_popcountl takes unsigned long as its paremeter, which is not always 64-bit integer. I personally use __builtin_popcountll, which takes long long. Looks like it's not the case for you
Integers have type 'int' by default, and by shifting int by anything greater or equal to 32 (to be precise, int's size in bits), you get undefined behavior. Correct usage: u |= 1LL << i; Here LL stands for long long.
Oring with zero does nothing. You can't just set bit to a particular value, you should either OR with mask (if you want to set some bits to 1s) or AND with mask's negation (if you want to set some bits to 0s), negation is done by tilda (~).
When you shift in the high bit of the 32-bit integer and and convert to 64-bit the sign bit will extend through the upper 32 bits; which you will then OR in setting all 64 bits, because your literal '1' is a signed 32 bit int by default. The shift will also not effect the upper 32 bits because the value is only 32 bit; however the conversion to 64-bit will when the the value being converted is negative.
This can be fixed by writing your first loop like this:
for(i=0;i<44;i++)
u |= (int64_t)1 << i;
Moreover, this loop does nothing since ORing with 0 will not alter the value:
for(;i<64;i++)
u |= 0 << i;

variables of incompatible width

I am using the following code to simplify assigning large values to specific locations in memory:
int buffer_address = virtual_to_physical(malloc(BUFFER_SIZE));
unsigned long int ring_slot = buffer_address << 32 | BUFFER_SIZE;
However, the compiler complains "warning: left shift count >= width of type". But an unsigned long int in C is 64 bits, so bit-shifting an int (32 bits) to the left 32 bits should yield a 64 bit value, and hence the compiler shouldn't complain. But it does.
Is there something obvious I'm missing, or otherwise is there a simple workaround?
An unsigned long int is not necessarily 64 bits, but for the simplicity let's assume it is.
buffer_address is of type int. Any expression without any "higher" types on buffer_address should return int. Thereby buffer_address << 32 should return int, and not unsigned long. Thus the compiler complains.
This should solve your issue though:
unsigned long ring_slot = ((unsigned long) buffer_address) << 32 | BUFFER_SIZE;
Please note, an unsigned long is not necessarily 64 bits, this depends on the implementation. Use this instead:
#include <stdint.h> // introduced in C99
uint64_t ring_slot = ((uint64_t) buffer_address) << 32 | BUFFER_SIZE;
buffer_address is a (32-bit) int, so buffer_size << 32 is shifting it by an amount greater than or equal to its size.
unsigned long ring_slot = ((unsigned long) buffer_address << 32) | BUFFER_SIZE:
Note that 'unsigned long' need not be 64-bits (it is not on Windows - 32-bit (ILP32) or 64-bit (LLP64); nor it is on a 32-bit Unix machine (ILP32)). To get a guaranteed (at least) 64-bit integer, you need unsigned long long.
There are few machines where int is a 64-bit quantity (ILP64); the DEC Alpha was one such, and I believe some Cray machines also used that (and the Cray's also used 'big' char types - more than 8 bits per char).
The result of the expression on the right side of the = sign does not depend on what it's assigned to. You must cast to unsigned long first.
unsigned long int ring_slot = (unsigned long)buffer_address << 32 | BUFFER_SIZE;

Is there any difference between 1U and 1 in C?

while ((1U << i) < nSize) {
i++;
}
Any particular reason to use 1U instead of 1?
On most compliers, both will give a result with the same representation. However, according to the C specification, the result of a bit shift operation on a signed argument gives implementation-defined results, so in theory 1U << i is more portable than 1 << i. In practice all C compilers you'll ever encounter treat signed left shifts the same as unsigned left shifts.
The other reason is that if nSize is unsigned, then comparing it against a signed 1 << i will generate a compiler warning. Changing the 1 to 1U gets rid of the warning message, and you don't have to worry about what happens if i is 31 or 63.
The compiler warning is most likely the reason why 1U appears in the code. I suggest compiling C with most warnings turned on, and eliminating the warning messages by changing your code.
1U is unsigned. It can carry values twice as big, but without negative values.
Depending on the environment, when using U, i can be a maximum of either 31 or 15, without causing an overflow. Without using U, i can be a maximum of 30 or 14.
31, 30 are for 32 bit int
15, 14 are for 16 bit int
If nSize is an int, it can be maximum of 2147483647 (2^31-1). If you use 1 instead of 1U then 1 << 30 will get you 1073741824 and 1 << 31 will be -2147483648, and so the while loop will never end if nSize is larger than 1073741824.
With 1U << i, 1U << 31 will evaluate to 2147483648, and so you can safely use it for nSize up to 2147483647. If nSize is an unsigned int, it is also possible that the loop never ends, as in that case nSize can be larger than 1U << 31.
Edit: So I disagree with the answers telling you nSize should be unsigned, but if it is signed then it should not be negative...
1U is unsigned.
The reason why they used an unsigned value in that is expression is (I guess) because nSize is unsigned too, and compilers (when invoked with certain parameters) give warnings when comparing a signed and an unsigned values.
Another reason (less likely, in my opinion, but we cannot know without knowing wath value nSize is supposed to assume) is that unsigned values can be twice as big as signed ones, so nSize could be up to ~4*10^9 instead of ~2*10^9.

Resources