Shift operator on integers - c

I have the following function:
#include<stdio.h>
#include <stdlib.h>
#include <math.h>
#include <stdint.h>
#include <inttypes.h>
uint64_t dtally(uint64_t x)
{
uint64_t t = 0;
while (x){ t += 1 << ((x%10) * 6), x /= 10;
return t;
}
int main(int argc, char *argv[])
{
printf("%" PRIu64 "\n", dtally(39));
return 0;
}
when I pass a number as 39 i understand should return the following value
18014398509481984
but it returns this value:
4456448
because returns this value and not what you expected?

There are two problem with your code (in fact it is the same problem twice).
First, t is an int, which is usually a 32 bit integer (and at least 16 bits). So 2^54, will not fit there. You have to use a 64 bit type for t. But your problem will persist.
The second problem is trickier: 1 << ((x % 10) * 6) perform a shift operation on the literal 1. But 1 is an int. So 1 << 54 will return 0 (the one is shifted out of the 32 bits of memory, then added to t). To solve this, you can cast the literal 1 to int64_t or use the literal 1LL (type long long) instead.
So you should have something like this:
int count(int x)
{
int64_t t = 0;
while (x) t += 1LL << ((x % 10) * 6), x /= 10;
return t;
}

18014398509481984 is probably too big for an int on your platform.
Check this by testing sizeof(int). If that's 4 then the largest number you can represent is a little over 2 billion. It might even be only 2 (in which case the largest int would be 32767).
You could use int64_t instead which is 64 bits (and available since C99; although a platform is not compelled to support it).
Don't forget to suffix any literals with LL which denotes a 64 bit type.

Related

Find number of bits in a data type

I need to write a macro named CountBitsM. this macro has one parameter and produces a value of type int. The parameter is any expression with an object data type or the literal name of any object data type, so i used int. This macro determines the number of bits of storage used for the data type on any machine in which its run. And i can use a macro from limits.h. Here is what i wrote, does this look right?
#ifndef COUNTBITSM_H
#define COUNTBITSM_H
#include <limits.h>
#define CountBitsM(int) ((int)*(CHAR_BIT))
#endif
Second question was to create a function CountIntBitsF that counts the number of bits used to represent a type int value on any machine. However, i can NOT USE any #define, or #include header files, or any macro. I also can not use any multiplications or divisions. The hint that was given was to start with a value of 1 in a type unsigned int variable and left-shift it one bit at a time, keeping count of number of shifts, until the variables value becomes 0. Here is what i have so far:
int CountIntBitsF(void)
{
int IntgMax = 8;
unsigned int count = 1;
while (IntgMax = IntgMax>>2) count++;
return count;
}
First off, i am not supposed to use division or multiplication so am i doing the shift properly? And i cant assume a char/byte contains 8 or any other specific number of bits. So how or what should i set my IntgMax to? Thanks for any help. I am new to C.
Macro for Bits in a Type
A macro to produce the number of bits used to represent a type in storage is:
#define CountBitsM(x) (sizeof (x) * CHAR_BIT)
However, this produces a result with type size_t (usually). If you really need an int result as stated in the question, convert it (but be aware overflow becomes possible):
#define CountBitsM(x) ((int) (sizeof (x) * CHAR_BIT))
Counting Bits
The second question asks to count the number of bits “to represent a type int value” by shifting bits in an unsigned value. There are two theoretical problems here. One is that the number of bits used to represent a value may including padding bits, and counting the bits by shifting a 1 through them only counts the value bits, not the padding bits. The second is that an int may have more padding bits than an unsigned; it may use fewer bits for the sign and value. Overwhelmingly, modern systems will not have these issues; the number of used bits in an int will be the same as the total number of bits used to store it and the number of bits used in an unsigned.
That said, you can count the number of bits in an unsigned object with:
int count = 0;
for (unsigned u = 1; 0 != u; u <<= 1)
++count;
This repeatedly shifts the bit in u left until it is shifted out, while counting the number of iterations required to do this. Note that the bits in an int cannot properly be counted this way, because the behavior of left shift is not defined by the C standard when it overflows an int.
Question one
#define NBITS(type_or_object) (sizeof(type_or_object) * CHAR_BIT)
or without multiplication
#define NBITS(type_or_object) (sizeof(type_or_object) << (CHAR_BIT == 8 ? 3 : CHAR_BIT == 16 ? 4 : CHAR_BIT == 32 ? 5 : 0))
Second question:
For the most popular two's complement (but I think it will also work for sign bit as well as -0 < 0 as I remember). Ir is for signed type. Unsigned types are easy.
int CountIntBits(void)
{
int IntgMax = 1;
int count = 1;
while (IntgMax > 0 )
{
count++;
IntgMax <<= 1;
}
return count;
}
int main(void)
{
printf("%d\n", CountIntBits());
}
or (also no multiplication :) )
int CountIntBits(void)
{
int shift = CHAR_BIT == 8 ? 3 : CHAR_BIT == 16 ? 4 : CHAR_BIT == 32 ? 5 : 0;
return sizeof(int) << shift;
}
for unsigned types:
int CountIntBits(void)
{
unsigned IntgMax = 1;
int count = 0;
while (IntgMax)
{
count++;
IntgMax <<= 1;
}
return count;
}

Convert signed int of variable bit size

I have a number of bits (the number of bits can change) in an unsigned int (uint32_t). For example (12 bits in the example):
uint32_t a = 0xF9C;
The bits represent a signed int of that length.
In this case the number in decimal should be -100.
I want to store the variable in a signed variable and gets is actual value.
If I just use:
int32_t b = (int32_t)a;
it will be just the value 3996, since it gets casted to (0x00000F9C) but it actually needs to be (0xFFFFFF9C)
I know one way to do it:
union test
{
signed temp :12;
};
union test x;
x.temp = a;
int32_t result = (int32_t) x.temp;
now i get the correct value -100
But is there a better way to do it?
My solution is not very flexbile, as I mentioned the number of bits can vary (anything between 1-64bits).
But is there a better way to do it?
Well, depends on what you mean by "better". The example below shows a more flexible way of doing it as the size of the bit field isn't fixed. If your use case requires different bit sizes, you could consider it a "better" way.
unsigned sign_extend(unsigned x, unsigned num_bits)
{
unsigned f = ~((1 << (num_bits-1)) - 1);
if (x & f) x = x | f;
return x;
}
int main(void)
{
int x = sign_extend(0xf9c, 12);
printf("%d\n", x);
int y = sign_extend(0x79c, 12);
printf("%d\n", y);
}
Output:
-100
1948
A branch free way to sign extend a bitfield (Henry S. Warren Jr., CACM v20 n6 June 1977) is this:
// value i of bit-length len is a bitfield to sign extend
// i is right aligned and zero-filled to the left
sext = 1 << (len - 1);
i = (i ^ sext) - sext;
UPDATE based on #Lundin's comment
Here's tested code (prints -100):
#include <stdio.h>
#include <stdint.h>
int32_t sign_extend (uint32_t x, int32_t len)
{
int32_t i = (x & ((1u << len) - 1)); // or just x if you know there are no extraneous bits
int32_t sext = 1 << (len - 1);
return (i ^ sext) - sext;
}
int main(void)
{
printf("%d\n", sign_extend(0xF9C, 12));
return 0;
}
This relies on the implementation defined behavior of sign extension when right-shifting signed negative integers. First you shift your unsigned integer all the way left until the sign bit is becoming MSB, then you cast it to signed integer and shift back:
#include <stdio.h>
#include <stdint.h>
#define NUMBER_OF_BITS 12
int main(void) {
uint32_t x = 0xF9C;
int32_t y = (int32_t)(x << (32-NUMBER_OF_BITS)) >> (32-NUMBER_OF_BITS);
printf("%d\n", y);
return 0;
}
This is a solution to your problem:
int32_t sign_extend(uint32_t x, uint32_t bit_size)
{
// The expression (0xffffffff << bit_size) will fill the upper bits to sign extend the number.
// The expression (-(x >> (bit_size-1))) is a mask that will zero the previous expression in case the number was positive (to avoid having an if statemet).
return (0xffffffff << bit_size) & (-(x >> (bit_size-1))) | x;
}
int main()
{
printf("%d\n", sign_extend(0xf9c, 12)); // -100
printf("%d\n", sign_extend(0x7ff, 12)); // 2047
return 0;
}
The sane, portable and effective way to do this is simply to mask out the data part, then fill up everything else with 0xFF... to get proper 2's complement representation. You need to know is how many bits that are the data part.
We can mask out the data with (1u << data_length) - 1.
In this case with data_length = 8, the data mask becomes 0xFF. Lets call this data_mask.
Thus the data part of the number is a & data_mask.
The rest of the number needs to be filled with zeroes. That is, everything not part of the data mask. Simply do ~data_mask to achieve that.
C code: a = (a & data_mask) | ~data_mask. Now a is proper 32 bit 2's complement.
Example:
#include <stdio.h>
#include <inttypes.h>
int main(void)
{
const uint32_t data_length = 8;
const uint32_t data_mask = (1u << data_length) - 1;
uint32_t a = 0xF9C;
a = (a & data_mask) | ~data_mask;
printf("%"PRIX32 "\t%"PRIi32, a, (int32_t)a);
}
Output:
FFFFFF9C -100
This relies on int being 32 bits 2's complement but is otherwise fully portable.

Function for binary conversion

I am trying to convert a decimal value to binary using the function I wrote in C below. I cannot figure out the reason why it is printing 32 zeroes rather than the binary value of 2.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <limits.h>
int binaryConversion(int num){
int bin_buffer[32];
int mask = INT_MIN;
for(int i = 0; i < 32; i++){
if(num & mask){
bin_buffer[i] = 1;
mask >> 1;
}
else{
bin_buffer[i] = 0;
mask >> 1;
}
}
for(int j = 0; j < 32; j++){
printf("%d", bin_buffer[j]);
}
}
int main(){
binaryConversion(2);
}
Thanks
Two mistakes:
You use >> instead of >>=, so you're not actually ever changing mask.
You didn't declare mask as unsigned, so when you shift, it'll get sign-extended, which you don't want.
If you put a:
printf("%d %d\n", num, mask);
immediately inside your for loop, you'll see why:
2 -2147483648
2 -2147483648
2 -2147483648
2 -2147483648
:
2 -2147483648
The expression mask >> 1 does right shift the value of mask but doesn't actually assign it back to mask. I think you meant to use:
mask >>= 1;
On top of that (once you fix that problem), you'll see that the values in the mask are a bit strange because right-shifting a negative value can preserve the sign, meaning you will end up with multiple bits set.
You'd be better off using unsigned integers since the >> operator will act on them more in line with your expectations.
Additionally, there's little point in writing all those bits into a buffer just so you can print them out later. Unless you need to do some manipulation on the bits (and this appears to not be the case here), you can just output them directly as they're calculated (and get rid of the now unnecessary i variable).
So, taking all those points into account, you can greatly simplify your code such as with the following complete program:
#include <stdio.h>
#include <limits.h>
int binaryConversion(unsigned num) {
for (unsigned mask = (unsigned)INT_MIN; mask != 0; mask >>= 1)
putchar((num & mask) ? '1' : '0');
}
int main(void) {
binaryConversion(2);
putchar('\n');
}
And just one more note, the value of INT_MIN is not actually required to just have the top bit set. Because of the current allowance by C to handle ones' complement and sign-magnitude (as well as two's complement) for negative numbers, it possible for INT_MIN to have a value with multiple bits set (such as -32767).
There are moves afoot to remove these little-used encodings from C (C++20 has already flagged this) but, for maximum portability, you could opt instead for the following function:
int binaryConversion(unsigned int num) {
// Done once to set topBit.
static unsigned topBit = 0;
if (topBit == 0) {
topBit = 1;
while (topBit << 1 != 0) topBit <<= 1;
}
// Loop to process all bits.
for (unsigned mask = topBit; mask != 0; mask >>= 1)
putchar(num & mask ? '1' : '0');
}
This calculates the value with the top bit set the first time you call the function, irrespective of the vagaries of negative encodings. Just watch out if you call it concurrently in a threaded program.
But, as mentioned, this probably isn't necessary, the number of environments that use the other two encodings would be countable on the fingers of a very careless/unlucky industrial machine operator.
You already have your primary question answered regarding the use of >> rather than =>>. However, from a fundamental standpoint there is no need to buffer the 1 and 0 in an array of int (e.g. int bin_buffer[32];) and there is no need to use the variadic printf function to display int values if all you are doing is outputting the binary representation of the number.
Instead, all you need is putchar() to output '1' or '0' depending on whether any bit is set or clear. You can also make your output function a bit more useful by providing the size of the representation you want, e.g. a byte (8-bits), a word (16-bits), and so on.
For example, you could do:
#include <stdio.h>
#include <limits.h>
/** binary representation of 'v' padded to 'sz' bits.
* the padding amount is limited to the number of
* bits in 'v'. valid range: 0 - sizeof v * CHAR_BIT.
*/
void binaryConversion (const unsigned long v, size_t sz)
{
if (!sz) { fprintf (stderr, "error: invalid sz.\n"); return; }
if (!v) { while (sz--) putchar ('0'); return; }
if (sz > sizeof v * CHAR_BIT)
sz = sizeof v * CHAR_BIT;
while (sz--)
putchar ((v >> sz & 1) ? '1' : '0');
}
int main(){
fputs ("byte : ", stdout);
binaryConversion (2, 8);
fputs ("\nword : ", stdout);
binaryConversion (2, 16);
putchar ('\n');
}
Which allows you to set the number of bits you want displayed, e.g.
Example Use/Output
$ ./bin/binaryconversion
byte : 00000010
word : 0000000000000010
There is nothing wrong with your approach, but there may be a simpler way to arrive at the same output.
Let me know if you have further questions.
INT_MIN is a negative number so when you shifted to the right using >>, the most significant bit will still be 1 instead of zero and you will end up in mask=11111...111 all bits have value of 1. Also the mask value is not changing. better use >>= instead. You can try masking on 0x1 and shift the actual value of num instead of the mask like this.
int binaryConversion(int num) {
char bin_buffer[32 + 1]; //+1 for string terminator.
int shifted = num;
for (int i = 31; i >= 0; --i, shifted >>= 1) { //loop 32x
bin_buffer[i] = '0' + (shifted & 0x1);
}
bin_buffer[32] = 0; //terminate the string.
printf("%s", bin_buffer);
}

REPEAT_BYTE(x) macro

I was looking the code in kernel.h header file in /usr/src/linux-headers-3.11-.../include/linux/, I stumbled upon this macro (line 47) :
#define REPEAT_BYTE(x) ((~0ul / 0xff) * (x))
After running this example I made:
#include <stdio.h>
#define REPEAT_BYTE(x) ((~0ul / 0xff) * (x))
int main(void)
{
long z = 12;
fprintf(stderr, "\tz = %ldd (0x%lX)\n"
"\tREPEAT_BYTE(%ldd) = %ldd (0x%lX)\n",
z, z, z, REPEAT_BYTE(z), REPEAT_BYTE(z));
return 0;
}
I've figured out what it does: It receives an int between 0 and 255(including them), so any one-byte long int, and it repeats that byte. This is obvious (except macro's name) when looking at the output:
z = 12d (0xC)
REPEAT_BYTE(12d) = 868082074056920076d (0xC0C0C0C0C0C0C0C)
However, I still can't understand how does this expression work: ((~0ul / 0xff) * (x)), I could use some help with it.
Thanks a lot in advance!
On a 64-bit machine, ~0ul is 0xffffffffffffffff. Divide that by 0xff and you get 0x0101010101010101. Multiply by an 8-bit value and you get that 8-bit value repeated 8 times.

Defining smallest possible sized macro in C

I want to define a boolean macro in C that uses less than 4 bytes. I have looked into this, and maybe it is possible to define an asm macro, with gcc, that could be less. It is important that the definition will be small because I will have tens of thousands of matrices which hold these boolean values, and it is important that they can be as memory efficient as possible. Ideally, I want to define a 4-bit, or 8-bit macro that represents true and false, and will evaluate as such in an if-statement.
Edit:
When I define a macro
#define True 0
#define False !True
and then print the size, it returns a size of 4 bytes, which is very inefficient.
Edit2:
I just read up on bitpacking, and however little bits I could have for a boolean would be best. I'm just not too sure how to bitpack a struck that has the size of a few bits.
Edit3:
#include <stdio.h>
#include <string.h>
#define false (unsigned char(0))
#define true (!false)
int main() {
if (true) {
printf("The size of true is %d\n", sizeof(true));
}
}
gives the following output
test.c: In function ‘main’:
test.c:8:9: error: expected ‘)’ before numeric constant
test.c:9:51: error: expected ‘)’ before numeric constant
Try this instead for your macros:
#define false ((unsigned char) 0)
#define true (!false)
This won't fix your space needs though. For more efficient storage, you need to use bits:
void SetBoolValue(int bitOffset, unsigned char *array, bool value)
{
int index = bitOffset >> 3;
int mask = 1 << (bitOffset & 0x07);
if (value)
array[index] |= mask;
else
array[index] &= ~mask;
}
bool GetBoolValue(int bitOffset, unsigned char *array)
{
int index = bitOffset >> 3;
int mask = 1 << (bitOffset & 0x07);
return array[index] & mask;
}
Where each value of "array" can hold 8 bools. On modern systems, it can be faster to use a U32 or U64 as the array, but it can take up more space for smaller amounts of data.
To pack larger amounts of data:
void SetMultipleBoolValues(int bitOffset, unsigned char *array, int value, int numBitsInValue)
{
for(int i=0; i<numBitsInValue; i++)
{
SetBoolValue(bitOffset + i, array, (value & (1 << i)));
}
}
And here would be a driver:
int main(void)
{
static char array[32]; // Static so it starts 0'd.
int value = 1234; // An 11-bit value to pack
for(int i=0; i<4; i++)
SetMultipleBoolValues(i * 11, array, value, 11); // 11 = 11-bits of data - do it 4 times
for(int i=0; i<32; i++)
printf("%c", array[i]);
return 0;
}
If you are using this in a structure, then you will want to use a bit field.
struct {
unsigned flag : 1;
/* other fields */
};
If you are wanting an array of boolean values, you should implement a bit vector (I was about to implement one, but Michael Dorgan's already done it).
First of all, there's no storage associated with your macros; they expand to the integer constants 0 and 1. The sizeof evaluates to 4 because the expressions have integer type. You can certainly assign those values to objects of smaller type (short or char).
For me, life got a lot simpler when I stopped using TRUE and FALSE macros1. Remember that in C, a zero-valued integral expression evaluates to false, and all non-zero-valued integral expressions evaluate to true.
If you want to store values into something smaller than 8 bits, then you're going to have to do your own bit packing, something like
#define TEST(x,bit) ((x) & (1 << (bit)))
#define SET(x,bit) ((x) |= (1 << (bit)))
#define CLEAR(x,bit) ((x) &= ~(1 << (bit)))
The smallest useful type for this is unsigned char. So if you need to store N single-bit values, you need an array of N/CHAR_BIT+1 elements. For example, to store 10 single-bit boolean values, you need 2 eight-bit array elements. Bits 0 through 7 will be stored in element 0, and bits 8 through 10 will be stored in element 1.
So, something like
#define MAX_BITS 24
unsigned char bits[MAX_BITS / CHAR_BIT + 1];
int bit = ...;
SET(bits[bit/CHAR_BIT], bit % CHAR_BIT);
if ( TEST(bits[bit/CHAR_BIT], bit % CHAR_BIT) )
{
// do something if bit is set
}
CLEAR(bits[bit/CHAR_BIT], bit % CHAR_BIT);
No warranties express or implied; I don't do a lot of bit twiddling. But hopefully this at least points you in the right direction.
1. The precipitating event was someone dropping a header where TRUE == FALSE. Not the most productive afternoon.
You should probably just use an unsigned char, it will be the smallest individually addressable type:
typedef unsigned char smallBool;
smallBool boolMatrix[M][N];
The above will use M * N bytes for the matrix.
Of course, wasting CHAR_BIT - 1 bits to store a single bit is ... wasteful. Consider bit-packing the boolean values.

Resources