constructing key by bit shifting 3 integers in C - c

I want to construct a key composed of 3 values by using bit shifting operations:
According to my understanding, the C statement code I am starting from creates a hash table by constructing its keys from certain data variables:
uint64_t key = (uint64_t)c->pos<<32 | c->isize;
My interpretation is that key is a combination of the last 32 digits
of c->pos, which must be a 64 bit unsigned integer, and c->isize, also a 64bit unsigned integer.
But I am not sure if that is the case, and maybe the | pipe operator
has a different meaning when applied to bit shifting operations.
What I want to do next is to modify the way key is constructed and
include a third c->barc element into the variable. Given the number
of possibilities of c->barc and c->isize, I was thinking that instead
of building key with 32+32 bits (pos+isize), I would build it
with 32+16+16 bits (pos+isize+barc) splitting the last 32 bits between
isize and barc.
Any ideas how to do that?

What I think you need is a solid explanation of bitmasking.
For this particular case, you should use the & operator to mask out the upper 16 bits of c->isize before shifting it up, and then use the & operator again to mask the upper 48 bits of c->barc.
Let's look at some diagrams.
let
c->pos = xxxx_xxxx_....._xxxx
c->isize = yyyy_yyyy_....._yyyy
c->barc = zzzz_zzzz_....._zzzz
where
x, y, and z are bits.
note: underscores are to identify groups of 4 bits.
If I understand correctly, you want a 64-bit number like this:
xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_yyyy_yyyy_yyyy_yyyy_zzzz_zzzz_zzzz_zzzz
right?
As you already know, we get the upper 32 x's by doing
|-----32 bits of pos----|---32 0 bits--|
(uint64_t)c->pos<<32 = xxxx_xxxx_...._xxxx_xxxx_0000_...._0000
Now, we want to bitwise-or that with the following:
|----------32 0 bits----|
0000_0000_...._0000_0000_yyyy_yyyy_yyyy_yyyy_0000_0000_0000_0000
To get that number there, we do this:
((c->isize & 0xffff) << 16)
because:
c->isize & 0xffff gives
yyyy_yyyy_yyyy_yyyy_yyyy_yyyy_yyyy_yyyy
& 0000_0000_0000_0000_1111_1111_1111_1111
---------------------------------------------
0000_0000_0000_0000_yyyy_yyyy_yyyy_yyyy
and then we shift it left by 16 to get
|--------32 0 bits------|
0000_0000_...._0000_0000_yyyy_yyyy_yyyy_yyyy_0000_0000_0000_0000
Now, the final part, the
|-------48 0 bits-------|
0000_0000_...._0000_0000_zzzz_zzzz_zzzz_zzz
is the result plain and simply of
(c->barc & 0xffff) =
zzzz_zzzz_zzzz_zzzz_zzzz_zzzz_zzzz_zzzz
& 0000_0000_0000_0000_1111_1111_1111_1111
-------------------------------------------------
0000_0000_0000_0000_zzzz_zzzz_zzzz_zzzz
So we take all of these expressions and bitwise-or them together.
uint64_t key = ((uint64_t)c->pos << 32) | ((c->isize & 0xffff) << 16)
| (c->barc & 0xffff);
if we diagram it out, we see
xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_0000_0000_0000_0000_0000_0000_0000_0000
0000_0000_0000_0000_0000_0000_0000_0000_yyyy_yyyy_yyyy_yyyy_0000_0000_0000_0000
or 0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_zzzz_zzzz_zzzz_zzzz
-----------------------------------------------------------------------------------
xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_yyyy_yyyy_yyyy_yyyy_zzzz_zzzz_zzzz_zzzz

The "pipe operator" is actually a bitwise OR operator. The code takes two (presumably) 32-bit integers, one of them shifts left by 32 bits and combines them together. Thus you get a single 64-bit number. See Wiki for more info about bitwise operations.
If you want to compose your key from three 32-bit integers, then you obviously have to manipulate them to fit them into 64 bits. You can do something like this:
uint64_t key = (uint64_t)c->pos<<32 | (c->isize & 0xFFFF0000) | (c->barc & 0xFFFF);
This code takes 32 bits from c->pos, shifts them in the higher 32 bits of the 64-bit key, then takes the higher 16 bits of c->isize and finally the lower 16 bits of c->barc. See here for more.

I wouldn't do it. It is not safe if you are not designing whole thing by yourself. But let's explain some things.
My interpretation is that key is a combination of the last 32 digits of c->pos,
Generally, yes.
which must be a 64 bit unsigned integer, and c->isize, also a 64bit unsigned integer.
No. You know nothing about size of type of pos andisize, it is cast onto uint64_t it might be any type that allows such a cast.
My bet is that both values are 32-bit. 1st value is being cast onto 64bit type, because bit shift equal to or greater than the width of the type is undefined behaviour. So to stay safe it is widened.
The code probably packs two 32bit values into a 64bit one, otherwise it would loose information.
Moreover, if it wanted to construct key from values which would overlap it would most probably use xor rather than or. Your way is not a good approach, unless you precisely know what are you doing. You should find out what types your operands are and then choose a method for creation keys out of them.

Related

left shifting unsigned int to 32 bits, generates warning [duplicate]

I recently picked up a copy of Applied Cryptography by Bruce Schneier and it's been a good read. I now understand how several algorithms outlined in the book work, and I'd like to start implementing a few of them in C.
One thing that many of the algorithms have in common is dividing an x-bit key, into several smaller y-bit keys. For example, Blowfish's key, X, is 64-bits, but you are required to break it up into two 32-bit halves; Xl and Xr.
This is where I'm getting stuck. I'm fairly decent with C, but I'm not the strongest when it comes to bitwise operators and the like.
After some help on IRC, I managed to come up with these two macros:
#define splitup(a, b, c) {b = a >> 32; c = a & 0xffffffff; }
#define combine(a, b, c) {a = (c << 32) | a;}
Where a is 64 bits and b and c are 32 bits. However, the compiler warns me about the fact that I'm shifting a 32 bit variable by 32 bits.
My questions are these:
What's bad about shifting a 32-bit variable 32 bits? I'm guessing it's undefined, but these macros do seem to be working.
Also, would you suggest I go about this another way?
As I said, I'm fairly familiar with C, but bitwise operators and the like still give me a headache.
EDIT
I figured out that my combine macro wasn't actually combining two 32-bit variables, but simply ORing 0 by a, and getting a as a result.
So, on top of my previous questions, I still don't have a method of combining the two 32-bit variables to get a 64-bit one; a suggestion on how to do it would be appreciated.
Yes, it is undefined behaviour.
ISO/IEC 9899:1999 6.5.7 Bitwise shift operators ¶3
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
C11 aka ISO/IEC 9899:2011 says the same.
You should first cast b to the target integer type. Another point is that you should put parentheses around the macro parameters to avoid surprises by operator precedences. Additionally, the comma operator is very useful here, allowing you to avoid the braces, so that the macro can be used as a normal command, closed with a semicolon.
#define splitup(a,b,c) ( (b) = (a) >> 32, (c) = (a) & 0xffffffff )
#define combine(a,b,c) ( (a) = ((unsigned long long)(b) << 32) | (c) )
Additional casts may be necessary for `splitup to silence warnings about precision loss by over-paranoid compilers.
#define splitup(a,b,c) ( (b) = (unsigned long)((a) >> 32), (c) = (unsigned long)((a) & 0xffffffff) )
And please don't even think about using your self-written encryption for production code.
Shifting a 32-bit value by 32 bits or more is undefined in C and C++. One of the reasons it was left undefined is that on some hardware platforms the 32-bit shift instruction only takes into account 5 lowest bits of the supplied shift count. This means that whatever shift count you pass, it will be interpreted modulo 32. Attempting to shift by 32 on such platform will actually shift by 0, i.e. not shift at all.
The language authors did not want to burden the compilers written for such platform with the task of analyzing the shift count before doing the shift. Instead, the language specification says that the behavior is undefined. This means that if you want to get a 0 value from a 32-bit shift by 32 (or more), it is up to you to recognize the situation and process it accordingly.
what's bad about shifting a 32-bit variable 32 bits?
Its better to assign 0 to n-bit integer than to shift it by n-bits.
Example:
0 0 1 0 1 ----- 5 bit Integer
0 1 0 1 0 ----- 1st shift
1 0 1 0 0 ----- 2nd shift
0 1 0 0 0 ----- 3rd shift
1 0 0 0 0 ----- 4th shift
0 0 0 0 0 ----- 5th shift (all the bits are shifted!)
I still don't have a method of combining the two 32-bit variables to get a 64-bit one
Consider: a is 64 bit, b and c are 32 bit
a = b;
a = a << 32; //Note: a is 64 bit
a = a | c;
Unless this is some "reinventing the wheel to understand how it works" project, don't implement your own crypto functions.
Ever.
It's hard enough to use the available algorithms to work (and to choose the right one), don't shoot yourself in the foot by putting in production some home grown cryptor API. Chances are your encryption won't encrypt
What's bad about shifting a 32-bit variable 32 bits?
In addition to what have been already said, the 32nd bit is the sign bit, and you may get sign extension to preserve the sing, thereby losing significant bits.

Bitmask for obtaining the value in between X and Y bits

I have a char pointer which points to 16 bytes of an array (Therefore 128 bits).
These bits contain some valuable information for my task and I need to parse the values from their fixed locations. For example, there is a time information in between 23rd bit and 12nd bit. I know that I need to have bitmask to retrieve such data but I couldn't manage it.
Could anyone tell me how I can get this information?
Finally, I need to convert those retrieved bits to integer but this is already the easy part of the task.
A general approach would be to select the byte inside the square brackets, then mask off the bits you care about using the single &, and then right shift the byte by the number of bits it is from the lsb, then you have the integer value.
e.g.
int i = (int)((a[1] & 0x6) >> 1);
This will get the value of 2nd and 3rd least significant bits of the second word and put the value in an integer.
More specifically for your question:
23rd bit and 12nd bit
int i = (int)((a[1] & 0xF0) >> 4) || (((int)(a[2])) << 8);
Another approach in C/C++ would to define a packed struct with bitfields.

Control over array with bitmap

I've now created an array with 64 spots for 8-byte blocks.
How do I implement a bitmap that checks if these spots are used?
I created the array with 64spots with this uint64_t array[64]
Since this is (untagged) homework, I won't give code, but:
First: what's a bitmap? It's using the values of individual bits in a variable individually, rather than interpreting the value as a whole. So, if you need a bitmap to indicate which of 64 blocks are in use, you would need 64 bits - a 64-bit type would do well for this. If you need a bitmap for each of the individual bytes in a "block", the same idea holds, but you can use a smaller size map (one byte bitmap) and more of them, obviously.
Then, you need to access each bit individually - there are bitwise operators that make this easy. Bit 0 would indicate the state of block 0 (unset, or 0 = unused, 1, or set = used), bit 63 would indicate the state of block 63. A bitwise AND between a value that has the bit of the block you'd like to check, and the bitmap, will return whether that block is being used. It can be easy to visualize bits in a data type by opening up calculator programs and setting them to "programmer" mode.
Setting a bit is easy - you can bitwise OR the set bit. Unsetting is a bit trickier if you're not familiar with it, but easy once you've grasped the concept of bitwise operations.
Assuming you are numbering the bits 0 .. 4095.
Then 6 bits represent the index into the bitmap bytes and 6 bits represent the bit within the array element.
if you have
unsigned int bit ;
then
unsigned int index = (bit >> 6) & 63 ;
uint64_t mask = 1 << (bit & 63) ;
if (array [index] & mask)
// bit is set

Get byte - how is this wrong?

I want to get the designated byte from a 32 bit integer. I am getting wrong values but I don't know why.
The restrictions to this problem are:
Must use signed bits, and I can't use multiplication.
I specifically need to know what is wrong with the function as it's below.
Here is the function:
int retrieveByteFromWord(int word, int byte)
{
return (word >> (byte << 3)) & 0xFF;
}
ex:
(3) (2) (1) (0) ------ byte number
In word: 10010011 11001100 00110011 10101000
I want to return byte 2 (1100 1100).
retrieveByteFromWord(word, 2) ---- gives: 1100 1100
But for some cases it's wrong and it won't tell me what case.
Any ideas?
Here is the problem:
You just started working for a company that is implementing a set of procedures to operate on a data structure where 4 signed bytes are packed into a 32 bit unsigned. Bytes within the word are numbered from 0(LSB) to 3(MSB). You have been assigned the task of implementing a function for a machine using 2's complement arithmetic and arithmetic right shifts with the following prototype:
typedef unsigned packed_t
int xbyte(packed_t word, int bytenum);
This is the previous employees attempt which got him fired for being wrong:
int xbyte(packed_t word, int bytenum)
{
return (word >> (bytenum << 3)) & 0xFF;
}
A) What is wrong with the code?
B) Write a correct implementation using only left and right shifts and one subtraction.
I have done B but still don't know why A is wrong. Is it because the decimal numbers going in like 12, 15, 19, 55 and then getting packed into a word and then when I extract them they aren't the same number anymore??? It might be so I am going to run some tests real fast...
As this is homework I won't give you a full answer, but I'll point you in the right direction. Your problem statement says that:
4 signed bytes are packed into a 32 bit unsigned.
When you bitwise & a 32 bit signed integer with 0xFF the most significant bit - i.e. the sign bit - of the result is always 0, so the original function never returns a negative value regardless of the input.
By way of example...
When you say "retrieveByteFromWord(word, 2) ---- gives: 11001100" you're wrong.
Your return type is a 32 bit integer - not an 8 bit integer. You're not returning 11001100 you're returning 00000000 00000000 00000000 11001100.
To work with numbers, use signed integer types such as int.
To work with bits, use unsigned integer types such as unsigned. I.e. let the word argument be of type unsigned. That is what the unsigned types are for.
To multiply by 8, write just *8 (this does not mean that that part of the code is technically wrong, just that it is artificially contrived and needlessly unreadable).
Even better, create a self-describing name for that magic number 8, e.g. *bitsPerByte (the standard library calls it CHAR_BIT, which is not particularly self-describing nor readable).
Finally, at the design level, think about designing your functions so that the code that uses a function of yours – each call – becomes clear and readable. E.g. like int const b = byteAt( 2, x );. That can prevent bugs by e.g. preventing wrong actual argument order, and since designing for readability makes the code easier to read, it reduces time spent on that. :-)
Cheers & hth.,
Works fine for positive numbers. You may want to cast word to unsigned to make it work for integers with the MSB set.
int retrieveByteFromWord(int word, int byte)
{
return ((unsigned)word >> (byte << 3)) & 0xFF;
}

What's bad about shifting a 32-bit variable 32 bits?

I recently picked up a copy of Applied Cryptography by Bruce Schneier and it's been a good read. I now understand how several algorithms outlined in the book work, and I'd like to start implementing a few of them in C.
One thing that many of the algorithms have in common is dividing an x-bit key, into several smaller y-bit keys. For example, Blowfish's key, X, is 64-bits, but you are required to break it up into two 32-bit halves; Xl and Xr.
This is where I'm getting stuck. I'm fairly decent with C, but I'm not the strongest when it comes to bitwise operators and the like.
After some help on IRC, I managed to come up with these two macros:
#define splitup(a, b, c) {b = a >> 32; c = a & 0xffffffff; }
#define combine(a, b, c) {a = (c << 32) | a;}
Where a is 64 bits and b and c are 32 bits. However, the compiler warns me about the fact that I'm shifting a 32 bit variable by 32 bits.
My questions are these:
What's bad about shifting a 32-bit variable 32 bits? I'm guessing it's undefined, but these macros do seem to be working.
Also, would you suggest I go about this another way?
As I said, I'm fairly familiar with C, but bitwise operators and the like still give me a headache.
EDIT
I figured out that my combine macro wasn't actually combining two 32-bit variables, but simply ORing 0 by a, and getting a as a result.
So, on top of my previous questions, I still don't have a method of combining the two 32-bit variables to get a 64-bit one; a suggestion on how to do it would be appreciated.
Yes, it is undefined behaviour.
ISO/IEC 9899:1999 6.5.7 Bitwise shift operators ¶3
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
C11 aka ISO/IEC 9899:2011 says the same.
You should first cast b to the target integer type. Another point is that you should put parentheses around the macro parameters to avoid surprises by operator precedences. Additionally, the comma operator is very useful here, allowing you to avoid the braces, so that the macro can be used as a normal command, closed with a semicolon.
#define splitup(a,b,c) ( (b) = (a) >> 32, (c) = (a) & 0xffffffff )
#define combine(a,b,c) ( (a) = ((unsigned long long)(b) << 32) | (c) )
Additional casts may be necessary for `splitup to silence warnings about precision loss by over-paranoid compilers.
#define splitup(a,b,c) ( (b) = (unsigned long)((a) >> 32), (c) = (unsigned long)((a) & 0xffffffff) )
And please don't even think about using your self-written encryption for production code.
Shifting a 32-bit value by 32 bits or more is undefined in C and C++. One of the reasons it was left undefined is that on some hardware platforms the 32-bit shift instruction only takes into account 5 lowest bits of the supplied shift count. This means that whatever shift count you pass, it will be interpreted modulo 32. Attempting to shift by 32 on such platform will actually shift by 0, i.e. not shift at all.
The language authors did not want to burden the compilers written for such platform with the task of analyzing the shift count before doing the shift. Instead, the language specification says that the behavior is undefined. This means that if you want to get a 0 value from a 32-bit shift by 32 (or more), it is up to you to recognize the situation and process it accordingly.
what's bad about shifting a 32-bit variable 32 bits?
Its better to assign 0 to n-bit integer than to shift it by n-bits.
Example:
0 0 1 0 1 ----- 5 bit Integer
0 1 0 1 0 ----- 1st shift
1 0 1 0 0 ----- 2nd shift
0 1 0 0 0 ----- 3rd shift
1 0 0 0 0 ----- 4th shift
0 0 0 0 0 ----- 5th shift (all the bits are shifted!)
I still don't have a method of combining the two 32-bit variables to get a 64-bit one
Consider: a is 64 bit, b and c are 32 bit
a = b;
a = a << 32; //Note: a is 64 bit
a = a | c;
Unless this is some "reinventing the wheel to understand how it works" project, don't implement your own crypto functions.
Ever.
It's hard enough to use the available algorithms to work (and to choose the right one), don't shoot yourself in the foot by putting in production some home grown cryptor API. Chances are your encryption won't encrypt
What's bad about shifting a 32-bit variable 32 bits?
In addition to what have been already said, the 32nd bit is the sign bit, and you may get sign extension to preserve the sing, thereby losing significant bits.

Resources