I want to implement bitwise cyclic shift of a 64 bit integer.
ROT(a,b) will move bit at position i to position i+b. (a is the 64 bit integer)
However, my avr processor is an 8-bit processor. Thus, to express a, I have to use
unit8_t x[8].
x[0] is the 8 most significant bits of a.
x[7] is the 8 least significant bits of a.
Can any one help to implement ROT(a,b) in term of array x?
Thank you
It makes no functional difference if the underlying processor is 64-bit, 8-bit or 1-bit. If the compiler is compliant - you are good to go. Use uint64_t. Code does not "have to use unit8_t" because the processor is an 8-bit one.
uint64_t RPT(uint64_t a, unsigned b) {
return (a << (b & 63)) | (a >> ((64 - b) & 63));
}
Extra () added for explicitness.
& 63 (or %64 is you like that style) added to insure only 6 LSBits of b contribute to the shift. Any higher bits simply imply multiple "revolutions" of a circular shift.
((64 - b) & 63) could be simplified to (-b & 63).
--
But if OP still wants "implement ROT(a,b) in term of array unit8_t x[8]":
#include <stdint.h>
// circular left shift. MSByte in a[0].
void ROT(uint8_t *a, unsigned b) {
uint8_t dest[8];
b &= 63;
// byte shift
unsigned byte_shift = b / 8;
for (unsigned i = 0; i < 8; i++) {
dest[i] = a[(i + byte_shift) & 7];
}
b &= 7; // b %= 8; form bit shift;
unsigned acc = dest[0] << b;
for (unsigned i = 8; i-- > 0;) {
acc >>= 8;
acc |= (unsigned) dest[i] << b;
a[i] = (uint8_t) acc;
}
}
#vlad_tepesch Suggested a solution that emphasizes the AVR 8-bit nature. This is an untested attempt.
void ROT(uint8_t *a, uint8_t b) {
uint8_t dest[8];
b &= 63; // Could be eliminated as following code only uses the 6 LSBits.
// byte shift
uint8_t byte_shift = b / 8u;
for (uint8_t i = 0; i < 8u; i++) {
dest[i] = a[(i + byte_shift) & 7u];
}
b &= 7u; // b %= 8u; form bit shift;
uint16_t acc = dest[0] << b;
for (unsigned i = 8u; i-- > 0;) {
acc >>= 8u;
acc |= (uint8_t) dest[i] << b;
a[i] = (uint8_t) acc;
}
}
why do not leave the work to the compiler and just implement a function
uint64_t rotL(uint64_t v, uint8_t r){
return (v>>(64-r)) | (v<<r)
}
I take it the x(i) are 8 bits.
To rotate left n times
each bit from X(i,j) where i is the index array x(0) -> x(7)
and j is the bit position within the element
then this bit will end up in
Y((i+n)/8, ( i+n) & 7 )
This will handle rotations up to 63
any number > 63 , you just mod it.
Related
I'm lost on bit shifting operations, I'm trying to reverse byte order on 32 bit ints, what I've managed to look up online I only got this far but cant seem to find why its not working
int32_t swapped = 0; // Assign num to the tmp
for(int i = 0; i < 32; i++)
{
swapped |= num & 1; // putting the set bits of num
swapped >>= 1; //shift the swapped Right side
num <<= 1; //shift the swapped left side
}
And I'm printing like this
num = swapped;
for (size_t i = 0; i < 32; i++)
{
printf("%d",(num >> i));
}
Your code looks likes its attempting to swap bits, and not bytes. If you are wanting to swap bytes, then the 'complete' method would be:
int32_t swapped = ((num >> 24) & 0x000000FF) |
((num >> 8) & 0x0000FF00) |
((num << 8) & 0x00FF0000) |
((num << 24) & 0xFF000000);
I say 'complete', because the last bitwise-and can be omitted, and the first bitwise-and can be omitted if num is unsigned.
If you want to swap the bits in a 32bit number, your loop should probably max out at 16 (if it's 32, the first 16 steps will swap the bits, the next 16 steps will swap them back again).
int32_t swapped = 0;
for(int i = 0; i < 16; ++i)
{
// the masks for the two bits (hi and lo) we will be swapping
// shift a '1' to the correct bit location based on the index 'i'
uint32_t hi_mask = 1 << (31 - i);
uint32_t lo_mask = 1 << i;
// use bitwise and to mask out the original bits in the number
uint32_t hi_bit = num & hi_mask;
uint32_t lo_bit = num & lo_mask;
// shift the bits so they switch places
uint32_t new_lo_bit = hi_bit >> (31 - i);
uint32_t new_hi_bit = lo_bit << (31 - i);
// use bitwise-or to combine back into an int
swapped |= new_lo_bit;
swapped |= new_hi_bit;
}
Code written for readability - there are faster ways to reverse the bits in a 32bit number. As for printing:
for (size_t i = 0; i < 32; i++)
{
bool bit = (num >> (31 - i)) & 0x1;
printf(bit ? "1" : "0");
}
Suppose you have an integer a = 0x12345678 & a short b = 0xabcd
What i wanna do is replace the given nibbles in integer a with nibbles from short b
Eg: Replace 0,2,5,7th nibbles in a = 0x12345678 (where 8 = 0th nibble, 7=1st nibble, 6=2nd nibble and so on...) with nibbles from b = 0xabcd (where d = 0th nibble, c=1st nibble, b=2nd nibble & so on...)
My approach is -
Clear the bits we're going to replace from a.
like a = 0x02045070
Create the mask from the short b like mask = 0xa0b00c0d
bitwise OR them to get the result. result = a| mask i.e result = 0xa2b45c7d hence nibbles replaced.
My problem is I don't know any efficient way to create the desired mask (like in step 2) from the given short b
If you can give me an efficient way of doing so, it would be a great help to me and I thank you for that in advance ;)
Please ask if more info needed.
EDIT:
My code to solve the problem (not good enough though)
Any improvement is highly appreciated.
int index[4] = {0,1,5,7}; // Given nibbles to be replaced in integer
int s = 0x01024300; // integer mask i.e. cleared nibbles
int r = 0x0000abcd; // short (converted to int )
r = ((r & 0x0000000f) << 4*(index[0]-0)) |
((r & 0x000000f0) << 4*(index[1]-1)) |
((r & 0x00000f00) << 4*(index[2]-2)) |
((r & 0x0000f000) << 4*(index[3]-3));
s = s|r;
Nibble has 4 bits, and according to your indexing scheme, the zeroth nibble is represented by least significant bits at positions 0-3, the first nibble is represented by least significant bits at positions 4-7, and so on.
Simply shift the values the necessary amount. This will set the nibble at position set by the variable index:
size_t index = 5; //6th nibble is at index 5
size_t shift = 4 * index; //6th nibble is represented by bits 20-23
unsigned long nibble = 0xC;
unsigned long result = 0x12345678;
result = result & ~( 0xFu << shift ); //clear the 6th nibble
result = result | ( nibble << shift ); //set the 6th nibble
If you want to set more than one value, put this code in a loop. The variable index should be changed to an array of values, and variable nibble could also be an array of values, or it could contain more than one nibble, in which case you extract them one by one by shifting values to the right.
A lot depends on how your flexible you are in accepting the "nibble list" index[4] in your case.
You mentioned that you can replace anywhere from 0 to 8 nibbles. If you take your nibble bits as an 8-bit bitmap, rather than as a list, you can use the bitmap as a lookup in a 256-entry table, which maps from bitmap to a (fixed) mask with 1s in the nibble positions. For example, for the nibble list {1, 3}, you'd have the bitmap 0b00001010 which would map to the mask 0x0000F0F0.
Then you can use pdep which has intrinsics on gcc, clang, icc and MSVC on x86 to expand the bits in your short to the right position. E.g., for b == 0xab you'd have _pdep_u32(b, mask) == 0x0000a0b0.
If you aren't on a platform with pdep, you can accomplish the same thing with multiplication.
To be able to change easy the nibbles assignment, a bit-field union structure could be used:
Step 1 - create a union allowing to have nibbles access
typedef union u_nibble {
uint32_t dwValue;
uint16_t wValue;
struct sNibble {
uint32_t nib0: 4;
uint32_t nib1: 4;
uint32_t nib2: 4;
uint32_t nib3: 4;
uint32_t nib4: 4;
uint32_t nib5: 4;
uint32_t nib6: 4;
uint32_t nib7: 4;
} uNibble;
} NIBBLE;
Step 2 - assign two NIBBLE items with your integer a and short b
NIBBLE myNibbles[2];
uint32_t a = 0x12345678;
uint16_t b = 0xabcd;
myNibbles[0].dwValue = a;
myNibbles[1].wValue = b;
Step 3 - initialize nibbles of a by nibbles of b
printf("a = %08x\n",myNibbles[0].dwValue);
myNibbles[0].uNibble.nib0 = myNibbles[1].uNibble.nib0;
myNibbles[0].uNibble.nib2 = myNibbles[1].uNibble.nib1;
myNibbles[0].uNibble.nib5 = myNibbles[1].uNibble.nib2;
myNibbles[0].uNibble.nib7 = myNibbles[1].uNibble.nib3;
printf("a = %08x\n",myNibbles[0].dwValue);
Output will be:
a = 12345678
a = a2b45c7d
If I understand your goal, the fun you are having comes from the reversal of the order of your fill from the upper half to the lower half of your final number. (instead of 0, 2, 4, 6, you want 0, 2, 5, 7) It isn't any more difficult, but it does make you count where the holes are in the final number. If I understood, then you could mask with 0x0f0ff0f0 and then fill in the zeros with shifts of 16, 12, 4 and 0. For example:
#include <stdio.h>
int main (void) {
unsigned a = 0x12345678, c = 0, mask = 0x0f0ff0f0;
unsigned short b = 0xabcd;
/* mask a, fill in the holes with the bits from b */
c = (a & mask) | (((unsigned)b & 0xf000) << 16);
c |= (((unsigned)b & 0x0f00) << 12);
c |= (((unsigned)b & 0x00f0) << 4);
c |= (unsigned)b & 0x000f;
printf (" a : 0x%08x\n b : 0x%0hx\n c : 0x%08x\n", a, b, c);
return 0;
}
Example Use/Output
$ ./bin/bit_swap_nibble
a : 0x12345678
b : 0xabcd
c : 0xa2b45c7d
Let me know if I misunderstood, I'm happy to help further.
With nibble = 4 bits and unsigned int = 32 bits, a nibble inside a unsigned int can be found as follows:
x = 0x00a0b000, find 3rd nibble in x i.e locate 'b'. Note nibble index starts with 0.
Now 3rd nibble is from 12th bit to 15th bit.
3rd_nibble can be selected with n = 2^16 - 2^12. So, in n all the bits in 3rd nibble will be 1 and all the bits in other nibbles will be 0. That is, n=0x00001000
In general, suppose if you want to find a continuous sequence of 1 in binary representation in which sequence starts from Xth bit to Yth bit then formula is 2^(Y+1) - 2^X.
#include <stdio.h>
#define BUF_SIZE 33
char *int2bin(int a, char *buffer, int buf_size)
{
int i;
buffer[BUF_SIZE - 1] = '\0';
buffer += (buf_size - 1);
for(i = 31; i >= 0; i--)
{
*buffer-- = (a & 1) + '0';
a >>= 1;
}
return buffer;
}
int main()
{
unsigned int a = 0;
unsigned int b = 65535;
unsigned int b_nibble;
unsigned int b_at_a;
unsigned int a_nibble_clear;
char replace_with[8];
unsigned int ai;
char buffer[BUF_SIZE];
memset(replace_with, -1, sizeof(replace_with));
replace_with[0] = 0; //replace 0th nibble of a with 0th nibble of b
replace_with[2] = 1; //replace 2nd nibble of a with 1st nibble of b
replace_with[5] = 2; //replace 5th nibble of a with 2nd nibble of b
replace_with[7] = 3; //replace 7th nibble of a with 3rd nibble of b
int2bin(a, buffer, BUF_SIZE - 1);
printf("a = %s, %08x\n", buffer, a);
int2bin(b, buffer, BUF_SIZE - 1);
printf("b = %s, %08x\n", buffer, b);
for(ai = 0; ai < 8; ++ai)
{
if(replace_with[ai] != -1)
{
b_nibble = (b & (1LL << ((replace_with[ai] + 1)*4)) - (1LL << (replace_with[ai]*4))) >> (replace_with[ai]*4);
b_at_a = b_nibble << (ai * 4);
a_nibble_clear = (a & ~(a & (1LL << ((ai + 1) * 4)) - (1LL << (ai * 4))));
a = a_nibble_clear | b_at_a;
}
}
int2bin(a, buffer, BUF_SIZE - 1);
printf("a = %s, %08x\n", buffer, a);
return 0;
}
Output:
a = 00000000000000000000000000000000, 00000000
b = 00000000000000001111111111111111, 0000ffff
a = 11110000111100000000111100001111, f0f00f0f
I'm stuck on XORing a 32-bit integer with it itself. I'm supposed to XOR the 4 8-bit portions of the integers. I understand how it works, but without storing the integer anywhere, I don't get how to do this.
I've thought it over and I'm thinking of using binary left shift and right shift operators to separate the 32 bit integer into 4 parts to XOR them. For example, if I were to use an 8-bit integer, I would do something like this:
int a = <some integer here>
(a << 4) ^ (a >> 4)
So far, it isn't working the way I thought it would work.
Here's a part of my code:
else if (choice == 2) {
int bits = 8;
printf("Enter an integer for checksum calculation: ");
scanf("%d", &in);
printf("Integer: %d, ", in);
int x = in, i;
int mask = 1 << sizeof(int) * bits - 1;
printf("Bit representation: ");
for (i = 1; i <= sizeof(int) * bits; i++) {
if (x & mask)
putchar('1');
else
putchar('0');
x <<= 1;
if (! (i % 8)) {
putchar(' ');
}
}
printf("\n");
}
Here's an example of an output:
What type of display do you want?
Enter 1 for character parity, 2 for integer checksum: 2
Enter an integer for checksum calculation: 1024
Integer: 1024, Bit representation: 00000000 00000000 00000100 00000000
Checksum of the number is: 4, Bit representation: 00000100
To accumulate the XOR of 8-bit values, you simply shift and XOR each part of the value. Conceptually it's this:
uint32_t checksum = ( (a >> 24) ^ (a >> 16) ^ (a >> 8) ^ a ) & 0xff;
However, since XOR can be done in any order, you can do the same with fewer operations:
uint32_t checksum = (a >> 16) ^ a;
checksum = ((checksum >> 8) ^ checksum) & 0xff;
If you're doing this over many values, you can extend this idea by only condensing the value at the very end. This is quite similar to how parallel commutative operations are done in larger registers with technologies like SIMD (and indeed, compilers with SIMD support should be able to optimize the following code to make it much faster):
uint32_t simple_checksum( uint32_t *v, size_t count )
{
uint32_t checksum = 0;
uint32_t *end = v + count;
for( ; v != end; v++ )
{
checksum ^= *v; /* accumulate XOR of each 32-bit value */
}
checksum ^= (checksum >> 16); /* XOR high and low words into low word */
checksum ^= (checksum >> 8 ); /* XOR each byte of low word into low byte */
return checksum & 0xff; /* everything from bits 8-31 is rubbish */
}
In general Xoring a number with itself should provide you with the value 0 so you can just as easily set the variable to 0.
0100101^0100101=0
This is a result of the Karnaugh map for the xor operation providing a 0 when both bits are a one, or both are a zero.
I am trying to convert a uint16_t input to a uint32_t bit mask. One bit in the input toggles two bits in the output bit mask. Here is an example converting a 4-bit input to an 8-bit bit mask:
Input Output
ABCDb -> AABB CCDDb
A,B,C,D are individual bits
Example outputs:
0000b -> 0000 0000b
0001b -> 0000 0011b
0010b -> 0000 1100b
0011b -> 0000 1111b
....
1100b -> 1111 0000b
1101b -> 1111 0011b
1110b -> 1111 1100b
1111b -> 1111 1111b
Is there a bithack-y way to achieve this behavior?
Interleaving bits by Binary Magic Numbers contained the clue:
uint32_t expand_bits(uint16_t bits)
{
uint32_t x = bits;
x = (x | (x << 8)) & 0x00FF00FF;
x = (x | (x << 4)) & 0x0F0F0F0F;
x = (x | (x << 2)) & 0x33333333;
x = (x | (x << 1)) & 0x55555555;
return x | (x << 1);
}
The first four steps consecutively interleave the source bits in groups of 8, 4, 2, 1 bits with zero bits, resulting in 00AB00CD after the first step, 0A0B0C0D after the second step, and so on. The last step then duplicates each even bit (containing an original source bit) into the neighboring odd bit, thereby achieving the desired bit arrangement.
A number of variants are possible. The last step can also be coded as x + (x << 1) or 3 * x. The | operators in the first four steps can be replaced by ^ operators. The masks can also be modified as some bits are naturally zero and don't need to be cleared. On some processors short masks may be incorporated into machine instructions as immediates, reducing the effort for constructing and / or loading the mask constants. It may also be advantageous to increase instruction-level parallelism for out-of-order processors and optimize for those with shift-add or integer-multiply-add instructions. One code variant incorporating various of these ideas is:
uint32_t expand_bits (uint16_t bits)
{
uint32_t x = bits;
x = (x ^ (x << 8)) & ~0x0000FF00;
x = (x ^ (x << 4)) & ~0x00F000F0;
x = x ^ (x << 2);
x = ((x & 0x22222222) << 1) + (x & 0x11111111);
x = (x << 1) + x;
return x;
}
The easiest way to map a 4-bit input to an 8-bit output is with a 16 entry table. So then it's just a matter of extracting 4 bits at a time from the uint16_t, doing a table lookup, and inserting the 8-bit value into the output.
uint32_t expandBits( uint16_t input )
{
uint32_t table[16] = {
0x00, 0x03, 0x0c, 0x0f,
0x30, 0x33, 0x3c, 0x3f,
0xc0, 0xc3, 0xcc, 0xcf,
0xf0, 0xf3, 0xfc, 0xff
};
uint32_t output;
output = table[(input >> 12) & 0xf] << 24;
output |= table[(input >> 8) & 0xf] << 16;
output |= table[(input >> 4) & 0xf] << 8;
output |= table[ input & 0xf];
return output;
}
This provides a decent compromise between performance and readability. It doesn't have quite the performance of cmaster's over-the-top lookup solution, but it's certainly more understandable than thndrwrks' magical mystery solution. As such, it provides a technique that can be applied to a much larger variety of problems, i.e. use a small lookup table to solve a larger problem.
In case you want to get some estimate of relative speeds, some community wiki test code. Adjust as needed.
void f_cmp(uint32_t (*f1)(uint16_t x), uint32_t (*f2)(uint16_t x)) {
uint16_t x = 0;
do {
uint32_t y1 = (*f1)(x);
uint32_t y2 = (*f2)(x);
if (y1 != y2) {
printf("%4x %8lX %8lX\n", x, (unsigned long) y1, (unsigned long) y2);
}
} while (x++ != 0xFFFF);
}
void f_time(uint32_t (*f1)(uint16_t x)) {
f_cmp(expand_bits, f1);
clock_t t1 = clock();
volatile uint32_t y1 = 0;
unsigned n = 1000;
for (unsigned i = 0; i < n; i++) {
uint16_t x = 0;
do {
y1 += (*f1)(x);
} while (x++ != 0xFFFF);
}
clock_t t2 = clock();
printf("%6llu %6llu: %.6f %lX\n", (unsigned long long) t1,
(unsigned long long) t2, 1.0 * (t2 - t1) / CLOCKS_PER_SEC / n,
(unsigned long) y1);
fflush(stdout);
}
int main(void) {
f_time(expand_bits);
f_time(expandBits);
f_time(remask);
f_time(javey);
f_time(thndrwrks_expand);
// now in the other order
f_time(thndrwrks_expand);
f_time(javey);
f_time(remask);
f_time(expandBits);
f_time(expand_bits);
return 0;
}
Results
0 280: 0.000280 FE0C0000 // fast
280 702: 0.000422 FE0C0000
702 1872: 0.001170 FE0C0000
1872 3026: 0.001154 FE0C0000
3026 4399: 0.001373 FE0C0000 // slow
4399 5740: 0.001341 FE0C0000
5740 6879: 0.001139 FE0C0000
6879 8034: 0.001155 FE0C0000
8034 8470: 0.000436 FE0C0000
8486 8751: 0.000265 FE0C0000
Here's a working implementation:
uint32_t remask(uint16_t x)
{
uint32_t i;
uint32_t result = 0;
for (i=0;i<16;i++) {
uint32_t mask = (uint32_t)x & (1U << i);
result |= mask << (i);
result |= mask << (i+1);
}
return result;
}
On each iteration of the loop, the bit in question from the uint16_t is masked out and stored.
That bit is then shifted by its bit position and ORed into the result, then shifted again by its bit position plus 1 and ORed into the result.
If your concern is performance and simplicity, you are likely best of with a big lookup table (64k entries of 4 bytes each). With that, you can pretty much use any algorithm you like to generate the table, lookup will just be a single memory access.
If that table is too big for your liking, you can split it. For instance, you can use a 8 bit lookup table with 256 entries of 2 bytes each. With that you can perform the entire operation with just two lookups. Bonus is, that this approach allows for type-punning tricks to avoid the hassle of splitting the address with bit operations:
//Implementation defined behavior ahead:
//Works correctly for both little and big endian machines,
//however, results will be wrong on a PDP11...
uint32_t getMask(uint16_t input) {
assert(sizeof(uint16_t) == 2);
assert(sizeof(uint32_t) == 4);
static const uint16_t lookupTable[256] = { 0x0000, 0x0003, 0x000c, 0x000f, ... };
unsigned char* inputBytes = (unsigned char*)&input; //legal because we type-pun to char, but the order of the bytes is implementation defined
char outputBytes[4];
uint16_t* outputShorts = (uint16_t*)outputBytes; //legal because we type-pun from char, but the order of the shorts is implementation defined
outputShorts[0] = lookupTable[inputBytes[0]];
outputShorts[1] = lookupTable[inputBytes[1]];
uint32_t output;
memcpy(&output, outputBytes, 4); //can't type-pun directly from uint16 to uint32_t due to strict aliasing rules
return output;
}
The code above works around strict aliasing rules by casting only to/from char, which is an explicit exception to the strict aliasing rules. It also works around the effects of little/big-endian byte order by building the result in the same order as the input was split. However, it still exposes implementation defined behavior: A machine with a byte order of 1, 0, 3, 2, or other middle endian orders, will silently produce wrong results (there have actually been such CPUs like the PDP11...).
Of course, you can split the lookup table even further, but I doubt that would do you any good.
A simple loop. Maybe not bit-hacky enough?
uint32_t thndrwrks_expand(uint16_t x) {
uint32_t mask = 3;
uint32_t y = 0;
while (x) {
if (x&1) y |= mask;
x >>= 1;
mask <<= 2;
}
return y;
}
Tried another that is twice as fast. Still 655/272 as slow as expand_bits(). Appears to be fastest 16 loop iteration solution.
uint32_t thndrwrks_expand(uint16_t x) {
uint32_t y = 0;
for (uint16_t mask = 0x8000; mask; mask >>= 1) {
y <<= 1;
y |= x&mask;
}
y *= 3;
return y;
}
Try this, where input16 is the uint16_t input mask:
uint32_t input32 = (uint32_t) input16;
uint32_t result = 0;
uint32_t i;
for(i=0; i<16; i++)
{
uint32_t bit_at_i = (input32 & (((uint32_t)1) << i)) >> i;
result |= ((bit_at_i << (i*2)) | (bit_at_i << ((i*2)+1)));
}
// result is now the 32 bit expanded mask
My solution is meant to run on mainstream x86 PCs and be simple and generic. I did not write this to compete for the fastest and/or shortest implementation. It is just another way to solve the problem submitted by OP.
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#define BITS_TO_EXPAND (4U)
#define SIZE_MAX (256U)
static bool expand_uint(unsigned int *toexpand,unsigned int *expanded);
int main(void)
{
unsigned int in = 12;
unsigned int out = 0;
bool success;
char buff[SIZE_MAX];
success = expand_uint(&in,&out);
if(false == success)
{
(void) puts("Error: expand_uint failed");
return EXIT_FAILURE;
}
(void) snprintf(buff, (size_t) SIZE_MAX,"%u expanded is %u\n",in,out);
(void) fputs(buff,stdout);
return EXIT_SUCCESS;
}
/*
** It expands an unsigned int so that every bit in a nibble is copied twice
** in the resultant number. It returns true on success, false otherwise.
*/
static bool expand_uint(unsigned int *toexpand,unsigned int *expanded)
{
unsigned int i;
unsigned int shifts = 0;
unsigned int mask;
if(NULL == toexpand || NULL == expanded)
{
return false;
}
*expanded = 0;
for(i = 0; i < BIT_TO_EXPAND; i++)
{
mask = (*toexpand >> i) & 1;
*expanded |= (mask << shifts);
++shifts;
*expanded |= (mask << shifts);
++shifts;
}
return true;
}
How would I go about implementing a sign extend from 16 bits to 32 bits in C code?
I am supposed to be using bitwise operators. I also need to add and subtract; can anyone point me in the right direction? I did the first 4 but am confused on the rest. I have to incorporate a for loop somewhere as well for 1 of the cases.
I am not allowed to use any arithmetic operators (+, -, /, *) and no if statements.
Here is the code for the switch statement I am currently editing:
unsigned int csc333ALU(const unsigned int opcode,
const unsigned int argument1,
const unsigned int argument2) {
unsigned int result;
switch(opcode) {
case(0x01): // result = NOT argument1
result = ~(argument1);
break;
case(0x02): // result = argument 1 OR argument 2
result = argument1 | argument2;
break;
case(0x03): // result = argument 1 AND argument 2
result = argument1 & argument2;
break;
case(0x04): // result = argument 1 XOR argument 2
result = argument1 ^ argument2;
break;
case(0x05): // result = 16 bit argument 1 sign extended to 32 bits
result = 0x00000000;
break;
case(0x06): // result = argument1 + argument2
result = 0x00000000;
break;
case(0x07): // result = -argument1. In two's complement, negate and add 1.
result = 0x00000000;
break;
default:
printf("Invalid opcode: %X\n", opcode);
result = 0xFFFFFFFF;
}
partial answer for sign extension:
result = (argument1 & 0x8000) == 0x8000 ? 0xFFFF0000 | argument1 : argument1;
To sign-extend a 16 bit number to 32 bit, you need to copy bit 15 to the upper bits. The naive way to do this is with 16 instructions, copying bit 15 to bit 16, then 17, then 18, and so on. But you can do it more efficiently by using previously copied bits and doubling the number of bits you've copied each time like this:
unsigned int ext = (argument1 & 0x8000U) << 1;
ext |= ext << 1;
ext |= ext << 2;
ext |= ext << 4;
ext |= ext << 8;
result = (argument1 & 0xffffU) | ext;
To add two 32 bit numbers "manually" then you can simply do it bit by bit.
unsigned carry = 0;
result = 0;
for (int i = 0; i < 32; i++) {
// Extract the ith bit from argument1 and argument 2.
unsigned a1 = (argument1 >> i) & 1;
unsigned a2 = (argument2 >> i) & 1;
// The ith bit of result is set if 1 or 3 of a1, a2, carry is set.
unsigned v = a1 ^ a2 ^ carry;
result |= v << i;
// The new carry is 1 if at least two of a1, a2, carry is set.
carry = (a1 & a2) | (a1 & carry) | (a2 & carry);
}
Subtraction works with almost exactly the same code: a - b is the same as a + (~b+1) in two's complement arithmetic. Because you aren't allowed to simply add 1, you can achieve the same by initialising carry to 1 instead of 0.
unsigned carry = 1;
result = 0;
for (int i = 0; i < 32; i++) {
unsigned a1 = (argument1 >> i) & 1;
unsigned a2 = (~argument2 >> i) & 1;
unsigned v = a1 ^ a2 ^ carry;
result |= v << i;
carry = (a1 & a2) | (a1 & carry) | (a2 & carry);
}
To find two's complement without doing the negation, similar ideas apply. Bitwise negate and then add 1. Adding 1 is simpler than adding argument2, so the code is correspondingly simpler.
result = ~argument1;
unsigned carry = 1;
for (int i = 0; i < 32 && carry; i++) {
carry &= (result >> i);
result |= (1 << i);
}
to get sign extension from short int to int....
short int iShort = value;
int i = iShort; // compiler automatically creates code that performs sign extension
Note: going from i to iShort will generate a compiler waring.
however, for other situations...
no need to make comparison, the & will result in a single bit being either 0 or 1 and be sure to cast the parts of the calculation as int
int i = (short int argument&0x8000)? (int)(0xFFFF000 | (int)argument) : (int)argument;