Reversing the nibbles - c

I am trying to "build a new number by reversing its nibbles".
This is the exercise:
Write a function that given an unsigned n
a) returns the value with the nibbles placed in reverse order
I was thinking that all the 8 nibbles from the 32 bit unsigned should be placed in reverse order. So , as an example for the number 24, which is 00000000000000000000000000011000.
=> The reversed value should be: 10000001000000000000000000000000.
#include <stdio.h>
unsigned getNibble(unsigned n,unsigned p){
unsigned mask = 0xFu;
unsigned nibble = 0;
nibble = (n&(mask<<p))>>p;
return nibble;
}
unsigned swapNibbles(unsigned n){
unsigned new = 0;
unsigned nibble;
for(unsigned i=0;i<(sizeof(n)*8);i=i+4){
nibble = getNibble(n,i);
new = (new<<i) + nibble;
}
return new;
}
int main(void) {
printf("0x%x",swapNibbles(24));
return 0;
}
I tried to debug it , and it went well until one point.
At one of the right shifts , it transformed my "new" variable into 0.

This statement
new = (new << i) + nibble;
is wrong. There should be
new = (new << 4) + nibble;

An approach that does work in parallel:
uint32_t n = ...;
// Swap the nibbles of each byte.
n = (n & 0x0F0F0F0F ) << 4
| (n & 0xF0F0F0F0 ) >> 4;
// Swap the bytes of each byte pair.
n = ( n & 0x00FF00FF ) << 8
| ( n & 0xFF00FF00 ) >> 8;
// Swap the byte pairs.
n = ( n & 0x0000FFFF ) << 16
| ( n & 0xFFFF0000 ) >> 16;
Doing the work in parallel greatly reduces the number of operations.
OP's This
Approach Approach
-------- --------- ---------
Shifts 24 / 48 6 / 8 32 bits / 64 bits
Ands 8 / 16 6 / 8
Ors* 8 / 16 3 / 4
Assigns 8 / 16 3 / 4
Adds 8 / 16 0 / 0
Compares 8 / 16 0 / 0
-------- --------- ---------
Total 64 / 128 18 / 24
-------- --------- ---------
Scale O(N) O(log(N))
* Addition was used as "or" in the OP's solution.

int main (void)
{
uint32_t x = 0xDEADBEEF;
printf("original 4 bytes %X\n", x);
uint32_t y = 0;
for(uint8_t i = 0 ; i < 32 ; i += 4)
{
y <<= 4;
y |= x>>i & 0xF;
}
printf("reverse order nibbles %X\n", y);
return 0;
}
This could be made generic function for accepting all 8,16,32 bits numbers. But for now this resolves the bug you are facing in your code.
But I would point out ikegami's code is much better than this approach.

Related

how do i split an unsigned 64 bit int into individual 8 bits? (little endian) in C

for example i have uint64_t value = 42 and i would like to split it into 8 uint8_t (8 bits), little endian. But I am unsure how to do the bit shifting. Help would be much appreciated.
If you want the individual bytes of a 64-bit value in little endian, then you can do the following:
In order to get the 1st byte, you simply apply the AND-bitmask 0xFF. This will mask out all bits except for the 8 least-significant bits.
In order to get the 2nd byte, you shift right by 8 bits before applying the bit-mask.
In order to get the 3rd byte, you shift right by 16 bits before applying the bit-mask.
In order to get the 4th byte, you shift right by 24 bits before applying the bit-mask.
(...)
In order to get the 8th byte, you shift right by 56 bits before applying the bit-mask.
Here is the code for the value 42 (which is the example in the question):
#include <stdio.h>
#include <stdint.h>
int main( void )
{
uint64_t value = 42;
uint8_t bytes[8];
//extract the individual bytes
for ( int i = 0; i < 8; i++ )
{
bytes[i] = value >> (8 * i) & 0xFF;
}
//print the individual bytes
for ( int i = 0; i < 8; i++ )
{
printf( "%2d ", bytes[i] );
}
printf( "\n" );
}
Output:
42 0 0 0 0 0 0 0
If you replace the value 42 with the value 74579834759 in the program above, then you get the following output:
135 247 77 93 17 0 0 0
The following code works on both little-endian and big-endian platforms. On both types of platforms, it will produce the bytes in little-endian byte order.
uint64_t input = 42;
uint8_t values[8];
values[0] = input >> 0 & 0xFF;
values[1] = input >> 8 & 0xFF;
values[2] = input >> 16 & 0xFF;
values[3] = input >> 24 & 0xFF;
values[4] = input >> 32 & 0xFF;
values[5] = input >> 40 & 0xFF;
values[6] = input >> 48 & 0xFF;
values[7] = input >> 56 & 0xFF;
Note that the & 0xFF is redundant here, but it makes the code more clear and it's useful if you want to do anything with the value other than immediately assign it to a uint8_t variable.
Macro extracts bth byte form the u integer
#define EXTRACT(u,b) ((u) >> (8 * (b)))
int foo(uint64_t x)
{
uint8_t b[8] = {
EXTRACT(x,0),
EXTRACT(x,1),
EXTRACT(x,2),
EXTRACT(x,3),
EXTRACT(x,4),
EXTRACT(x,5),
EXTRACT(x,6),
EXTRACT(x,7),
};
}
If the platform is little endian you can also use memcpy
void foo(uint64_t x)
{
uint8_t b[8];
memcpy(b, &x, sizeof(b));
}
Here's a pointer approach to retrieve byte data from u64 data I usually use. Just share with you. But in this way, the user has to take care of the order.
#include <stdio.h>
#include <stdint.h>
void main(void)
{
int i;
uint64_t v = 0x123456789abcdef0;
uint8_t* ptrb;
ptrb = (uint8_t*)&v;
for (i = 0; i < 8; i++)
{
printf("%2x ", ptrb[i]);
}
printf("\n");
}
Below is the output with my sample code,
$ ./foo
f0 de bc 9a 78 56 34 12

Extracting a particular range of bits and find number of zeros between them in C

I want to extract a particular range of bits in an integer variable.
For example: 0xA5 (10100101)
I want to extract from bit2 to bit5. i.e 1001 to a variable and count number of zeros between them.
I have another variable which give the starting point, which means in this case the value of the variable is 2. So the starting point can be find by 0xA5 >> 2.
5th bit position is a random position here..means it can be 6 or 7. The main idea is whichever bit is set to 1 after 2nd bit. I have to extract that..
How can I do rest of the part ?
Assuming you are dealing with unsigned int for your variable.
You will have to construct the appropriate mask.
Suppose you want the bits from position x to position y, there need to be y - x + 1 1s in the mask.
You can get this by -
int digits = y - x + 1;
unsigned int mask = 1u << digits - 1;
Now you need to remove the lower x bits from the initial number, which be done by -
unsigned int result = number >> x;
Finally apply the mask to remove the upper bits -
result = result & mask;
In this example we put 0 or 1 values into array. After that you can treat array as you like.
#include <stdio.h>
#include <stdint.h>
int main(int argc, char **argv) {
uint8_t value = 0xA5;
unsigned char bytes[8];
unsigned char i;
for (i = 0; i < 8; i++) {
bytes[i] = (value & (1 << i)) != 0 ? 1 : 0;
}
for (i = 0; i < 8; i++) {
printf("%d", bytes[i]);
}
return 0;
}
You could use a mask and the "&" (AND) operation:
a = 0xA5;
a = a >> OFFSET; //OFFSET
mask = 0x0F; // equals 00001111
a = a & mask;
In your example a = 0xA5 (10100101), and the offset is 2.
a >> 2 a now equals to 0x29 (00101001)
a & 0x0F (00101001 AND
00001111) = 00001001 = 0x09
If you want bits from the offset X then shift right by X.
If you want Y bits, then then mask (after the shift) will be 2 to the power of Y minus one (for your example with four bits, 2 to the power of 4 is 16, minus one is 15 which is 1111 binary). This can be dome by using left-shifting by Y bits and subtracting 1.
However, the masking isn't needed if you want to count the number of zeros in the wanted bits, only the right shift. Loop Y times, each time shifting a 1 left one step, and check using bitwise and if the value is zero. If it is then increment a counter. At the end of the loop the counter is the number of zeros.
To put it all in code:
// Count the number of zeros in a specific amount of bits starting at a specific offset
// value is the original value
// offset is the offset in bits
// bits is the number of bits to check
unsigned int count_zeros(unsigned int value, unsigned int offset, unsigned int bits)
{
// Get the bits we're interested in the rightmost position
value >>= offset;
unsigned int counter = 0; // Zero-counter
for (unsigned int i = 0; i < bits; ++i)
{
if ((value & (1 << i)) == 0)
{
++counter; // Bit is a zero
}
}
return counter;
}
To use with the example data you have:
count_zeros(0xa5, 2, 4);
The result should be 2. Which it is if you see this live program.
int32_t do_test(int32_t value, int32_t offset)
{
int32_t _zeros = 1;
value >>= offset;
int i = 1;
while(1) {
if((value >> i) % 2 == 0) {
_zeros += 1;
i++;
} else {
break;
}
}
}
int result = (0xA5 >> 2) & 0x0F;
Truth table for the & operator
| INPUTS | OUTPUT |
-----------------------
| 0 | 0 | 0 |
| 0 | 1 | 0 |
| 1 | 0 | 0 |
| 1 | 1 | 1 |
-----------------------

Efficient Conversion of a Binary Number to Hexadecimal String [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I am writing a program that converts a binary value's hexadecimal representation to a regular string. So each character in the hex representation would convert to two hexadecimal characters in the string. This means the result will be twice the size; a hexadecimal representation of 1 byte would need two bytes in a string.
Hexadecimal Characters
0123456789 ;0x30 - 0x39
ABCDEF ;0x41 - 0x46
Example
0xF05C1E3A ;hex
4032568890 ;dec
would become
0x4630354331453341 ;hex
5057600944242766657 ;dec
Question?
Are there any elegant/alternative(/interesting) methods for converting between these states, other than a lookup table, (bitwise operations, shifts, modulo, etc)?
I'm not looking for a function in a library, but rather how one would/should be implemented. Any ideas?
Here's a solution with nothing but shifts, and/or, and add/subtract. No loops either.
uint64_t x, m;
x = 0xF05C1E3A;
x = ((x & 0x00000000ffff0000LL) << 16) | (x & 0x000000000000ffffLL);
x = ((x & 0x0000ff000000ff00LL) << 8) | (x & 0x000000ff000000ffLL);
x = ((x & 0x00f000f000f000f0LL) << 4) | (x & 0x000f000f000f000fLL);
x += 0x0606060606060606LL;
m = ((x & 0x1010101010101010LL) >> 4) + 0x7f7f7f7f7f7f7f7fLL;
x += (m & 0x2a2a2a2a2a2a2a2aLL) | (~m & 0x3131313131313131LL);
Above is the simplified version I came up with after a little time to reflect. Below is the original answer.
uint64_t x, m;
x = 0xF05C1E3A;
x = ((x & 0x00000000ffff0000LL) << 16) | (x & 0x000000000000ffffLL);
x = ((x & 0x0000ff000000ff00LL) << 8) | (x & 0x000000ff000000ffLL);
x = ((x & 0x00f000f000f000f0LL) << 4) | (x & 0x000f000f000f000fLL);
x += 0x3636363636363636LL;
m = (x & 0x4040404040404040LL) >> 6;
x += m;
m = m ^ 0x0101010101010101LL;
x -= (m << 2) | (m << 1);
See it in action: http://ideone.com/nMhJ2q
Spreading out the nibbles to bytes is easy with pdep:
spread = _pdep_u64(raw, 0x0F0F0F0F0F0F0F0F);
Now we'd have to add 0x30 to bytes in the range 0-9 and 0x41 to higher bytes. This could be done by SWAR-subtracting 10 from every byte and then using the sign to select which number to add, such as (not tested)
H = 0x8080808080808080;
ten = 0x0A0A0A0A0A0A0A0A
cmp = ((spread | H) - (ten &~H)) ^ ((spread ^~ten) & H); // SWAR subtract
masks = ((cmp & H) >> 7) * 255;
// if x-10 is negative, take 0x30, else 0x41
add = (masks & 0x3030303030303030) | (~masks & 0x3737373737373737);
asString = spread + add;
That SWAR compare can probably be optimized since you shouldn't need a full subtract to implement it.
There are some different suggestions here, including SIMD: http://0x80.pl/articles/convert-to-hex.html
A slightly simpler version based on Mark Ransom's:
uint64_t x = 0xF05C1E3A;
x = ((x & 0x00000000ffff0000LL) << 16) | (x & 0x000000000000ffffLL);
x = ((x & 0x0000ff000000ff00LL) << 8) | (x & 0x000000ff000000ffLL);
x = ((x & 0x00f000f000f000f0LL) << 4) | (x & 0x000f000f000f000fLL);
x = (x + 0x3030303030303030LL) +
(((x + 0x0606060606060606LL) & 0x1010101010101010LL) >> 4) * 7;
And if you want to avoid the multiplication:
uint64_t m, x = 0xF05C1E3A;
x = ((x & 0x00000000ffff0000LL) << 16) | (x & 0x000000000000ffffLL);
x = ((x & 0x0000ff000000ff00LL) << 8) | (x & 0x000000ff000000ffLL);
x = ((x & 0x00f000f000f000f0LL) << 4) | (x & 0x000f000f000f000fLL);
m = (x + 0x0606060606060606LL) & 0x1010101010101010LL;
x = (x + 0x3030303030303030LL) + (m >> 1) - (m >> 4);
A bit more decent conversion from the the integer to the string any base from 2 to length of the digits
char *reverse(char *);
const char digits[] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char *convert(long long number, char *buff, int base)
{
char *result = (buff == NULL || base > strlen(digits) || base < 2) ? NULL : buff;
char sign = 0;
if (number < 0)
{
sign = '-';
number = -number;
}
if (result != NULL)
{
do
{
*buff++ = digits[number % base];
number /= base;
} while (number);
if(sign) *buff++ = sign;
*buff = 0;
reverse(result);
}
return result;
}
char *reverse(char *str)
{
char tmp;
int len;
if (str != NULL)
{
len = strlen(str);
for (int i = 0; i < len / 2; i++)
{
tmp = *(str + i);
*(str + i) = *(str + len - i - 1);
*(str + len - i - 1) = tmp;
}
}
return str;
}
example - counting from -50 to 50 decimal in base 23
-24 -23 -22 -21 -20 -1M -1L -1K -1J -1I -1H -1G -1F -1E -1D
-1C -1B -1A -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -M -L
-K -J -I -H -G -F -E -D -C -B -A -9 -8 -7 -6
-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9
A B C D E F G H I J K L M 10 11
12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 1G
1H 1I 1J 1K 1L 1M 20 21 22 23 24
A LUT (lookup table) C++ variant. I didn't check the actual machine code produced, but I believe any modern C++ compiler can catch the idea and compile it well.
static const char nibble2hexChar[] { "0123456789ABCDEF" };
// 17B in total, because I'm lazy to init it per char
void byteToHex(std::ostream & out, const uint8_t value) {
out << nibble2hexChar[value>>4] << nibble2hexChar[value&0xF];
}
// this one is actually written more toward short+simple source, than performance
void dwordToHex(std::ostream & out, uint32_t value) {
int i = 8;
while (i--) {
out << nibble2hexChar[value>>28];
value <<= 4;
}
}
EDIT: For C code you have just to switch from std::ostream to some other output means, unfortunately your question lacks any details, what you are actually trying to achieve and why you don't use the built-in printf family of C functions.
For example C like this can write to some char* output buffer, converting arbitrary amount of bytes:
/**
* Writes hexadecimally formatted "n" bytes array "values" into "outputBuffer".
* Make sure there's enough space in output buffer allocated, and add zero
* terminator yourself, if you plan to use it as C-string.
*
* #Returns: pointer after the last character written.
*/
char* dataToHex(char* outputBuffer, const size_t n, const unsigned char* values) {
for (size_t i = 0; i < n; ++i) {
*outputBuffer++ = nibble2hexChar[values[i]>>4];
*outputBuffer++ = nibble2hexChar[values[i]&0xF];
}
return outputBuffer;
}
And finally, I did help once somebody on code review, as he had performance bottleneck exactly with hexadecimal formatting, but I did there the code variant conversion, without LUT, also the whole process and other answer + performance measuring may be instructional for you, as you may see that the fastest solution doesn't just blindly convert result, but actually mix up with the main operation, to achieve better performance overall. So that's why I'm wonder what you are trying to solve, as the whole problem may often allow for more optimal solution, if you just ask about conversion, printf("%x",..) is safe bet.
Here is that another approach for "to hex" conversion:
fast C++ XOR Function
Decimal -> Hex
Just iterate throught string and every character convert to int, then you can do
printf("%02x", c);
or use sprintf for saving to another variable
Hex -> Decimal
Code
printf("%c",16 * hexToInt('F') + hexToInt('0'));
int hexToInt(char c)
{
if(c >= 'a' && c <= 'z')
c = c - ('a' - 'A');
int sum;
sum = c / 16 - 3;
sum *= 10;
sum += c % 16;
return (sum > 9) ? sum - 1 : sum;
}
The articles below compare different methods of converting digits to string, hex numbers are not covered but it seems not a big problem to switch from dec to hex
Integers
Fixed and floating point
#EDIT
Thank you for pointing that the answer above is not relevant.
Common way with no LUT is to split integer into nibbles and map them to ASCII
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#define HI_NIBBLE(b) (((b) >> 4) & 0x0F)
#define LO_NIBBLE(b) ((b) & 0x0F)
void int64_to_char(char carr[], int64_t val){
memcpy(carr, &val, 8);
}
uint64_t inp = 0xF05C1E3A;
char tmp_st[8];
int main()
{
int64_to_char(tmp_st,inp);
printf("Sample: %x\n", inp);
printf("Result: 0x");
for (unsigned int k = 8; k; k--){
char tmp_ch = *(tmp_st+k-1);
char hi_nib = HI_NIBBLE(tmp_ch);
char lo_nib = LO_NIBBLE(tmp_ch);
if (hi_nib || lo_nib){
printf("%c%c",hi_nib+((hi_nib>9)?55:48),lo_nib+((lo_nib>9)?55:48));
}
}
printf("\n");
return 0;
}
Another way is to use Allison's Algorithm. I am total noob in ASM, so I post the code in the form I googled it.
Variant 1:
ADD AL,90h
DAA
ADC AL,40h
DAA
Variant 2:
CMP AL, 0Ah
SBB AL, 69h
DAS

Bitwise Operation on a byte and an int

I have a byte array represented as
char * bytes = getbytes(object); //some api function
I want to check whether the bit at some position x is set.
I've been trying this
int mask = 1 << x % 8;
y= bytes[x>>3] & mask;
However y returns as all zeros? What am I doing incorrectly and is there an easier way to check if a bit is set?
EDIT:
I did run this as well. It didn't return with the expected result either.
int k = x >> 3;
int mask = x % 8;
unsigned char byte = bytes[k];
return (byte & mask);
it failed an assert true ctest I ran. Byte and Mask at this time where "0002" and 2 respectively when printed from gdb.
edit 2: This is how I set the bits in the first place. I'm just trying to write a test to verify they are set.
unsigned long x = somehash(void* a);
unsigned int mask = 1 << (x % 8);
unsigned int location = x >> 3;
char* filter = getData(ref);
filter[location] |= mask;
This would be one (crude perhaps) way from the top of my head:
#include "stdio.h"
#include "stdlib.h"
// this function *changes* the byte array
int getBit(char *b, int bit)
{
int bitToCheck = bit % 8;
b = b + (bitToCheck ? (bit / 8) : (bit / 8 - 1));
if (bitToCheck)
*b = (*b) >> (8 - bitToCheck);
return (*b) & 1;
}
int main(void)
{
char *bytes = calloc(2, 1);
*(bytes + 1)= 5; // writing to the appropiate bits
printf("%d\n", getBit(bytes, 16)); // checking the 16th bit from the left
return 0;
}
Assumptions:
A byte is represented as:
----------------------------------------
| 2^7 | 2^6 | 2^5 | 2^4 | 2^3 |... |
----------------------------------------
The left most bit is considered bit number 1 and the right most bit is considered the max. numbered bit (16th bit in a 2 byte object).
It's OK to overwrite the actual byte object (if this is not wanted, use memcpy).

Breaking apart bit patterns, shifting and creating new patterns

As part of a larger problem, I have to take some binary value: 00000000 11011110 (8)
Then, I have to:
Derive the bit count in this function - so I've done that by finding the place of the most sig fig.
Then store the first 6 numbers of this value into the value 128, such that it equals: 10011110
Then store the last 5 numbers of this value into the value 192, such that it equals: 11000011 10011110
The two bytes should be stored in some array, buffer[]
I have written this function however, position does not appear to initialise properly in gdb and the values are not outputting correctly. This is my attempt:
void create_value(unsigned short init_val, unsigned char buffer[])
{
// get the count
int position = 0;
while (init_val >>= 1)
position++;
// get total
int count = position++;
int start = 128;
for (int i = 0; i < 7; i++)
if (((1 << i) & init_val) != 0) start = start | 1 << i;
buffer[0] = start;
start = 192;
for (int i = 7; i < 11; i++) {
if (((1 << i) & init_val) !=0) start = start | 1 << i;
}
buf[1] = start;
}
After
while (init_val >>= 1)
position++;
init_val will be 0. When you later use
if (((1 << i) & init_val) != 0) start = start | 1 << i;
you will never change start.
So, after reading through what you're trying to do (which is pretty confusingly described), why don't you:
void create_value(unsigned short init_value, unsigned char buffer[])
{
buffer[0] = (init_value & 63) | 128;
buffer[1] = ((init_value >> 6) & 31) | 192;
return;
}
What this does: init_value & 63 masks off all but the lowest 6 bits in init_value, as you wanted. The | 128 then sets the most significant bit of the byte (IFF CHAR_BIT == 8, mind you).
(init_value >> 6) shifts init_value down by 6 bits, so now the original bits 6-11 are bits 0-4. & 31 masks off all bit the lowest 5 bits in this value, | 192 sets the two most significant bits.

Resources