Making conversion similar to BN_hex2bn + BN_bn2bin without openSSL in C - c

I currently use openSSL to convert values from encrypted string to what I thought was a binary array. I then decrypt this "array" (pass to EVP_DecryptUpdate). I make the conversion like this:
BIGNUM *bnEncr = BN_new();
if (0 == BN_hex2bn(&bnEncr, encrypted)) { // from hex to big number
printf("ERROR\n");
}
unsigned int numOfBytesEncr = BN_num_bytes(bnEncr);
unsigned char encrBin[numOfBytesEncr];
if (0 == BN_bn2bin(bnEncr, encrBin)) { // from big number to binary
printf("ERROR\n");
}
Then I pass encrBin to EVP_DecryptUpdate and decryption works.
I do this in many places in my code and now want to write my own C function of converting hex to binary array, which I can then pass to EVP_DecryptUpdate. I had a go at this and converted my encrypted hex string to an array of 0s and 1s, but turns out that EVP_DecryptUpdate won't work with that. From what I could find online, BN_bn2bin "creates a representation that is truly binary (i.e. a sequence of bits). More specifically, it creates a big-endian representation of the number." So this is not just an array of 0s and 1s, right?
Can someone explain how I can make the hex->(truly) binary conversion myself in C, so I would get the format that EVP_DecryptUpdate expects? Is this complicated?

BN_bn2bin "creates a representation that is truly binary (i.e. a
sequence of bits). More specifically, it creates a big-endian
representation of the number." So this is not just an array of 0s and
1s, right?
The sequence of bits mentioned here is represented as an array of bytes. With each of those bytes containing 8 bits, this can be interpreted as an "array of 0s and 1s". It is not an "array of integers that have the value 0 or 1", if that is what you are asking.
Since you are unclear of the workings of BN_bn2bin(), it helps to just analyze the end result of your code snippet. You could do that like this (omitting any error checking):
#include <stdio.h>
#include <openssl/bn.h>
int main(
int argc,
char **argv)
{
const char *hexString = argv[1];
BIGNUM *bnEncr = BN_new();
BN_hex2bn(&bnEncr, hexString);
unsigned int numOfBytesEncr = BN_num_bytes(bnEncr);
unsigned char encrBin[numOfBytesEncr];
BN_bn2bin(bnEncr, encrBin);
fwrite(encrBin, 1, numOfBytesEncr, stdout);
}
This outputs the contents of encrBin to the standard output, which is never a nice thing to happen, but you can then pipe it through a tool like hexdump, or redirect it to a file for analyzing with a hex editor. It looks like this:
$ ./bntest 74162ac74759e85654e0e7762c2cdd26 | hexdump -C
00000000 74 16 2a c7 47 59 e8 56 54 e0 e7 76 2c 2c dd 26 |t.*.GY.VT..v,,.&|
00000010
Or, if you do want to see those 0s and 1s:
$ ./bntest 74162ac74759e85654e0e7762c2cdd26 | xxd -b -c 4
00000000: 01110100 00010110 00101010 11000111 t.*.
00000004: 01000111 01011001 11101000 01010110 GY.V
00000008: 01010100 11100000 11100111 01110110 T..v
0000000c: 00101100 00101100 11011101 00100110 ,,.&
This shows that your question
Can someone explain how I can make the hex->(truly) binary conversion
myself in C, so I would get the format that EVP_DecryptUpdate expects?
Is this complicated?
is essentially the same as the SO question How to turn a hex string into an unsigned char array?, like I commented.

It is unclear why you want this, and it's definitely not advisable to roll your own implementation of the conversion functions (they may stop working with any number of internal changes to OpenSSL) but if you're interested in what it looks like:
static int bn2binpad(const BIGNUM *a, unsigned char *to, int tolen)
{
int n;
size_t i, lasti, j, atop, mask;
BN_ULONG l;
/*
* In case |a| is fixed-top, BN_num_bytes can return bogus length,
* but it's assumed that fixed-top inputs ought to be "nominated"
* even for padded output, so it works out...
*/
n = BN_num_bytes(a);
if (tolen == -1) {
tolen = n;
} else if (tolen < n) { /* uncommon/unlike case */
BIGNUM temp = *a;
bn_correct_top(&temp);
n = BN_num_bytes(&temp);
if (tolen < n)
return -1;
}
/* Swipe through whole available data and don't give away padded zero. */
atop = a->dmax * BN_BYTES;
if (atop == 0) {
OPENSSL_cleanse(to, tolen);
return tolen;
}
lasti = atop - 1;
atop = a->top * BN_BYTES;
for (i = 0, j = 0, to += tolen; j < (size_t)tolen; j++) {
l = a->d[i / BN_BYTES];
mask = 0 - ((j - atop) >> (8 * sizeof(i) - 1));
*--to = (unsigned char)(l >> (8 * (i % BN_BYTES)) & mask);
i += (i - lasti) >> (8 * sizeof(i) - 1); /* stay on last limb */
}
return tolen;
}

Related

How to reverse strings that have been obfuscated using floats and double?

I'm working on a crackme , and having a bit of trouble making sense of the flag I'm supposed to retrieve.
I have disassembled the binary using radare2 and ghidra , ghidra gives me back the following pseudo-code:
undefined8 main(void)
{
long in_FS_OFFSET;
double dVar1;
double dVar2;
int local_38;
int local_34;
int local_30;
int iStack44;
int local_28;
undefined2 uStack36;
ushort uStack34;
char local_20;
undefined2 uStack31;
uint uStack29;
byte bStack25;
long local_10;
local_10 = *(long *)(in_FS_OFFSET + 0x28);
__printf_chk(1,"Insert flag: ");
__isoc99_scanf(&DAT_00102012,&local_38);
uStack34 = uStack34 << 8 | uStack34 >> 8;
uStack29 = uStack29 & 0xffffff00 | (uint)bStack25;
bStack25 = (undefined)uStack29;
if ((((local_38 == 0x41524146) && (local_34 == 0x7b594144)) && (local_30 == 0x62753064)) &&
(((iStack44 == 0x405f336c && (local_20 == '_')) &&
((local_28 == 0x665f646e && (CONCAT22(uStack34,uStack36) == 0x40746f31)))))) {
dVar1 = (double)CONCAT26(uStack34,CONCAT24(uStack36,0x665f646e));
dVar2 = (double)CONCAT17((undefined)uStack29,CONCAT43(uStack29,CONCAT21(uStack31,0x5f)));
__printf_chk(0x405f336c62753064,1,&DAT_00102017);
__printf_chk(dVar1,1,"y: %.30lf\n");
__printf_chk(dVar2,1,"z: %.30lf\n");
dVar1 = dVar1 * 124.8034902710365;
dVar2 = (dVar1 * dVar1) / dVar2;
round_double(dVar2,0x1e);
__printf_chk(1,"%.30lf\n");
dVar1 = (double)round_double(dVar2,0x1e);
if (1.192092895507812e-07 <= (double)((ulong)(dVar1 - 4088116.817143337) & 0x7fffffffffffffff))
{
puts("Try Again");
}
else {
puts("Well done!");
}
}
if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
/* WARNING: Subroutine does not return */
__stack_chk_fail();
}
return 0;
}
It is easy to see that there's a part of the flag in plain-sight , but the other part is a bit more interesting :
if (1.192092895507812e-07 <= (double)((ulong)(dVar1 - 4088116.817143337) & 0x7fffffffffffffff))
From what I understand , I have to generate the missing part of the flag depending on this condition . The problem is that I absolutely have no idea how to do this .
I can assume this missing part is 8 bytes of size , according to this line :
dVar2=(double)CONCAT17((undefined)uStack29,CONCAT43(uStack29,CONCAT21(uStack31,0x5f)));`
Considering flags are usually ascii , with some special characters , let's say , each byte will have values from 0x21 to 0x7E , that's almost 8^100 combinations , which will clearly take too much time to compute.
Do you guys have an idea on how I should proceed to solve this ?
Edit : Here is the link to the binary : https://filebin.net/dpfr1nocyry3sijk
You can tweak the Ghidra reverse result by edit variable type. Based on scanf const string %32s your local_38 should be char [32].
Before the first if, there are some char swap.
And the first if statment give you a long constrain of flag
At this point, you can confirm part of flag is FARADAY{d0ubl3_#nd_f1o#t, then is ther main part of this challenge.
It print x, y, z based on the flag, but you'll quickly find x and y is constrain by the if, so you only need to solve z to get the flag, so you think you need to bruteforce all double value limit by printable ascii.
But there are a limitaion in if statment says byte0 of this double must be _ and a math constrain there, simple math tell dVar2 - 4088116.817143337 <= 1.192092895507813e-07 and it comes dVar2 is very close 4088116.817143337
And byte 3 and byte 7 in this double will swap
By reverse result: dVar2 = y*y*x*x/z, solve this equation you can say z must near 407.2786840401004 and packed to little endian is `be}uty#. Based on double internal structure format, MSB will affect exponent, so you can make sure last byte is # and it shows byte0 and byte3 is fixed now by constrain and flag common format with {} pair.
So finally, you only need to bureforce 5 bytes of printable ascii to resolve this challenge.
import string, struct
from itertools import product
possible = string.ascii_lowercase + string.punctuation + string.digits
for nxt in product(possible, repeat=5):
n = ''.join(nxt).encode()
s = b'_' + n[:2] + b'}' + n[2:] + b'#'
rtn = struct.unpack("<d", s)[0]
rtn = 1665002837.488342 / rtn
if abs(rtn - 4088116.817143337) <= 0.0000001192092895507812:
print(s)
And bingo the flag is FARADAY{d0ubl3_#nd_f1o#t_be#uty}

How to read binary inputs from a file in C

What I need to do is to read binary inputs from a file. The inputs are for example (binary dump),
00000000 00001010 00000100 00000001 10000101 00000001 00101100 00001000 00111000 00000011 10010011 00000101
What I did is,
char* filename = vargs[1];
BYTE buffer;
FILE *file_ptr = fopen(filename,"rb");
fseek(file_ptr, 0, SEEK_END);
size_t file_length = ftell(file_ptr);
rewind(file_ptr);
for (int i = 0; i < file_length; i++)
{
fread(&buffer, 1, 1, file_ptr); // read 1 byte
printf("%d ", (int)buffer);
}
But the problem here is that, I need to divide those binary inputs in some ways so that I can use it as a command (e.g. 101 in the input is to add two numbers)
But when I run the program with the code I wrote, this provides me an output like:
0 0 10 4 1 133 1 44 8 56 3 147 6
which shows in ASCII numbers.
How can I read the inputs as binary numbers, not ASCII numbers?
The inputs should be used in this way:
0 # Padding for the whole file!
0000|0000 # Function 0 with 0 arguments
00000101|00|000|01|000 # MOVE the value 5 to register 0 (000 is MOV function)
00000011|00|001|01|000 # MOVE the value 3 to register 1
000|01|001|01|100 # ADD registers 0 and 1 (100 is ADD function)
000|01|0000011|10|000 # MOVE register 0 to 0x03
0000011|10|010 # POP the value at 0x03
011 # Return from the function
00000110 # 6 instructions in this function
I am trying to implement some sort of like assembly language commands
Can someone please help me out with this problem?
Thanks!
You need to understand the difference between data and its representation. You are correctly reading the data in binary. When you print the data, printf() gives the decimal representation of the binary data. Note that 00001010 in binary is the same as 10 in decimal and 00000100 in binary is 4 in decimal. If you convert each sequence of bits into its decimal value, you will see that the output is exactly correct. You seem to be confusing the representation of the data as it is output with how the data is read and stored in memory. These are two different and distinct things.
The next step to solve your problem is to learn about bitwise operators: |, &, ~, >>, and <<. Then use the appropriate combination of operators to extract the data you need from the stream of bits.
The format you use is not divisible by a byte, so you need to read your bits into a circular buffer and parse it with a state machine.
Read "in binary" or "in text" is quite the same thing, the only thing that change is your interpretation of the data. In your exemple you are reading a byte, and you are printing the decimal value of that byte. But you want to print the bit of that char, to do that you just need to use binary operator of C.
For example:
#include <stdio.h>
#include <stdbool.h>
#include <string.h>
#include <limits.h>
struct binary_circle_buffer {
size_t i;
unsigned char buffer;
};
bool read_bit(struct binary_circle_buffer *bcn, FILE *file, bool *bit) {
if (bcn->i == CHAR_BIT) {
size_t ret = fread(&bcn->buffer, sizeof bcn->buffer, 1, file);
if (!ret) {
return false;
}
bcn->i = 0;
}
*bit = bcn->buffer & ((unsigned char)1 << bcn->i++); // maybe wrong order you should test yourself
// *bit = bcn->buffer & (((unsigned char)UCHAR_MAX / 2 + 1) >> bcn->i++);
return true;
}
int main(void)
{
struct binary_circle_buffer bcn = { .i = CHAR_BIT };
FILE *file = stdin; // replace by your file
bool bit;
size_t i = 0;
while (read_bit(&bcn, file, &bit)) {
// here you must code your state machine to parse instruction gl & hf
printf(bit ? "1" : "0");
if (++i >= 7) {
i = 0;
printf(" ");
}
}
}
Help you more would be difficult, you are basically asking help to code a virtual machine...

Store C structs for multiple platform use - would this approach work?

Compiler: GNU GCC
Application type: console application
Language: C
Platforms: Win7 and Linux Mint
I wrote a program that I want to run under Win7 and Linux. The program writes C structs to a file and I want to be able to create the file under Win7 and read it back in Linux and vice versa.
By now, I have learned that writing complete structs with fwrite() will give almost 100% assurance that it won't be read back correctly by the other platform. This due to padding and maybe other causes.
I defined all structs myself and they (now, after my previous question on this forum) all have members of type int32_t, int64_t and char. I am thinking about writing a WriteStructname() function for each struct that will write the individual members as int32_t, int64_t and char to the outputfile. Likewise, a ReadStructname() function to read the individual struct members from the file and copy them to an empty struct again.
Would this approach work? I prefer to have maximum control over my sourcecode, so I'm not looking for libraries or other dependencies to achieve this unless I really have to.
Thanks for reading
Element-wise writing of data to a file is your best approach, since structs will differ due to alignment and packing differences between compilers.
However, even with the approach you're planning on using, there are still potential pitfalls, such as different endianness between systems, or different encoding schemes (ie: two's complement versus one's complement encoding of signed numbers).
If you're going to do this, you should consider something like a JSON parser to encode and decode your data so you don't corrupt it due to the issues mentioned above.
Good luck!
If you use GCC or any other compiler that supports "packed" structs, as long you avoid yourself from using anything but [u]intX_t types in the struct, and execute endianness fix in any field where type is bigger than 8 bits, you are platform safe :)
This is an example code where you get portability between platforms, do not forget to manually edit the endianness UIP_BYTE_ORDER.
#include <stdint.h>
#include <stdio.h>
/* These macro are set manually, you should use some automated detection methodology */
#define UIP_BIG_ENDIAN 1
#define UIP_LITTLE_ENDIAN 2
#define UIP_BYTE_ORDER UIP_LITTLE_ENDIAN
/* Borrowed from uIP */
#ifndef UIP_HTONS
# if UIP_BYTE_ORDER == UIP_BIG_ENDIAN
# define UIP_HTONS(n) (n)
# define UIP_HTONL(n) (n)
# define UIP_HTONLL(n) (n)
# else /* UIP_BYTE_ORDER == UIP_BIG_ENDIAN */
# define UIP_HTONS(n) (uint16_t)((((uint16_t) (n)) << 8) | (((uint16_t) (n)) >> 8))
# define UIP_HTONL(n) (((uint32_t)UIP_HTONS(n) << 16) | UIP_HTONS((uint32_t)(n) >> 16))
# define UIP_HTONLL(n) (((uint64_t)UIP_HTONL(n) << 32) | UIP_HTONL((uint64_t)(n) >> 32))
# endif /* UIP_BYTE_ORDER == UIP_BIG_ENDIAN */
#else
#error "UIP_HTONS already defined!"
#endif /* UIP_HTONS */
struct __attribute__((__packed__)) s_test
{
uint32_t a;
uint8_t b;
uint64_t c;
uint16_t d;
int8_t string[13];
};
struct s_test my_data =
{
.a = 0xABCDEF09,
.b = 0xFF,
.c = 0xDEADBEEFDEADBEEF,
.d = 0x9876,
.string = "bla bla bla"
};
void save()
{
FILE * f;
f = fopen("test.bin", "w+");
/* Fix endianness */
my_data.a = UIP_HTONL(my_data.a);
my_data.c = UIP_HTONLL(my_data.c);
my_data.d = UIP_HTONS(my_data.d);
fwrite(&my_data, sizeof(my_data), 1, f);
fclose(f);
}
void read()
{
FILE * f;
f = fopen("test.bin", "r");
fread(&my_data, sizeof(my_data), 1, f);
fclose(f);
/* Fix endianness */
my_data.a = UIP_HTONL(my_data.a);
my_data.c = UIP_HTONLL(my_data.c);
my_data.d = UIP_HTONS(my_data.d);
}
int main(int argc, char ** argv)
{
save();
return 0;
}
Thats the saved file dump:
fanl#fanl-ultrabook:~/workspace-tmp/test3$ hexdump -v -C test.bin
00000000 ab cd ef 09 ff de ad be ef de ad be ef 98 76 62 |..............vb|
00000010 6c 61 20 62 6c 61 20 62 6c 61 00 00 |la bla bla..|
0000001c
This is a good approach. If all fields are integer types of a specific size such as int32_t, int64_t, or char, and you read/write the appropriate number of them to/from arrays, you should be fine.
The one thing you need to watch out for is endianness. Any integer type should be written in a known byte order and read back in the proper byte order for the system in question. The simplest way to do this is with the ntohs and htons functions for 16-bit ints and the ntohl and htonl functions for 32-bit ints. There's no corresponding standard functions for 64-bit ints, but that shouldn't be to difficult to write.
Here's a sample of how you could write these functions for 64 bit:
uint64_t htonll(uint64_t val)
{
uint8_t v[8];
uint64_t *result = (uint64_t *)v;
int i;
for (i=0; i<8; i++) {
v[i] = (uint8_t)(val >> ((7-i) * 8));
}
return *result;
}
uint64_t ntohll(uint64_t val)
{
uint8_t *v = (uint8_t *)&val;
uint64_t result = 0;
int i;
for (i=0; i<8; i++) {
result |= (uint64_t)v[i] << ((7-i) * 8);
}
return result;
}

How to check the number of set bits in an 8-bit unsigned char?

So I have to find the set bits (on 1) of an unsigned char variable in C?
A similar question is How to count the number of set bits in a 32-bit integer? But it uses an algorithm that's not easily adaptable to 8-bit unsigned chars (or its not apparent).
The algorithm suggested in the question How to count the number of set bits in a 32-bit integer? is trivially adapted to 8 bit:
int NumberOfSetBits( uint8_t b )
{
b = b - ((b >> 1) & 0x55);
b = (b & 0x33) + ((b >> 2) & 0x33);
return (((b + (b >> 4)) & 0x0F) * 0x01);
}
It is simply a case of shortening the constants the the least significant eight bits, and removing the final 24 bit right-shift. Equally it could be adapted for 16bit using an 8 bit shift. Note that in the case for 8 bit, the mechanical adaptation of the 32 bit algorithm results in a redundant * 0x01 which could be omitted.
The fastest approach for an 8-bit variable is using a lookup table.
Build an array of 256 values, one per 8-bit combination. Each value should contain the count of bits in its corresponding index:
int bit_count[] = {
// 00 01 02 03 04 05 06 07 08 09 0a, ... FE FF
0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, ..., 7, 8
};
Getting a count of a combination is the same as looking up a value from the bit_count array. The advantage of this approach is that it is very fast.
You can generate the array using a simple program that counts bits one by one in a slow way:
for (int i = 0 ; i != 256 ; i++) {
int count = 0;
for (int p = 0 ; p != 8 ; p++) {
if (i & (1 << p)) {
count++;
}
}
printf("%d, ", count);
}
(demo that generates the table).
If you would like to trade some CPU cycles for memory, you can use a 16-byte lookup table for two 4-bit lookups:
static const char split_lookup[] = {
0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4
};
int bit_count(unsigned char n) {
return split_lookup[n&0xF] + split_lookup[n>>4];
}
Demo.
I think you are looking for Hamming Weight algorithm for 8bits?
If it is true, here is the code:
unsigned char in = 22; //This is your input number
unsigned char out = 0;
in = in - ((in>>1) & 0x55);
in = (in & 0x33) + ((in>>2) & 0x33);
out = ((in + (in>>4) & 0x0F) * 0x01) ;
Counting the number of digits different than 0 is also known as a Hamming Weight. In this case, you are counting the number of 1's.
Dasblinkenlight provided you with a table driven implementation, and Olaf provided you with a software based solution. I think you have two other potential solutions. The first is to use a compiler extension, the second is to use an ASM specific instruction with inline assembly from C.
For the first alternative, see GCC's __builtin_popcount(). (Thanks to Artless Noise).
For the second alternative, you did not specify the embedded processor, but I'm going to offer this in case its ARM based.
Some ARM processors have the VCNT instruction, which performs the count for you. So you could do it from C with inline assembly:
inline
unsigned int hamming_weight(unsigned char value) {
__asm__ __volatile__ (
"VCNT.8"
: "=value"
: "value"
);
return value;
}
Also see Fastest way to count number of 1s in a register, ARM assembly.
For completeness, here is Kernighan's bit counting algorithm:
int count_bits(int n) {
int count = 0;
while(n != 0) {
n &= (n-1);
count++;
}
return count;
}
Also see Please explain the logic behind Kernighan's bit counting algorithm.
I made an optimized version. With a 32-bit processor, utilizing multiplication, bit shifting and masking can make smaller code for the same task, especially when the input domain is small (8-bit unsigned integer).
The following two code snippets are equivalent:
unsigned int bit_count_uint8(uint8_t x)
{
uint32_t n;
n = (uint32_t)(x * 0x08040201UL);
n = (uint32_t)(((n >> 3) & 0x11111111UL) * 0x11111111UL);
/* The "& 0x0F" will be optimized out but I add it for clarity. */
return (n >> 28) & 0x0F;
}
/*
unsigned int bit_count_uint8_traditional(uint8_t x)
{
x = x - ((x >> 1) & 0x55);
x = (x & 0x33) + ((x >> 2) & 0x33);
x = ((x + (x >> 4)) & 0x0F);
return x;
}
*/
This produces smallest binary code for IA-32, x86-64 and AArch32 (without NEON instruction set) as far as I can find.
For x86-64, this doesn't use the fewest number of instructions, but the bit shifts and downcasting avoid the use of 64-bit instructions and therefore save a few bytes in the compiled binary.
Interestingly, in IA-32 and x86-64, a variant of the above algorithm using a modulo ((((uint32_t)(x * 0x08040201U) >> 3) & 0x11111111U) % 0x0F) actually generates larger code, due to a requirement to move the remainder register for return value (mov eax,edx) after the div instruction. (I tested all of these in Compiler Explorer)
Explanation
I denote the eight bits of the byte x, from MSB to LSB, as a, b, c, d, e, f, g and h.
abcdefgh
* 00001000 00000100 00000010 00000001 (make 4 copies of x
--------------------------------------- with appropriate
abc defgh0ab cdefgh0a bcdefgh0 abcdefgh bit spacing)
>> 3
---------------------------------------
000defgh 0abcdefg h0abcdef gh0abcde
& 00010001 00010001 00010001 00010001
---------------------------------------
000d000h 000c000g 000b000f 000a000e
* 00010001 00010001 00010001 00010001
---------------------------------------
000d000h 000c000g 000b000f 000a000e
... 000h000c 000g000b 000f000a 000e
... 000c000g 000b000f 000a000e
... 000g000b 000f000a 000e
... 000b000f 000a000e
... 000f000a 000e
... 000a000e
... 000e
^^^^ (Bits 31-28 will contain the sum of the bits
a, b, c, d, e, f, g and h. Extract these
bits and we are done.)
Maybe not the fastest, but straightforward:
int count = 0;
for (int i = 0; i < 8; ++i) {
unsigned char c = 1 << i;
if (yourVar & c) {
//bit n°i is set
//first bit is bit n°0
count++;
}
}
For 8/16 bit MCUs, a loop will very likely be faster than the parallel-addition approach, as these MCUs cannot shift by more than one bit per instruction, so:
size_t popcount(uint8_t val)
{
size_t cnt = 0;
do {
cnt += val & 1U; // or: if ( val & 1 ) cnt++;
} while ( val >>= 1 ) ;
return cnt;
}
For the incrementation of cnt, you might profile. If still too slow, an assember implementation might be worth a try using carry flag (if available). While I am in against using assembler optimizations in general, such algorithms are one of the few good exceptions (still just after the C version fails).
If you can omit the Flash, a lookup table as proposed by #dasblinkenlight is likey the fastest approach.
Just a hint: For some architectures (notably ARM and x86/64), gcc has a builtin: __builtin_popcount(), you also might want to try if available (although it takes int at least). This might use a single CPU instruction - you cannot get faster and more compact.
Allow me to post a second answer. This one is the smallest possible for ARM processors with Advanced SIMD extension (NEON). It's even smaller than __builtin_popcount() (since __builtin_popcount() is optimized for unsigned int input, not uint8_t).
#ifdef __ARM_NEON
/* ARM C Language Extensions (ACLE) recommends us to check __ARM_NEON before
including <arm_neon.h> */
#include <arm_neon.h>
unsigned int bit_count_uint8(uint8_t x)
{
/* Set all lanes at once so that the compiler won't emit instruction to
zero-initialize other lanes. */
uint8x8_t v = vdup_n_u8(x);
/* Count the number of set bits for each lane (8-bit) in the vector. */
v = vcnt_u8(v);
/* Get lane 0 and discard other lanes. */
return vget_lane_u8(v, 0);
}
#endif

Reading double from binary file (byte order?)

I have a binary file, and I want to read a double from it.
In hex representation, I have these 8 bytes in a file (and then some more after that):
40 28 25 c8 9b 77 27 c9 40 28 98 8a 8b 80 2b d5 40 ...
This should correspond to a double value of around 10 (based on what that entry means).
I have used
#include<stdio.h>
#include<assert.h>
int main(int argc, char ** argv) {
FILE * f = fopen(argv[1], "rb");
assert(f != NULL);
double a;
fread(&a, sizeof(a), 1, f);
printf("value: %f\n", a);
}
However, that prints
value: -261668255698743527401808385063734961309220864.000000
So clearly, the bytes are not converted into a double correctly. What is going on?
Using ftell, I could confirm that 8 bytes are being read.
Just like integer types, floating point types are subject to platform endianness. When I run this program on a little-endian machine:
#include <stdio.h>
#include <stdint.h>
uint64_t byteswap64(uint64_t input)
{
uint64_t output = (uint64_t) input;
output = (output & 0x00000000FFFFFFFF) << 32 | (output & 0xFFFFFFFF00000000) >> 32;
output = (output & 0x0000FFFF0000FFFF) << 16 | (output & 0xFFFF0000FFFF0000) >> 16;
output = (output & 0x00FF00FF00FF00FF) << 8 | (output & 0xFF00FF00FF00FF00) >> 8;
return output;
}
int main()
{
uint64_t bytes = 0x402825c89b7727c9;
double a = *(double*)&bytes;
printf("%f\n", a);
bytes = byteswap64(bytes);
a = *(double*)&bytes;
printf("%f\n", a);
return 0;
}
Then the output is
12.073796
-261668255698743530000000000000000000000000000.000000
This shows that your data is stored in the file in little endian format, but your platform is big endian. So, you need to perform a byte swap after reading the value. The code above shows how to do that.
Endianness is convention. Reader and writer should agree on what endianness to use and stick to it.
You should read your number as int64, convert endianness and then cast to double.

Resources