How to read binary inputs from a file in C - c

What I need to do is to read binary inputs from a file. The inputs are for example (binary dump),
00000000 00001010 00000100 00000001 10000101 00000001 00101100 00001000 00111000 00000011 10010011 00000101
What I did is,
char* filename = vargs[1];
BYTE buffer;
FILE *file_ptr = fopen(filename,"rb");
fseek(file_ptr, 0, SEEK_END);
size_t file_length = ftell(file_ptr);
rewind(file_ptr);
for (int i = 0; i < file_length; i++)
{
fread(&buffer, 1, 1, file_ptr); // read 1 byte
printf("%d ", (int)buffer);
}
But the problem here is that, I need to divide those binary inputs in some ways so that I can use it as a command (e.g. 101 in the input is to add two numbers)
But when I run the program with the code I wrote, this provides me an output like:
0 0 10 4 1 133 1 44 8 56 3 147 6
which shows in ASCII numbers.
How can I read the inputs as binary numbers, not ASCII numbers?
The inputs should be used in this way:
0 # Padding for the whole file!
0000|0000 # Function 0 with 0 arguments
00000101|00|000|01|000 # MOVE the value 5 to register 0 (000 is MOV function)
00000011|00|001|01|000 # MOVE the value 3 to register 1
000|01|001|01|100 # ADD registers 0 and 1 (100 is ADD function)
000|01|0000011|10|000 # MOVE register 0 to 0x03
0000011|10|010 # POP the value at 0x03
011 # Return from the function
00000110 # 6 instructions in this function
I am trying to implement some sort of like assembly language commands
Can someone please help me out with this problem?
Thanks!

You need to understand the difference between data and its representation. You are correctly reading the data in binary. When you print the data, printf() gives the decimal representation of the binary data. Note that 00001010 in binary is the same as 10 in decimal and 00000100 in binary is 4 in decimal. If you convert each sequence of bits into its decimal value, you will see that the output is exactly correct. You seem to be confusing the representation of the data as it is output with how the data is read and stored in memory. These are two different and distinct things.
The next step to solve your problem is to learn about bitwise operators: |, &, ~, >>, and <<. Then use the appropriate combination of operators to extract the data you need from the stream of bits.

The format you use is not divisible by a byte, so you need to read your bits into a circular buffer and parse it with a state machine.
Read "in binary" or "in text" is quite the same thing, the only thing that change is your interpretation of the data. In your exemple you are reading a byte, and you are printing the decimal value of that byte. But you want to print the bit of that char, to do that you just need to use binary operator of C.
For example:
#include <stdio.h>
#include <stdbool.h>
#include <string.h>
#include <limits.h>
struct binary_circle_buffer {
size_t i;
unsigned char buffer;
};
bool read_bit(struct binary_circle_buffer *bcn, FILE *file, bool *bit) {
if (bcn->i == CHAR_BIT) {
size_t ret = fread(&bcn->buffer, sizeof bcn->buffer, 1, file);
if (!ret) {
return false;
}
bcn->i = 0;
}
*bit = bcn->buffer & ((unsigned char)1 << bcn->i++); // maybe wrong order you should test yourself
// *bit = bcn->buffer & (((unsigned char)UCHAR_MAX / 2 + 1) >> bcn->i++);
return true;
}
int main(void)
{
struct binary_circle_buffer bcn = { .i = CHAR_BIT };
FILE *file = stdin; // replace by your file
bool bit;
size_t i = 0;
while (read_bit(&bcn, file, &bit)) {
// here you must code your state machine to parse instruction gl & hf
printf(bit ? "1" : "0");
if (++i >= 7) {
i = 0;
printf(" ");
}
}
}
Help you more would be difficult, you are basically asking help to code a virtual machine...

Related

Value of CRC changing everytime program is run

I am writing a CLI utility in C that analyzes PNG files and outputs data about it. More specifically, it prints out the length, CRC and type values of each chunk in the PNG file. I am using the official specification for the PNG file format and it says that each chunk has a CRC value encoded in it for data integrity.
My tool is running fine and it outputs the correct values for length and type and outputs what appears to be a correct value for the CRC (as in it is formatted as 4-bytes hexadecimal) - the only problem is that everytime I run this program, the value of the CRC changes. Is this normal, and if not what could be causing it?
Here is the main part of the code
CHUNK chunk;
BYTE buffer;
int i = 1;
while (chunk.type != 1145980233) { // 1145980233 is a magic number that signals our program that IEND chunk
// has been reached it is just the decimal equivalent of 'IEND'
printf("============\nCHUNK: %i\n", i);
// Read LENGTH value; we have to buffer and then append to length hexdigit-by-hexdigit to account for
// reversals of byte-order when reading infile (im not sure why this reversal only happens here)
for(unsigned j = 0; j < 4; ++j) {
fread(&buffer, 1, sizeof(BYTE), png);
chunk.length = (chunk.length | buffer)<<8; // If length is 0b4e and buffer is 67 this makes sure that length
// ends up 0b4e67 and not 0b67
}
chunk.length = chunk.length>>8; // Above bitshifting ends up adding an extra 00 to end of length
// This gets rid of that
printf("LENGTH: %u\n", chunk.length);
// Read TYPE value
fread(&chunk.type, 4, sizeof(BYTE), png);
// Print out TYPE in chars
printf("TYPE: ");
printf("%c%c%c%c\n", chunk.type & 0xff, (chunk.type & 0xff00)>>8, (chunk.type & 0xff0000)>>16, (chunk.type & 0xff000000)>>24);
// Allocate LENGTH bytes of memory for data
chunk.data = calloc(chunk.length, sizeof(BYTE));
// Populate DATA
for(unsigned j = 0; j < chunk.length; ++j) {
fread(&buffer, 1, sizeof(BYTE), png);
}
// Read CRC value
for(unsigned j = 0; j < 4; ++j) {
fread(&chunk.crc, 1, sizeof(BYTE), png);
}
printf("CRC: %x\n", chunk.crc);
printf("\n");
i++;
}
here are some preprocessor directives and global variables
#define BYTE uint8_t
typedef struct {
uint32_t length;
uint32_t type;
uint32_t crc;
BYTE* data;
} CHUNK;
here are some examples of the output I am getting
Run 1 -
============
CHUNK: 1
LENGTH: 13
TYPE: IHDR
CRC: 17a6a400
============
CHUNK: 2
LENGTH: 2341
TYPE: iCCP
CRC: 17a6a41e
Run 2 -
============
CHUNK: 1
LENGTH: 13
TYPE: IHDR
CRC: 35954400
============
CHUNK: 2
LENGTH: 2341
TYPE: iCCP
CRC: 3595441e
Run 3 -
============
CHUNK: 1
LENGTH: 13
TYPE: IHDR
CRC: 214b0400
============
CHUNK: 2
LENGTH: 2341
TYPE: iCCP
CRC: 214b041e
As you can see, the CRC values are different each time, yet within each run they are all fairly similar whereas my intuition tells me this should not be the case and the CRC value should not be changing.
Just to make sure, I also ran
$ cat test.png > file1
$ cat test.png > file2
$ diff -s file1 file2
Files file1 and file2 are identical
so accessing the file two different times doesnt change the CRC values in them, as expected.
Thanks,
This:
fread(&chunk.crc, 1, sizeof(BYTE), png);
keeps overwriting the first byte of chunk.crc with the bytes read from the file. The other three bytes of chunk.crc are never written, and so you are seeing whatever was randomly in memory at those locations when your program started. You will note that the 00 and 1e at the ends is consistent, since that is the one byte that is being written.
Same problem with this in your data reading loop:
fread(&buffer, 1, sizeof(BYTE), png);
An unrelated error is that you are accumulating bytes in a 32-bit integer thusly:
chunk.length = (chunk.length | buffer)<<8;
and then after the end of that loop, rolling it back down:
chunk.length = chunk.length>>8;
That will always discard the most significant byte of the length, since you push it off the top of the 32 bits, and then roll eight zero bits back down in its place. Instead you need to do it like this:
chunk.length = (chunk.length << 8) | buffer;
and then all 32 bits are retained, and you don't need to fix it at the end.
This is a bad idea:
fread(&chunk.type, 4, sizeof(BYTE), png);
because it is not portable. What you end up with in chunk.type depends on the endianess of the architecture it is running on. For "IHDR", you will get 0x52444849 on a little-endian machine, and 0x49484452 on a big-endian machine.

Making conversion similar to BN_hex2bn + BN_bn2bin without openSSL in C

I currently use openSSL to convert values from encrypted string to what I thought was a binary array. I then decrypt this "array" (pass to EVP_DecryptUpdate). I make the conversion like this:
BIGNUM *bnEncr = BN_new();
if (0 == BN_hex2bn(&bnEncr, encrypted)) { // from hex to big number
printf("ERROR\n");
}
unsigned int numOfBytesEncr = BN_num_bytes(bnEncr);
unsigned char encrBin[numOfBytesEncr];
if (0 == BN_bn2bin(bnEncr, encrBin)) { // from big number to binary
printf("ERROR\n");
}
Then I pass encrBin to EVP_DecryptUpdate and decryption works.
I do this in many places in my code and now want to write my own C function of converting hex to binary array, which I can then pass to EVP_DecryptUpdate. I had a go at this and converted my encrypted hex string to an array of 0s and 1s, but turns out that EVP_DecryptUpdate won't work with that. From what I could find online, BN_bn2bin "creates a representation that is truly binary (i.e. a sequence of bits). More specifically, it creates a big-endian representation of the number." So this is not just an array of 0s and 1s, right?
Can someone explain how I can make the hex->(truly) binary conversion myself in C, so I would get the format that EVP_DecryptUpdate expects? Is this complicated?
BN_bn2bin "creates a representation that is truly binary (i.e. a
sequence of bits). More specifically, it creates a big-endian
representation of the number." So this is not just an array of 0s and
1s, right?
The sequence of bits mentioned here is represented as an array of bytes. With each of those bytes containing 8 bits, this can be interpreted as an "array of 0s and 1s". It is not an "array of integers that have the value 0 or 1", if that is what you are asking.
Since you are unclear of the workings of BN_bn2bin(), it helps to just analyze the end result of your code snippet. You could do that like this (omitting any error checking):
#include <stdio.h>
#include <openssl/bn.h>
int main(
int argc,
char **argv)
{
const char *hexString = argv[1];
BIGNUM *bnEncr = BN_new();
BN_hex2bn(&bnEncr, hexString);
unsigned int numOfBytesEncr = BN_num_bytes(bnEncr);
unsigned char encrBin[numOfBytesEncr];
BN_bn2bin(bnEncr, encrBin);
fwrite(encrBin, 1, numOfBytesEncr, stdout);
}
This outputs the contents of encrBin to the standard output, which is never a nice thing to happen, but you can then pipe it through a tool like hexdump, or redirect it to a file for analyzing with a hex editor. It looks like this:
$ ./bntest 74162ac74759e85654e0e7762c2cdd26 | hexdump -C
00000000 74 16 2a c7 47 59 e8 56 54 e0 e7 76 2c 2c dd 26 |t.*.GY.VT..v,,.&|
00000010
Or, if you do want to see those 0s and 1s:
$ ./bntest 74162ac74759e85654e0e7762c2cdd26 | xxd -b -c 4
00000000: 01110100 00010110 00101010 11000111 t.*.
00000004: 01000111 01011001 11101000 01010110 GY.V
00000008: 01010100 11100000 11100111 01110110 T..v
0000000c: 00101100 00101100 11011101 00100110 ,,.&
This shows that your question
Can someone explain how I can make the hex->(truly) binary conversion
myself in C, so I would get the format that EVP_DecryptUpdate expects?
Is this complicated?
is essentially the same as the SO question How to turn a hex string into an unsigned char array?, like I commented.
It is unclear why you want this, and it's definitely not advisable to roll your own implementation of the conversion functions (they may stop working with any number of internal changes to OpenSSL) but if you're interested in what it looks like:
static int bn2binpad(const BIGNUM *a, unsigned char *to, int tolen)
{
int n;
size_t i, lasti, j, atop, mask;
BN_ULONG l;
/*
* In case |a| is fixed-top, BN_num_bytes can return bogus length,
* but it's assumed that fixed-top inputs ought to be "nominated"
* even for padded output, so it works out...
*/
n = BN_num_bytes(a);
if (tolen == -1) {
tolen = n;
} else if (tolen < n) { /* uncommon/unlike case */
BIGNUM temp = *a;
bn_correct_top(&temp);
n = BN_num_bytes(&temp);
if (tolen < n)
return -1;
}
/* Swipe through whole available data and don't give away padded zero. */
atop = a->dmax * BN_BYTES;
if (atop == 0) {
OPENSSL_cleanse(to, tolen);
return tolen;
}
lasti = atop - 1;
atop = a->top * BN_BYTES;
for (i = 0, j = 0, to += tolen; j < (size_t)tolen; j++) {
l = a->d[i / BN_BYTES];
mask = 0 - ((j - atop) >> (8 * sizeof(i) - 1));
*--to = (unsigned char)(l >> (8 * (i % BN_BYTES)) & mask);
i += (i - lasti) >> (8 * sizeof(i) - 1); /* stay on last limb */
}
return tolen;
}

Binary file that can only be printed in hex format , but not binary format

Here is a binary file that contains:
0xff 0xff 0xff
which is exactly three bytes.
I try to use the dump_file function here
#include "table.h"
#include "debug.h"
typedef unsigned int Code
void dump_file( char* fileName[] )
{
char c;
for (int i = 0; i < 4; ++i)
{
log_info("File: %s",fileName[i]);
FILE* file = fopen(fileName[i],"rb");
fread(&c,sizeof(char),1,file);
while( !feof(file) ){
dump_code( c , 8 );
fread(&c,sizeof(char),1,file);
}
}
}
void dump_code( Code code,int BitsNum )
{
int mask = 1 << (BitsNum-1);
for (int i = 0; i < BitsNum ; ++i)
{
if(i%8==0)putchar('|');
putchar((mask & code) ? '1' : '0');
code <<= 1;
}
puts("");
}
to print the file in binary format, but it prints nothing. ( Somehow it bumps into EOF in an undesirable manner ?? )
I also use the Unix unity xxd.
When I signal xxd to print my file in binary, it prints nothing. But if I choose to print hexademically, it prints as expected. What's wrong with this file?
This file is generated by a parser. The C program uses fseek to jump to various location in a file and print the corresponding binary code. It might go like:
0th byte --> 1st byte --> 3rd byte --> 5th byte --> 2nd byte --> 4th byte --> 6th byte
It is guaranteed that there is no "leak" in the resulting file, i.e, every byte will be traversed.
What is the reason for this strange behavior?
Update 1
While pointed out by samgak that this might be due to the interpretation of 0xff, some of my other experiments indicate that even file containing:
0x01 0x01 0x01
which results in the same phenomenon.
Update 2
Here's the relevent code that write Code into file:
#define CODE_FILE_NUM 3
void writeCode( FILE* out[] , Code code ){
for (int i = 0; i < CODE_FILE_NUM; ++i){
fwrite(&code,sizeof(char),1,out[i]);
code >>= 8;
}
}
Code is an unsigned int, which has 4 bytes. Function writeCode will only consider the lower 3 bytes and write each byte into 3 seperate files.
I have found the reason.
It's because I forgot to close the output files.
I tried to dump unclosed binary files ( That is: open and read data from files that haven't been closed. ) , which resulted in unpredictable behaviors.

Store zeros from ints and use them later

I have 3 sensors that each provide either 0 or 1 (repeatedly in a loop). They are stored individually as int variables. These are then printed using the following:
print ("%d%d%d", Sensor1, Sensor2, Sensor3);
I want to store each combination (ex: 010, 001, 110, etc.) temporarily so that I can use it do something else (I want to have a switch or something eventually where I can do a different operation depending on the value of the sensor combination). I can't store it as an int since that drops the 0s in front.
How can I store these combinations?
You can use structure bit field for this.
struct Bit{
bool Sensor1 : 1;
bool Sensor2 : 1;
bool Sensor3 : 1;
};
int main(void)
{
struct Bit bit = {0, 1, 0};
printf ("%d%d%d", bit.Sensor1, bit.Sensor2, bit.Sensor3);
}
So you have
int Sensor1, Sensor2, Sensor3;
// have code to initialize above variables to 0 or 1
To store these as one integer in base 10, assuming they really all are 0 or 1, you can do:
int Sensors_10 = Sensor1 * 100 + Sensor2 * 10 + Sensor3;
And then to get them back:
Sensor1 = Sensors_10 / 100 % 10;
Sensor2 = Sensors_10 / 10 % 10;
Sensor3 = Sensors_10 % 10;
Obviously order of sensors can be whatever, as long as it matches between packing and unpacking.
But, you only need 1 bit to store each sensor, so could use binary:
int Sensors_2 = Sensor1 * 4 + Sensor2 * 2 + Sensor3;
...
Sensor1 = Sensors_2 / 4 % 2;
Sensor2 = Sensors_2 / 4 % 2;
Sensor3 = Sensors_2 % 2;
But, with computer binary numbers are special, so the binary version is more commonly written like this:
int Sensors_2 = Sensor1 << 2 | Sensor2 << 1 | Sensor3;
...
Sensor1 = Sensors_2 >> 2 & 1;
Sensor2 = Sensors_2 >> 1 & 1;
Sensor3 = Sensors_2 & 1;
Where |, <<, >> and & are bitwise OR, shift and AND operators, and explaining what they do is beyond scope of this question, but one note about them: When there are no "overlapping" one-bits and numbers are positive, then result of | is same as result of +.
Answer of haccks covers how to make C compiler do this for you, without doing your own bit manipulation.
To print Sensors_10 with leading zeros, you can do printf("%03d", Sensors_10);. C standard library does not have a way to print binary numbers directly, so you need your own code to print the bits one-by-one, so you might as well printf("%d%d%d", Sensor1, Sensor2, Sensor3); then.
You can use a 2D int array to store the values and use it later.
E.g int sen_value[1000][3]; use it in the loop to store the values.
Example how you can use it in loop:
#include <stdio.h>
int main ()
{
int i;
int sen_value[10][3];
for(i=0;i<10;i++)
{
//Assigning the values
sen_value[i][0] = 0;
sen_value[i][1] = 0;
sen_value[i][2] = 0;
//Use the way you want
printf("%d %d %d\n",sen_value[i][0],sen_value[i][1],sen_value[i][2]);
}
return 0;
}
Or you can use it just once and then reset it after each operation, For example:
#include <stdio.h>
int main ()
{
int sen_value[1][3];
//Assigning the values
sen_value[0][0] = 0;
sen_value[0][1] = 0;
sen_value[0][2] = 0;
//Use the way you want
printf("%d %d %d\n",sen_value[0][0],sen_value[0][1],sen_value[0][2]);
return 0;
}
If you are using a linux environment then by using the command you can easily save the output that are displayed in your console.
Let here sensor.c be your source file Then,
$ gcc -o a sensor.c
$ ./a > senser.txt
Then you have a .txt file with all output stored in a txt file. And these can be again used as inputs in your other.c files like :
$ gcc -o other other.c
$ ./other < senser.txt
If you want to store those sensor1,sensor2,sensor3 internally and use internally then you can simply use the arrays or Structure like :
main(){
int Sensor[1][3];
Sensor[0][0] = 0;
Sensor[0][1] = 1;
Sensor[0][2] = 0;
print ("%d%d%d", Sensor[0][0], Sensor[0][1], Sensor[0][2]);
}
While the leading zeroes of an integer are not displayed when printed, that does not mean they are "dropped"; they are merely implicit - that is a matter of the format specifier used in teh output of the value rather than the zeros not being present. An int is always a fixed number of binary digits.
Consider:
uint32_t sensor_state = (sensor3 << 3) | (sensor2 << 1) | sensor1 ;
Note that uint32_t is a type alias for an unsigned integer 32 bits in length. It is defined by including the <stdint.h> header file. In this case a plain int would work, but when you are dealing with data at the bit level it is good to be explicit (and unsigned). Here of course a uint8_t would work too, and if your target is an 8 bit device, I suggest you use that.
Here sensor_state is a binary combination of the three sensor values and will have one of the following values:
Sensors sensor_state
3 2 1 binary decimal hexadecimal
---------------------------------------
0 0 0 0000 0 0x00
0 0 1 0001 1 0x01
0 1 0 0010 2 0x02
0 1 1 0011 3 0x03
1 0 0 0100 4 0x04
1 0 1 0101 5 0x05
1 1 0 0110 6 0x06
1 1 1 0111 7 0x07
So you can switch on any combination:
switch( sensor_state )
{
case 0x00 :
...
break ;
case 0x01 :
...
break ;
case 0x02 :
...
break ;
...
case 0x07 :
...
break ;
default :
// unexpected invalid combination
break ;
}
You might usefully create an enumeration for each combination:
enum eSensorStates
{
NO_SENSOR = 0,
SENSOR1,
SENSOR2,
SENSOR12,
SENSOR3,
SENSOR13,
SENSOR23,
SENSOR123
}
Then you can write:
switch( sensor_state )
{
case NO_SENSOR :
...
break ;
case SENSOR1:
...
break ;
case SENSOR2:
...
break ;
...
case SENSOR123 :
...
break ;
default :
// unexpected invalid combination
break ;
}
You may of course use enumeration names that make specific sense in your application - that reflect the meaning or action for each combination rather than the generic names I have chosen.

how to disable fast frames in ath5k wireless driver

By default, fast frames are enabled in ath5k. (http://wireless.kernel.org/en/users/Drivers/ath5k)
I have found the macro which disables it
#define AR5K_EEPROM_FF_DIS(_v) (((_v) >> 2) & 0x1
The question is what do I do with it?
Do I replace the above line with
#define AR5K_EEPROM_FF_DIS(_v) 1
?
Do I compile it passing some parameter?
The bit-shift expression confuses me. Is _v a variable?
The question is more general as to how to deal with such macros in drivers. I've seen them in other codes too and always got confused.
Ok, I try to explain with a simplified example
#include <stdio.h>
/* Just for print in binary mode */
char *chartobin(unsigned char c)
{
static char a[9];
int i;
for (i = 0; i < 8; i++)
a[7 - i] = (c & (1 << i)) == (1 << i) ? '1' : '0';
a[8] = '\0';
return a;
}
int main(void)
{
unsigned char u = 0xf;
printf("%s\n", chartobin(u));
u >>= 2; // Shift bits 2 positions (to the right)
printf("%s\n", chartobin(u));
printf("%s\n", chartobin(u & 0x1)); // Check if the last bit is on
return 0;
}
Output:
00001111
00000011
00000001
Do I replace the above line with #define AR5K_EEPROM_FF_DIS(_v) 1?
Nooooo!!
If you initialize u with 0xb instead of 0xf you get:
00001011
00000010
00000000
As you can see (((_v) >> 2) & 0x1 != 1
Fast frames are not enabled or used on ath5k. It's a feature allowing the card to send multiple frames at once (think of it as an early version of 11n frame aggregation) that's implemented on MadWiFi and their proprietary drivers and can only be used with an Access Point that also supports it. What you see there is a flag stored at the device's EEPROM that instructs the driver if fast frames can be used or not, that macro you refer to just checks if that flag is set. You can modify the header file to always return 1 but that wouldn't make any difference, the driver never uses that information.

Resources