I need a function which will print out the binary representation of a read file like the xxd program in unix, but I want to make my own. Hexidecimal works just fine with %x but there is no built in format for binary. Anyone know how to do this?
I usually do not believe in answering these sorts of questions with full code implementations, however I was handed this bit of code many years ago and I feel obligated to pass it on. I have removed all the comments except for the usage, so you can try to figure out how it works yourself.
Code base 16
#include <stdio.h>
#include <ctype.h>
// Takes a pointer to an arbitrary chunk of data and prints the first-len bytes.
void dump (void* data, unsigned int len)
{
printf ("Size: %d\n", len);
if (len > 0) {
unsigned width = 16;
char *str = (char *)data;
unsigned int j, i = 0;
while (i < len) {
printf (" ");
for (j = 0; j < width; j++) {
if (i + j < len)
printf ("%02x ", (unsigned char) str [j]);
else
printf (" ");
if ((j + 1) % (width / 2) == 0)
printf (" - ");
}
for (j = 0; j < width; j++) {
if (i + j < len)
printf ("%c", isprint (str [j]) ? str [j] : '.');
else
printf (" ");
}
str += width;
i += j;
printf ("\n");
}
}
}
Output base 16 (Excerpt from first 512 bytes* of a flash video)
Size: 512
00 00 00 20 66 74 79 70 - 69 73 6f 6d 00 00 02 00 - ... ftypisom....
69 73 6f 6d 69 73 6f 32 - 61 76 63 31 6d 70 34 31 - isomiso2avc1mp41
00 06 e8 e6 6d 6f 6f 76 - 00 00 00 6c 6d 76 68 64 - ....moov...lmvhd
00 00 00 00 7c 25 b0 80 - 7c 25 b0 80 00 00 03 e8 - ....|%..|%......
00 0c d6 2a 00 01 00 00 - 01 00 00 00 00 00 00 00 - ...*............
00 00 00 00 00 01 00 00 - 00 00 00 00 00 00 00 00 - ................
00 00 00 00 00 01 00 00 - 00 00 00 00 00 00 00 00 - ................
00 00 00 00 40 00 00 00 - 00 00 00 00 00 00 00 00 - ....#...........
00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 - ................
00 01 00 02 00 01 9f 38 - 74 72 61 6b 00 00 00 5c - .......8trak...\
I assume you already know how to tell the size of a file and read a file in binary mode, so I will leave that out of the discussion. Depending on your terminal width you may need to adjust the variable: width -- the code is currently designed for 80 character terminals.
I am also assuming that when you mentioned xxd in conjunction with "binary" you meant non-text as opposed to base 2. If you want base 2, set width to 6 and replace printf ("%02x ", (unsigned char) str [j]); with this:
{
for (int k = 7; k >= 0; k--)
printf ("%d", ((unsigned char)str [j] >> k) & 1);
printf (" ");
}
The required change is pretty simple, you just need to individually shift all 8 bits of your octet and mask off all but the least-significant bit. Remember to do this in an order that seems counter-intuitive at first, since we print left-to-right.
Output base 2 (Excerpt from first 512 bytes* of a flash video)
Size: 512
00000000 00000000 00000000 - 00100000 01100110 01110100 - ... ft
01111001 01110000 01101001 - 01110011 01101111 01101101 - ypisom
00000000 00000000 00000010 - 00000000 01101001 01110011 - ....is
01101111 01101101 01101001 - 01110011 01101111 00110010 - omiso2
01100001 01110110 01100011 - 00110001 01101101 01110000 - avc1mp
00110100 00110001 00000000 - 00000110 11101000 11100110 - 41....
01101101 01101111 01101111 - 01110110 00000000 00000000 - moov..
00000000 01101100 01101101 - 01110110 01101000 01100100 - .lmvhd
00000000 00000000 00000000 - 00000000 01111100 00100101 - ....|%
10110000 10000000 01111100 - 00100101 10110000 10000000 - ..|%..
00000000 00000000 00000011 - 11101000 00000000 00001100 - ......
*For the sake of simplicity, let us pretend that a byte is always 8-bits.
Depending on the language, assuming you have bitwise operations, which lets you act on each bit of a variable, you can do the following. Read the file into a buffer, or a line, if encoding is needed, force it to extended ASCII (8 bit/ 1 byte character) now, when you get the buffer, you loop from 7 to 0 and using the and bitwise and a shift to check each bit value, let me give an example in C:
// gcc -Wall -Wextra -std=c99 xxd.c
#include <stdio.h>
#include <string.h>
int main() {
// Whatever buffer size you chose.
char buffer[32];
//Feel free to replace stdin to a File Pointer, or any other stream
// Reading into a char, means reading each byte at once
while (!feof(stdin)) {
// Read at most buffer bytes. Since its ASCII 1 byte = 1 char.
fgets(buffer, sizeof(buffer), stdin);
// Iterate though each character in the string/buffer.
const size_t len = strlen(buffer);
for (size_t j = 0; j < len; j++) {
// Print the most significant bit first.
for (int i = 7; i >=0; i--) {
// Check if the i-th bit is set
printf(buffer[j] & (1 << i) ? "1" : "0");
}
}
}
return 0;
}
Related
Is there a function in a C lib to print data packets similar to Wireshark format (position then byte by byte)
I looked up their code and they use trees which was too complex for my task. I could also write my own version from scratch but I don't wanna be reinventing the wheel, so I was wondering if there is some code written that I can utilize. Any suggestions of a lib that I can use?
*The data I have is in a buffer of unsigned ints.
0000 01 02 ff 45 a3 00 90 00 00 00 00 00 00
0010 00 00 00 00 00 00 00 00 00 00 00 00 00
0020 00 00 00 00 00 00 00 00 00 00 00 00 00 ... etc
Thanks!
I doubt such a specific function exists in the libC, but the system is rather simple:
for (unsigned k = 0; k < len; k++)
{
if (k % 0x10 == 0)
printf("\n%04x", k);
if (k % 0x4 == 0)
printf(" ");
printf(" %02x", buffer[k] & 0xff);
}
Replace the first modulo by the line length, and the second by the word length and you're good (of course, try to make one a multiple of the other)
EDIT:
As I just noticed you mentioned the data you have is in a buffer of unsigned ints, you will have to cast it to an unsigned char buffer for this part.
Of course, you can do it with an unsigned buffer with bitwise shifts and four prints per loop, but that really makes for cumbersome code where it isn't necessary
Unexpected MAC-address value obtained using snprintf function
Why do I get "Unexpected mac-address value. I have big string (unsigned char data[DATA_LEN ]) to parse and copy mac address to the structure member. I am getting completely different string. Please help on this, Thank you.
Input data string:
unsigned char data[512] = "its-STRING: 18 22 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 AC 12 00 20 00 00 00 C8 8C DF 9D 57 12 20 00 00 00 29 \n";
Output of the program
Parsed Mac from string = 8C DF 9D 57 12 20
copied MacAddress == 32:30:3a:33:38:00
From the above mentioned string i have to extract the mac-address "8C DF 9D 57 12 20"and then i have to copy this mac-address into the following structure
typedef struct my_stuct_s{
uint8_t mac_addr[18];
}my_stuct_t;
Below is how I have the coded.
#define PARSE_OFFSET 89
#define END_OFFESET 19
#define DATA_LEN 512
#define ADDR_LEN 6
typedef struct my_stuct_s{
uint8_t mac_addr[ADDR_LEN];
uint8_t item;
}my_stuct_t;
int main()
{
unsigned char data[DATA_LEN] = "its-STRING: 18 22 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 AC 12 00 20 00 00 00 C8 8C DF 9D 57 12 20 00 00 00 29 \n";
unsigned char strv6[ADDR_LEN];
unsigned char *data1 = NULL;
my_stuct_t shm_memory;
memset(strv6, 0,sizeof(strv6));
memset(&shm_memory, 0,sizeof(my_stuct_t));
if ((strcmp(data, "") ) != 0)
{
data1 = &data[0];
data1 = data1 + PARSE_OFFSET;
snprintf(strv6, END_OFFESET,"%s\n", data1);
printf("Parsed Mac from string = %s\n", strv6);
snprintf((char *)&shm_memory.mac_addr, ADDR_LEN,
"%02x:%02x:%02x:%02x:%02x:%02x\n",
strv6[0], strv6[1],
strv6[2], strv6[3],
strv6[4], strv6[5]);
printf("copied MacAddress == %02x:%02x:%02x:%02x:%02x:%02x\n ",
shm_memory.mac_addr[0],
shm_memory.mac_addr[1],
shm_memory.mac_addr[2],
shm_memory.mac_addr[3],
shm_memory.mac_addr[4],
shm_memory.mac_addr[5]);
}
else
printf("\n empty string");
return 0;
}
Your parse offset starts at 89 and number of chars to be copied are 18.
#define PARSE_OFFSET 89
#define END_OFFESET 19
should be
#define PARSE_OFFSET 90
#define END_OFFESET 18
Length of the char array which stores the address should be 19.
unsigned char strv6[19];
You need to use sscanf not sprintf as below.
sscanf(strv6, "%02x %02x %02X %02x %02x %02x\n",
&shm_memory.mac_addr[0],
&shm_memory.mac_addr[1],
&shm_memory.mac_addr[2],
&shm_memory.mac_addr[3],
&shm_memory.mac_addr[4],
&shm_memory.mac_addr[5]);
I'm trying to read UTF file and decided to read it in binary mode and skip non-ASCII as file consists of valid english text basically. I'm stuck at fread returning 1 instead of number of bytes requested. First output of print_hex IMHO shows it has read more than 1 char. I've read some examples of reading binary files in C e.g Read and write to binary files in C?, read about fread e.g. here https://en.cppreference.com/w/c/io/fread and here How does fread really work?, still puzzled why it returns 1. File hexdump, and complete C code and output below.
ADD: compiled by gcc, run on Linux.
File:
00000000 ff fe 41 00 41 00 42 00 61 00 0d 00 0a 00 41 00 |..A.A.B.a.....A.|
00000010 41 00 45 00 72 00 0d 00 0a 00 66 00 73 00 61 00 |A.E.r.....f.s.a.|
00000020 6a 00 0d 00 0a 00 64 00 73 00 61 00 66 00 64 00 |j.....d.s.a.f.d.|
00000030 73 00 61 00 66 00 64 00 73 00 61 00 0d 00 0a 00 |s.a.f.d.s.a.....|
00000040 64 00 66 00 73 00 61 00 0d 00 0a 00 66 00 64 00 |d.f.s.a.....f.d.|
00000050 73 00 61 00 66 00 73 00 64 00 61 00 66 00 0d 00 |s.a.f.s.d.a.f...|
00000060 0a 00 0d 00 0a 00 0d 00 0a 00 |..........|
Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void print_hex(const char *s)
{
while(*s)
printf("%02x ", (unsigned char) *s++);
printf("\n");
}
int main(){
#define files_qty 5
const char* files_array[][2]={{"xx","a"},{"zz","b"},{"xxx","d"},{"d","sd"},{"as","sd"}};
const char* file_postfix = ".txt";
char* file_out_name = "XXX_AD.txt";
FILE* file_out = fopen (file_out_name, "w");
printf ("This app reads txt files with hardcoded names and writes to file %s\n",file_out_name);
/* ssize_t bytes_read = 1; //signed size_t */
size_t n_bytes = 10;
unsigned char* string_in;
char* string_out;
char* file_name;
string_in = (char*) malloc (n_bytes+1);
string_out = (char*) malloc (n_bytes+50);
file_name = (char*) malloc (n_bytes+1); /* more error prone would be to loop through array for max file name length */
int i;
size_t n;
for (i=0;i<files_qty;i++)
{
strcpy (file_name,files_array[i][0]);
FILE* file = fopen (strcat(file_name,file_postfix), "rb");
if (file!= NULL)
{
int k=0;
while (n=fread (string_in, sizeof(char), n_bytes, file)>0)
{
printf("bytes read:%lu\n",(unsigned long) n);
print_hex(string_in);
int j;
for (j=0;j<n;j++)
{
switch (string_in[j])
{
case 0x00:
case 0xff:
case 0xfe:
case 0x0a:
break;
case 0x0d:
string_out[k]=0x00;
fprintf (file_out, "%s;%s;%s\n", files_array[i][0], files_array[i][1], string_out);
k=0;
printf("out:\n");
print_hex(string_out);
break;
default:
string_out[k++]=string_in[j];
}
}
}
fclose (file);
}
else
{
perror (file_name); /* why didn't the file open? */
}
}
free (string_in);
free (string_out);
free (file_name);
return 0;
}
Output:
bytes read:1
ff fe 41
bytes read:1
0d
out:
bytes read:1
72
bytes read:1
61
bytes read:1
73
bytes read:1
61
bytes read:1
0d
out:
72 61 73 61
bytes read:1
61
bytes read:1
73
bytes read:1
61
bytes read:1
0a
You have a precedence problem. Simple assignment has lower precedence than comparison. So the following line:
while(n=fread (string_in, sizeof(char), n_bytes, file)>0)
is evaluated as (extra parenthesis)
while (n=(fread (string_in, sizeof(char), n_bytes, file)>0))
Therefore n is being assigned as 1 because fread is returning a value > 0
Instead, explicitly add parenthesis as:
while((n=fread (string_in, sizeof(char), n_bytes, file))>0)
I am learning fread and fwrite of c and made a basic code to write a structure using fwrite in a file . Output was there on the
#include<stdio.h>
int main()
{
FILE *f;
int i,q=0;
typedef struct {
int a;
char ab[10];
}b;
b var[2];
f=fopen("new.c","w");
printf("Enter values in structure\n");
for(i=0 ; i<2 ; i++)
{
scanf("%d",&var[i].a);
scanf("%s",var[i].ab);
}
fwrite(var,sizeof(var),1,f);
fclose(f);
return 0;
}
The output was not smooth as it contained weird characters inside the file. I opened the file in binary mode too but in vain. Is this some kind of buffer problem?
The "weird" characters inside your file are probably the bytes of the binary integers you're writing out. fwrite is writing the bits of var directly to the file, not converting that into a human readable format. If you want that, use fprintf instead.
Here's an example directly from your code above:
$ ./example
Enter values in structure
5 hello
8 world
$ hexdump -vC new.c
00000000 05 00 00 00 68 65 6c 6c 6f 00 00 00 00 00 00 00 |....hello.......|
00000010 08 00 00 00 77 6f 72 6c 64 00 00 00 00 00 00 00 |....world.......|
00000020
Notice that the first four bytes at offset 0x00 and 0x10 are the numbers entered (little-endian and 32-bit because of my machine), followed by the strings entered, plus a bit of structure padding. All broken down:
File Offset Data (ASCII) Relationship to source
0 05 var[0].a 7:0
1 00 var[0].a 15:8
2 00 var[0].a 23:16
3 00 var[0].a 31:24
4 68 (h) var[0].ab[0]
5 65 (e) var[0].ab[1]
6 6c (l) var[0].ab[2]
7 6c (l) var[0].ab[3]
8 6f (o) var[0].ab[4]
9 00 (NUL) var[0].ab[5]
10 00 (NUL) var[0].ab[6]
11 00 (NUL) var[0].ab[7]
12 00 (NUL) var[0].ab[8]
13 00 (NUL) var[0].ab[9]
14 00 structure padding
15 00 structure padding
16 08 var[1].a 7:0
17 00 var[1].a 15:8
18 00 var[1].a 23:16
19 00 var[1].a 31:24
20 77 (w) var[1].ab[0]
21 6f (o) var[1].ab[1]
22 72 (r) var[1].ab[2]
23 6c (l) var[1].ab[3]
24 64 (d) var[1].ab[4]
25 00 (NUL) var[1].ab[5]
26 00 (NUL) var[1].ab[6]
27 00 (NUL) var[1].ab[7]
28 00 (NUL) var[1].ab[8]
29 00 (NUL) var[1].ab[9]
30 00 structure padding
31 00 structure padding
I thought shift operator shifts the memory of the integer or the char on which it is applied but the output of the following code came a surprise to me.
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
int main(void) {
uint64_t number = 33550336;
unsigned char *p = (unsigned char *)&number;
size_t i;
for (i=0; i < sizeof number; ++i)
printf("%02x ", p[i]);
printf("\n");
//shift operation
number = number<<4;
p = (unsigned char *)&number;
for (i=0; i < sizeof number; ++i)
printf("%02x ", p[i]);
printf("\n");
return 0;
}
The system on which it ran is little endian and produced the following output:
00 f0 ff 01 00 00 00 00
00 00 ff 1f 00 00 00 00
Can somebody provide some reference to the detailed working of the shift operators?
I think you've answered your own question. The machine is little endian, which means the bytes are stored in memory with the least significant byte to the left. So your memory represents:
00 f0 ff 01 00 00 00 00 => 0x0000000001fff000
00 00 ff 1f 00 00 00 00 => 0x000000001fff0000
As you can see, the second is the same as the first value, shifted left by 4 bits.
Everything is right:
(1 * (256^3)) + (0xff * (256^2)) + (0xf0 * 256) = 33 550 336
(0x1f * (256^3)) + (0xff * (256^2)) = 536 805 376
33 550 336 * (2^4) = 536 805 376
Shifting left by 4 bits is the same as multiplying by 2^4.
I think you printf confuses you. Here are the values:
33550336 = 0x01FFF000
33550336 << 4 = 0x1FFF0000
Can you read you output now?
It doesn't shift the memory, but the bits. So you have the number:
00 00 00 00 01 FF F0 00
After shifting this number 4 bits (one hexadecimal digit) to the left you have:
00 00 00 00 1F FF 00 00
Which is exactly the output you get, when transformed to little endian.
Your loop is printing bytes in the order they are stored in memory, and the output would be different on a big-endian machine. If you want to print the value in hex just use %016llx. Then you'll see what you expect:
0000000001fff000
000000001fff0000
The second value is left-shifted by 4.