Understanding fwrite() behaviour - c

When I use fwrite to stdout it becomes undefined results?
What is the use of the size_t count argument in fwrite and fread function?
#include<stdio.h>
struct test
{
char str[20];
int num;
float fnum;
};
int main()
{
struct test test1={"name",12,12.334};
fwrite(&test1,sizeof(test1),1,stdout);
return 0;
}
output:
name XEA
Process returned 0 (0x0) execution time : 0.180 s
Press any key to continue.
And when i use fwrite for file
#include<stdio.h>
#include<stdlib.h>
struct test
{
char str[20];
int num;
float fnum;
};
int main()
{
struct test test1={"name",12,12.334};
FILE *fp;
fp=fopen("test.txt","w");
if(fp==NULL)
{
printf("FILE Error");
exit(1);
}
fwrite(&test1,sizeof(test1),1,fp);
return 0;
}
The File also contains like this
name XEA
Why is the output like this?
And when I put size_t count as 5, it also becomes undefined result. What is the purpose of this argument?

In the way that you're using it, fwrite is writing "binary" output, which is a byte-for-byte representation of your struct test as in memory. This representation is not generally human-readable.
When you write
struct test test1 = {"name", 12, 12.334};
you get a structure in memory which might be represented like this (with all byte valuess in hexadecimal):
test1: str: 6e 61 6d 65 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
num: 0c 00 00 00
fnum: 10 58 45 41
Specifically: 0x6e, 0x61, 0x6d, and 0x65 are the ASCII codes for the letters in "name". Your str array was size 20, so the string name is 0-terminated and then padded out with 15 more 0 characters. 0x0c is the hexadecimal representation of the number 12, and I'm assuming that type int is 32 bits or 4 bytes on your machine, so there are 3 more 0's there, also. (Also I'm assuming your machine is "little endian", with the least-significant byte of a 4-byte quantity like 0x0000000c coming first in memory.) Finally, in the IEEE-754 format which your machine uses, the number 12.334 is represented in 32-bit single-precision floating point as 0x41455810, which is again stored in the opposite order in memory.
So those 28 bytes (plus possibly some padding, which we won't worry about for now) are precisely the bytes that get written to the screen, or to your file. The string "name" will probably be human-readable, but all the rest will be, literally, "binary garbage". It just so happens that three of the bytes making up the float number 12.334, namely 0x58, 0x45, and 0x41, correspond in ASCII to the capital letters X, E, and A, so that's why you see those characters in the output.
Here's the result of passing the output of your program into a "hex dump utility:
0 6e 61 6d 65 00 00 00 00 00 00 00 00 00 00 00 00 name............
16 00 00 00 00 0c 00 00 00 10 58 45 41 .........XEA
You can see the letters name at the beginning, and the letters XEA at the end, and all the 0's and other binary characters in between.
If you're on a Unix or Linux (or Mac OS X) system, you can use tools like od or hexdump to get a hex dump similar to the one I've shown.
You asked about the "count" argument to fwrite. fwrite is literally designed for writing out binary structures, just like you're doing. Its function signature is
size_t fwrite(void *ptr, size_t sz, size_t n, FILE *fp);
As you know, ptr is a pointer to the data structure(s) you're writing, and fp is the file pointer you're writing it/them to. And then you're saying you want to write n (or 'count') items, each of size sz.
You called
fwrite(&test1, sizeof(test1), 1, fp);
The expression sizeof(test1) gives the size of test1 in bytes. (It will probably be 28 or so, as I mentioned above.) And of course you're writing one struct, so passing sizeof(test1) and 1 as sz and n is perfectly reasonable and correct.
It would also not be unreasonable or incorrect to call
fwrite(&test1, 1, sizeof(test1), fp);
Now you're telling fwrite to write 28 bytes, each of size 1. (A byte is of course always size 1.)
One other thing about fwrite (as noted by #AnttiHaapala in a comment): when you're writing binary data to a file, you should specify the b flag when you open the file:
fp = fopen("test.txt", "wb");
Finally, if this isn't what you want, if you want human-readabe text output instead, then fwrite is not what you want here. You could use something like
printf("str: %s\n", test1.str);
printf("num: %d\n", test1.num);
printf("fnum: %f\n", test1.fnum);
Or, to your file:
fprintf(fp, "str: %s\n", test1.str);
fprintf(fp, "num: %d\n", test1.num);
fprintf(fp, "fnum: %f\n", test1.fnum);

Related

Why does C use 5 bytes to write/fwrite the number 10, but 4 bytes to write 9 or 11?

GCC on Windows
#include <stdio.h>
#include <stdlib.h>
struct test {
int n1;
int n2;
};
int main() {
FILE *f;
f = fopen("test.dat", "w");
struct test test1 = {10, 10};
fwrite(&test1, sizeof(struct test), 1, f);
fclose(f);
struct test test2;
f = fopen("test.dat", "r");
while(fread(&test2, sizeof(struct test), 1, f))
printf("n1=%d n2=%d\n", test2.n1, test2.n2);
fclose(f);
return 0;
}
If I set test1 to 10,10 then fwrite will write 10 bytes to file: 0D 0A 00 00 00 0D 0A 00 00 00
(each 4-byte int will be padded with a 0D carriage return character before it)
If I set test1 to 11,11 then fwrite will write 8 bytes to file: 0B 00 00 00 0B 00 00 00
(as I would expect)
If I set test1 to 9,9 then fwrite will write 8 bytes to file: 09 00 00 00 09 00 00 00
(as I would expect)
If I set test1 to 9,10 then fwrite will write 9 bytes to file: 09 00 00 00 0D 0A 00 00 00
The 9 gets 4 bytes as expected, but the 10 gets padded with an extra 0D byte, resulting in 5 bytes. What is so special about the number 10 that requires this padding? And why do both smaller and larger numbers (8, 9, 11, 12, 13, 14, etc) not get padded? I thought maybe GCC is confusing the number 10 for a new-line character (a new-line is 10 is acsii), but this does not explain how fread is able to get the number 10 back out correctly.
And how do I write a struct to file without getting this extra padding on the number 10?
You opened the file in text mode, so on Windows every '\n' character gets a carriage return prepended in front of it.
You should write (and read) binary data in binary mode instead (fopen(..., "wb")) -- this will be much faster, and avoids surprises (and also requires only 8 bytes, which is what sizeof(struct test) is).
What is so special about the number 10 that requires this padding?
The number 10 just happens to be the ASCII code for a newline ('\n') character.
You are writing and reading in text mode.
Open the file with the flags "wb" and "rb". This will treat the files as binary files.

How to print data in a buffer irrespective of zeros in between till the given length

I wrote a simple program to read a packet till layer 3 and print the same in hex format.
I gave input in hex format. My output should be same as this.
Input:
45 00 00 44 ad 0b 00 00 40 11 72 72 ac 14 02 fd ac 14
00 06 e5 87 00 35 00 30 5b 6d ab c9 01 00 00 01
00 00 00 00 00 00 09 6d 63 63 6c 65 6c 6c 61 6e
02 63 73 05 6d 69 61 6d 69 03 65 64 75 00 00 01
00 01
I am able to read the packet. Here the hex dump in gdb
(gdb) p packet
$1 = 0x603240 "E"
(gdb) x/32x 0x603240
0x603240: 0x00440045 0x00000000 0x00400b0e 0x00000000
0x603250: 0x00603010 0x0035e587 0xe3200030 0x63206261
0x603260: 0x31302039 0x20303020 0x30203030 0x30302031
0x603270: 0x20303020 0x30203030 0x30302030 0x20303020
0x603280: 0x36203930 0x33362064 0x20333620 0x36206336
0x603290: 0x63362035 0x20633620 0x36203136 0x32302065
0x6032a0: 0x20333620 0x30203337 0x64362035 0x20393620
0x6032b0: 0x36203136 0x39362064 0x20333020 0x36203536
But when I tried to print the packet in console using %s I can't see the total packet because of zeros in between. But I wanted to print it till length of the packet(I am taking it as input to print function).
output on console is:
packet: E
My print function is something like this.
void print(char *packet, int len) {
printf("packet: ");
printf("%s\n\n" , packet );
}
Can you tell me any other way to print the packet till the len(input to print function).
PS: Reading l3 information I didn,t complete. So in gdb of packet l3 information vary form my input.
A string in C is defined as a sequence of characters ending with '\0' (a 0-byte), and the %s conversion specifier of printf() is for strings. So the solution to your problem is doing something else for printing the binary bytes. If you want for example to print their hexadecimal values, you could change your print function like this:
void print(unsigned char *packet, size_t len)
{
for (size_t i = 0; i < len; ++i) printf("%02hhx ", packet[i]);
puts("");
}
Note I also changed the types here:
char can be signed. If you want to handle raw bytes, it's better to always use unsigned char.
int might be too small for a size, always use size_t which is guaranteed to hold any size possible on your platform.
If you really want to print encoded characters (which is unlikely with binary data), you can use %c in printf(), or use the putchar() function, or fwrite() the whole chunk to stdout.

C program Reading Hex values from Fat MBR

I am new to the C language and am trying to create a forensic tool. So far i have this. It reads a dd file which is a dump of a fat16 MBR. I am able to read certain bytes properly but not some.
What i need help with is the SizeOfFat variable needs to get the values of byte 0x16 and 0x17 read in little endian. How would i have it read FB00 and then convert it to 00FB and then print the value ?
char buf[64];
int startFat16 = part_entry[0].start_sect,sectorSize = 512,dirEntrySize=32;
fseek(fp, startFat16*512, SEEK_SET);
fread(buf, 1, 64, fp);
int rootDir = *(int*)(buf+0x11);
int sectorPerCluster = *(int*)(buf+0x0D);
int sizeOfFat = *(int*)(buf+0x16);
int fatCopies = *(int*)(buf+0x10);
printf("\n Phase 2 \n no of sectors per cluster : %d \n",(unsigned char)sectorPerCluster);
printf("size of fat : %d \n",(unsigned char)sizeOfFat);
printf("no of Fat copies : %d \n",(unsigned char)fatCopies);
printf("maximum number of root directories : %d \n",(unsigned char)rootDir);
The hex values im working with here are -
EB 3C 90 4D 53 44 4F 53 35 2E 30 00 02 08 02 00
02 00 02 00 00 F8 FB 00 3F 00 FF 00 3F 00 00 00
E1 D7 07 00 80 00 29 CD 31 52 F4 4E 4F 20 4E 41
With int, you only got the guarantee that it can hold 32-bit signed integers. With your code, you read sizeof(int) bytes for every of your variables, even though they differ in size. There are uint16_t, uint8_t, uint32_t types on most systems. Use those for fixed-width data. Note also that they are unsigned. You don't want negative sectors per cluster, do you?

What is the difference between these two C functions in terms of handling memory?

typedef unsigned char *byte_pointer;
void show_bytes(byte_pointer start, size_t len) {
size_t i;
for (i = 0; i < len; i++)
printf(" %.2x", start[i]); //line:data:show_bytes_printf
printf("\n");
}
void show_integer(int* p,size_t len){
size_t i;
for(i=0;i<len;i++){
printf(" %d",p[i]);
}
printf("\n");
}
Suppose I have two functions above, and I use main function to test my functions:
int main(int argc, char *argv[])
{
int a[5]={12345,123,23,45,1};
show_bytes((byte_pointer)a,sizeof(a));
show_integer(a,5);
}
I got the following results in my terminal:
ubuntu#ubuntu:~/OS_project$ ./show_bytes
39 30 00 00 7b 00 00 00 17 00 00 00 2d 00 00 00 01 00 00 00
12345 123 23 45 1
Can someone tell me why I got the result? I understand the second function, but I have no idea why I got 39 30 00 00 7b 00 00 00 17 00 00 00 2d 00 00 00 01 00 00 00 for the first function. Actually I know the number sequence above are hexadecimal decimal for 12345, 123, 23, 45, 1. However, I have no idea: start[i] pointer doesn't point to the whole number such as 12345 or 123 in the first function. Instead, the start[0] just point to the least significant digit for the first number 12345? Can someone help me explain why these two functions are different?
12345 is 0x3039 in hex. because int is 32bits on your machine it will be represented as 0x00003039. then because your machine is little endian it will be represented as 0x39300000. you can read more about Big and Little endian on: https://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Data/endian.html
the same applies for other results.
On your platform, sizeof(int) is 4 and your platform uses little endian system. The binary representation of 12345 using a 32-bit representation is:
00000000 00000000 00110000 00111001
In a little endian system, that is captured using the following byte sequence.
00111001 00110000 00000000 00000000
In hex, those bytes are:
39 30 00 00
That's what you are seeing as the output corresponding to the first number.
You can do similar processing of the other numbers in the array to understand the output corresponding to them.

Dissecting a binary file in C

I'm working on assignment in which I need to dissect a binary file retrieve the source address from the header data. I was able to get hex data from the file to write out as we were instructed but I can't make heads or tails of what I am looking at. Here's the print out code I used.
FILE *ptr_myfile;
char buf[8];
ptr_myfile = fopen("packets.1","rb");
if (!ptr_myfile)
{
printf("Unable to open file!");
return 1;
}
size_t rb;
do {
rb = fread(buf, 1, 8, ptr_myfile);
if( rb ) {
size_t i;
for(i = 0; i < rb; ++i) {
printf("%02x", (unsigned int)buf[i]);
}
printf("\n");
}
} while( rb );
And here's a small portion of the output:
120000003c000000
4500003c195d0000
ffffff80011b60ffffff8115250b
4a7d156708004d56
0001000561626364
65666768696a6b6c
6d6e6f7071727374
7576776162636465
666768693c000000
4500003c00000000
ffffffff01ffffffb5ffffffbc4a7d1567
ffffff8115250b00005556
0001000561626364
65666768696a6b6c
6d6e6f7071727374
7576776162636465
666768693c000000
4500003c195d0000
ffffff8001775545ffffffcfffffffbe29
ffffff8115250108004d56
0001000561626364
65666768696a6b6c
6d6e6f7071727374
7576776162636465
666768693c000000
4500003c195f0000
......
So we are using this diagram to aid in the assignment
I'm really having difficulty translating information from the binary file to some thing useful that I can manage, and searching the website hasn't yielded me much. I just need some help putting me in the right direction.
Ok, it looks like you actually are reversing parts of an IP packet based on the diagram. This diagram is based on 32-bit words, with each bit being shown as the small 'ticks' along the horizontal ruler looking thing at the top. Bytes are shown as the big 'ticks' on the top ruler.
So, if you were to read the first byte of the file, the low-order nibble (the low-order four bytes) contains the version, and the high order nibble contains the number of 32-bit words in the header (assuming we can interpret this as an IP header).
So, from you diagram, you can see that the source address is in the fourth word so to read this, you can advance the file point to this point and read in four bytes. So in pseudo-code you should be able to do this:
fp = fopen("the file name")
fseek(fp, 12) // advance the file pointer 12 bytes
fread(buf, 1, 4, fp) // read in four bytes from the file.
Now you should have the source address in buf.
OK, to make this a bit more concrete, here is a packet I captured off my home network:
0000 00 15 ff 2e 93 78 bc 5f f4 fc e0 b6 08 00 45 00 .....x._......E.
0010 00 28 18 c7 40 00 80 06 00 00 c0 a8 01 05 5e 1f .(..#.........^.
0020 1d 9a fd d3 00 50 bd 72 7e e9 cf 19 6a 19 50 10 .....P.r~...j.P.
0030 41 10 3d 81 00 00 A.=...
The first 14 bytes are the EthernetII header, with the first six bytes (00 15 ff 2e 93 78) being the destination MAC address, the next six bytes (bc 5f f4 fc e0 b6) is the source MAC address and the new two bytes (08 00) denote that the next header is of type IP.
The next twenty bytes is the IP header (which you show in your figure), these bytes are:
0000 45 00 00 28 18 c7 40 00 80 06 00 00 c0 a8 01 05 E..(..#.........
0010 5e 1f 1d 9a ^...
So to interpret this lets look at 4-byte words.
The first 4-byte word (45 00 00 28), according to your figure is:
first byte : version & length, we have 0x45 meaning IPv4, and 5 4-byte words in length
second byte : Type of Service 0x00
3rd & 4th bytes: total length 0x00 0x28 or 40 bytes.
The second 4-byte word (18 c7 40 00), according to your figure is:
1st & 2nd bytes: identification 0x18 0xc7
3rd & 4th bytes: flags (3-bits) & fragmentation offset (13-bits)
flags - 0x02 0x40 is 0100 0000 in binary, and taking the first three bits 010 gives us 0x02 for the flags.
offset - 0x00
The third 4-byte word (80 06 00 00), according to your figure is:
first byte : TTL, 0x80 or 128 hops
second byte : protocol 0x06 or TCP
3rd & 4th bytes: 0x00 0x00
The fourth 4-byte word (c0 a8 01 05), according to your figure is:
1st to 4th bytes: source address, in this case 192.168.1.5
notice that each byte corresponds to one of the octets in the IP address.
The fifth 4-byte word (5e 1f 1d 9a), according to your figure is:
1st to 4th bytes: destination address, in this case 94.31.29.154
Doing this type of programming is a bit confusing at first, I recommend doing a paring by hand (like I did above) a few times to get the hang of it.
One final thing, in this line of code printf("%02x", (unsigned int)buf[i]);, I'd recommend changing it to printf("%02x ", (unsigned char)buf[i]);. Remember that each element in you buf array represents a single byte read from the file.
Hope this helps,
T.

Resources