Writing structure variables into a file, problem - c

Hi I have to write the contents of a structure variable into a file. I have a working program but the output looks distorted, you can understand when you look at the output. The simple version of the code is as below and the outputs follows.
Code:
#include<stdio.h>
#include <stdlib.h>
struct mystruct
{
char xx[20];
char yy[20];
int zz;
};
void filewrite(struct mystruct *myvar)
{
FILE *f;
f = fopen("trace.bin","wb");
if (f == NULL)
{
printf("\nUnable to create the file");
exit(0);
}
fwrite(myvar, sizeof(struct mystruct), 1, f);
fclose(f);
}
void main()
{
struct mystruct myvar = {"Rambo 1", "Rambo 2", 1234};
filewrite(&myvar);
}
Output:
(1. Where is the integer '1234'? I need that intact.)
(2. Why does some random character appear here?)
trace.bin
Rambo 1Rambo 2Ò

Your program is correct and the output too...
Your are writing a binary file containing the raw data from memory.
The integer zz gets written to disk as 4 bytes (or 2 depending on the size of an int on your system), with the least significant byte first (Intel machine I guess).
1234 (decimal) gets written as 0xD2, 0x04, 0x00, 0x00 to disk.
0xD2 is a Ò when you look at in text form. The 0x04 and the 0x's are non-printable characters so they don't show.

First, in general it's not a good practice to copy non-packed struct-types to files since the compiler can add padding to the struct in order to align it in memory. Thus you will end up with either a non-portable implementation, or some garbled output where someone else tries to read your file, and the bits/bytes are not placed at the correct offset because of the compiler's padding bytes.
Second, I'm not sure how you are reading your file back (it appears you just copied it into a buffer and tried to print that), but the last set of bytes is an int type at the end ... it's not going to be a null-terminated string, so the manner in which it prints will not look "correct" ... printing non-null-terminated strings as strings can also lead to buffer overflows resulting in segmentation faults, etc.
In order to read back the contents of the file in a human-readable format, you would need to open the file and read contents back into the correct data-structures/types, and then appropriately call printf or some other means of converting the binary data to ASCII data for a print-out.

I don't recommend dumping memory directly into file, you should use some serialization method (e.x if you have pointer in the struct, you are doomed). I recommend Google Buffers Protocol if data will be shared between multiple applications.

Related

C: unsigned short stored into buffer after fread from a binary file doesn't match original pattern

I have a binary file filled with 2-byte words following this pattern(in HEX): 0XY0. this is the part of the code where I execute fread and fopen.
unsigned short buffer[bufferSize];
FILE *ptr; //
ptr = fopen(fileIn,"rb"); //
if(ptr == NULL)
{
fprintf(stderr,"Unable to read from file %s because of %s",fileIn,strerror(errno));
exit(20);
}
size_t readed = fread(buffer,(size_t)sizeof(unsigned short),bufferSize,ptr);
if(readed!=bufferSize)
{
printf("readed and buffersize are not the same\n");
exit(100);
}
//---------------------------
If I look at any content of the buffer, for example buffer[0], instead of being a short with pattern 0XY0, it is a short with pattern Y00X
Where is my error? is it something regarding endianess?
of course i checked every element inside buffer. the program execute with no errors.
EDIT: If i read from file with size char instead of short, the content of buffer(obviously changed to char buffer[bufferSize*2];) match the pattern OXYO. So I have that (for example) buffer[0] is 0X and buffer[1] is Y0
Your problem seems to be exactly typical of an endianness mismatch between the program that stored the data into the file and the one that reads it. Remember that mobile phone processors tend to use big-endian representations and laptop little-endian.
Another potential explanation is your file might have been written in text mode by a windows based program that converted the 0x0A bytes into pairs of 0x0D / 0x0A, causing the contents to shift and cause a similar pattern as what you observe.
Instead of reading the file with fread, you should read it byte by byte and compute the values according the endianness specified for the file format.

Explanation of HEX value representation and Endianess

I was working on a script to basically output some sample data as a binary blob.
I'm a new intern in the software field and vaguely remember the idea of endianness.
I realize that the most significant bits for big-endian starts at the top and works down the memory block.
If I have 0x03000201 and the data is being parsed to output 0 1 2, how does this happen and what is being done to make that work in terms of bits, bytes, etc.
I am wondering, in the example posted below, how the numbers are extracted to form 0 1 2 when printing out the data stored in the variables.
For example: I am creating a couple lines of the binary blob using this file:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *file;
int buffer = 0x03000201;
int buffer2= 0x010203;
file = fopen("test.bin", "wb");
if (file != NULL)
{
fwrite(&buffer, sizeof(buffer), 1, file);
fwrite(&buffer2, sizeof(buffer2), 1, file);
fclose(file);
}
return 0;
}
I then created a Python script to parse this data:
Info About Parse
import struct
with open('test.bin','rb') as f:
while True:
data = f.read(4)
if not data: break
var1, var2, var3 = struct.unpack('=BHB', data)
print(var1, var2, var3)
Big or little endianness defines how to interpret a sequence of bytes longer than one byte and how to store those in memory. Wikipedia will help you with that.
I was really just looking to understand how 0x0300020 when read 2
bytes at a time and reprinted yields 0 1 2.
You don't read 2 bytes at a time, you read 4 bytes: data = f.read(4)
f.read(size) reads some quantity of data and returns it as a string.
You unpack data using =BHB - byte, 2 bytes, byte. Endianness comes into play only when you unpack data, all other IO calls in your code deal with byte sequences.
Experiment with unpack() Byte Order, Size, and Alignment You may also look at file data with a HEX editor of your choice.
And if, after your research, you have a concrete question, ask here.

Most efficient way to write a number in a file in C?

I need to keep trace of an int number greater than 255 in a file. It is greater than the largest unsigned char and so the use of the fputc seems to be not reliable (first question: is it always true?).
I could use fputs, by converting the digits in characters, so obtaining a string; but in the program i need the number as an int too!
So, the question in the title: what is so the most efficient way to write that number? Is there any way to avoid the conversion to string?
Keep that the file should then be readed by another process, where char number should become an int again.
Just write out the binary representation:
int fd;
...
int foo = 1234;
write (fd, &foo, sizeof(foo));
(and add error handling).
Or if you like FILE*
FILE *file;
...
int foo = 1234;
fwrite (&foo, sizeof(foo), 1, file);
(and add error handling).
Note that if your file is to be loaded on a different system, potential with different endianness, you might want to ensure the endianness of the bytes is constant (e.g. most significant byte or least significant byte first). You can use htnol, htons etc. for this if you want. If you know the architecture that is loading the file is the same as that saving it, there is no need for this.

How do I point to this piece of memory correctly to treat it as my struct?

I have the following code. The file contains a bitmap image were the first bytes are 0x424d. I would expect the first printf to print BM, instead it prints BM??.
Additionally, the second printf prints 10, and I would expect it to be a larger number since the file is larger then 10 bytes.
fp = fopen("input.bmp", "r");
bmp_header_p = malloc(sizeof(bmp_header_t));
rewind(fp);
fread(bmp_header_p, sizeof(char), 14, fp);
printf("magic number = %s\n", bmp_header_p->magic);
printf("file size = %" PRIu32 "\n", bmp_header_p->filesz);
typedef struct {
uint8_t magic[2]; /* the magic number used to identify the BMP file:
0x42 0x4D (Hex code points for B and M).
The following entries are possible:
BM - Windows 3.1x, 95, NT, ... etc
BA - OS/2 Bitmap Array
CI - OS/2 Color Icon
CP - OS/2 Color Pointer
IC - OS/2 Icon
PT - OS/2 Pointer. */
uint32_t filesz; /* the size of the BMP file in bytes */
of the byte where the bitmap data can be found. */
} bmp_header_t;
%s is for null-terminated strings. magic is just an array of 2 bytes, not a string.
printf("magic number = %c%c\n", bmp_header_p->magic[0], bmp_header_p->magic[1]);
You have several problems:
The compiler is likely adding padding into your data structure between the magic and filesz members. There are compiler-specific extensions such as pragmas and attributes you can use to avoid this behavior and get packed structures (Visual Studio, GCC). But these are not portable and should be avoided if possible.
The %s format specifier expects a null-terminated string, but the string you're passing is not terminated. You should instead use a specifier such as %.2s to print at most 2 characters.
The bitmap file format is little-endian. If your computer is also little-endian, then it will appear to work correctly, but when you compile for a big-endian architecture, you'll find it will suddenly stop working. You must endian-swap any multibyte values like filesz on big-endian platforms.
The call to rewind(3) is not necessary—the file pointer is guaranteed to be at the start of the file after you open it.
There are many ways to solve problem (1); there are countless articles and answers about how to do serialization correctly in C. I'd recommend just reading each member individually so that you don't have to worry about how the compiler lays out your structure, and it will be fully portable to all platforms.
Also, sizeof(char) is guaranteed to be 1 by the C standard, so there's rarely a need to explicitly write out sizeof(char) in e.g. function arguments to fread(3).
When you print something as a string, the printf function will keep going until it reaches a null terminator. So although the first two bytes might represent BM, you did nothing to ensure that printing stops there. Unless the next byte in memory is '0' the printf will keep looking for characters and printing them. It turns out that the next two are "unprintable", and result in ??. After that, there might be a '0' so output stops...
As for the second point - the format specification says that the integers are stored in little-endian format. You are doing nothing to make sure you are interpreting the number that way - it's possible that you are in fact working with a big-endian machine, in which case the number 10 is really 10*256*256*256 = 167772160. Of course there is no guarantee that numbers in a structure are actually aligned how you think - it's possible that there is some (compiler, platform) specific padding going on. And then there's the question of what happens when a binary file is opened in "r" rather than "rb" mode...
Here is a possible way to tackle these things:
#define HEADER_SIZE 14
#define SIZE_START 2
fp = fopen("input.bmp", "r");
bmp_header_p = malloc(HEADER_SIZE);
fread(bmp_header_p, HEADER_SIZE, 1, fp);
printf("magic number = %c%c\n", bmp_header_p[0], bmp_header_p[1]);
long int fileSize=0, ii;
for(ii=0;ii<4;ii++) fileSize+=256*fileSize + bmp_header+p[ii + SIZE_START];
printf("file size = %ld\n", fileSize);
just fill a 3 char null terminated array.
char label[] ="00\0";
label[0] =magic[0];
label[1] =magic[1];
prints("%s\n", label );

Conversion from binary file to hex in C

I am trying to write some simple program to uploading files to my server. I' d like to convert binary files to hex. I have written something, but it does not work properly.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
static int bufferSize = 1024;
FILE *source;
FILE *dest;
int n;
int counter;
int main() {
unsigned char buffer[bufferSize];
source = fopen("server.pdf", "rb");
if (source) {
dest = fopen("file_test", "wb");
while (!feof(source)) {
n = fread(buffer, 1, bufferSize, source);
counter += n;
strtol(buffer, NULL, 2);
fwrite(buffer, 1, n, dest);
}
}
else {
printf("Error");
}
fclose(source);
fclose(dest);
}
I use strtol to convert binary do hex. After invoking this code I have still strange characters in my file_test file.
I' d like to upload a file on server, for example a PDF file. But firstly I have to write a program, that will convert this file to a hex file. I'd like that the length of a line in hex file would be equal 1024. After that, I will upload this file line by line with PL/SQL.
EDIT: I completely misunderstood what the OP was aiming for. He wants to convert his pdf file to its hex representation, as I see now, because he wants to put that file in a text blob field in some database table. I still claim the exercise is a complete waste of time,since blobs can contain binary data: that's what blobs were invented for. Blob means binary large object.
You said: "I' d like to upload file on server, for example pdf file. But firstly I have to write a program, that will convert this file to hex file."
You don't have to, and must not, write any such conversion program.
You have to first understand and internalize the idea that hex notation is only an easy-to-read representation of binary. If you think, as you seem to, that you have to "convert" a pdf file to hex, then you are mistaken. A pdf file is a binary file is a binary file. You don't "convert" anything, not unless you want to change the binary!
You must abandon, delete, discard, defenestrate, forget about, and expunge your notion of "converting" any binary file to anything else. See, hex exists only as a human-readable presentation format for binary, each hex digit representing four contiguous binary digits.
To put it another way: hex representation is for human consumption only, unsuitable (almost always) for program use.
For an example: suppose your pdf file holds a four-bit string "1100," whose human-readable hex representation can be 'C'. When you "convert" that 1100 to hex the way you want to do it, you replace it by the ASCII character 'C', whose decimal value is 67. You can see right away that's not what you want to do and you immediately see also that it's not even possible: the decimal value 67 needs seven bits and won't fit in your four bits of "1100".
HTH
Your code is fantastically confused.
It's reading in the data, then doing a strtol() call on it, with a base of 2, and then ignoring the return value. What's the point in that?
To convert the first loaded byte of data to hexadecimal string, you should probably use something like:
char hex[8];
sprintf(hex, "%02x", (unsigned int) buffer[0] & 0xff);
Then write hex to the output file. You need to do this for all bytes loaded, of course, not just buffer[0].
Also, as a minor point, you can't call feof() before you've tried reading the file. It's better to not use feof() and instead check the return value of fread() to detect when it fails.
strtol converts a string containing a decimal representation of a number to the binary number if i am not mistaken. You probably want to convert something like a binary OK to 4F 4B... To do that you can use for example sprintf(aString, "%x", aChar).

Resources