Converting Binary File to Text File in C - c

I'm having an issue trying to convert a Binary File into a text file. Right now, I'm getting an output of "hello 16". I should be getting 5 lines of output, in which the first line should be "hello 32". I'm unsure where I went wrong, but I've been trying to figure it out for a few hours now.
Link to Binary File
void BinaryToText(char *inputFile, char *outputFile) {
unsigned char str[256];
unsigned int num;
int fileLen;
FILE *finp;
FILE *fout;
finp = fopen(inputFile, "r");
fout = fopen(outputFile, "w");
fseek(finp, 0, SEEK_END);
fileLen = ftell(finp);
fseek(finp, 0, SEEK_SET);
while (fread(&fileLen, sizeof(char), 1, finp) == 1) {
fread(&str, sizeof(str), 1, finp);
fread(&num, sizeof(int), 1, finp);
fprintf(fout, "%s %d\n", str, num);
}
fclose(finp);
fclose(fout);
}

You binary file format seems awkward:
you open the input file with "r": it should be opened in binary mode with "rb".
you first attempt to determine the file size with ftell(). Be aware that this will not work for pipes and devices. In you case it would not matter as you do not use fileLen anyway.
in the loop:
you read a single byte that you store in a part of fileLen.
you then read a string of 256 bytes.
you read a number directly, assuming the byte order and size of int is what you expect.
you then print the string, assuming there was a '\0' in the file, otherwise you would invoke undefined behavior.
It is hard to tell what is wrong without seeing the writing code.
Note that the binary file should be open with "rb" to prevent spurious conversion of linefeed sequences on some platforms, notably Windows.
EDIT:
Form the extra information provided in the comments, here is a modified version that should parse you binary file more appropriately:
void BinaryToText(char *inputFile, char *outputFile) {
unsigned char str[256];
unsigned int num; // assuming 32 bit ints
int i, len;
FILE *finp = fopen(inputFile, "rb");
FILE *fout = fopen(outputFile, "w");
while ((len = fgetc(finp)) != EOF) {
fread(str, len, 1, finp);
str[len] = '\0';
num = (unsigned int)fgetc(finp) << 24;
num |= fgetc(finp) << 16;
num |= fgetc(finp) << 8;
num |= fgetc(finp);
fprintf(fout, "%s %d\n", (char*)str, num);
}
fclose(finp);
fclose(fout);
}

Related

Broken files in C

I made a simple script to rewrite one file contents into another.
Here's code:
#include <stdio.h>
#include <stdlib.h>
int main()
{
char filename[1024];
scanf("%s", &filename);
// printf("Filename: '%s'\n", filename);
int bytesToModify; scanf("%d", &bytesToModify);
FILE *fp;
fp = fopen(filename, "r");
fseek(fp, 0, SEEK_END);
int fSize = ftell(fp);
fseek(fp, 0, SEEK_SET);
printf("%d\n", fSize);
char *buf = malloc(fSize*sizeof(char));
for (int i = 0; i < fSize; i++) {
buf[i] = getc(fp);
}
fclose(fp);
FILE *fo;
fo = fopen("out_file.txt", "w");
for (int i = 0; i < fSize; i++) {
fwrite(&buf[i], 1, 1, fo);
}
fclose(fo);
return 0;
}
Even on small file like this I can see the artifact. Cyrillic sybmol 'я' is coming in the end of file.
If I'll try to rewrite executable file, i get this:
99% of file just turned to these symbols. What is wrong with my code?
I'm using CodeBlocks with GCC Compiler, version 10.1.0.
My Operation System is Windows 10.
Thanks for your help.
You did not open the file in binary mode: "rb" and "wb". Therefore, fgetc will turn all \r\n to a single \n.
For each line terminator there is one character less read. Yet you attempt to read nevertheless, and fgetc will return EOF (and fgetc returns an int, not char). As EOF has value -1 on Windows, when written to file converted to unsigned char this results in Я in the encoding you're using in Notepad (most likely Windows-1251).
Furthermore, since you're using fwrite, then you could similarly use fread. And no need to read, write the characters one at a time, just use
char *buf = malloc(fSize);
int bytesRead = fread(buf, 1, fSize, fp);
fclose(fp);
and
int bytesWritten = fwrite(buf, 1, bytesRead, fo);

Getting excess characters with fread() in C

Okay, so I have tried to read a whole file with fread(), and I can do it successfully, but the longer the file, the more the excess characters I get on the output.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int main() {
FILE* fpointer = fopen("test.txt", "r");
char* wholeFile;
long int fileSize;
if (fpointer == NULL) return 0;
fseek(fpointer, 0, SEEK_END);
fileSize = ftell(fpointer);
rewind(fpointer);
printf("fileSize == %ld\n", fileSize);
wholeFile = (char*)malloc(fileSize+1, sizeof(char));
if (wholeFile == NULL) return 1;
fread(wholeFile, sizeof(char), fileSize, fpointer);
fclose(fpointer);
wholeFile[fileSize] = '\0';
printf("This is whole file:\n\n%s", wholeFile);
free(wholeFile);
return 0;
}
If the file looks like this:
This is cool file.
I get this as output:
This is cool file.²²²²
And if the file is like this:
This
is
cool
file.
I get this as the output:
This
is
cool
file.═══²²²²
Any idea where I'm wrong?
EDIT: Edited code according to comments.
You need to allocate one more than the size of the file and set the last position in the buffer to 0.
C expects character arrays to be null terminated.
Use "rb" to open the file in binary mode. This will ensure you get a reliable count of bytes in the file from Windows.
FILE* fpointer = fopen("test.txt", "rb");
wholeFile = (char*)malloc(fileSize + 1);
wholeFile[fileSize] = '\0';

Reading hex value from image in C

I'm trying to read the hex values from an image file using C. In Linux, this code works fine, but with Windows it reads only the first 334 bytes and I don't understand why.
The code to read the file is:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
void readHexFile(char* path) {
FILE *fp;
if ((fp = fopen (path, "r")) != NULL) {
struct stat st;
stat(path, &st);
int i;
int ch;
for (i = 0; i < st.st_size; i++) {
ch = fgetc(fp);
printf("%x ", ch);
}
fclose(fp);
}
else {
return NULL;
}
}
st.st_size comes from <sys/stat.h> package and contains the right value (the size, in bytes, of the image file)
This image show what my program outputs, and the actual binary content of the file it is reading:
As you see after the sequence of 17, 18, 19 there is also hex values but my program prints ffffffff repeatedly.
You opened the file in a text mode, and not as binary. Different platforms may behave differently.
In this case, Microsoft Windows decided that this plain text file ends at the first occurrence of Ctrl+Z (0x1A), and returns EOF for all fgetc afterwards.
Explicitly state that you want to open the file as binary:
fp = fopen ("yourfile", "rb");
and the problem goes away.
I think your loop should look like this:
int ch;
while (!feof(fp)) {
ch = fgetc(fp);
printf("%x ", ch);
}
It's completely unclear to me why you are using st.st_size here.
On Windows, the character 0x1A (Ctrl+Z) is the EOF character for text mode; see this question.
If you're reading from a binary file like a JPEG, you should do so with first opening the file as binary (fopen mode "rb"), then fread into a pre-allocated buffer, the size of which you would determine with ftell with the file pointer at the end of the file:
size_t i, len;
char *buffer = NULL;
fp = fopen(argv[1], "rb");
if(!fp)
// handle error
fseek(fp, 0, SEEK_END);
len = ftell(fp);
rewind(fp);
buffer = malloc(len + 1);
if(!buffer)
// handle error
fread(buffer, 1, len, fp);
for(i = 0; i < len; i++)
{
printf("%.2X ", buffer[i]);
}
free(buffer);
buffer = NULL;

Read and write a buffer of unsigned char to a file in C?

The following code writes an array of unsigned char (defined as byte) to a file:
typedef unsigned char byte;
void ToFile(byte *buffer, size_t len)
{
FILE *f = fopen("out.txt", "w");
if (f == NULL)
{
fprintf(stderr, "Error opening file!\n");
exit(EXIT_FAILURE);
}
for (int i = 0; i < len; i++)
{
fprintf(f, "%u", buffer[i]);
}
fclose(f);
}
How do I read the file back from out.txt into a buffer of byte? The goal is to iterate the buffer byte by byte. Thanks.
If you want to read it back, I wouldn't use %u to write it out. %u is going to be variable width output, so a 1 takes one character, and a 12 takes two, etc. When you read it back and see 112 you don't know if that's three characters (1, 1, 2), or two (11, 2; or 1, 12) or just one (112). If you need an ASCII file, you would use a fixed width output, such as %03u. That way each byte is always 3 characters. Then you could read in a byte at a time with fscanf("%03u", buffer[i]).
How do I read the file back from out.txt into a buffer of byte? The goal is to iterate the buffer byte by byte. Thanks.
Something similar to this should work for you. (Not debugged, doing this away from my compiler)
void FromFile(byte *buffer, size_t len)
{
FILE *fOut = fopen("out.txt", "rb");
int cOut;
int i = 0;
if (fOut == NULL)
{
fprintf(stderr, "Error opening file!\n");
exit(EXIT_FAILURE);
}
cOut = fgetc(fOut);
while(cOut != EOF)
{
buffer[i++] = cOut; //iterate buffer byte by byte
cOut = fgetc(fOut);
}
fclose(fOut);
}
You could (and should) use fread() and fwrite() (http://www.cplusplus.com/reference/cstdio/fread/) for transferring raw memory between FILE s and memory.
To determine the size of the file (to advise fread() how many bytes it should read) use fseek(f, 0, SEEK_END) (http://www.cplusplus.com/reference/cstdio/fseek/) to place the cursor to the end of the file and read its size with ftell(f) (http://www.cplusplus.com/reference/cstdio/ftell/). Don't forget to jump back to the beginning with fseek(f, 0, SEEK_SET) for the actual reading process.

Can't copy the beginning (first 64 bytes) of one .wav to an empty one

I have to copy first 64 bytes from input file in.wav into output file out.wav.
(I've downloaded a program which shows .wav file's header (chunks): first 44 bytes and first 20 bytes of data_subchunk)
My code fills out.wav file with some values, but (I'm convinced) it to be a garbage. (The values that program shows don't match.)
I have to copy a part of in.wav file into out.wav:
#include <stdio.h>
#include <stdlib.h>
typedef struct FMT
{
char Subchunk1ID[4];
int Subchunk1Size;
short int AudioFormat;
short int NumChannels;
int SampleRate;
int ByteRate;
short int BlockAlign;
short int BitsPerSample;
} fmt;
typedef struct DATA
{
char Subchunk2ID[4];
int Subchunk2Size;
int Data[441000]; // 10 secs of garbage. he-he)
} data;
struct HEADER
{
char ChunkId[4];
int ChunkSize;
char Format[4];
fmt S1;
data S2;
} header;
int main()
{
FILE *input = fopen("in.wav", "r");
FILE *output = fopen("out.wav", "w");
if(input == NULL)
{
printf("Unable to open wave file\n");
exit(EXIT_FAILURE);
}
fwrite(&input, sizeof(int), 16, output); // 16'ints' * 4 = 64 bytes
fclose(input);
fclose(output);
return 0;
}
What is wrong?
You are writing data to opened output file from input
fwrite(&input, sizeof(int), 16, output);
I am not sure, if you can use FILE pointer this way. I would do that this way
unsigned char buf[64];
fread(&buf, sizeof(char), 64, input);
fwrite(&buf, sizeof(char), 64, output);
Also open files in binary mode: fopen("in.wav", "rb") and fopen("out.wav", "wb")
You can't use a FILE pointer as input to fwrite. You actually have to read the data first into a buffer, before you write that buffer.
First, use binary mode for fopen ("wb" instead of "w"). Second, you are making an unsafe assumption, that is sizeof (int) == 4, which is not always true. Third, input is already a pointer, you don't need the & befor input on the fwrite line.
And, most importantly, you cannot use fwrite from one file handle to another. You need to copy to a char[] buffer before, like:
char buffer[40];
if (fread(buffer, sizeof (char), 40, input) != 40) {
printf("Error!");
}
if (fwrite(buffer, sizeof (char), 40, output) != 40) {
printf("Error!");
}

Resources