C in UNIX: Reading/combining files based upon number of bytes - c

I am trying to fix the code below to only read the first few N bytes. I would also like to do the same thing, but for the last number of N bytes (I assume that would involve just adding a '-' in front of the number of bytes N). I am not sure if using fget is the correct method for doing so.
I tried changing the 1000 in
while(fgets(buffer, 1000, fp)
however I do not think changing that value will pick up a certain number of bytes, as I have read that it is only a maximum value.
char buffer[1001];
int main(int argc, char** argv) {
bzero(buffer, sizeof(buffer));
for(int x=1; x<argc; x++) {
FILE *fp = fopen(argv[x], "r+");
if (fp) {
while(fgets(buffer, 1000, fp)) {
printf("%s", buffer);
}
} else {
printf("could not open file %s\n", argv[x]);
}
}
}

Assuming that you want the first 1000 bytes and the last 1000 bytes of a file, and largely ignoring problems with files smaller than 2000 bytes (it works, but you might want a different result), you could use:
#include <stdio.h>
enum { NUM_BYTES = 1000 };
int main(int argc, char **argv)
{
for (int x = 1; x < argc; x++)
{
FILE *fp = fopen(argv[x], "r");
if (fp)
{
char buffer[NUM_BYTES];
int nbytes = fread(buffer, 1, NUM_BYTES, fp);
fwrite(buffer, 1, nbytes, stdout);
if (fseek(fp, -NUM_BYTES, SEEK_END) == 0)
{
nbytes = fread(buffer, 1, NUM_BYTES, fp);
fwrite(buffer, 1, nbytes, stdout);
}
fclose(fp);
}
else
{
fprintf(stderr, "%s: could not open file %s\n", argv[0], argv[x]);
}
}
}
This uses fread(), fwrite() and fseek() as suggested in the comments.
It also takes care to close successfully opened files. It does not demand write permissions on the files since it only reads and does not write those files (using "r" instead of "r+" in the call to fopen()).
If the file is smaller than 1000 bytes, the fseek() will fail because it tries to seek to a negative offset. If that happens, don't bother to read or write another 1000 bytes.
I debated whether to use sizeof(buffer) or NUM_BYTES in the function calls. I decided that NUM_BYTES was better, but the choice is not definitive — there are cogent arguments for using sizeof(buffer) instead.
Note that buffer becomes a local variable. There's no need to zero it; only the entries that are written on by fread() will be written by fwrite(), so there is no problem resolved by bzero(). (There doubly wasn't any point in that when the variable was global; variables with static duration are default initialized to all bytes zero anyway.)
The error message is written to standard error.
The code doesn't check for zero bytes read; arguably, it should.
If the NUM_BYTES becomes a parameter (e.g. you call your program fl19 and use fl19 -n 200 file1 to print the first and last 200 bytes of file1), then you need to do some tidying up as well as command-line argument handling.

Related

Copy Function in C not creating matching Checksums

I written a simple copy program that copies a file and generates an MD5, It runs and generates the MD5 correctly.
However when verifying the file generated by the copy function it does not match the source MD5. I can't see any reason for this in my code, can anyone help?
#include <stdio.h>
#include <openssl/md5.h>
#include <assert.h>
#define BUFFER_SIZE 512
int secure_copy(char *filepath, char *destpath);
int main(int argc, char * argv[]) {
secure_copy(argv[1], argv[2]);
return 0;
}
int secure_copy(char *filepath, char *destpath) {
FILE *src, *dest;
src = fopen(filepath, "r");
assert(src != NULL);
dest = fopen(destpath, "w");
assert(dest != 0);
MD5_CTX c;
char buf[BUFFER_SIZE];
ssize_t bytes, out_writer;
unsigned char out[MD5_DIGEST_LENGTH];
MD5_Init(&c);
while((bytes = fread(buf, 1, BUFFER_SIZE, src)) != 0) {
MD5_Update(&c, buf, bytes);
out_writer = fwrite(buf, 1, BUFFER_SIZE, dest);
assert(out_writer != 0);
}
MD5_Final(out, &c);
printf("MD5: ");
for (int i=0; i < MD5_DIGEST_LENGTH; i++)
{
printf("%02x", out[i]);
}
printf("\n");
fclose(src);
fclose(dest);
return 0;
}
Output
$ ./md5speed doc.txt /home/doc.txt
MD5: 4c55e4b9185eece3cc000c4023f8f6fe
when verifying the copied file with md5sum I get a completely different hash.
md5sum doc.txt
29cb4da30c3e28fdb81463b5f0a76894 doc.txt
Though the file still opens and content is uncorrupted.
regarding:
while((bytes = fread(buf, 1, BUFFER_SIZE, src)) != 0)
and
out_writer = fwrite(buf, 1, BUFFER_SIZE, dest);
on the last read, the amount read can be less than BUFFER_SIZE so should always use bytes variable for the number of bytes to write.
Also, certain errors can occur when calling fread() and/or fwrite() Such errors are indicated by negative values (and/or values less than the 3rd parameter to those functions) in the returned variables (bytes, outwriter). The code, to be robust, must be checking those values and handling any errors that occur, including EOF
As stated in comments, changing the fwrite function to use bytes as opposed to BUFFER_SIZE combined with changing file operations mode "rb" and "wb" to binary.

not getting all data in file using fopen

I'm using the fopen with fread for this:
FILE *fp;
if (fopen_s(&fp, filePath, "rb"))
{
printf("Failed to open file\n");
//exit(1);
}
fseek(fp, 0, SEEK_END);
int size = ftell(fp);
rewind(fp);
char buffer = (char)malloc(sizeof(char)*size);
if (!buffer)
{
printf("Failed to malloc\n");
//exit(1);
}
int charsTransferred = fread(buffer, 1, size, fp);
printf("charsTransferred = %d, size = %d\n", charsTransferred, strlen(buffer));
fclose(fp);
I'm not getting the file data in the new file. Here is a comparison between the original file (right) and the one that was sent over the network (left):
Any issues with my fopen calls?
EDIT: I can't do away with the null terminators, because this is a PDF. If i get rid of them the file will corrupt.
Be reassured: the way you're doing the read ensures that you're reading all the data.
you're using "rb" so even in windows you're covered against CR+LF conversions
you're computing the size all right using ftell when at the end of the file
you rewind the file
you allocate properly.
BUT you're not storing the right variable type:
char buffer = (char)malloc(sizeof(char)*size);
should be
char *buffer = malloc(size);
(that very wrong and you should correct it, but since you successfully print some data, that's not the main issue. Next time enable and read the warnings. And don't cast the return value of malloc, it's error-prone specially in your case)
Now, the displaying using printf and strlen which confuses you.
Since the file is binary, you meet a \0 somewhere, and printf prints only the start of the file. If you want to print the contents, you have to perform a loop and print each character (using charsTransferred as the limit).
That's the same for strlen which stops at the first \0 character.
The value in charsTransferred is correct.
To display the data, you could use fwrite to stdout (redirect the output or this can crash your terminal because of all the junk chars)
fwrite(buffer, 1, size, stdout);
Or loop and print only if the char is printable (I'd compare ascii codes for instance)
int charsTransferred = fread(buffer, 1, size, fp);
int i;
for (i=0;i<charsTransferred;i++)
{
char b = buffer[i];
putchar((b >= ' ') && (b < 128) ? b : "-");
if (i % 80 == 0) putchar('\n'); // optional linefeed every now and then...
}
fflush(stdout);
that code prints dashes for characters outside the standard printable ASCII-range, and the real character otherwise.

fread is not reading other file formats

I am fairly new to C still, but the program below compiles just fine, (using gcc) and it even works when using text files, but I when I use other file formats, i.e. png, I get nothing. The console spits out ?PNG and nothing else. I don't want the image to print as an image, obviously the program does nothing like that, but I would like the data from the png file to be printed. Why is the program not fread-ing properly? Is is because fread refuses any file other than text?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
FILE *fp;
int main() {
char buffer[1000];
fp=fopen("FILE IN QUESTION HERE", "rb");
if(fp==NULL) {
perror("An error occured while opening the file...");
exit(1);
}
fread(buffer, 1000, 1, fp);
printf("%s\n", buffer);
fclose(fp);
return 0;
}
%s in printf() is for printing null-terminated string, not binary data and PNG header contains a signature to prevent the data from being transfered as text by mistake.
(Actually there are no 0x00 in the PNG signature and printf() stopped at the 0x00 contained in the size of IHDR chunk)
Use fwrite() to output binary data, or print the bytes one-by-one via putchar().
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
FILE* fp; /* avoid using gloval variables unless it is necessary */
char buffer[1000] = {0}; /* initialize to avoid undefined behavior */
fp=fopen("FILE IN QUESTION HERE", "rb");
if(fp==NULL) {
perror("An error occured while opening the file...");
exit(1);
}
fread(buffer, 1000, 1, fp);
fwrite(buffer, 1000, 1, stdout); /* use fwrite instead of printf */
fclose(fp);
return 0;
}
fread is not reading other file formats
Code does not check the result of fread(). That is the way to determine if fread() is working.
char buffer[1000];
// fread(buffer, 1000, 1, fp);
size_t sz = fread(buffer, 1000, 1, fp);
if (sz == 0) puts("Did not read an entire block");
fread() returns the number of blocks read. With OP's case, code is attempting to read one 1000 byte block. Recommend reading 1000 blocks, each of 1 char rather than 1 block of a 1000 char. Further, avoid magic numbers.
for (;;) {
size_t sz = fread(buffer, sizeof buffer[0], sizeof buffer, fp);
if (sz == 0) break;
// Somehow print the buffer.
print_it(buffer, sz);
}
OP call to printf() expects a pointer to a string. A C string is an array of characters up to and including the terminating null character. buffer may/may not contain a null character and useful data after a null character.
// Does not work for OP
// printf("%s\n", buffer);
The data of a .png file is mostly binary and will have little textual meaning. A sample print function of mixed binary data and text follows. Most output will appears meaningless until one learns the .png file format. Untested code.
int print_it(const unsigned char *x, size_t sz) {
char buf[5];
unsigned column = 0;
while (sz > 0) {
sz--;
if (isgraph(*x) && *x != `(`) {
sprintf(buf, "%c", *x);
} else {
sprintf(buf, "(%02X)", *x);
}
column += strlen(buf);
if (column > 80) {
column = 0;
fputc('\n', stdout);
}
fputs(buf, stdout);
}
if (column > 0) fputc('\n', stdout);
}

Read and write a buffer of unsigned char to a file in C?

The following code writes an array of unsigned char (defined as byte) to a file:
typedef unsigned char byte;
void ToFile(byte *buffer, size_t len)
{
FILE *f = fopen("out.txt", "w");
if (f == NULL)
{
fprintf(stderr, "Error opening file!\n");
exit(EXIT_FAILURE);
}
for (int i = 0; i < len; i++)
{
fprintf(f, "%u", buffer[i]);
}
fclose(f);
}
How do I read the file back from out.txt into a buffer of byte? The goal is to iterate the buffer byte by byte. Thanks.
If you want to read it back, I wouldn't use %u to write it out. %u is going to be variable width output, so a 1 takes one character, and a 12 takes two, etc. When you read it back and see 112 you don't know if that's three characters (1, 1, 2), or two (11, 2; or 1, 12) or just one (112). If you need an ASCII file, you would use a fixed width output, such as %03u. That way each byte is always 3 characters. Then you could read in a byte at a time with fscanf("%03u", buffer[i]).
How do I read the file back from out.txt into a buffer of byte? The goal is to iterate the buffer byte by byte. Thanks.
Something similar to this should work for you. (Not debugged, doing this away from my compiler)
void FromFile(byte *buffer, size_t len)
{
FILE *fOut = fopen("out.txt", "rb");
int cOut;
int i = 0;
if (fOut == NULL)
{
fprintf(stderr, "Error opening file!\n");
exit(EXIT_FAILURE);
}
cOut = fgetc(fOut);
while(cOut != EOF)
{
buffer[i++] = cOut; //iterate buffer byte by byte
cOut = fgetc(fOut);
}
fclose(fOut);
}
You could (and should) use fread() and fwrite() (http://www.cplusplus.com/reference/cstdio/fread/) for transferring raw memory between FILE s and memory.
To determine the size of the file (to advise fread() how many bytes it should read) use fseek(f, 0, SEEK_END) (http://www.cplusplus.com/reference/cstdio/fseek/) to place the cursor to the end of the file and read its size with ftell(f) (http://www.cplusplus.com/reference/cstdio/ftell/). Don't forget to jump back to the beginning with fseek(f, 0, SEEK_SET) for the actual reading process.

copying contents of a text file in c

I want to read a text file and transfer it's contents to another text file in c, Here is my code:
char buffer[100];
FILE* rfile=fopen ("myfile.txt","r+");
if(rfile==NULL)
{
printf("couldn't open File...\n");
}
fseek(rfile, 0, SEEK_END);
size_t file_size = ftell(rfile);
printf("%d\n",file_size);
fseek(rfile,0,SEEK_SET);
fread(buffer,file_size,1,rfile);
FILE* pFile = fopen ( "newfile.txt" , "w+" );
fwrite (buffer , 1 ,sizeof(buffer) , pFile );
fclose(rfile);
fclose (pFile);
return 0;
}
the problem that I am facing is the appearence of unnecessary data in the receiving file,
I tried the fwrite function with both "sizeof(buffer)" and "file_size",In the first case it is displaying greater number of useless characters while in the second case the number of useless characters is only 3,I would really appreciate if someone pointed out my mistake and told me how to get rid of these useless characters...
Your are writing all the content of buffer (100 char) in the receiving file. You need to write the exact amount of data read.
fwrite(buffer, 1, file_size, pFile)
Adding more checks for your code:
#include <stdio.h>
#include <stdlib.h>
#define BUFFER_SIZE 100
int main(void) {
char buffer[BUFFER_SIZE];
size_t file_size;
size_t ret;
FILE* rfile = fopen("input.txt","r+");
if(rfile==NULL)
{
printf("couldn't open File \n");
return 0;
}
fseek(rfile, 0, SEEK_END);
file_size = ftell(rfile);
fseek(rfile,0,SEEK_SET);
printf("File size: %d\n",file_size);
if(!file_size) {
printf("Warring! Empty input file!\n");
} else if( file_size >= BUFFER_SIZE ){
printf("Warring! File size greater than %d. File will be truncated!\n", BUFFER_SIZE);
file_size = BUFFER_SIZE;
}
ret = fread(buffer, sizeof(char), file_size, rfile);
if(file_size != ret) {
printf("I/O error\n");
} else {
FILE* pFile = fopen ( "newfile.txt" , "w+" );
if(!pFile) {
printf("Can not create the destination file\n");
} else {
ret = fwrite (buffer , 1 ,file_size , pFile );
if(ret != file_size) {
printf("Writing error!");
}
fclose (pFile);
}
}
fclose(rfile);
return 0;
}
You need to check the return values from all calls to fseek(), fread() and fwrite(), even fclose().
In your example, you have fread() read 1 block which is 100 bytes long. It's often a better idea to reverse the parameters, like this: ret = fread(buffer,1,file_size,rfile). The ret value will then show how many bytes it could read, instead of just saying it could not read a full block.
Here is an implementation of an (almost) general purpose file copy function:
void fcopy(FILE *f_src, FILE *f_dst)
{
char buffer[BUFSIZ];
size_t n;
while ((n = fread(buffer, sizeof(char), sizeof(buffer), f_src)) > 0)
{
if (fwrite(buffer, sizeof(char), n, f_dst) != n)
err_syserr("write failed\n");
}
}
Given an open file stream f_src to read and another open file stream f_dst to write, it copies (the remainder of) the file associated with f_src to the file associated with f_dst. It does so moderately economically, using the buffer size BUFSIZ from <stdio.h>. Often, you will find that bigger buffers (such as 4 KiB or 4096 bytes, even 64 KiB or 65536 bytes) will give better performance. Going larger than 64 KiB seldom yields much benefit, but YMMV.
The code above calls an error reporting function (err_syserr()) which is assumed not to return. That's why I designated it 'almost general purpose'. The function could be upgraded to return an int value, 0 on success and EOF on a failure:
enum { BUFFER_SIZE = 4096 };
int fcopy(FILE *f_src, FILE *f_dst)
{
char buffer[BUFFER_SIZE];
size_t n;
while ((n = fread(buffer, sizeof(char), sizeof(buffer), f_src)) > 0)
{
if (fwrite(buffer, sizeof(char), n, f_dst) != n)
return EOF; // Optionally report write failure
}
if (ferror(f_src) || ferror(f_dst))
return EOF; // Optionally report I/O error detected
return 0;
}
Note that this design doesn't open or close files; it works with open file streams. You can write a wrapper that opens the files and calls the copy function (or includes the copy code into the function). Also note that to change the buffer size, I simply changed the buffer definition; I didn't change the main copy code. Also note that any 'function call overhead' in calling this little function is completely swamped by the overhead of the I/O operations themselves.
Note ftell returns a long, not a size_t. Shouldn't matter here, though. ftell itself is not necessarily a byte-offset, though. The standard requires it only to be an acceptable argument to fseek. You might get a better result from fgetpos, but it has the same portability issue from the lack of specification by the standard. (Confession: I didn't check the standard itself; got all this from the manpages.)
The more robust way to get a file-size is with fstat.
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd>
struct stat stat_buf;
if (fstat(filename, &buf) == -1)
perror(filename), exit(EXIT_FAILURE);
file_size = statbuf.st_size;
I think the parameters you passed in the fwrite are not in right sequence.
To me it should be like that-
fwrite(buffer,SIZE,1,pFile)
as the syntax of fwrite is
size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream);
The function fwrite() writes nmemb elements of data, each size bytes long, to the stream pointed to by stream, obtaining them from the location given by ptr.
So change the sequence and try again.

Resources