Introduction
I'm writing my own cp program. With the code I currently have I'm able to copy and paste files.
Code
char *buf;
int fd;
int ret;
struct stat sb;
FILE *stream;
/*opening and getting size of file to copy*/
fd = open(argv[1],O_RDONLY);
if(fd == -1)
{
perror("open");
return 1;
}
/*obtaining size of file*/
ret = fstat(fd,&sb);
if(ret)
{
perror("stat");
return 1;
}
/*opening a stream for reading/writing file*/
stream fdopen(fd,"rb");
if(!stream)
{
perror("fdopen");
return 1;
}
/*allocating space for reading binary file*/
buf = malloc(sb.st_size);
/*reading data*/
if(!fread(buf,sb.st_size,1,stream))
{
perror("fread");
return 1;
}
/*writing file to a duplicate*/
fclose(stream);
stream = fopen("duplicate","wb");
if(!fwrite(buf,sb.st_size,1,stream))
{
perror("fwrite");
return 1;
}
fclose(stream);
close(fd);
free(buf);
return 0;
The problem
I'm unable to copy and paste .zip files and .tar.gz files. If i alter the code and give an extension such as 'duplicate.zip' (assuming im copying a zip file) such as .zip and then try and copy a .zip file
everything is copied, however the new duplicated file does not act like a zip file and when i use cat it outputs nothing and this error when i attempt to unzip it anyway:
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
So how do i go about copying zip and pasting zip files and also .tar.gz files. Any pointers will be helpful, thanks in advance.
You are using malloc() incorrectly. You want to allocate sb.st_size bytes.
malloc(sb.st_size * sizeof buf)
should be
malloc(sb.st_size)
The use of fread() is dubious and you are throwing away the result of fread(). Instead of
if(!fread(buf,sb.st_size,1,stream))
you should have
size_t num_bytes_read = fread (buf, 1, sb.st_size, stream);
if (num_bytes_read < sb.st_size)
You are using strlen() incorrectly. The content of buf is not guaranteed to be a string; and anyway you already know how many bytes you have in buf: sb.st_size. (Because if fread() returned a smaller number of bytes read you got angry and terminated the process.) So instead of
fwrite(buf,strlen(buf),1,stream)
you should have
fwrite (buf, 1, sb.st_size, stream)
In addition to AlexP's notes...
/*obtaining size of file*/
ret = fstat(fd,&sb);
if(ret)
{
perror("stat");
return 1;
}
// ...some code...
/*allocating space for reading binary file*/
buf = malloc(sb.st_size);
/*reading data*/
if(!fread(buf,sb.st_size,1,stream))
{
perror("fread");
return 1;
}
You have a race condition here. If the file size changes between your fstat call and malloc or fread you will read too much or too little of the file.
Fixing this leads us to the next issue, you're slurping the entire file into memory. While this might work for small files, it is extremely inefficient with your memory on large ones. For very large files it might be too large for a single malloc, and you're not checking if your malloc succeeds.
Instead, read and write the file a piece at a time. And read until there isn't any more to read.
uint8_t *buffer[4096]; // 4K buffer
size_t num_read;
while( (num_read = fread(buffer, sizeof(uint8_t), sizeof(buffer), in)) != 0 ) {
if( fwrite( buffer, sizeof(uint8_t), num_read, out ) == 0 ) {
perror("fwrite");
}
}
This avoids the race condition by not having to call fstat in the first place. And it avoids allocating a potentially enormous hunk of memory. Instead it can all be done on the stack.
I've used uint8_t to get a hunk of bytes. It's a standard fixed width integer type from stdint.h. You can also use unsigned char to read bytes, and that's probably what uint8_t really is, but uint8_t makes it explicit.
Related
I need to copy the contents of a text file to a dynamically-allocated character array.
My problem is getting the size of the contents of the file; Google reveals that I need to use fseek and ftell, but for that the file apparently needs to be opened in binary mode, and that gives only garbage.
EDIT: I tried opening in text mode, but I get weird numbers. Here's the code (I've omitted simple error checking for clarity):
long f_size;
char* code;
size_t code_s, result;
FILE* fp = fopen(argv[0], "r");
fseek(fp, 0, SEEK_END);
f_size = ftell(fp); /* This returns 29696, but file is 85 bytes */
fseek(fp, 0, SEEK_SET);
code_s = sizeof(char) * f_size;
code = malloc(code_s);
result = fread(code, 1, f_size, fp); /* This returns 1045, it should be the same as f_size */
The root of the problem is here:
FILE* fp = fopen(argv[0], "r");
argv[0] is your executable program, NOT the parameter. It certainly won't be a text file. Try argv[1], and see what happens then.
You cannot determine the size of a file in characters without reading the data, unless you're using a fixed-width encoding.
For example, a file in UTF-8 which is 8 bytes long could be anything from 2 to 8 characters in length.
That's not a limitation of the file APIs, it's a natural limitation of there not being a direct mapping from "size of binary data" to "number of characters."
If you have a fixed-width encoding then you can just divide the size of the file in bytes by the number of bytes per character. ASCII is the most obvious example of this, but if your file is encoded in UTF-16 and you happen to be on a system which treats UTF-16 code points as the "native" internal character type (which includes Java, .NET and Windows) then you can predict the number of "characters" to allocate as if UTF-16 were fixed width. (UTF-16 is variable width due to Unicode characters above U+FFFF being encoded in multiple code points, but a lot of the time developers ignore this.)
I'm pretty sure argv[0] won't be an text file.
Give this a try (haven't compiled this, but I've done this a bazillion times, so I'm pretty sure it's at least close):
char* readFile(char* filename)
{
FILE* file = fopen(filename,"r");
if(file == NULL)
{
return NULL;
}
fseek(file, 0, SEEK_END);
long int size = ftell(file);
rewind(file);
char* content = calloc(size + 1, 1);
fread(content,1,size,file);
return content;
}
If you're developing for Linux (or other Unix-like operating systems), you can retrieve the file-size with stat before opening the file:
#include <stdio.h>
#include <sys/stat.h>
int main() {
struct stat file_stat;
if(stat("main.c", &file_stat) != 0) {
perror("could not stat");
return (1);
}
printf("%d\n", (int) file_stat.st_size);
return (0);
}
EDIT: As I see the code, I have to get into the line with the other posters:
The array that takes the arguments from the program-call is constructed this way:
[0] name of the program itself
[1] first argument given
[2] second argument given
[n] n-th argument given
You should also check argc before trying to use a field other than '0' of the argv-array:
if (argc < 2) {
printf ("Usage: %s arg1", argv[0]);
return (1);
}
argv[0] is the path to the executable and thus argv[1] will be the first user submitted input. Try to alter and add some simple error-checking, such as checking if fp == 0 and we might be ble to help you further.
You can open the file, put the cursor at the end of the file, store the offset, and go back to the top of the file, and make the difference.
You can use fseek for text files as well.
fseek to end of file
ftell the offset
fseek back to the begining
and you have size of the file
Kind of hard with no sample code, but fstat (or stat) will tell you how big the file is. You allocate the memory required, and slurp the file in.
Another approach is to read the file a piece at a time and extend your dynamic buffer as needed:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define PAGESIZE 128
int main(int argc, char **argv)
{
char *buf = NULL, *tmp = NULL;
size_t bufSiz = 0;
char inputBuf[PAGESIZE];
FILE *in;
if (argc < 2)
{
printf("Usage: %s filename\n", argv[0]);
return 0;
}
in = fopen(argv[1], "r");
if (in)
{
/**
* Read a page at a time until reaching the end of the file
*/
while (fgets(inputBuf, sizeof inputBuf, in) != NULL)
{
/**
* Extend the dynamic buffer by the length of the string
* in the input buffer
*/
tmp = realloc(buf, bufSiz + strlen(inputBuf) + 1);
if (tmp)
{
/**
* Add to the contents of the dynamic buffer
*/
buf = tmp;
buf[bufSiz] = 0;
strcat(buf, inputBuf);
bufSiz += strlen(inputBuf) + 1;
}
else
{
printf("Unable to extend dynamic buffer: releasing allocated memory\n");
free(buf);
buf = NULL;
break;
}
}
if (feof(in))
printf("Reached the end of input file %s\n", argv[1]);
else if (ferror(in))
printf("Error while reading input file %s\n", argv[1]);
if (buf)
{
printf("File contents:\n%s\n", buf);
printf("Read %lu characters from %s\n",
(unsigned long) strlen(buf), argv[1]);
}
free(buf);
fclose(in);
}
else
{
printf("Unable to open input file %s\n", argv[1]);
}
return 0;
}
There are drawbacks with this approach; for one thing, if there isn't enough memory to hold the file's contents, you won't know it immediately. Also, realloc() is relatively expensive to call, so you don't want to make your page sizes too small.
However, this avoids having to use fstat() or fseek()/ftell() to figure out how big the file is beforehand.
I consider reading file of unknown size that I know doesn't change size in the meantime. So I intend to use fstat() function and struct stat. Now I am considering what the st_size field really means and how should I use it.
If I get the file size's in this way, then allocate a buffer of that size and read exactly that size of bytes there seems to be one byte left over. I come to this conclusion when I used feof() function to check if there really nothing left in FILE *. It returns false! So I need to read (st_size + 1) and only than all bytes have been read and feof() works correctly. Should I always add this +1 value to this size to read all bytes from binary file or there is some hidden reason that this isn't reading to EOF?
struct stat finfo;
fstat(fileno(fp), &finfo);
data_length = finfo.st_size;
I am asking about this because when I add +1 then the number of bytes read by fread() is really -1 byte less, and as the last byte is inserted 00 byte. I could also before checking with feof() do something like this
fread(NULL, 1, 1, fp);
It is the real code, it is a little odd situation:
// reading png bytes from file
FILE *fp = fopen("./test/resources/RGBA_8bits.png", "rb");
// get file size from file info
struct stat finfo;
fstat(fileno(fp), &finfo);
pngDataLength = finfo.st_size;
pngData = malloc(sizeof(unsigned char)*pngDataLength);
if( fread(pngData, 1, pngDataLength, fp) != pngDataLength) {
fprintf(stderr, "%s: Incorrect number of bytes read from file!\n", __func__);
fclose(fp);
free(pngData);
return;
}
fread(NULL, 1, 1, fp);
if(!feof(fp)) {
fprintf(stderr, "%s: Not the whole binary file has been read.\n", __func__);
fclose(fp);
free(pngData);
return;
}
fclose(fp);
This behaviour is normal.
feof will return true only once you have tried to read beyond the file's end which you don't do as you read exactly the size of the file.
I am working on an assignment in socket programming in which I have to send a file between sparc and linux machine. Before sending the file in char stream I have to get the file size and tell the client. Here are some of the ways I tried to get the size but I am not sure which one is the proper one.
For testing purpose, I created a file with content " test" (space + (string)test)
Method 1 - Using fseeko() and ftello()
This is a method I found on https://www.securecoding.cert.org/confluence/display/c/FIO19-C.+Do+not+use+fseek()+and+ftell()+to+compute+the+size+of+a+regular+file
While the fssek() has a problem of "Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream", fseeko() is said to have tackled this problem but it only works on POSIX system (which is fine because the environment I am using is sparc and linux)
fd = open(file_path, O_RDONLY);
fp = fopen(file_path, "rb");
/* Ensure that the file is a regular file */
if ((fstat(fd, &st) != 0) || (!S_ISREG(st.st_mode))) {
/* Handle error */
}
if (fseeko(fp, 0 , SEEK_END) != 0) {
/* Handle error */
}
file_size = ftello(fp);
fseeko(fp, 0, SEEK_SET);
printf("file size %zu\n", file_size);
This method works fine and get the size correctly. However, it is limited to regular files only. I tried to google the term "regular file" but I still not quite understand it thoroughly. And I do not know if this function is reliable for my project.
Method 2 - Using strlen()
Since the max. size of a file in my project is 4MB, so I can just calloc a 4MB buffer. After that, the file is read into the buffer, and I tried to use the strlen to get the file size (or more correctly the length of content). Since strlen() is portable, can I use this method instead? The code snippet is like this
fp = fopen(file_path, "rb");
fread(file_buffer, 1024*1024*4, 1, fp);
printf("strlen %zu\n", strlen(file_buffer));
This method works too and returns
strlen 8
However, I couldn't see any similar approach on the Internet using this method. So I am thinking maybe I have missed something or there are some limitations of this approach which I haven't realized.
Regular file means that it is nothing special like device, socket, pipe etc. but "normal" file.
It seems that by your task description before sending you must retrieve size of normal file.
So your way is right:
FILE* fp = fopen(...);
if(fp) {
fseek(fp, 0 , SEEK_END);
long fileSize = ftell(fp);
fseek(fp, 0 , SEEK_SET);// needed for next read from beginning of file
...
fclose(fp);
}
but you can do it without opening file:
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
struct stat buffer;
int status;
status = stat("path to file", &buffer);
if(status == 0) {
// size of file is in member buffer.st_size;
}
OP can do it the easy way as "max. size of a file in my project is 4MB".
Rather than using strlen(), use the return value from fread(). stlen() stops on the first null character, so may report too small a value. #Sami Kuhmonen Also we do not know the data read contains any null character, so it may not be a string. Append a null character (and allocate +1) if code needs to use data as a string. But in that case, I'd expect the file needed to be open in text mode.
Note that many OS's do not even use allocated memory until it is written.
Why is malloc not "using up" the memory on my computer?
fp = fopen(file_path, "rb");
if (fp) {
#define MAX_FILE_SIZE 4194304
char *buf = malloc(MAX_FILE_SIZE);
if (buf) {
size_t numread = fread(buf, sizeof *buf, MAX_FILE_SIZE, fp);
// shrink if desired
char *tmp = realloc(buf, numread);
if (tmp) {
buf = tmp;
// Use buf with numread char
}
free(buf);
}
fclose(fp);
}
Note: Reading the entire file into memory may not be the best idea to begin with.
I am currently working on a project in which I have to read from a binary file and send it through sockets and I am having a hard time trying to send the whole file.
Here is what I wrote so far:
FILE *f = fopen(line,"rt");
//size = lseek(f, 0, SEEK_END)+1;
fseek(f, 0L, SEEK_END);
int size = ftell(f);
unsigned char buffer[MSGSIZE];
FILE *file = fopen(line,"rb");
while(fgets(buffer,MSGSIZE,file)){
sprintf(r.payload,"%s",buffer);
r.len = strlen(r.payload)+1;
res = send_message(&r);
if (res < 0) {
perror("[RECEIVER] Send ACK error. Exiting.\n");
return -1;
}
}
I think it has something to do with the size of the buffer that I read into,but I don't know what it's the correct formula for it.
One more thing,is the sprintf done correctly?
If you are reading binary files, a NUL character may appear anywhere in the file.
Thus, using string functions like sprintf and strlen is a bad idea.
If you really need to use a second buffer (buffer), you could use memcpy.
You could also directly read into r.payload (if r.payload is already allocated with sufficient size).
You are looking for fread for a binary file.
The return value of fread tells you how many bytes were read into your buffer.
You may also consider to call fseek again.
See here How can I get a file's size in C?
Maybe your code could look like this:
#include <stdint.h>
#include <stdio.h>
#define MSGSIZE 512
struct r_t {
uint8_t payload[MSGSIZE];
int len;
};
int send_message(struct r_t *t);
int main() {
struct r_t r;
FILE *f = fopen("test.bin","rb");
fseek(f, 0L, SEEK_END);
size_t size = ftell(f);
fseek(f, 0L, SEEK_SET);
do {
r.len = fread(r.payload, 1, sizeof(r.payload), f);
if (r.len > 0) {
int res = send_message(&r);
if (res < 0) {
perror("[RECEIVER] Send ACK error. Exiting.\n");
fclose(f);
return -1;
}
}
} while (r.len > 0);
fclose(f);
return 0;
}
No, the sprintf is not done correctly. It is prone to buffer overflow, a very serious security problem.
I would consider sending the file as e.g. 1024-byte chunks instead of as line-by-line, so I would replace the fgets call with an fread call.
Why are you opening the file twice? Apparently to get its size, but you could open it only once and jump back to the beginning of the file. And, you're not using the size you read for anything.
Is it a binary file or a text file? fgets() assumes you are reading a text file -- it stops on a line break -- but you say it's a binary file and open it with "rb" (actually, the first time you opened it with "rt", I assume that was a typo).
IMO you should never ever use sprintf. The number of characters written to the buffer depends on the parameters that are passed in, and in this case if there is no '\0' in buffer then you cannot predict how many bytes will be copied to r.payload, and there is a very good chance you will overflow that buffer.
I think sprintf() would be the first thing to fix. Use memcpy() and you can tell it exactly how many bytes to copy.
I am reaing from a file, and when i read, it takes it line by line, and prints it
what i want exactly is i want an array of char holding all the chars in the file and print it once,
this is the code i have
if(strcmp(str[0],"#")==0)
{
FILE *filecomand;
//char fname[40];
char line[100];
int lcount;
///* Read in the filename */
//printf("Enter the name of a ascii file: ");
//fgets(History.txt, sizeof(fname), stdin);
/* Open the file. If NULL is returned there was an error */
if((filecomand = fopen(str[1], "r")) == NULL)
{
printf("Error Opening File.\n");
//exit(1);
}
lcount=0;
int i=0;
while( fgets(line, sizeof(line), filecomand) != NULL ) {
/* Get each line from the infile */
//lcount++;
/* print the line number and data */
//printf("%s", line);
}
fclose(filecomand); /* Close the file */
You need to determine the size of the file. Once you have that, you can allocate an array large enough and read it in a single go.
There are two ways to determine the size of the file.
Using fstat:
struct stat stbuffer;
if (fstat(fileno(filecommand), &stbuffer) != -1)
{
// file size is in stbuffer.st_size;
}
With fseek and ftell:
if (fseek(fp, 0, SEEK_END) == 0)
{
long size = ftell(fp)
if (size != -1)
{
// succesfully got size
}
// Go back to start of file
fseek(fp, 0, SEEK_SET);
}
Another solution would be to map the entire file to the memory and then treat it as a char array.
Under windows MapViewOfFile, and under unix mmap.
Once you mapped the file (plenty of examples), you get a pointer to the file's beginning in the memory. Cast it to char[].
Since you can't assume how big the file is, you need to determine the size and then dynamically allocate a buffer.
I won't post the code, but here's the general scheme. Use fseek() to navigate to the end of file, ftell() to get size of the file, and fseek() again to move the start of the file. Allocate a char buffer with malloc() using the size you found. The use fread() to read the file into the buffer. When you are done with the buffer, free() it.
Use a different open. i.e.
fd = open(str[1], O_RDONLY|O_BINARY) /* O_BINARY for MS */
The read statement would be for a buffer of bytes.
count = read(fd,buf, bytecount)
This will do a binary open and read on the file.