I am writing some software that corrupts files. It stores the file in a buffer, corrupts that buffer by xoring it with random masks, then writes the modified buffer to stdout so that the (linux) user can pipe it to somewhere.
I have opened stdout in binary mode:
FILE *const out = fdopen(dup(fileno(stdout)), "wb");
But how do I actually write the whole buffer to out in one go?
It seems I have 2 options:
Iterate all bytes and do fputc
Hope that there are no nullbytes in the data and do fputs
I'm looking for a fputb which takes:
A pointer to the data to be written
A file descriptor to write it to
The amount of bytes to write
(with better performance than a fputc-loop)
You're looking for fwrite ( const void * ptr, size_t size, size_t count, FILE * stream ):
Writes an array of count elements, each one with a size of size bytes, from the block of memory pointed by ptr to the current position in the stream.
Related
So what I'm trying to do is open a file and read it until the end in blocks that are 256 bytes long each time it is called. My dilemma is using fgets() or fread() to do it.
I was using fgets() initially, because it returns a string of the bytes that were read, which is great because I can store that data and work with it. However, in my particular file that I'm reading, the 256 bytes often happen over a more than 2 lines, which is a problem because fgets() stops reading when it hits a newline character or the end of the file.
I then thought of using fread(), but I don't know how to save the line that I'm referring to with it because fread() returns an int referring to the number of elements successfully read (according to its documentation).
I've searched and thought of solutions for a while now and can't find anything that works with my particular scenario. I would like some guidance on how to go about this issue, how would you go about this in my position?
You can use fread() to read each 256 bytes block and keep a lineCount variable to keep track of the number of new line characters you have encountered so far in the input. Since you have to process the blocks already this wouldn't mean much of an overhead in the processing.
To read a block of 256 chars, which is what I think you are doing, you just need to create a buffer of chars that can hold 256 of them, in other words a char array of size 256.
#define BLOCK_SIZE 256
char block[BLOCK_SIZE];
Then if you check the documentation for fread() it shows the following signature:
Following is the declaration for fread() function.
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream)
Parameters
ptr -- This is the pointer to a block of memory with a minimum size of size*nmemb bytes.
size -- This is the size in bytes of each element to be read.
nmemb -- This is the number of elements, each one with a size of size bytes.
stream -- This is the pointer to a FILE object that an input stream.
So this means it takes a pointer to the buffer where it will write the read information, the size of each element it's supposed to read, the maximum amount of elements you want it to read and the file pointer. In your case it would be:
int read = fread(block, sizeof(char), BLOCK_SIZE, file);
This will copy the information from the file to the block array, which you can later process and keep track of the lines. The characters that were read by fread are in the block array, so the first char in the last read block would be block[0], the second block[1] and so on. The returned value in read indicates how many elements (in your case chars) were inserted in the array block when you call fread, this number will be equal to BLOCK_SIZE for every call, unless you reach the end of the file or there's an error.
I suggest you read some documentation for a full example, play a little with the code and do some reading on pointers in C to gain a better understanding of how everything works in general. If you still have questions after that, we can take it from there or you can create a new SO question.
I want to read a video file and save as binary and write as a video file again.
I tested with 180MB video. I used fread function and It occur segmentation fault because array size is small for video.
those are my questions:
I use 160*1024 bytes char array. What is the maximum size of char array? How I can solve this problem?
this program need to work as:
read 128 bytes of video -> Encrypt -> write 128 byte
read next 128 bytes -> Encrypt -> write to the next.
I can't upload my code because of security rule of company. Any tip would be appreciated.
first use fseek() with SEEK_END, then use ftell() to determine the file size, after that allocate the needed memory with malloc() and write the data to that memory.
If I understand you correctly you don't need to allocate so much memory, but only 128 Bytes.
char buf[128];
while(/* condition */)
{
ret = fread(buf, sizeof buf, 1, fp_in);
encrypt(buf);
ret = fwrite(buf, sizeof buf, 1, fp_out);
}
isn't it possible to read bytes left in a file that is smaller than buffer size?
char * buffer = (char *)malloc(size);
FILE * fp = fopen(filename, "rb");
while(fread(buffer, size, 1, fp)){
// do something
}
Let's assume size is 4 and file size is 17 bytes. I thought fread can handle last operation as well even if bytes left in file is smaller than buffer size, but apparently it just terminates while loop without reading one last byte.
I tried to use lower system call read() but I couldn't read any byte for some reason.
What should I do if fread cannot handle last part of bytes that is smaller than buffer size?
Yep, turn your parameters around.
Instead of requesting one block of size bytes, you should request size blocks of 1 bytes. Then the function will return how many blocks (bytes) it was able to read:
int nread;
while( 0 < (nread = fread(buffer, 1, size, fp)) ) ...
try using "man fread"
it clearly mention following things which itself answers your question:
SYNOPSIS
size_t fread(void *ptr, size_t size, size_t nitems, FILE *stream);
DESCRIPTION
fread() copies, into an array pointed to by ptr, up to nitems items of
data from the named input stream, where an item of data is a sequence
of bytes (not necessarily terminated by a null byte) of length size.
fread() stops appending bytes if an end-of-file or error condition is
encountered while reading stream, or if nitems items have been read.
fread() leaves the file pointer in stream, if defined, pointing to the
byte following the last byte read if there is one.
The argument size is typically sizeof(*ptr) where the pseudo-function
sizeof specifies the length of an item pointed to by ptr.
RETURN VALUE
fread(), return the number of items read.If size or nitems is 0, no
characters are read or written and 0 is returned.
The value returned will be less than nitems only if a read error or
end-of-file is encountered. The ferror() or feof() functions must be
used to distinguish between an error condition and an end-of-file
condition.
I'm writing an application that deals with very large user-generated input files. The program will copy about 95 percent of the file, effectively duplicating it and switching a few words and values in the copy, and then appending the copy (in chunks) to the original file, such that each block (consisting of between 10 and 50 lines) in the original is followed by the copied and modified block, and then the next original block, and so on. The user-generated input conforms to a certain format, and it is highly unlikely that any line in the original file is longer than 100 characters long.
Which would be the better approach?
To use one file pointer and use variables that hold the current position of how much has been read and where to write to, seeking the file pointer back and forth to read and write; or
To use multiple file pointers, one for reading and one for writing.
I am mostly concerned with the efficiency of the program, as the input files will reach up to 25,000 lines, each about 50 characters long.
If you have memory constraints, or you want a generic approach, read bytes into a buffer from one file pointer, make changes, and write out the buffer to a second file pointer when the buffer is full. If you reach EOF on the first pointer, make your changes and just flush whatever is in the buffer to the output pointer. If you intend to replace the original file, copy the output file to the input file and remove the output file. This "atomic" approach lets you check that the copy operation took place correctly before deleting anything.
For example, to deal with generically copying over any number of bytes, say, 1 MiB at a time:
#define COPY_BUFFER_MAXSIZE 1048576
/* ... */
unsigned char *buffer = NULL;
buffer = malloc(COPY_BUFFER_MAXSIZE);
if (!buffer)
exit(-1);
FILE *inFp = fopen(inFilename, "r");
fseek(inFp, 0, SEEK_END);
uint64_t fileSize = ftell(inFp);
rewind(inFp);
FILE *outFp = stdout; /* change this if you don't want to write to standard output */
uint64_t outFileSizeCounter = fileSize;
/* we fread() bytes from inFp in COPY_BUFFER_MAXSIZE increments, until there is nothing left to fread() */
do {
if (outFileSizeCounter > COPY_BUFFER_MAXSIZE) {
fread(buffer, 1, (size_t) COPY_BUFFER_MAXSIZE, inFp);
/* -- make changes to buffer contents at this stage
-- if you resize the buffer, then copy the buffer and
change the following statement to fwrite() the number of
bytes in the copy of the buffer */
fwrite(buffer, 1, (size_t) COPY_BUFFER_MAXSIZE, outFp);
outFileSizeCounter -= COPY_BUFFER_MAXSIZE;
}
else {
fread(buffer, 1, (size_t) outFileSizeCounter, inFp);
/* -- make changes to buffer contents at this stage
-- again, make a copy of buffer if it needs resizing,
and adjust the fwrite() statement to change the number
of bytes that need writing */
fwrite(buffer, 1, (size_t) outFileSizeCounter, outFp);
outFileSizeCounter = 0ULL;
}
} while (outFileSizeCounter > 0);
free(buffer);
An efficient way to deal with a resized buffer is to keep a second pointer, say, unsigned char *copyBuffer, which is realloc()-ed to twice the size, if necessary, to deal with accumulated edits. That way, you keep expensive realloc() calls to a minimum.
Not sure why this got downvoted, but it's a pretty solid approach for doing things with a generic amount of data. Hope this helps someone who comes across this question, in any case.
25000 lines * 100 characters = 2.5MB, that's not really a huge file. The fastest will probably be to read the whole file in memory and write your results to a new file and replace the original with that.
The user should input some file names in the command line and the program will read each file name from argv[] array. I have to perform error checking etc.
I want to read each filename. For example, if argv[2] is 'myfile.txt', the program should read the content of 'myfile.txt' and store value in char buffer[BUFSIZ] and then write the content of buffer into another file.
However before the content is written, the program should also write the name of the file and the size. Such that the file can be easily extracted later. A bit like the tar function.
The file I write the content of buffer, depending on the number of files added by user, should be a string like:
myfile.txt256Thisisfilecontentmyfile2.txt156Thisisfile2content..............
My question is
1) How do I write value of argv[2] into file using write() statement, as having problems writing char array, what should I put as (sizeof(?)) inside write(). see below as I don't know the length of the file name entered by the user.
2) Do I use the '&' to write an integer value into file after name, for example write 4 bytes after file name for the size of file
Here is the code I have written,
char buffer[BUFSIZ];
int numfiles=5; //say this is no of files user entered at command
open(file.....
lseek(fdout, 0, SEEK_SET); //start begging of file and move along each file some for loop
for(i=0-; ......
//for each file write filename,filesize,data....filename,filesize,data......
int bytesread=read(argv[i],buffer,sizeof(buffer));
write(outputfile, argv[i], sizeof(argv)); //write filename size of enough to store value of filename
write(outputfile, &bytesread, sizeof(bytesread));
write(outputfile, buffer, sizeof(buffer));
But the code is not working as I expected.
Any suggestions?
Since argv consists of null-terminated arrays, the length you can write is strlen(argv[2])+1 to write both the argument and null terminator:
size_t sz = strlen (argv[2]);
write (fd, argv[2], sz + 1);
Alternatively, if you want the length followed by the characters, you can write the size_t itself returned from strlen followed by that many characters.
size_t sz = strlen (argv[2]);
write (fd, &sz, sizeof (size_t));
write (fd, argv[2], sz);
You probably also need to write the length of the file as well so that you can locate the next file when reading it back.
1., You can write the string the following way:
size_t size = strlen(string);
write(fd, string, size);
However, most of the time it's not this simple: you will need the size of the string so you'll know how much you need to read. So you should write the string size too.
2., An integer can be written the following way:
write(fd, &integer, sizeof(integer));
This is simple, but if you plan to use the file on different architectures, you'll need to deal with endianness too.
It sounds like your best bet is to use a binary format. In your example, is the file called myfile.txt with a content length of 256, or myfile.txt2 with a content length of 56, or myfile.txt25 with a content length of 6? There's no way to distinguish between the end of the filename and the start of the content length field. Similarly there is no way to distinguish between the end of the content length and the start of the content. If you must use a text format, fixed width fields will help with this. I.e. 32 characters of filename followed by 6 digits of content length. But binary format is more efficient.
You get the filename length using strlen(), don't use sizeof(argv) as you will get completely the wrong result. sizeof(argv[i]) will also give the wrong result.
So write 4 bytes of filename length followed by the filename then 4 bytes of content length followed by the content.
If you want the format to be portable you need to be aware of byte order issues.
Lastly, if the file won't all fit in your buffer then you are stuffed. You need to get the size of the file you are reading to write it to your output file first, and then make sure you read that number of bytes from the first file into the second file. There are various techniques to do this.
thanks for replies guys,
I decided not to use (size_t) structure instead just assigned (int) and (char) types so I know exact value of bytes to read() out. ie I know start at beggining of file and read 4 bytes(int) to get value of lenght of filename, which I use as size in next read()
So, when I am writing (copying file exactly with same name) users inputted file to the output file (copied file) I writing it in long string, without spaces obviously just to make it readable here,
filenamesize filename filecontentsize filecontent
ie 10 myfile.txt 5 hello
So when come to reading that data out I start at begining of file using lseek() and I know the first 4 bytes are (int) which is lenght of filename so I put that into value int namelen using the read function.
My problem is I want to use that value read for the filenamesize(first 4 bytes) to declare my array to store filename with the right lenght. How do I put this array into read() so the read stores value inside that char array specified, see below please
int namelen; //value read from first 4 bytes of file lenght of filename to go in nxt read()
char filename[namelen];
read(fd, filename[namelen], namelen);//filename should have 'myfile.txt' if user entered that filename
So my question is once I read that first 4 bytes from file giving me lenght of filename stored in namelen, I then want to read namelen amount of bytes to give me the filename of originally file so I can create copied file inside directory?
Thanks