How to use the read function, with an unknown size

How to use the read function, with an unknown size - c

I made a function that opens a file and reads the single characters of them.
int getBlocks(char *file){
char c;
int line = 0;
int pos = 0;
int blocks;
FILE *inptr = fopen(file, "rt");
// Count the amount of blocks/tetriminos
while((c=fgetc(inptr))!=EOF){
if(c == '.' || c == '#'){
pos++;
if(pos == 4){
pos = 0;
line++;
}
}
}
blocks = (line/4);
fclose(inptr);
return blocks;
}
Now I have to rewrite it, because the only functions I am allowed to use are exit, open, close, write, read, malloc and free.
I think I could basically use int inptr = open(file, O_RDONLY); instead of my fopen line, and simply close(inptr); instead of my fclose function.
My only problem now is fgetc. I am pretty sure that I can use read here, but according to it's definition ssize_t read(int fildes, void *buf, size_t nbytes); I would need to tell a fixed amount of bytes in advance, but the filesize can always differ in my case.
Do you know how I could rewrite fgetc here?

It's pretty similar with slight changes
char chr;
int fd = open(file, O_RDONLY);
if (fd == -1)
reutnr -1; // Or some error integer
while (read(fd, &chr, 1) == 1) {
/* the rest of your code */
}
close(fd);
Note that one important change is that the type of chr is char and not int.
Instead of checking for EOF you simply check that read() returned a suitable value, ideally you should store it somewhere and check that it's 0 at the end of the loop, meaning the end of the file was reached, otherwise an error occurred.

Related

Copy a file with buffers of different sizes for read and write

I have been doing some practice problems for job interviews and I came across a function that I can't wrap my mind on how to tackle it. The idea is to create a function that takes the name of two files, and the allowed buffer size to read from file1 and allowed buffer size for write to file2. if the buffer size is the same, I know how to go trough the question, but I am having problems figuring how to move data between the buffers when the sizes are of different. Part of the constraints is that we have to always fill the write buffer before writing it to file. if file1 is not a multiple of file2, we pad the last buffer transfer with zeros.
// input: name of two files made for copy, and their limited buffer sizes
// output: number of bytes copied
int fileCopy(char* file1,char* file2, int bufferSize1, int bufferSize2){
int bytesTransfered=0;
int bytesMoved=o;
char* buffer1, *buffer2;
FILE *fp1, *fp2;
fp1 = fopen(file1, "r");
if (fp1 == NULL) {
printf ("Not able to open this file");
return -1;
}
fp2 = fopen(file2, "w");
if (fp2 == NULL) {
printf ("Not able to open this file");
fclose(fp1);
return -1;
}
buffer1 = (char*) malloc (sizeof(char)*bufferSize1);
if (buffer1 == NULL) {
printf ("Memory error");
return -1;
}
buffer2 = (char*) malloc (sizeof(char)*bufferSize2);
if (buffer2 == NULL) {
printf ("Memory error");
return -1;
}
bytesMoved=fread(buffer1, sizeof(buffer1),1,fp1);
//TODO: Fill buffer2 with maximum amount, either when buffer1 <= buffer2 or buffer1 > buffer2
//How do I iterate trough file1 and ensuring to always fill buffer 2 before writing?
bytesTransfered+=fwrite(buffer2, sizeof(buffer2),1,fp2);
fclose(fp1);
fclose(fp2);
return bytesTransfered;
}
How should I write the while loop for the buffer transfers before the fwrites?

I am having problems figuring how to move data between the buffers when the sizes are of different
Layout a plan. For "some practice problems for job interviews", a good plan and ability to justify it is important. Coding, although important, is secondary.
given valid: 2 FILE *, 2 buffers and their sizes
while write active && read active
while write buffer not full && reading active
if read buffer empty
read
update read active
append min(read buffer length, write buffer available space) of read to write buffer
if write buffer not empty
pad write buffer
write
update write active
return file status
Now code it. A more robust solution would use a struct to group the FILE*, buffer, size, offset, length, active variables.
// Return true on problem
static bool rw(FILE *in_s, void *in_buf, size_t in_sz, FILE *out_s,
void *out_buf, size_t out_sz) {
size_t in_offset = 0;
size_t in_length = 0;
bool in_active = true;
size_t out_length = 0;
bool out_active = true;
while (in_active && out_active) {
// While room for more data
while (out_length < out_sz && in_active) {
if (in_length == 0) {
in_offset = 0;
in_length = fread(in_buf, in_sz, 1, in_s);
in_active = in_length > 0;
}
// Append a portion of `in` to `out`
size_t chunk = min(in_length, out_sz - out_length);
memcpy((char*) out_buf + out_length, (char*) in_buf + in_offset, chunk);
out_length += chunk;
in_length -= chunk;
in_offset += chunk;
}
if (out_length > 0) {
// Padding only occurs, maybe, on last write
memset((char*) out_buf + out_length, 0, out_sz - out_length);
out_active = fwrite(out_buf, out_sz, 1, out_s) == out_sz;
out_length = 0;
}
}
return ferror(in_s) || ferror(out_s);
}
Other notes;
Casting malloc() results not needed. #Gerhardh
// buffer1 = (char*) malloc (sizeof(char)*bufferSize1);
buffer1 = malloc (sizeof *buffer1 * bufferSize1);
Use stderr for error messages. #Jonathan Leffler
Open the file in binary.
size_t is more robust for array/buffer sizes than int.
Consider sizeof buffer1 vs. sizeof (buffer1) as parens not needed with sizeof object

while(bytesMoved > 0) {
for(i=0; i<bytesMoved && i<bufferSize2; i++)
buffer2[i]=buffer1[i];
bytesTransfered+=fwrite(buffer2, i,1,fp2);
bytesMoved-=i;
}
If bufferSize1 is smaller than the filesize you need an outer loop.

As the comments to your question have indicated, this solution is not the best way to transfer data from 1 file to another file. However, your case has certain restrictions, which this solution accounts for.
(1) Since you are using a buffer, you do not need to read and write 1 char at a time, but instead you can make as few calls to those functions possible.
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream);
:from the man page for fread, nmemb can = bufferSize1
(2) You will need to check the return from fread() (i.e. bytesMoved) and compare it with both of the bufferSize 1 and 2. If (a) bytesMoved (i.e. return from fread()) is equal to bufferSize1 or if (b) bufferSize2 is less than bufferSize1 or the return from fread(), then you know that there is still data that needs to be read (or written). So, therefore you should begin the next transfer of data, and when completed return to the previous step you left off on.
Note: The pointer to the File Stream in fread() and fwrite() will begin where it left off in the event that the data is larger than the bufferSizes.
PseudoCode:
/* in while() loop continue reading from file 1 until nothing is left to read */
while (bytesMoved = fread(buffer1, sizeof(buffer1), bufferSize1, fp1))
{
/* transfer from buffer1 to buffer2 */
for(i = 0; i < bytesMoved && i < bufferSize2; i++)
buffer2[i] = buffer1[i];
buffer2[i] = '\0';
iterations = 1; /* this is just in case your buffer2 is super tiny and cannot store all from buffer1 */
/* in while() loop continue writing to file 2 until nothing is left to write
to upgrade use strlen(buffer2) instead of bufferSize2 */
while (bytesTransfered = fwrite(buffer2, sizeof(buffer2), bufferSize2, fp2))
{
/* reset buffer2 & write again from buffer1 to buffer2 */
for(i = bufferSize2 * iterations, j = 0; i < bytesMoved && j < bufferSize2; i++, j++)
buffer2[j] = buffer1[i];
buffer2[j] = '\0';
iterations++;
}
/* mem reset buffer1 to prepare for next data transfer*/
}

Reading a File as Strings

I want to read the data of the file into a string.
Is there a function that reads the whole file into a character array?
I open the file like this:
FILE *fp;
for(i = 0; i < filesToRead; i++)
{
fp = fopen(name, "r");
// Read into a char array.
}
EDIT: So how to read it "line by line" getchar() ?

Here are three ways to read an entire file into a contiguous buffer:
Figure out the file length, then fread() the whole file. You can figure out the length with fseek() and ftell(), or you can use fstat() on POSIX systems. This will not work on sockets or pipes, it only works on regular files.
Read the file into a buffer which you dynamically expand as you read data using fread(). Typical implementations start with a "reasonable" buffer size and double it each time space is exhausted. This works on any kind of file.
On POSIX, use fstat() to get the file and then mmap() to put the entire file in your address space. This only works on regular files.

You can do the following:
FILE *fp;
int currentBufferSize;
for(i = 0; i < filesToRead; i++)
{
fp = fopen(name, "r");
currentBufferSize = 0;
while(fp != EOF)
fgets(filestring[i], BUFFER_SIZE, fp);
}
Of course you would have to make this in a more robust way, checking if your buffer can hold all the data and so on...

You might use something like the following: where you read each line, carefully check the result and pass it to a datastructure of your choosing. I have not shown how to properly allocate memory, but you can malloc up front and realloc when necessary.
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#define FILE_BUFFER_SIZE 1024
int file_read_line(FILE *fp, char *buffer)
{
// Read the line to buffer
if (fgets(buffer, FILE_BUFFER_SIZE, fp) == NULL)
return -errno;
// Check for End of File
if (feof(fp))
return 0;
return 1;
}
void file_read(FILE *fp)
{
int read;
char buffer[FILE_BUFFER_SIZE];
while (1) {
// Clear buffer for next line
buffer[0] = '\0';
// Read the next line with the appropriate read function
read = file_read_line(fp, buffer);
// file_read_line() returns only negative numbers when an error ocurred
if (read < 0) {
print_fatal_error("failed to read line: %s (%u)\n",
strerror(errno), errno);
exit(EXIT_FAILURE);
}
// Pass the read line `buffer` to whatever you want
// End of File reached
if (read == 0)
break;
}
return;
}

Trying to make program that counts number of bytes in a specified file (in C)

I am currently attempting to write a program that will tell it's user how many times the specified 8-bit byte appears in the specified file.
I have some ground work laid out, but when it comes to making sure that the file makes it in to an array or buffer or whatever format I should put the file data into to check for the bytes, I feel I'm probably very far off from using the correct methods.
After that, I need to check whatever the file data gets put in to for the byte specified, but I am also unsure how to do this.
I think I may be over-complicating this quite a bit, so explaining anything that needs to be changed or that can just be scrapped completely is greatly appreciated.
Hopefully didn't leave out any important details.
Everything seems to be running (this code compiles), but when I try to printf the final statement at the bottom, it does not spit out the statement.
I have a feeling I just did not set up the final for loop correctly at all..
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
//#define BUFFER_SIZE (4096)
main(int argc, char *argv[]){ //argc = arg count, argv = array of arguements
char buffer[4096];
int readBuffer;
int b;
int byteCount = 0;
b = atoi(argv[2]);
FILE *f = fopen(argv[1], "rb");
unsigned long count = 0;
int ch;
if(argc!=3){ /* required number of args = 3 */
fprintf(stderr,"Too few/many arguements given.\n");
fprintf(stderr, "Proper usage: ./bcount path byte\n");
exit(0);
}
else{ /*open and read file*/
if(f == 0){
fprintf(stderr, "File could not be opened.\n");
exit(0);
}
}
if((b <= -1) || (b >= 256)){ /*checks to see if the byte provided is between 0 & 255*/
fprintf(stderr, "Byte provided must be between 0 and 255.\n");
exit(0);
}
else{
printf("Byte provided fits in range.\n");
}
int i = 0;
int k;
int newFile[i];
fseek(f, 0, SEEK_END);
int lengthOfFile = ftell(f);
for(k = 0; k < sizeof(buffer); k++){
while(fgets(buffer, lengthOfFile, f) != NULL){
newFile[i] = buffer[k];
i++;
}
}
if(newFile[i] = buffer[k]){
printf("same size\n");
}
for(i = 0; i < sizeof(newFile); i++){
if(b == newFile[i]){
byteCount++;
}
printf("Final for loop is working???"\n");
}
}

OP is mixing fgets() with binary reads of a file.
fgets() reads a file up to the buffer size provided or reaching a \n byte. It is intended for text processing. The typical way to determine how much data was read via fgets() is to look for a final \n - which may or may not be there. The data read could have embedded NUL bytes in it so it becomes problematic to know when to stop scanning the buffer. on a NUL byte or a \n.
Fortunately this can all be dispensed with, including the file seek and buffers.
// "rb" should be used when looking at a file in binary. C11 7.21.5.3 3
FILE *f = fopen(argv[1], "rb");
b = atoi(argv[2]);
unsigned long byteCount = 0;
int ch;
while ((ch = fgetc(f)) != EOF) {
if (ch == b) {
byteCount++;
}
}
The OP error checking is good. But the for(k = 0; k < sizeof(buffer); k++){ loop and its contents had various issues. OP had if(b = newFile[i]){ which should have been if(b == newFile[i]){

Not really an ANSWER --
Chux corrected the code, this is just more than fits in a comment.
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
struct stat st;
int rc=0;
if(argv[1])
{
rc=stat(argv[1], &st);
if(rc==0)
printf("bytes in file %s: %ld\n", argv[1], st.st_size);
else
{
perror("Cannot stat file");
exit(EXIT_FAILURE);
}
return EXIT_SUCCESS;
}
return EXIT_FAILURE;
}
The stat() call is handy for getting file size and for determining file existence at the same time.
Applications use stat instead of reading the whole file, which is great for gigantic files.

copying contents of a text file in c

I want to read a text file and transfer it's contents to another text file in c, Here is my code:
char buffer[100];
FILE* rfile=fopen ("myfile.txt","r+");
if(rfile==NULL)
{
printf("couldn't open File...\n");
}
fseek(rfile, 0, SEEK_END);
size_t file_size = ftell(rfile);
printf("%d\n",file_size);
fseek(rfile,0,SEEK_SET);
fread(buffer,file_size,1,rfile);
FILE* pFile = fopen ( "newfile.txt" , "w+" );
fwrite (buffer , 1 ,sizeof(buffer) , pFile );
fclose(rfile);
fclose (pFile);
return 0;
}
the problem that I am facing is the appearence of unnecessary data in the receiving file,
I tried the fwrite function with both "sizeof(buffer)" and "file_size",In the first case it is displaying greater number of useless characters while in the second case the number of useless characters is only 3,I would really appreciate if someone pointed out my mistake and told me how to get rid of these useless characters...

Your are writing all the content of buffer (100 char) in the receiving file. You need to write the exact amount of data read.
fwrite(buffer, 1, file_size, pFile)
Adding more checks for your code:
#include <stdio.h>
#include <stdlib.h>
#define BUFFER_SIZE 100
int main(void) {
char buffer[BUFFER_SIZE];
size_t file_size;
size_t ret;
FILE* rfile = fopen("input.txt","r+");
if(rfile==NULL)
{
printf("couldn't open File \n");
return 0;
}
fseek(rfile, 0, SEEK_END);
file_size = ftell(rfile);
fseek(rfile,0,SEEK_SET);
printf("File size: %d\n",file_size);
if(!file_size) {
printf("Warring! Empty input file!\n");
} else if( file_size >= BUFFER_SIZE ){
printf("Warring! File size greater than %d. File will be truncated!\n", BUFFER_SIZE);
file_size = BUFFER_SIZE;
}
ret = fread(buffer, sizeof(char), file_size, rfile);
if(file_size != ret) {
printf("I/O error\n");
} else {
FILE* pFile = fopen ( "newfile.txt" , "w+" );
if(!pFile) {
printf("Can not create the destination file\n");
} else {
ret = fwrite (buffer , 1 ,file_size , pFile );
if(ret != file_size) {
printf("Writing error!");
}
fclose (pFile);
}
}
fclose(rfile);
return 0;
}

You need to check the return values from all calls to fseek(), fread() and fwrite(), even fclose().
In your example, you have fread() read 1 block which is 100 bytes long. It's often a better idea to reverse the parameters, like this: ret = fread(buffer,1,file_size,rfile). The ret value will then show how many bytes it could read, instead of just saying it could not read a full block.

Here is an implementation of an (almost) general purpose file copy function:
void fcopy(FILE *f_src, FILE *f_dst)
{
char buffer[BUFSIZ];
size_t n;
while ((n = fread(buffer, sizeof(char), sizeof(buffer), f_src)) > 0)
{
if (fwrite(buffer, sizeof(char), n, f_dst) != n)
err_syserr("write failed\n");
}
}
Given an open file stream f_src to read and another open file stream f_dst to write, it copies (the remainder of) the file associated with f_src to the file associated with f_dst. It does so moderately economically, using the buffer size BUFSIZ from <stdio.h>. Often, you will find that bigger buffers (such as 4 KiB or 4096 bytes, even 64 KiB or 65536 bytes) will give better performance. Going larger than 64 KiB seldom yields much benefit, but YMMV.
The code above calls an error reporting function (err_syserr()) which is assumed not to return. That's why I designated it 'almost general purpose'. The function could be upgraded to return an int value, 0 on success and EOF on a failure:
enum { BUFFER_SIZE = 4096 };
int fcopy(FILE *f_src, FILE *f_dst)
{
char buffer[BUFFER_SIZE];
size_t n;
while ((n = fread(buffer, sizeof(char), sizeof(buffer), f_src)) > 0)
{
if (fwrite(buffer, sizeof(char), n, f_dst) != n)
return EOF; // Optionally report write failure
}
if (ferror(f_src) || ferror(f_dst))
return EOF; // Optionally report I/O error detected
return 0;
}
Note that this design doesn't open or close files; it works with open file streams. You can write a wrapper that opens the files and calls the copy function (or includes the copy code into the function). Also note that to change the buffer size, I simply changed the buffer definition; I didn't change the main copy code. Also note that any 'function call overhead' in calling this little function is completely swamped by the overhead of the I/O operations themselves.

Note ftell returns a long, not a size_t. Shouldn't matter here, though. ftell itself is not necessarily a byte-offset, though. The standard requires it only to be an acceptable argument to fseek. You might get a better result from fgetpos, but it has the same portability issue from the lack of specification by the standard. (Confession: I didn't check the standard itself; got all this from the manpages.)
The more robust way to get a file-size is with fstat.
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd>
struct stat stat_buf;
if (fstat(filename, &buf) == -1)
perror(filename), exit(EXIT_FAILURE);
file_size = statbuf.st_size;

I think the parameters you passed in the fwrite are not in right sequence.
To me it should be like that-
fwrite(buffer,SIZE,1,pFile)
as the syntax of fwrite is
size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream);
The function fwrite() writes nmemb elements of data, each size bytes long, to the stream pointed to by stream, obtaining them from the location given by ptr.
So change the sequence and try again.

Correct way to read a text file into a buffer in C? [duplicate]

This question already has answers here:
How to read the content of a file to a string in C?
(12 answers)
Closed 5 years ago.
I'm dealing with small text files that i want to read into a buffer while i process them, so i've come up with the following code:
...
char source[1000000];
FILE *fp = fopen("TheFile.txt", "r");
if(fp != NULL)
{
while((symbol = getc(fp)) != EOF)
{
strcat(source, &symbol);
}
fclose(fp);
}
...
Is this the correct way of putting the contents of the file into the buffer or am i abusing strcat()?
I then iterate through the buffer thus:
for(int x = 0; (c = source[x]) != '\0'; x++)
{
//Process chars
}

char source[1000000];
FILE *fp = fopen("TheFile.txt", "r");
if(fp != NULL)
{
while((symbol = getc(fp)) != EOF)
{
strcat(source, &symbol);
}
fclose(fp);
}
There are quite a few things wrong with this code:
It is very slow (you are extracting the buffer one character at a time).
If the filesize is over sizeof(source), this is prone to buffer overflows.
Really, when you look at it more closely, this code should not work at all. As stated in the man pages:
The strcat() function appends a copy of the null-terminated string s2 to the end of the null-terminated string s1, then add a terminating `\0'.
You are appending a character (not a NUL-terminated string!) to a string that may or may not be NUL-terminated. The only time I can imagine this working according to the man-page description is if every character in the file is NUL-terminated, in which case this would be rather pointless. So yes, this is most definitely a terrible abuse of strcat().
The following are two alternatives to consider using instead.
If you know the maximum buffer size ahead of time:
#include <stdio.h>
#define MAXBUFLEN 1000000
char source[MAXBUFLEN + 1];
FILE *fp = fopen("foo.txt", "r");
if (fp != NULL) {
size_t newLen = fread(source, sizeof(char), MAXBUFLEN, fp);
if ( ferror( fp ) != 0 ) {
fputs("Error reading file", stderr);
} else {
source[newLen++] = '\0'; /* Just to be safe. */
}
fclose(fp);
}
Or, if you do not:
#include <stdio.h>
#include <stdlib.h>
char *source = NULL;
FILE *fp = fopen("foo.txt", "r");
if (fp != NULL) {
/* Go to the end of the file. */
if (fseek(fp, 0L, SEEK_END) == 0) {
/* Get the size of the file. */
long bufsize = ftell(fp);
if (bufsize == -1) { /* Error */ }
/* Allocate our buffer to that size. */
source = malloc(sizeof(char) * (bufsize + 1));
/* Go back to the start of the file. */
if (fseek(fp, 0L, SEEK_SET) != 0) { /* Error */ }
/* Read the entire file into memory. */
size_t newLen = fread(source, sizeof(char), bufsize, fp);
if ( ferror( fp ) != 0 ) {
fputs("Error reading file", stderr);
} else {
source[newLen++] = '\0'; /* Just to be safe. */
}
}
fclose(fp);
}
free(source); /* Don't forget to call free() later! */

Yes - you would probably be arrested for your terriable abuse of strcat !
Take a look at getline() it reads the data a line at a time but importantly it can limit the number of characters you read, so you don't overflow the buffer.
Strcat is relatively slow because it has to search the entire string for the end on every character insertion.
You would normally keep a pointer to the current end of the string storage and pass that to getline as the position to read the next line into.

If you're on a linux system, once you have the file descriptor you can get a lot of information about the file using fstat()
http://linux.die.net/man/2/stat
so you might have
#include <unistd.h>
void main()
{
struct stat stat;
int fd;
//get file descriptor
fstat(fd, &stat);
//the size of the file is now in stat.st_size
}
This avoids seeking to the beginning and end of the file.

See this article from JoelOnSoftware for why you don't want to use strcat.
Look at fread for an alternative. Use it with 1 for the size when you're reading bytes or characters.

Why don't you just use the array of chars you have? This ought to do it:
source[i] = getc(fp);
i++;

Not tested, but should work.. And yes, it could be better implemented with fread, I'll leave that as an exercise to the reader.
#define DEFAULT_SIZE 100
#define STEP_SIZE 100
char *buffer[DEFAULT_SIZE];
size_t buffer_sz=DEFAULT_SIZE;
size_t i=0;
while(!feof(fp)){
buffer[i]=fgetc(fp);
i++;
if(i>=buffer_sz){
buffer_sz+=STEP_SIZE;
void *tmp=buffer;
buffer=realloc(buffer,buffer_sz);
if(buffer==null){ free(tmp); exit(1);} //ensure we don't have a memory leak
}
}
buffer[i]=0;

Methinks you want fread:
http://www.cplusplus.com/reference/clibrary/cstdio/fread/

Have you considered mmap()? You can read from the file directly as if it were already in memory.
http://beej.us/guide/bgipc/output/html/multipage/mmap.html

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to use the read function, with an unknown size - c

Related

Copy a file with buffers of different sizes for read and write

Reading a File as Strings

Trying to make program that counts number of bytes in a specified file (in C)

copying contents of a text file in c

Correct way to read a text file into a buffer in C? [duplicate]

Categories

Resources