Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I'm just a young computer science student, and currently I'm a bit confused about what is the best practice to read a string from stdin. I know that there are a lot of ways to do that, some safer than other, and so on...
I'm currently in need of a function that prevents buffer overflow and appends a null terminator character (\0) to the end of the string. I found fgets really useful for that, but ... It stops reading with \n or EOF! What if I want the user to input more than one line at time? Are there some other function that can help me doing that?
I'm sorry if this question can seem silly to some of you, but please, understand me!
Any help would be appreciated.
#define INITALLOC 16 /* #chars initally alloced */
#define STEP 8 /* #chars to realloc by */
#define END (-1) /* returned by getline to indicate EOF */
#define ALLOCFAIL 0 /* returned by getline to indicate allocation failure */
int getline(char **dynline)
{
int i, c;
size_t nalloced; /* #chars currently alloced */
if ((*dynline = malloc(INITALLOC)) == NULL)
return ALLOCFAIL;
nalloced = INITALLOC;
for (i = 0; (c = getchar()) != EOF; ++i) {
/* buffer is full, request more memory */
if (i == nalloced)
if ((*dynline = realloc(*dynline, nalloced += STEP)) == NULL)
return ALLOCFAIL;
/* store the newly read character */
(*dynline)[i] = c;
}
/* zero terminate the string */
(*dynline)[i] = '\0';
if (c == EOF)
return END;
return i+1; /* on success, return #chars read successfully
(i is an index, so add 1 to make it a count */
}
This function allocates memory dynamically, so the caller needs to free the memory.
This code is not perfect. If, on reallocation, there is a failure, NULL overwrites the previous, perfectly-good data causing a memory leak and loss of data.
If a newline is encountered and fgets returns, you can run it again as many times as necessary to read as many lines as you want. A loop is useful for this.
If EOF is encountered, you have reached the end of the file(/stream) and there is no point in running it again, because there is nothing left to read.
An example showing the logic to read an entire string to EOF from stdin follows.
There are many ways to do this, and this is just one, but it shows the general logic.
The result buffer grows as the input is read, and there are no bounds on this – so if EOF is never reached you will eventually run out of memory and the program will exit. A simple check could avoid this, or depending on your application you could process the data as it comes in and not need to store it all.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#define LINE_BUFFER_SIZE 256
// Each time this is exhausted, the buffer will be increased in size by this amount again.
#define INITIAL_BUFFER_SIZE 2048
int main (int argc, char **argv) {
char *result = malloc(INITIAL_BUFFER_SIZE);
if (!result) {
// Out of memory.
return 1;
}
size_t totalBytesRead = 0;
size_t bytesAllocated = INITIAL_BUFFER_SIZE;
char buf[LINE_BUFFER_SIZE];
while (fgets(buf, LINE_BUFFER_SIZE, stdin)) {
size_t bytesRead = strlen(buf);
size_t bytesNeeded = totalBytesRead + bytesRead + 1;
if (bytesAllocated < bytesNeeded) {
char *newPtr = realloc(result, bytesAllocated + INITIAL_BUFFER_SIZE);
if (newPtr) {
result = newPtr;
bytesAllocated += INITIAL_BUFFER_SIZE;
}
else {
// Out of memory.
free(result);
return 1;
}
}
memcpy(result + totalBytesRead, buf, bytesRead);
totalBytesRead += bytesRead;
}
result[totalBytesRead] = '\0';
// result contains the entire contents from stdin until EOF.
printf("%s", result);
free(result);
return 0;
}
On POSIX systems, you have getline. It is able to read an arbitrarily wide line (till exhausting resources) in heap allocated memory.
You can also repeatedly call fgetc ... (BTW, you should define exactly what is a string for you)
On Linux, you can read an editable line from the terminal (that is, stdin when it is a tty) using GNU readline.
To read some kind of strings, you might use fscanf with e.g. %50s or %[A-Z] etc...
And you can read an array (of bytes, or some other binary data) using fread
You might read an entire line and parse it later (perhaps using sscanf). You could read several lines and build some strings in heap memory (e.g. using asprintf or strdup on systems having it) from them.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 10 months ago.
Improve this question
I'm making a program that takes a file of an unknown size as an input and dynamic allocate it as much as the size of the array. Why do we need to subtract 1 from the size of the buffer to get the size of the column?
This is part of the code.
#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_SIZE 500
void CountRowCol(FILE* fp);
void allocateMemory();
int main()
{
FILE* fp = NULL;
fp = fopen("test.txt", "r");
if (fp != NULL)
{
CountRowCol(fp);
allocateMemory();
}
else
printf("There is No file\n");
fclose(fp);
return 0;
}
void CountRowCol(FILE* fp)
{
int row = 0; int col = 0;
char buffer[MAX_SIZE];
if (fgets(buffer, 1000, fp))
{
col = strlen(buffer) - 1;
row = 1;
}
while (fgets(buffer, 1000, fp))
{
row++;
}
fclose(fp);
fp = NULL;
}
Why do we need to subtract 1 from the size of the buffer to get the size of the column?
As code does not use col, col = strlen(buffer) - 1; has no direct usefulness. #shjeff
Let us assume code is trying to find the length of the first line and not count a final '\n' #David Ranieri
.
strlen(buffer) - 1, code risks 2 mistakes:
A '\n' may not exist in buffer[], so finding its offset is moot. E.g. the line of input may exceed buffer space, so no '\n' saved.
Although rare to first read a null character, it is possible, then strlen(buffer) is an unsigned 0 and strlen(buffer) - 1 is a very large value: SIZE_MAX. Assigning that to an int leads to implementation defined behavior, possible returning INT_MAX.
A better way to lop off a potential '\n':
buffer[strcspn(buffer, "\n")] = 0;
col = strlen(buffer);
Wrong size
User code is lying to fgets() as buffer is less than 1000 bytes. #the busybee
#define MAX_SIZE 500
...
// if (fgets(buffer, 1000, fp))
if (fgets(buffer, sizeof buffer, fp))
fgets(buffer, 1000, fp
This reads a line (assuming it fits in 999 bytes, one byte is used for string terminator, '\0').
This line includes the newline character, '\n'.
Subtracting 1 excludes that.
You might additionally want to remove the line feed from the string:
size_t len = strlen(buffer);
if (len > 0 && buffer[len-1] == '\n') {
// reduce line length by 1 and overwrite newline with 0
buffer[--len] = '\0';
} else {
// handle line which does not end in newline
}
Missing newline might be because file ended and did not have newline at the end, or because line was longer than the buffer. Here you can probably just... print error message and exit, if you aren't required to handle such input.
In the C standard library, the character reading functions such as getchar return a value equal to the symbolic value (macro) EOF to indicate that an end-of-file condition has occurred. The actual value of EOF is implementation-dependent and must be negative (but is commonly −1, such as in glibc). Block-reading functions return the number of bytes read, and if this is fewer than asked for, then the end of file was reached or an error occurred (checking of errno or dedicated function, such as ferror is often required to determine which).
Source
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
Is there away in which I can get input in one scanf where the user only needs to type a sequence of numbers (separated by something if needed like \n or' '). And it would be automatically stored in the array. The same way there is a possibility in strings?
Someone has mentioned to me . %d... with three dots that it's equivalent to %s both in scanf. Is it true?
I haven't seen any similar questions. I think they are simply talking about getting string array. I am talking about int array at once.
I think the answer to your question is no. (but I am wrong see below)
In C you really need to define everything very explicitly.
So if You want 5 numbers in an array you could
scanf("%d %d %d %d %d", array[0], array[1],array[2],array[3],array[4]);
but this is fragile as if the format was not exactly that it would be difficult.
For what you want to achieve I would suggest reading in a whole line as a string and then processing the string to see how many numbers are in it and putting them in your array....
I am wrong in my initial assumption because since C99 there is this function called vscanf, which looks like it does exactly what you need.
I suggest you look at this question which really goes through it for vscanf, but for me I really think even vscanf could be a bit fragile and it would be better to read in data as string line by line using fgets to get data from stdin that will read everything into a string from the keyboard when you hit return provided the keyboard entry does not overflow the number of characters you indicate as a maximum - you can then use sscanf or other functions to scan through the string to pick up the numbers.
You can quite easily implement such a function yourself.
Let's steal the interface (mostly) from POSIX.1 getline(), as it is known to work well. So:
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
size_t get_ints(int **dataptr, size_t *sizeptr, FILE *in)
{
int *data;
size_t size, used = 0;
int value;
/* Invalid parameters? None may be NULL. */
if (!dataptr || !sizeptr || !in) {
errno = EINVAL;
return 0;
}
/* Has an error in the input stream already occurred? */
if (ferror(in)) {
errno = EIO;
return 0;
}
/* Has data already been allocated? */
if (!*dataptr || !*sizeptr) {
/* No, not yet. */
*dataptr = NULL;
*sizeptr = 0;
}
data = *dataptr;
size = *sizeptr;
/* Read loop. */
while (1) {
/* Try reading one int. */
if (fscanf(in, " %d", &value) != 1)
break;
/* Have one. Make sure data array has room for value. */
if (used >= size) {
/* Reallocation policy. We need at least size = used + 1,
but realloc() calls are relatively slow, so we want
to allocate in larger chunks. This is just one typical
policy. */
if (used < 255)
size = 256; /* Minimum allocation is 256 ints. */
else
if (used < 1048575)
size = (3 * used) / 2; /* Grow by 50% ... */
else
size = (used | 1048575) + 1048577; /* up to 1048576, after which round up to next multiple of 1048576. */
data = realloc(data, size * sizeof data[0]);
if (!data) {
/* Note: original data in *data still exists! */
errno = ENOMEM;
return 0;
}
*dataptr = data;
*sizeptr = size;
}
/* Append to array. */
data[used++] = value;
}
/* An actual I/O error? */
if (ferror(in)) {
errno = EIO;
return 0;
}
/* No, either an end of stream, or the next stuff
in the stream is not an integer.
If used == 0, we want to ensure the caller knows
there was no error. For simplicity, we avoid that
check, and simply set errno = 0 always. */
errno = 0;
return used;
}
The interface to the above get_ints() function is simple: it takes a pointer to a dynamically allocated array, a pointer to the size (in ints) allocated for that array, and the stream handle, and returns the number of ints read from the stream. If an error occurs, it returns 0 with errno set to indicate the error. (getline() returns -1 instead.)
If no error occurs, this particular implementation sets errno = 0, but do note that this behaviour is uncommon; normally, you can only expect errno to have a valid error number when the function returns the error code (usually -1, but zero for this function).
The way you use this function is very easy. For example, let's assume you want to read an array of ints from the standard input:
int main(void)
{
int *iarray = NULL; /* No array allocated yet, */
size_t isize = 0; /* so allocated size is zero, and */
size_t icount = 0; /* no ints in it yet. */
icount = get_ints(&iarray, &isize, stdin);
if (!icount) {
/* No ints read. Error? */
if (errno)
fprintf(stderr, "Error reading from standard input: %s.\n", strerror(errno));
else
fprintf(stderr, "No integers in standard input.\n");
return EXIT_FAILURE;
}
printf("Read %zu integers from standard input.\n", icount);
/*
* Do something with the integers...
*/
/* Discard the dynamically allocated array. */
free(iarray);
iarray = NULL;
isize = 0;
icount = 0;
return EXIT_SUCCESS;
}
Note that as long as you initialize your iarray = NULL and isize = 0 when declaring it, the get_ints() function will allocate as much memory as is needed for the ints it reads. It is also then always safe to do free(iarray); iarray = NULL; isize = 0; to discard the array, even if iarray was NULL, because free(NULL) is safe to do (does nothing).
This is an excellent way of doing memory management in C. If you do it this way -- initialize your pointers to NULL, then free() them and reset them to NULL after they are no longer needed -- your programs won't have memory leaks or crash due to use-after-free or similar bugs. Many instructors won't bother, because they erroneously think that sort of carefulness can be tacked on later, if necessary.
(Do note, however, that even in the above program, in the error cases where the program is about to abort/exit/return, there is no such cleanup done. This is because the operating system will release all resources automatically. [Except for shared memory and filesystem objects.] Simply put, if your program is guaranteed to exit, it is not necessary to free dynamically allocated memory.)
I am trying to write a function that does the following things:
Start an input loop, printing '> ' each iteration.
Take whatever the user enters (unknown length) and read it into a character array, dynamically allocating the size of the array if necessary. The user-entered line will end at a newline character.
Add a null byte, '\0', to the end of the character array.
Loop terminates when the user enters a blank line: '\n'
This is what I've currently written:
void input_loop(){
char *str = NULL;
printf("> ");
while(printf("> ") && scanf("%a[^\n]%*c",&input) == 1){
/*Add null byte to the end of str*/
/*Do stuff to input, including traversing until the null byte is reached*/
free(str);
str = NULL;
}
free(str);
str = NULL;
}
Now, I'm not too sure how to go about adding the null byte to the end of the string. I was thinking something like this:
last_index = strlen(str);
str[last_index] = '\0';
But I'm not too sure if that would work though. I can't test if it would work because I'm encountering this error when I try to compile my code:
warning: ISO C does not support the 'a' scanf flag [-Wformat=]
So what can I do to make my code work?
EDIT: changing scanf("%a[^\n]%*c",&input) == 1 to scanf("%as[^\n]%*c",&input) == 1 gives me the same error.
First of all, scanf format strings do not use regular expressions, so I don't think something close to what you want will work. As for the error you get, according to my trusty manual, the %a conversion flag is for floating point numbers, but it only works on C99 (and your compiler is probably configured for C90)
But then you have a bigger problem. scanf expects that you pass it a previously allocated empty buffer for it to fill in with the read input. It does not malloc the sctring for you so your attempts at initializing str to NULL and the corresponding frees will not work with scanf.
The simplest thing you can do is to give up on n arbritrary length strings. Create a large buffer and forbid inputs that are longer than that.
You can then use the fgets function to populate your buffer. To check if it managed to read the full line, check if your string ends with a "\n".
char str[256+1];
while(true){
printf("> ");
if(!fgets(str, sizeof str, stdin)){
//error or end of file
break;
}
size_t len = strlen(str);
if(len + 1 == sizeof str){
//user typed something too long
exit(1);
}
printf("user typed %s", str);
}
Another alternative is you can use a nonstandard library function. For example, in Linux there is the getline function that reads a full line of input using malloc behind the scenes.
No error checking, don't forget to free the pointer when you're done with it. If you use this code to read enormous lines, you deserve all the pain it will bring you.
#include <stdio.h>
#include <stdlib.h>
char *readInfiniteString() {
int l = 256;
char *buf = malloc(l);
int p = 0;
char ch;
ch = getchar();
while(ch != '\n') {
buf[p++] = ch;
if (p == l) {
l += 256;
buf = realloc(buf, l);
}
ch = getchar();
}
buf[p] = '\0';
return buf;
}
int main(int argc, char *argv[]) {
printf("> ");
char *buf = readInfiniteString();
printf("%s\n", buf);
free(buf);
}
If you are on a POSIX system such as Linux, you should have access to getline. It can be made to behave like fgets, but if you start with a null pointer and a zero length, it will take care of memory allocation for you.
You can use in in a loop like this:
#include <stdlib.h>
#include <stdio.h>
#include <string.h> // for strcmp
int main(void)
{
char *line = NULL;
size_t nline = 0;
for (;;) {
ptrdiff_t n;
printf("> ");
// read line, allocating as necessary
n = getline(&line, &nline, stdin);
if (n < 0) break;
// remove trailing newline
if (n && line[n - 1] == '\n') line[n - 1] = '\0';
// do stuff
printf("'%s'\n", line);
if (strcmp("quit", line) == 0) break;
}
free(line);
printf("\nBye\n");
return 0;
}
The passed pointer and the length value must be consistent, so that getline can reallocate memory as required. (That means that you shouldn't change nline or the pointer line in the loop.) If the line fits, the same buffer is used in each pass through the loop, so that you have to free the line string only once, when you're done reading.
Some have mentioned that scanf is probably unsuitable for this purpose. I wouldn't suggest using fgets, either. Though it is slightly more suitable, there are problems that seem difficult to avoid, at least at first. Few C programmers manage to use fgets right the first time without reading the fgets manual in full. The parts most people manage to neglect entirely are:
what happens when the line is too large, and
what happens when EOF or an error is encountered.
The fgets() function shall read bytes from stream into the array pointed to by s, until n-1 bytes are read, or a is read and transferred to s, or an end-of-file condition is encountered. The string is then terminated with a null byte.
Upon successful completion, fgets() shall return s. If the stream is at end-of-file, the end-of-file indicator for the stream shall be set and fgets() shall return a null pointer. If a read error occurs, the error indicator for the stream shall be set, fgets() shall return a null pointer...
I don't feel I need to stress the importance of checking the return value too much, so I won't mention it again. Suffice to say, if your program doesn't check the return value your program won't know when EOF or an error occurs; your program will probably be caught in an infinite loop.
When no '\n' is present, the remaining bytes of the line are yet to have been read. Thus, fgets will always parse the line at least once, internally. When you introduce extra logic, to check for a '\n', to that, you're parsing the data a second time.
This allows you to realloc the storage and call fgets again if you want to dynamically resize the storage, or discard the remainder of the line (warning the user of the truncation is a good idea), perhaps using something like fscanf(file, "%*[^\n]");.
hugomg mentioned using multiplication in the dynamic resize code to avoid quadratic runtime problems. Along this line, it would be a good idea to avoid parsing the same data over and over each iteration (thus introducing further quadratic runtime problems). This can be achieved by storing the number of bytes you've read (and parsed) somewhere. For example:
char *get_dynamic_line(FILE *f) {
size_t bytes_read = 0;
char *bytes = NULL, *temp;
do {
size_t alloc_size = bytes_read * 2 + 1;
temp = realloc(bytes, alloc_size);
if (temp == NULL) {
free(bytes);
return NULL;
}
bytes = temp;
temp = fgets(bytes + bytes_read, alloc_size - bytes_read, f); /* Parsing data the first time */
bytes_read += strcspn(bytes + bytes_read, "\n"); /* Parsing data the second time */
} while (temp && bytes[bytes_read] != '\n');
bytes[bytes_read] = '\0';
return bytes;
}
Those who do manage to read the manual and come up with something correct (like this) may soon realise the complexity of an fgets solution is at least twice as poor as the same solution using fgetc. We can avoid parsing data the second time by using fgetc, so using fgetc might seem most appropriate. Alas most C programmers also manage to use fgetc incorrectly when neglecting the fgetc manual.
The most important detail is to realise that fgetc returns an int, not a char. It may return typically one of 256 distinct values, between 0 and UCHAR_MAX (inclusive). It may otherwise return EOF, meaning there are typically 257 distinct values that fgetc (or consequently, getchar) may return. Trying to store those values into a char or unsigned char results in loss of information, specifically the error modes. (Of course, this typical value of 257 will change if CHAR_BIT is greater than 8, and consequently UCHAR_MAX is greater than 255)
char *get_dynamic_line(FILE *f) {
size_t bytes_read = 0;
char *bytes = NULL;
do {
if ((bytes_read & (bytes_read + 1)) == 0) {
void *temp = realloc(bytes, bytes_read * 2 + 1);
if (temp == NULL) {
free(bytes);
return NULL;
}
bytes = temp;
}
int c = fgetc(f);
bytes[bytes_read] = c >= 0 && c != '\n'
? c
: '\0';
} while (bytes[bytes_read++]);
return bytes;
}
I'm trying to use the getdelim function to read an entire text file's contents into a string.
Here is the code I am using:
ssize_t bytesRead = getdelim(&buffer, 0, '\0', fp);
This is failing however, with strerror(errno) saying "Error: Invalid Argument"
I've looked at all the documentation I could and just can't get it working, I've tried getline which does work but I'd like to get this function working preferably.
buffer is NULL initialised as well so it doesn't seem to be that
fp is also not reporting any errors and the file opens perfectly
EDIT: My implementation is based on an answer from this stackoverflow question Easiest way to get file's contents in C
Kervate, please enable compiler warnings (-Wall for gcc), and heed them. They are helpful; why not accept all the help you can get?
As pointed out by WhozCraig and n.m. in comments to your original question, the getdelim() man page shows the correct usage.
If you wanted to read records delimited by the NUL character, you could use
FILE *input; /* Or, say, stdin */
char *buffer = NULL;
size_t size = 0;
ssize_t length;
while (1) {
length = getdelim(&buffer, &size, '\0', input);
if (length == (ssize_t)-1)
break;
/* buffer has length chars, including the trailing '\0' */
}
free(buffer);
buffer = NULL;
size = 0;
if (ferror(input) || !feof(input)) {
/* Error reading input, or some other reason
* that caused an early break out of the loop. */
}
If you want to read the contents of a file into a single character array, then getdelim() is the wrong function.
Instead, use realloc() to dynamically allocate and grow the buffer, appending to it using fread(). To get you started -- this is not complete! -- consider the following code:
FILE *input; /* Handle to the file to read, assumed already open */
char *buffer = NULL;
size_t size = 0;
size_t used = 0;
size_t more;
while (1) {
/* Grow buffer when less than 500 bytes of space. */
if (used + 500 >= size) {
size_t new_size = used + 30000; /* Allocate 30000 bytes more. */
char *new_buffer;
new_buffer = realloc(buffer, new_size);
if (!new_buffer) {
free(buffer); /* Old buffer still exists; release it. */
buffer = NULL;
size = 0;
used = 0;
fprintf(stderr, "Not enough memory to read file.\n");
exit(EXIT_FAILURE);
}
buffer = new_buffer;
size = new_size;
}
/* Try reading more data, as much as fits in buffer. */
more = fread(buffer + used, 1, size - used, input);
if (more == 0)
break; /* Could be end of file, could be error */
used += more;
}
Note that the buffer in this latter snippet is not a string. There is no terminating NUL character, so it's just an array of chars. In fact, if the file contains binary data, the array may contain lots of NULs (\0, zero bytes). Assuming there was no error and all of the file was read (you need to check for that, see the former example), buffer contains used chars read from the file, with enough space allocated for size. If used > 0, then size > used. If used == 0, then size may or may not be zero.
If you want to turn buffer into a string, you need to decide what to do with the possibly embedded \0 bytes -- I recommend either convert to e.g. spaces or tabs, or move the data to skip them altogether --, and add the string-terminating \0 at end to make it a valid string.
This question already has answers here:
How to read the content of a file to a string in C?
(12 answers)
Closed 5 years ago.
I want to write the full contents of a file into a buffer. The file actually only contains a string which i need to compare with a string.
What would be the most efficient option which is portable even on linux.
ENV: Windows
Portability between Linux and Windows is a big headache, since Linux is a POSIX-conformant system with - generally - a proper, high quality toolchain for C, whereas Windows doesn't even provide a lot of functions in the C standard library.
However, if you want to stick to the standard, you can write something like this:
#include <stdio.h>
#include <stdlib.h>
FILE *f = fopen("textfile.txt", "rb");
fseek(f, 0, SEEK_END);
long fsize = ftell(f);
fseek(f, 0, SEEK_SET); /* same as rewind(f); */
char *string = malloc(fsize + 1);
fread(string, fsize, 1, f);
fclose(f);
string[fsize] = 0;
Here string will contain the contents of the text file as a properly 0-terminated C string. This code is just standard C, it's not POSIX-specific (although that it doesn't guarantee it will work/compile on Windows...)
Here is what I would recommend.
It should conform to C89, and be completely portable. In particular, it works also on pipes and sockets on POSIXy systems.
The idea is that we read the input in large-ish chunks (READALL_CHUNK), dynamically reallocating the buffer as we need it. We only use realloc(), fread(), ferror(), and free():
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
/* Size of each input chunk to be
read and allocate for. */
#ifndef READALL_CHUNK
#define READALL_CHUNK 262144
#endif
#define READALL_OK 0 /* Success */
#define READALL_INVALID -1 /* Invalid parameters */
#define READALL_ERROR -2 /* Stream error */
#define READALL_TOOMUCH -3 /* Too much input */
#define READALL_NOMEM -4 /* Out of memory */
/* This function returns one of the READALL_ constants above.
If the return value is zero == READALL_OK, then:
(*dataptr) points to a dynamically allocated buffer, with
(*sizeptr) chars read from the file.
The buffer is allocated for one extra char, which is NUL,
and automatically appended after the data.
Initial values of (*dataptr) and (*sizeptr) are ignored.
*/
int readall(FILE *in, char **dataptr, size_t *sizeptr)
{
char *data = NULL, *temp;
size_t size = 0;
size_t used = 0;
size_t n;
/* None of the parameters can be NULL. */
if (in == NULL || dataptr == NULL || sizeptr == NULL)
return READALL_INVALID;
/* A read error already occurred? */
if (ferror(in))
return READALL_ERROR;
while (1) {
if (used + READALL_CHUNK + 1 > size) {
size = used + READALL_CHUNK + 1;
/* Overflow check. Some ANSI C compilers
may optimize this away, though. */
if (size <= used) {
free(data);
return READALL_TOOMUCH;
}
temp = realloc(data, size);
if (temp == NULL) {
free(data);
return READALL_NOMEM;
}
data = temp;
}
n = fread(data + used, 1, READALL_CHUNK, in);
if (n == 0)
break;
used += n;
}
if (ferror(in)) {
free(data);
return READALL_ERROR;
}
temp = realloc(data, used + 1);
if (temp == NULL) {
free(data);
return READALL_NOMEM;
}
data = temp;
data[used] = '\0';
*dataptr = data;
*sizeptr = used;
return READALL_OK;
}
Above, I've used a constant chunk size, READALL_CHUNK == 262144 (256*1024). This means that in the worst case, up to 262145 chars are wasted (allocated but not used), but only temporarily. At the end, the function reallocates the buffer to the optimal size. Also, this means that we do four reallocations per megabyte of data read.
The 262144-byte default in the code above is a conservative value; it works well for even old minilaptops and Raspberry Pis and most embedded devices with at least a few megabytes of RAM available for the process. Yet, it is not so small that it slows down the operation (due to many read calls, and many buffer reallocations) on most systems.
For desktop machines at this time (2017), I recommend a much larger READALL_CHUNK, perhaps #define READALL_CHUNK 2097152 (2 MiB).
Because the definition of READALL_CHUNK is guarded (i.e., it is defined only if it is at that point in the code still undefined), you can override the default value at compile time, by using (in most C compilers) -DREADALL_CHUNK=2097152 command-line option -- but do check your compiler options for defining a preprocessor macro using command-line options.
A portable solution could use getc.
#include <stdio.h>
char buffer[MAX_FILE_SIZE];
size_t i;
for (i = 0; i < MAX_FILE_SIZE; ++i)
{
int c = getc(fp);
if (c == EOF)
{
buffer[i] = 0x00;
break;
}
buffer[i] = c;
}
If you don't want to have a MAX_FILE_SIZE macro or if it is a big number (such that buffer would be to big to fit on the stack), use dynamic allocation.