read user input without maxsize in C - c

In C i can use the char *fgets(char *s, int size, FILE *stream) function to read user input from stdin. But the size of the user input is limited to size.
How can i read user input of variable size?

In C you are responsible for your buffers, and responsible for their size. So you can not have some dynamic buffer ready for you.
So the only solution is to use a loop (either of fgets or fgetc - depends on your processing and on your stop condition)
If you go beyond C to C++, you will find that you can accept std::string objects of variable sizes (there you need to deal with word and/or line termination instead - and loop again)

This function reads from standard input until end-of-file is encountered, and returns the number of characters read. It should be fairly easy to modify it to read exactly one line, or alike.
ssize_t read_from_stdin(char **s)
{
char buf[1024];
char *p;
char *tmp;
ssize_t total;
size_t len;
size_t allocsize;
if (s == NULL) {
return -1;
}
total = 0;
allocsize = 1024;
p = malloc(allocsize);
if (p == NULL) {
*s = NULL;
return -1;
}
while(fgets(buf, sizeof(buf), stdin) != NULL) {
len = strlen(buf);
if (total + len >= allocsize) {
allocsize <<= 1;
tmp = realloc(p, allocsize);
if (tmp == NULL) {
free(p);
*s = NULL;
return -1;
}
p = tmp;
}
memcpy(p + total, buf, len);
total += len;
}
p[total] = 0;
*s = p;
return total;
}

Related

dynamic buffer size for reading input

I am trying to create a program that will read line by line from stdin, search that line for the start and end of a given word and output all the matching words. Here is the code:
int main()
{
char buffer[100];
char **words = NULL;
int word_count = 0;
while (fgets(buffer, sizeof(buffer), stdin) != NULL) {
int length = strlen(buffer);
if (buffer[length - 1] == '\n') {
word_count = count_words(buffer, FIRSTCHAR);
if (word_count > 0) {
words = get_words(buffer, FIRSTCHAR, LASTCHAR);
for (int i = 0; i < word_count; ++i) {
printf("%s\n", words[i]);
free(words[i]);
}
free(words);
}
}
}
return 0;
}
I got the basic functionality working, but I am relying on fgets() with a fixed buffer size.
What I would like is to dynamically allocate a memory buffer with a size based on the length of each line.
I can only see one way of going about solving it, which is to iterate over input with fgetc and increment a counter until end of line and use that counter in place of sizeof(buffer), but I don't know how I would get fgetc to read the correct relevant line.
Is there any smart way of solving this?
but I am relying on fgets() with a fixed buffer size. What I would like is to dynamically allocate a memory buffer with a size based on the length of each line
I did wrote a version of fgets for another SO answer that reads the whole line and returns a
malloc allocated pointer with the contents of the whole line. This is the
code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *fgets_long(FILE *fp)
{
size_t size = 0, currlen = 0;
char line[1024];
char *ret = NULL, *tmp;
while(fgets(line, sizeof line, fp))
{
int wholeline = 0;
size_t len = strlen(line);
if(line[len - 1] == '\n')
{
line[len-- - 1] = 0;
wholeline = 1;
}
if(currlen + len >= size)
{
// we need more space in the buffer
size += (sizeof line) - (size ? 1 : 0);
tmp = realloc(ret, size);
if(tmp == NULL)
break; // return all we've got so far
ret = tmp;
}
memcpy(ret + currlen, line, len + 1);
currlen += len;
if(wholeline)
break;
}
if(ret)
{
tmp = realloc(ret, currlen + 1);
if(tmp)
ret = tmp;
}
return ret;
}
The trick is to check if the newline was read. If it was read, then you can
return the buffer, otherwise it reallocates the buffer with sizeof line more
bytes and appends it to the buffer. You could use this function if you like.
An alternative would be if you are using a POSIX system and/or are compiling with GNU GCC, then you
can use getline as well.
void foo(FILE *fp)
{
char *line = NULL;
size_t len = 0;
if(getline(&line, &len, fp) < 0)
{
free(line); // man page says even on failure you should free
fprintf(stderr, "could not read whole line\n");
return;
}
printf("The whole line is: '%s'\n", line);
free(line);
return;
}
the function: getline() does just what you want. The syntax:
ssize_t getline(char **lineptr, size_t *n, FILE *stream);
The function is exposed in the stdio.h header file and usually requires something like: #define _POSIX_C_SOURCE 200809L or #define _GNU_SOURCE as the first line in the file that calls getline()
Strongly suggest reading/understanding the MAN page for `getline() for all the grubby details.

How to malloc for getline implementation

I'm trying to add getline support to http-fs-wrapper and I have some malloc problems.
ssize_t _intercept_getdelim(int fd, char **lineptr, size_t *n, int delim)
{
intercept_t *obj = intercept[fd];
int counter;
size_t nc = sizeof(char);
counter = -1;
while (obj->offset < obj->size)
{
++counter;
if (*lineptr) {
*lineptr = realloc(*lineptr, (counter + 2) * nc);
}
else {
*lineptr = malloc(nc);
}
_intercept_read(fd, lineptr[counter], nc);
if (*lineptr[counter] == delim)
{
break;
}
}
*n = counter ? counter + 1 : counter;
*lineptr[counter + 2] = '\0';
// Why do we need a *n when the return value is the same??
return *n;
}
Here's the relevant section of _intercept_read:
size_t _intercept_read(int fd, void *buf, size_t count)
{
memcpy(buf, obj->ra_buf+bo, count);
When I step through this in gdb, the second iteration throws a SIGSEGV (from memcpy -- it's not the ending \0, it's still inside the loop). I also don't quite get what's the difference between the *n of getline/getdelim and the return value.
The difference between n and the return value is that n is always the buffer size, but the return value can be -1 for error states per posix spec. You aren't fully handling EOF (it should return -1 if it hits EOF and hasn't read anything yet).
A note, reallocing for every character is fairly inefficient. The standard pattern is to double the buffer size each time it is necessary. This is another way the return value and n can differ, since n is the buffer size, which can be much larger than the read character count it returns.
You also don't need to special case a starting null pointer, realloc internally calls malloc in that case.
buf = realloc(buf...) is an unsafe pattern, realloc can return null, you have to save the realloc result to a temp variable and check it before assigning, otherwise you both leak memory and can reference a null pointer.
I don't think there's actually space for the trailing null you're adding to the buffer at the end there.
This works:
ssize_t _intercept_getdelim(int fd, char **lineptr, size_t *n, int delim)
{
intercept_t *obj = intercept[fd];
int counter = -1;
char *c, *newbuf;
*n = 1;
*lineptr = malloc(*n);
while (obj->offset < obj->size)
{
++counter;
if (counter >= *n)
{
if ((newbuf = realloc(*lineptr, *n << 1)))
{
*n = *n << 1;
*lineptr = newbuf;
}
else
{
return -1;
}
}
c = *lineptr + counter;
_intercept_read(fd, c, nc);
if (*c == delim)
{
break;
}
}
if (counter > -1)
{
*(*lineptr + ++counter) = '\0';
}
return counter;
}

Read line from file issue

I wrote this simple readline function, it can return each line length but it doesn't return a pointer to the allocated buffer. Another issue is the last line ignored(it doesn't return it):
FILE *passFile = NULL;
char *current = NULL;
size_t len = 0;
passFile = fopen("pass.txt", "r");
while(readline(passFile, &current, &len) != -1) {
printf("%s\n", current); // SEGMENTAION FAULT
printf("%d\n", len);
free(current);
current = NULL;
}
ssize_t
readline(FILE *file, char **bufPtr, size_t *len)
{
char c, *buf = NULL;
size_t n = 0;
buf = (char*)malloc(sizeof(char));
while((c = fgetc(file)) != '\n' && (c != EOF)) {
buf[n] = c;
++n;
buf = realloc(buf, n + 1);
}
buf[n] = '\0';
*bufPtr = buf;
*len = n;
if(c == EOF) // reach end of file
return -1;
return 0;
}
Your readline() function is not returning a pointer to allocated memory. In your call, current is never set, so the pointer is invalid and you get the error.
In C, functions are "call by value". Inside readline(), bufPtr is a copy of whatever was passed to readline(). Assigning to bufPtr merely overwrites the local copy and does not return a value that the calling code can see.
In pseudocode:
TYPE a;
define function foo(TYPE x)
{
x = new_value;
}
foo(a); // does not change a
This only changes the local copy of x and does not return a value. You change it to use a pointer... the function still gets a copy, but now it's a copy of a pointer, and it can use that pointer value to find the original variable. In pseudocode:
TYPE a;
define function foo(TYPE *px)
{
*px = new_value;
}
foo(&a); // does change a
Now, to change your function:
ssize_t
readline(FILE *file, char **pbufPtr, size_t *len)
{
// ...deleted...
buf[n] = '\0';
*pbufPtr = buf;
// ...deleted...
}
And you call it like so:
while(readline(passFile, &current, &len) != -1)
P.S. It is not a good idea to call realloc() the way you do here. It's potentially a very slow function, and for an input string of 65 characters you will call it 65 times. It would be better to use an internal buffer for the initial file input, then use malloc() to allocate a string that is just the right size and copy the string into the buffer. If the string is too long to fit in the internal buffer at once, use malloc() to get a big-enough place to copy out the part of the string you have in the internal buffer, then continue using the internal buffer to copy more of the string, and then call realloc() as needed. Basically I'm suggesting you have an internal buffer of size N, and copy the string in chunks of N characters at a time, thus minimizing the number of calls to realloc() while still allowing arbitrary-length input strings.
EDIT: Your last-line problem is that you return -1 when you hit end of file, even though there is a line to return.
Change your code so that you return -1 only if c == EOF and n == 0, so a final line that ends with EOF will be correctly returned.
You should also make readline() use the feof() function to check if file is at end-of-file, and if so, return -1 without calling malloc().
Basically, when you return -1, you don't want to call malloc(), and when you did call malloc() and copy data into it, you don't want to return -1! -1 should mean "you got nothing because we hit end of file". If you got something before we hit end of file, that's not -1, that is 0. Then the next call to readline() after that will return -1.
In your readline function you pass current by value. So if you change bufPtr inside your function, it doesn't change value of current outside. If you want to change value of current pass it by reference: &current and change readline() parameter to char **bufPTR.
You could pass current the way you did if you wanted to change something it points to, but you want to change where it points in first place.
replace your readlinefunction with this
char* readline(FILE *file, size_t *len)
{
char c, *buf = NULL;
size_t n = 0;
buf = (char*)malloc(sizeof(char));
while((c = fgetc(file)) != '\n' && (c != EOF)) {
buf[n] = c;
++n;
buf = realloc(buf, n + 1);
}
buf[n] = '\0';
bufPtr = buf;
*len = n;
if(c == EOF) // reach end of file
return NULL;
return buf;
}
and then in main replace this line while(readline(passFile, current, &len) != -1) with this while((current = readline(passFile, &len) != NULL)
Now it works:
ssize_t
readline(FILE *file, char **bufPtr, size_t *len)
{
if(feof(file)) // reach end of file
return -1;
char c, *buf = NULL;
size_t n = 0, portion = CHUNK;
buf = (char*)malloc(sizeof(char) * CHUNK);
while((c = fgetc(file)) != '\n' && (c != EOF)) {
buf[n] = c;
++n;
if(n == portion) {
buf = realloc(buf, CHUNK + n);
portion += n;
}
}
buf[n] = '\0';
*bufPtr = buf;
*len = n;
return 0;
}

C Programming getting input

How do I constantly get user input (strings) until enter is pressed in C just like string class in C++?
I don't know the input size so I can't declare a variable of fixed size or even I can't allocate memory dynamically using malloc() or calloc().
Is there any way to implement this as a separate function?
As H2CO3 said, you should allocate a buffer with malloc(), then resize it with realloc() whenever it fills up. Like this:
size_t bufsize = 256;
size_t buf_used = 0;
int c;
char *buf = malloc(bufsize);
if (buf == NULL) { /* error handling here */ }
while ((c = fgetc(stdin)) != EOF) {
if (c == '\n') break;
if (buf_used == bufsize-1) {
bufsize *= 2;
buf = realloc(buf, bufsize);
if (buf == NULL) { /* error handling here */ }
}
buf[buf_used++] = c;
}
buf[buf_used] = '\0';
Use exponential storage expansion:
char *read_a_line(void)
{
size_t alloc_size = LINE_MAX;
size_t len = 0;
char *buf = malloc(LINE_MAX); // should be good for most, euh, *lines*...
if (!buf)
abort();
int c;
while ((c = fgetc(stdin)) != '\n' && c != EOF) {
if (len >= alloc_size) {
alloc_size <<= 1;
char *tmp = realloc(buf, alloc_size);
if (!tmp)
abort(); // or whatever
buf = tmp;
}
buf[len++] = c;
}
if (len >= alloc_size) {
alloc_size++;
char *tmp = realloc(buf, alloc_size);
if (!tmp)
abort(); // or whatever
buf = tmp;
}
buf[len] = 0;
return buf;
}
In C, you have little choice: If you want to input a string of unbounded length, have to use allocations in a loop. Whether you use realloc() or a linked list of buffers, it comes down to reading (usually through fgets()), reading some more, and so on until the buffer you've just read contains a \n.
Then, depending on the method, you either already have a contiguous buffer (the realloc method) or just need to concatenate them all (the linked list method). Then you can return.
If you're lucky, your platform comes with the extension function getline() that does the realloc method for you. If not, you'll have to write it yourself.

C - How to Read String Lines from Stdin or File Memory Save

I need a version of read line that is memory save. I have this "working" solution. But I'm not sure how it behaves with memory. When I enable free(text) it works for a few lines and then I get an error. So now neither text nor result is ever freed although I malloc text. Is that correct ? And why is that so ?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char* readFromIn()
{
char* text = malloc(1024);
char* result = fgets(text, 1024, stdin);
if (result[strlen(result) - 1] == 10)
result[strlen(result) - 1] = 0;
//free(text);
return result;
}
I have A LOT of short lines to read with this and I also need stdin to be replaceable with a FILE* handle. There is no need for me to realloc text because I have only short lines.
fgets returns a pointer to the string, so after the fgets line, result will be the same memory address as text. Then when you call free (text); you are returning invalid memory.
You should free the memory in the calling function when you have finished with result
You could also avoid the malloc/free stuff by structuring your code to pass a buffer something like this:
void parent_function ()
{
char *buffer[1024];
while (readFromIn(buffer)) {
// Process the contents of buffer
}
}
char *readFromIn(char *buffer)
{
char *result = fgets(buffer, 1024, stdin);
int len;
// fgets returns NULL on error of end of input,
// in which case buffer contents will be undefined
if (result == NULL) {
return NULL;
}
len = strlen (buffer);
if (len == 0) {
return NULL;
}
if (buffer[len - 1] == '\n') {
buffer[len - 1] = 0;
return buffer;
}
Trying to avoid the malloc/free is probably wise if you are dealing with many small, short lived items so that the memory doesn't get fragmented and it should faster as well.
char *fgets(char *s, int size, FILE *stream) reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer. A terminating null byte ('\0') is stored after the last character in the buffer.
Return Value: returns s on success, and NULL on error or when end of file occurs while no characters have been read.
So there are 2 critical problems with your code:
You don't check the return value of fgets
You want to deallocate the memory, where this string is stored and return a pointer to this memory. Accessing the memory, where such a pointer (dangling pointer) points to, leads to undefined behaviour.
Your function could look like this:
public char* readFromIn() {
char* text = malloc(1024);
if (fgets(text, 1024, stdin) != NULL) {
int textLen = strlen(text);
if (textLen > 0 && text[textLen - 1] == '\n')
text[textLen - 1] == '\0'; // getting rid of newline character
return text;
}
else {
free(text);
return NULL;
}
}
and then caller of this function should be responsible for deallocating the memory that return value of this function points to.
I know you mentioned that the lines are only short, but none of the solutions provided will work for lines greater than 1024 in length. It is for this reason that I provide a solution which will attempt to read entire lines, and resize the buffer when there's not enough space.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MINIMUM_CAPACITY 16
size_t read_line(char **buffer, size_t *capacity) {
char *buf = *buffer;
size_t cap = *capacity, pos = 0;
if (cap < MINIMUM_CAPACITY) { cap = MINIMUM_CAPACITY; }
for (;;) {
buf = realloc(buf, cap);
if (buf == NULL) { return pos; }
*buffer = buf;
*capacity = cap;
if (fgets(buf + pos, cap - pos, stdin) == NULL) {
break;
}
pos += strcspn(buf + pos, "\n");
if (buf[pos] == '\n') {
break;
}
cap *= 2;
}
return pos;
}
int main(void) {
char *line = NULL;
size_t size = 0;
for (size_t end = read_line(&line, &size); line[end] == '\n'; end = read_line(&line, &size)) {
line[end] = '\0'; // trim '\n' off the end
// process contents of buffer here
}
free(line);
return 0;
}
An ideal solution should be able to operate with a fixed buffer of 1 byte. This requires a more comprehensive understanding of the problem, however. Once achieved, adapting such a solution would achieve the most optimal solution.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *readFromIn(FILE *fp)
{
char text[1024];
size_t len;
if (!fgets(text, sizeof text, fp)) return NULL;
len = strlen(text);
while (len && text[len-1] == '\n') text[--len] = 0;
return strdup(text);
}
Why did no one propose to move the buffer from heap to stack ? This is my solution now:
char input[1024]; // held ready as buffer for fgets
char* readFromIn()
{
char* result = fgets(input, 1024, stdin);
if (result == null)
return "";
if (result[strlen(result) - 1] == '\n')
result[strlen(result) - 1] = 0;
return result;
}

Resources