How to use read() to get input of unknown length from stdin - c

Until now, whenever I wanted to get user input from stdin I used scanf() but this time I can't and have to use read().
Usually, to get input from stdin using read I use:
char buf[128];
read(0, buf, sizeof(buf));
But this time I don't have any length limit to the input and I want to allow input with arbitrary size. In the past I used scanf for this, like so:
char *user_input;
scanf("%ms", &user_input);
How can I do this with read()?
Note: safety isn't important here

The function read returns the number of read bytes. You can take advantage of this information and loop until you read 0 bytes, that is, read returns 0.
char buf[BUF_SIZE]; // Set BUF_SIZE to the maximum number of character you expect to read (e.g. 1000 or 10000 or more).
int bytes_to_read, total_read_bytes, read_bytes;
// Number of bytes to read at each iteration of the loop.
bytes_to_read = 128;
// The following variable counts the number of total read bytes.
total_read_bytes = 0;
while ((read_bytes = read(0, buf + total_read_bytes, bytes_to_read) != 0) {
if (read_bytes < 0) {
// read() may return -1. You can look at the variable errno to
// have more details about the cause of the error.
return -1;
}
total_read_bytes += read_bytes;
}
Notice that read does not automatically append the null terminator \0 to buf, that is, buf is not a string until you explicitly add \0 at the end of it.
...
// Making buf a string.
buf[total_read_bytes] = '\0';
...

one way is run a loop with getchar() function and keep rading the characters into an array. check the array size with each iteration ,Once the array is full, reallocate it to a larger size. OR use the The getline() function.
link for getline()
Check the below programme .
#include <stdio.h>
#include <stdlib.h>
int main(void) {
char *lines = NULL;
size_t n = 0;
ssize_t res = getline(&line, &n, stdin);
free(line);
}

Related

Are there ways to overcome the constraints of fgets()?

The fgets() function has two problems. The first is that, if the size of the line is longer than that of the passed buffer, the line is truncated. The second is that, if the line read from the file has embedded '\0' characters, then there is no way to know the actual length of the line. I would like to get a replacement for fgets() that dynamically allocates the space for the line read and also provides the size of the line read. I have written the code for dynamically allocating the space. I am unable to figure out how to get the size of the line read. I am a beginner. Thank you so much.
#include <stdio.h>
#include <stdlib.h>
#include <error.h>
#include <errno.h>
char *myfgets(FILE *fptr, int *size);
char *myfgets(FILE *fptr, int *size) {
char *buffer;
char *ret;
buffer = (char *)malloc((*size) * sizeof(char));
if (buffer == NULL)
error(1, 0, "No memory available\n");
ret = fgets(buffer, *size, fptr);
if (ret == NULL)
error(1, 0, "Error in reading the file\n");
return ret;
}
int main(int argc, char *argv[]) {
char *file;
FILE *fptr;
int size;
char *result;
if (argc != 3)
error(1, 0, "Too many or few arguments <File_name>, <Number of bytes to read>\n");
file = argv[1];
size = atoi(argv[2]);
fptr = fopen(file, "r");
if (fptr == NULL)
error(1, 0, "Error in opening the file\n");
result = myfgets(fptr, &size);
printf("The line read is :%s", result);
free(result);
return 0;
}
Use getline(3) to read a complete line of unknown length. It allocates memory as needed to hold it all.
The function can deal with 0 bytes in the line being read too. From the linked man page (emphasis added):
On success, getline() and getdelim() return the number of characters read, including the delimiter character, but not including the terminating null byte ('\0'). This value can be used to handle embedded null bytes in the line read.
So you just have to save its return value instead of using strlen().
You have correctly identified 2 issues in fgets(), but your proposed alternative does not address either of them as you still call fgets().
You should write a loop, calling getc() repeatedly until you get EOF or '\n' and you would store the bytes read into an allocated array, reallocating as needed.
Here is a simplistic version:
// Read a full line from `fptr`
// - return `NULL` at end of file or upon read error like `fgets()`.
// - otherwise return a pointer to an allocated array containing the
// characters read, up to and including the newline and a null terminator.
// - store the number of bytes read into *plength.
// - the buffer is null terminated, and it may contain embedded null bytes
// if such bytes were read from the file
char *myfgets(FILE *fptr, size_t *plength) {
size_t length = 0;
char *buffer = NULL, *newp;
int c;
for (;;) {
if (c = getc()) == EOF) {
if (!feof(fptr)) {
/* read error: discard data read so far and return NULL */
free(buffer);
buffer = NULL;
length = 0;
}
break;
}
if ((newp = realloc(buffer, length + 2)) == NULL) {
free(buffer);
error(1, 0, "Out of memory for realloc\n");
return NULL;
}
buffer = newp;
buffer[length] = c;
length++;
if (c == '\n')
break;
}
if (length != 0) {
buffer[length] = '\0';
}
*plength = length;
return buffer;
}
Various approaches for a "fixed" fgets():
1) Use the non-C library standard getline() as suggested by #Shawn. Commonly available in *nix and source code easy enough to find. It unfortunately obliges a new type: ssize_t.
2) Roll your own getc() code #chqrlie. Corner cases can be tricky.
3) Repeatedly call fgets() as needed. Pre-fill the buffer with '\n' and look for the first occurrence of '\n', its position, next character to help determine length. (There are only a few cases to consider)
4) Repeatedly call scanf("%99[^\n]%n", buf100, &n) and getc() for the '\n' as needed. Look at the return value and n to determine length.
5) Likely others
A good functional test of the design is how well did it report the cases:
Happy path: a line was read, memory allocated, no problems.
End-of-file: Nothing read due to end of file.
Out-of-memory.
Input error occurred.
Other considerations:
Do you really want to save a '\n'?
Performance.
As for me with "dynamically allocates the space" with no limit, code introduces the ability for a nefarious user to overwhelm memory resources by entering a pathologically long line. Rather than give such ability to a user, I recommend to limit input to a sane bound. Excessively long input is an attack that should be detected, not enabled.
So I would start with
char *myfgets(FILE *fptr, size_t limit, size_t *size) {

Program crashing at realloc

Problem
I am currently writing a small (and bad) grep-like program for Windows. In it I want to read files line by line and print out the ones which contain a key. For this to work I need a function which reads each line of a file. Since I am not on Linux I cannot use the getline function and have to implement it myself.
I have found an SO answer where such a function is implemented. I tried it out and it works fine for 'normal' text files. But the program crashes if I try to read a file with a line length of 13 000 characters.
MCVE
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
char * getline(FILE *f)
{
size_t size = 0;
size_t len = 0;
size_t last = 0;
char *buf = NULL;
do {
size += BUFSIZ; /* BUFSIZ is defined as "the optimal read size for this platform" */
buf = realloc(buf, size); /* realloc(NULL,n) is the same as malloc(n) */
/* Actually do the read. Note that fgets puts a terminal '\0' on the
end of the string, so we make sure we overwrite this */
if (buf == NULL) return NULL;
fgets(buf + last, size, f);
len = strlen(buf);
last = len - 1;
} while (!feof(f) && buf[last] != '\n');
return buf;
}
int main(int argc, char *argv[])
{
FILE *file = fopen(argv[1], "r");
if (file == NULL)
return 1;
while (!feof(file))
{
char *line = getline(file);
if (line != NULL)
{
printf("%s", line);
free(line);
}
}
return 0;
}
This is the file I am using. It contains three short lines which get read just fine and a long one from one of my Qt projects. When reading this line the getline function reallocates 2 times to a size of 1024 and crashes at the 3rd time. I've put printf around the realloc to make sure it crashes there and it definitely does.
Question
Could anyone explain me why my program is crashing like that? I just spend hours with this and don't know what to do anymore.
In this fragment
size += BUFSIZ;
buf = realloc(buf, size);
if (buf == NULL) return NULL;
fgets(buf + last, size, f);
you add size + BUFSIZ and allocate that, but then you read that same – increased! – size. In essence, you are reading more and more characters than you allocated in each turn. The first time around, size = BUFSIZ and you read exactly size/BUFSIZ characters. If the line is longer than this (the last character is not \n), you increase the size of the memory (size += BUFSIZ) but you also read its (new) total size again – and you've already processes that last number of size bytes.
The allocated memory grows with BUFSIZE per loop, but the amount of bytes to read increases with BUFSIZE – after one loop, it's BUFSIZE, after two loops 2*BUFSIZE, and so on, until something important gets overwritten and the program is terminated.
If you read only chunks of the exact size of BUFSIZE then this should work.
Note that your code expects the last line to end with an \n, which may not always be true. You can catch this with an additional test:
if (!fgets(buf + last, size, f))
break;
so your code won't be trying to read past the end of the input file.

Why are extra garbage characters printed?

I try to use read() to get some characters from file just for learning this API. I have create a file called "file" in the same directory and it is content:
1:2:ab:cd:ef
Here is the code:
#include <stdio.h>
#include <error.h>
int read_indent(int sockfd){
int sport, cport;
char user[3], rtype[3], addinfo[3];
char buffer[4+4+3+3+3+1];
if(read(sockfd, buffer, sizeof(buffer)) <= 0) {
perror("read: %m");
return -1;
}
buffer[sizeof(buffer)-1] = '\0';
sscanf(buffer, "%d:%d:%s:%s:%s", &sport, &cport, rtype, user, addinfo);
printf("%d:%d:%s:%s:%s", sport, cport, rtype, user, addinfo);
return 0;
}
int main(){
FILE *file_pt = fopen("file", "r");
if(file_pt == NULL) { printf("fopen error\n"); return -1;}
char buf[128];
int a = read_indent(fileno(file_pt));
fclose(file_pt);
return 0;
}
My printf returns me
1:2:ab:cd:ef::xPvx
where x is some garbage character I cannot recognize. What is the reason for this? int is 4 bytes in my system.
One issue is that you didn't specify a width for the %s parameters. This means that it matches up until the first whitespace character. There are no whitespace characters in your string, so the first %s matches until the end, leaving only garbage data after your string to fill the other variables.
Try this:
sscanf(buffer, "%d:%d:%2s:%2s:%2s", &sport, &cport, rtype, user, addinfo);
The other issue is that you don't null-terminate your buffer properly, read returns the number of characters read - add a null after that.
char buffer[4+4+3+3+3+1];
The buffer is bigger than what you plan to read and that's ok, but:
buffer[sizeof(buffer)-1] = '\0';
This is wrong, add the \0 at size+1 , where size is what you get back with read(), the actual number of bytes read.
See here:
The value returned may be less than nbyte if the number of bytes left in the file is less than nbyte, if the read() request was interrupted by a signal, or if the file is a pipe or FIFO or special file and has fewer than nbyte bytes immediately available for reading.
char buffer[4+4+3+3+3+1];
if(read(sockfd, buffer, sizeof(buffer)) <= 0) {
//....
}
buffer[sizeof(buffer)-1] = '\0';
The read function does not add \0 to the buffer after reading. But you read just 12 bytes, and your buffer size is 18. So you still have 5 bytes of garbage in your buffer. This gets added to the last string you read.

Using read() system call

For an assignment in class we were tasked with using the read() function to read a file containing numbers. While I was able to read the numbers into a buffer I have been unable to move them from the buffer into a char *array so that they can be easily accessed and sorted. Any advice is appreciated.
int readNumbers(int hexI, int MAX_FILENAME_LEN, int **array, char* fname) {
int numberRead = 0, cap = 2;
*array = (int *)malloc(cap*sizeof(int));
int n;
int filedesc = open(fname, O_RDONLY, 0);
if(filedesc < 0){
printf("%s: %s\n", "COULD NOT OPEN", fname);
return -1;
}
char * buff = malloc(512);
buff[511] = '\0';
while(n = read(filedesc, buff+totaln, 512 - totaln) > 0) //Appears to loop only once
totaln += n;
int len = strlen(buff);
for (int a = 0; a < len; a++) { //Dynamically allocates array according to input size
if ((&buff[a] != " ") && (&buff[a] != '\n'))
numberRead++;
if (numberRead >= cap){
cap = cap*2;
*array = (int*)realloc(*array, cap*sizeof(int));
}
}
int k = 0;
while((int *)&buff[k]){ //attempts to assign contents of buff to array
array[k] = (int *)&buff[k];
k++;
}
}
Your use of read() is wrong. There are at least two serious errors:
You ignore the return value, except to test for end-of-file.
You seem to assume that read() will append a nul byte after the data it reads. Perhaps even that it will pad out the buffer with nul bytes.
If you want to read more data into the same buffer after read() returns, without overwriting what you already read, then you must pass a pointer to the first available position in the buffer. If you want to know how many bytes were read in total, then you need to add the return values. The usual paradigm is something like this:
/*
* Read as many bytes as possible, up to buf_size bytes, from file descriptor fd
* into buffer buf. Return the number of bytes read, or an error code on
* failure.
*/
int read_full(int fd, char buf[], int buf_size) {
int total_read = 0;
int n_read;
while ((n_read = read(fd, buf + total_read, buf_size - total_read)) > 0) {
total_read += n_read;
}
return ((n_read < 0) ? n_read : total_read);
}
Having done something along those lines and not received an error, you can be assured that read() has not modified any element of the buffer beyond buf[total_read - 1]. It certainly has not filled the rest of the buffer with zeroes.
Note that it is not always necessary or desirable to read until the buffer is full; the example function does that for demonstration purposes, since it appears to be what you wanted.
Having done that, be aware that you are trying to extract numbers as if they were recorded in binary form in the file. That may indeed be the case, but if you're reading a text file containing formatted numbers then you need to extract the numbers differently. If that's what you're after then add a string terminator after the last byte read and use sscanf() to extract the numbers.

How do I use scanf when I dont know how many values it will assign in C?

These are the instructions:
"Read characters from standard input until EOF (the end-of-file mark) is read. Do not prompt the user to enter text - just read data as soon as the program starts."
So the user will be entering characters, but I dont know how many. I will later need to use them to build a table that displays the ASCII code of each value entered.
How should I go about this?
This is my idea
int main(void){
int inputlist[], i = -1;
do {++i;scanf("%f",&inputlist[i]);}
while(inputlist[i] != EOF)
You said character.So this might be used
char arr[10000];
ch=getchar();
while(ch!=EOF)
{
arr[i++]=ch;
ch=getchar();
}
//arr[i]=0; TO make it a string,if necessary.
And to convert to ASCII
for(j=0;j<i;j++)
printf("%d\n",arr[j]);
If you are particular in using integer array,Use
int arr[1000];
while(scanf("%d",&arr[i++])!=EOF);
PPS:This works only if your input is one character per line.
scanf returns EOF on EOF
You have a reasonable attempt at a start to the solution, with a few errors. You can't define an array without specifying a size, so int inputlist[] shouldn't even compile. Your scanf() specifier is %f for float, which is wrong twice (once because you declared inputlist with an integer type, and twice because you said your input is characters, so you should be telling scanf() to use %c or %s), and really if you're reading input unconditionally until EOF, you should use an unconditional input function, such as fgets() or fread(). (or read(), if you prefer).
You'll need two things: A place to store the current chunk of input, and a place to store the input that you've already read in. Since the input functions I mentioned above expect you to specify the input buffer, you can allocate that with a simple declaration.
char input[1024];
However, for the place to store all input, you'll want something dynamically allocated. The simplest solution is to simply malloc() a chunk of storage, keep track of how large it is, and realloc() it if and when necessary.
char *all_input;
int poolsize=16384;
all_input = malloc(pool_size);
Then, just loop on your input function until the return value indicates that you've hit EOF, and on each iteration of the loop, append the input data to the end of your storage area, increment a counter by the size of the input data, and check whether you're getting too close to the size of your input storage area. (And if you are, then use realloc() to grow your storage.)
You could read the input by getchar until reach EOF. And you don't know the size of input, you should use dynamic size buffer in heap.
char *buf = NULL;
long size = 1024;
long count = 0;
char r;
buf = (char *)malloc(size);
if (buf == NULL) {
fprintf(stderr, "malloc failed\n");
exit(1);
}
while( (r = getchar()) != EOF) {
buf[count++] = r;
// leave one space for '\0' to terminate the string
if (count == size - 1) {
buf = realloc(buf,size*2);
if (buf == NULL) {
fprintf(stderr, "realloc failed\n");
exit(1);
}
size = size * 2;
}
}
buf[count] = '\0';
printf("%s \n", buf);
return 0;
Here is full solution for your needs with comments.
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
// Number of elements
#define CHARNUM 3
int main(int argc, char **argv) {
// Allocate memory for storing input data
// We calculate requested amount of bytes by the formula:
// NumElement * SizeOfOneElement
size_t size = CHARNUM * sizeof(int);
// Call function to allocate memory
int *buffer = (int *) calloc(1, size);
// Check that calloc() returned valid pointer
// It can: 1. Return pointer in success or NULL in faulire
// 2. Return pointer or NULL if size is 0
// (implementation dependened).
// We can't use this pointer later.
if (!buffer || !size)
{
exit(EXIT_FAILURE);
}
int curr_char;
int count = 0;
while ((curr_char = getchar()) != EOF)
{
if (count >= size/sizeof(int))
{
// If we put more characters than now our buffer
// can hold, we allocate more memory
fprintf(stderr, "Reallocate memory buffer\n");
size_t tmp_size = size + (CHARNUM * sizeof(int));
int *tmp_buffer = (int *) realloc(buffer, tmp_size);
if (!tmp_buffer)
{
fprintf(stderr, "Can't allocate enough memory\n");
exit(EXIT_FAILURE);
}
size = tmp_size;
buffer = tmp_buffer;
}
buffer[count] = curr_char;
++count;
}
// Here you get buffer with the characters from
// the standard input
fprintf(stderr, "\nNow buffer contains characters:\n");
for (int k = 0; k < count; ++k)
{
fprintf(stderr, "%c", buffer[k]);
}
fprintf(stderr, "\n");
// Todo something with the data
// Free all resources before exist
free(buffer);
exit(EXIT_SUCCESS); }
Compile with -std=c99 option if you use gcc.
Also you can use getline() function which will read from standard input line by line. It will allocate enough memory to store line. Just call it until End-Of-File.
errno = 0;
int read = 0;
char *buffer = NULL;
size_t len = 0;
while ((read = getline(&buffer, &len, stdin)) != -1)
{ // Process line }
if (errno) { // Get error }
// Process later
Note that if you are using getline() you should anyway use dynamic allocated memory. But not for storing characters, rather to store pointers to the strings.

Resources