Implementing getchar issue in C - c

I'm implementing getchar(), I have two issues with the BUFF_SIZE, when I use 1024 my getchar() reads all the characters and return one character discards the rest. This doesn't work inside a loop.
#ifndef BUFF_SIZE
#define BUFF_SIZE 1024
#endif
int my_getchar(void)
{
static char buff[BUFF_SIZE];
static char *chr;
int ret;
if ((ret = read(STDIN_FILENO, buff, BUFF_SIZE)) > 0)
{
chr = buff;
return (*chr);
}
return (EOF);
}
int get_line(char **line)
{
char *text = malloc(sizeof(char) * 4092);
int position = 0;
int c;
if (!text)
return (-1);
while (1)
{
c = my_getchar();
if (c == EOF || c == '\n')
{
text[position] = '\0';
*line = text;
return (1);
}
else
text[position] = c;
position++;
}
return (0);
}
When I set BUFF_SIZE to 1, it works fine inside a loop, and doesn't work well outside a loop. How can I solve this?
int main()
{
//comment out to test.
//char *sp;
//sp = (char *)malloc(sizeof(char) * 4092);
//get_line(&sp); // this function calls my_getchar inside a while loop.
printf("%c\n", my_getchar()); //calling my_getchar() outside a loop
// printf("%s\n", sp);
return 0;
}

Your my_getchar() skips BUF_SIZE - 1 characters on each invokation. Every time you call it - it reads BUF_SIZE characters from stdin (or less if we are at the end of the file) and returns the first character. All other are dropped, because when you call it next time it reads BUF_SIZE chars again instead of returning of the next char from previously read buffer.

When you set BUFF_SIZE to 1, you are calling read like this:
read(STDIN, buff, 1)
But, since STDIN is connected to a terminal, read() is allowed to wait for a line to be typed in before returning the first character. So, the function is behaving correctly (as written).
The good news is that if you use printf("%c\n", getchar());, you will see that getchar() behaves in the same way.
The main problem with your version is that it is built on top of read() -- the real getchar() is in <stdio.h> and meant to be mixable with the calls in there that use FILE* and its buffering. So, unless you are reimplementing all of stdio, it probably makes sense to build it on its primitives (e.g. fread).
If you are reimplementing all of <stdio>, then you should be implementing some equivalent to FILE, which is where the buffers would live (not in static variables inside of functions).

OP's code reads BUFF_SIZE chars, and returns only 1.
Need to selectively read. Only read when previous characters are comsumed.
Note: Better to pass in a structure pointer to the state (buff, index,count) than to use static variables, but staying with OP's approach:
int my_getchar(void) {
static unsigned char buff[BUFF_SIZE];
static int index = 0;
static int count = 0;
if (index >= count) {
index = 0;
count = read(STDIN_FILENO, buff, BUFF_SIZE);
if (count == 0) return EOF; // end-of-file
if (count < 0) return EOF; // I/O error
}
return buff[index++];
}
Note: important to read an unsigned char buffer to distinguish EOF from input charterers.
Depending on STDIN_FILENO attributes, a return value of 0 from read() may simple imply input not presently available. Then use
// count = read(STDIN_FILENO, buff, BUFF_SIZE);
// if (count == 0) return EOF; // end-of-file
do {
count = read(STDIN_FILENO, buff, BUFF_SIZE);
} while (count == 0);

There is issue in your my_getchar function. Issue is here:
/* When you do return *char only first char is returned */
if ((ret = read(STDIN_FILENO, buff, BUFF_SIZE)) > 0)
{
chr = buff;
return (*chr);
}
There you are doing return (*chr); you return only the first char in buff, so that way a \n is not returned and so your get_line is affected as it depends on a \n returned. This can be fixed this by adding an additional static variable to track the position of latest char returned, as below.
int my_getchar(void)
{
static char buff[BUFF_SIZE];
static char *chr;
static int pos = 0; /* New static variable to track position */
static int ret = 0; /* Changed this to static */
if (pos >= ret) { /* if all data in buffer has been returned */
if ((ret = read(STDIN_FILENO, buff, BUFF_SIZE)) > 0)
{
chr = buff;
pos = 0;
return *(chr + pos++); /* return one char and update pos */
} else { /* if no more to read from stdin */
return EOF;
}
} else { /* if data still in buffer */
return *(chr + pos++); /* return one char and update pos */
}
}
Now, my_getchar will return a \n when pos reaches the postition of \n in the buffer, so get_line will get all characters before the new line character.

Your code reads as many characters as possible:
read(STDIN_FILENO, buff, BUFF_SIZE)
up to BUFF_SIZE and then only returns the first character.
For that to work as intended, your buffer size needs to be 1.

Related

Ways for reading inputs in C without reading newline or using scanf

I have a college exercise to do, that consists in writing a bank in C and I do know that scanf is a really buggy function and fgets reads the '\n', that I don't really want to be read. So I have written new functions trying to solve this problem:
char *readstr(char *buff, size_t bytes) {
char *ptr;
for (ptr = buff;
(*ptr = getchar()) != '\n' && *ptr != EOF && ptr - buff < bytes - 1;
ptr++)
;
int c = *ptr;
*ptr = '\0';
if (c != EOF && c != '\n') {
while ((c = getchar()) != '\n' && c != EOF)
;
}
return buff;
}
int readint() {
char buffer[256] = "";
readstr(buffer, 256);
return atoi(strpbrk(buffer, "0123456789"));
}
double readfloat() {
char buffer[256] = "";
readstr(buffer, 256);
return atof(strpbrk(buffer, "0123456789"));
}
char readchar() {
char buffer[2] = "";
readstr(buffer, 2);
return *buffer;
}
until now, I wrote these ones. Any advice or suggestion? a more elegant one or simpler solution? apparently they work, but I don't know if this is the best approach.
scanf is not buggy, it's simply hard to use correctly.
Anyway, you can use fgets and remove manually the newline character. A way to do this is by using the strcspn function like follows:
fgets(str, size, stdin);
str[strcspn(str, "\n")] = 0;
Your goal seems to read a line of input into a buffer with a given size, not storing the trailing newline and discarding any excess characters present on the line. You do need to read the newline so it does not linger in the input stream, but you do not want to store it into the destination array as fgets() does.
Note that this can be achieved with scanf() this way:'
char buf[100] = "";
if (scanf("%99[^\n]", buf) == EOF) {
// handle end of file
} else {
// line was read into buf or empty line was detected and buf was not modified
scanf("%*[^\n]"); // consume extra bytes on the line if any
scanf("%*1[\n]"); // consume the newline if present
}
This is very cumbersome, and it gets even worse if the buffer size is a variable value. As many savvy C programmers noted, scanf is not buggy, it is just difficult to use correctly. I would even say its semantics are confusing and error prone, yet it can be used safely, unlike gets(), which was removed from recent versions of the C Standard because any arbitrary long input can cause undefined behavior with it.
Reading a line and discarding the trailing newline can be done with fgets(), but combining the reading the line, discarding extra bytes and consuming the newline may be useful as a separate function.
There are problems in your code:
you store the return value of getchar() directly into the char array, removing the distinction between EOF and the char value '\377' on architectures with signed chars, while *ptr != EOF would never match on architectures where char is unsigned by default. You must store it into an int variable.
there is no way to tell if the end of file was reached: the function returns an empty string at end of file, just like it does for blank lines int the stream
the buffer size cannot be zero.
truncation cannot be detected: instead of returning the destination array, you could return the number of characters needed for the full line and -1 at end of file.
calling atoi(strpbrk(buffer, "0123456789")) poses multiple problems: strpbrk can return a null pointer, causing undefined behavior, it will skip a leading sign, and atoi is not fully defined for contents representing values out of range for type int.
Here is a modified version:
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
int readstr(char *buff, size_t bytes) {
size_t pos = 0;
int c;
while ((c = getchar()) != EOF && c != '\n') {
if (pos + 1 < bytes)
buff[pos] = (char)c;
pos++;
}
if (size > 0) {
if (pos < size)
buff[pos] = '\0';
else
buff[size - 1] = '\0';
}
if (c == EOF && pos == 0)
return -1;
return (int)pos;
}
int readint(void) {
char buffer[256];
long n;
if (readstr(buffer, sizeof buffer) < 0)
return -1;
n = strtol(buffer, NULL, 0);
#if LONG_MIN < INT_MIN
if (n < INT_MIN) {
errno = ERANGE;
return INT_MIN;
}
#endif
#if LONG_MAX > INT_MAX
if (n > INT_MAX) {
errno = ERANGE;
return INT_MAX;
}
#endif
return (int)n;
}
double readdouble(void) {
char buffer[256];
if (readstr(buffer, sizeof buffer) < 0)
return NAN;
else
return strtod(buffer, NULL);
}
int readchar(void) {
char buffer[2] = "";
if (readstr(buffer, sizeof buffer) < 0)
return EOF;
else
return (unsigned char)*buffer;
}

Getting "Abort trap 6" using memset()

I am relatively new to C, so please bear with me if this is an obvious question. I've looked all over SO for an answer, and have not been able to figure this out.
I am writing a simple calculator -- it will take a calculation from the user ("1 + 3", for example, and return the result. To keep things simple, I am setting a length for the input buffer and forcing the user to stay within those bounds. If they input too many characters, I want to alert them they have gone over the limit, and reset the buffer so that they can re-input.
This functionality works fine when they stay under the limit. It also correctly gives them a message when they go over the limit. However, when they try to input a valid calculation after having put in an invalid one, I get abort trap: 6. I know this has something to do with how I am resetting the array and managing the memory of that buffer, but my C skills are not quite sharp enough to diagnose the problem on my own.
If anybody could please take a look, I'd really appreciate it! I've pasted my code below.
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
#include <stdlib.h>
#define BUFFER_SIZE 50
static void ready_for_input()
{
printf("> ");
}
static char *as_string(char buffer[], int size)
{
char *result = (char *)malloc((size + 1) * sizeof(char));
if (!result)
{
fprintf(stderr, "calculator: allocation error");
exit(EXIT_FAILURE);
}
for (int i = 0; i < size; i++)
{
result[i] = buffer[i];
}
// to make it a valid string
result[size] = '\0';
return result;
}
static char *read_line()
{
// put the input into a buffer
char buffer[BUFFER_SIZE], c;
int len = 0;
while (true)
{
c = getchar();
if (c == EOF || c == '\n')
{
// reset if input has exceeded buffer length
if (len > BUFFER_SIZE)
{
printf("Calculations must be under 100 characters long.\n");
memset(buffer, 0, sizeof(buffer));
len = 0;
ready_for_input();
}
else
{
return as_string(buffer, len);
}
}
else
{
buffer[len++] = c;
}
}
}
static void start_calculator()
{
ready_for_input();
char *line = read_line();
printf("input received : %s", line);
}
int main(int argc, char *argv[])
{
start_calculator();
}
You don't prevent the buffer overflow, because you are checking for it too late. You should check whether the user is about to exceed the buffer's size, before the user hits enter.
The code below improves a bit the way a buffer overflow is checked:
static char *read_line()
{
// put the input into a buffer
char buffer[BUFFER_SIZE];
int c; // getchar should be assigned to an int
int len = 0;
while (true)
{
c = getchar();
if (len >= BUFFER_SIZE)
{
// drop everything until EOF or newline
while (c != EOF && c != '\n')
c = getchar();
printf("Calculations must be under 100 characters long.\n");
memset(buffer, 0, sizeof(buffer));
len = 0;
ready_for_input();
}
else if (c == EOF || c == '\n')
{
return as_string(buffer, len);
}
else
{
buffer[len++] = c;
}
}
}
Another thing to notice is that gethchar() should be assigned to an int variable instead of char since you are checking for EOF (more info about this)
Finally, you may want to check for better ways to read a line in c, such as fgets, dynamically allocate memory for your buffer and using realloc (or a combination of malloc and memmove) to double the size when a limit is reached, or using getline.

Read line from stdin to pointer array in C

I'm trying to read line from stdin and store it in a pointer array, when I print the array I get the last entered value, the other values are replaced by the last entered value. The if statement works the first time only, after that it comparison doesn't come true. How can I read a line into pointer array?
void read_line(int fd, char *s)
{
char line[100];
char *list[100];
int i = 0;
while ((read(1, line, 100)))
{
if (!strncmp(line, s, strlen(line) - 1))
break;
else
list[i++] = line;
}
i = 0;
while (list[i])
{
write(fd, list[i], strlen(list[i]));
i++;
}
}
I call the function
read_line(1, "exit");
//if I type more words before typing exit, then type exit program doesn't terminate
First you should ensure that read did not fail by checking its return value:
while ((read(1, line, 100)) > 0)
Then you must copy line content in list[i++] using strdup, otherwise all list items will end up with the last value pointed by line.
Finally, what do you expect to happen when read reaches the limit of your buffer? You might want to handle that as well.
Code could look like this:
#define SIZE 100
void read_line(int fd, char *s)
{
char line[SIZE];
char *list[SIZE] = { 0 };
int i = 0;
ssize_t retval;
while ((retval = read(1, line, SIZE)) > 0)
{
if (retval == SIZE)
// adapt to the desired behavior here... for now we'll abort
break;
else if (!strncmp(line, s, strlen(line) - 1))
break;
else
list[i++] = strdup(line);
memset(line, 0, SIZE);
}
i = 0;
while (list[i])
{
write(fd, list[i], strlen(list[i]));
i++;
}
}

Read one line of a text file in C on Unix — my read_line is broken?

I want to make a function that reads a line of your choice, from a given text file. Moving on to the function as parameters (int fd of the open, and int line_number)
It must do so using the language C and Unix system calls (read and / or open).
It should also read any spaces, and it must not have real limits (ie the line must be able to have a length of your choice).
The function I did is this:
char* read_line(int file, int numero_riga){
char myb[1];
if (numero_riga < 1) {
return NULL;
}
char* myb2 = malloc(sizeof(char)*100);
memset(myb2, 0, sizeof(char));
ssize_t n;
int i = 1;
while (i < numero_riga) {
if((n = read(file, myb, 1)) == -1){
perror("read fail");
exit(EXIT_FAILURE);
}
if (strncmp(myb, "\n", 1) == 0) {
i++;
}else if (n == 0){
return NULL;
}
}
numero_riga++;
int j = 0;
while (i < numero_riga) {
ssize_t n = read(file, myb, 1);
if (strncmp(myb, "\n", 1) == 0) {
i++;
}else if (n == 0){
return myb2;
}else{
myb2[j] = myb[0];
j++;
}
}
return myb2;
}
Until recently, I thought that this would work but it really has some problems.
Using message queues, the string read by the read_line is received as a void string ( "\0" ). I know the message queues are not the problem because trying to pass a normal string did not create the problem.
If possible I would like a fix with explanation of why I should correct it in a certain way. This is because if I do not understand my mistakes I risk repeating them in the future.
EDIT 1. Based upon the answers I decided to add some questions.
How do I end myb2? Can someone give me an example based on my code?
How do I know in advance the amount of characters that make up a line of txt to read?
EDIT 2. I don't know the number of char the line have so I don't know how many char to allocate; that's why I use *100.
Partial Analysis
You've got a memory leak at:
char* myb2 = (char*) malloc((sizeof(char*))*100);
memset(myb2, 0, sizeof(char));
if (numero_riga < 1) {
return NULL;
}
Check numero_riga before you allocate the memory.
The following loop is also dubious at best:
int i = 1;
while (i < numero_riga) {
ssize_t n = read(file, myb, 1);
if (strncmp(myb, "\n", 1) == 0) {
i++;
}else if (n == 0){
return NULL;
}
}
You don't check whether read() actually returned anything quick enough, and when you do check, you leak memory (again) and ignore anything that was read beforehand, and you don't detect errors (n < 0). When you do detect a newline, you simply add 1 to i. At no point do you save the character read in a buffer (such as myb2). All in all, that seem's pretty thoroughly broken…unless…unless you're trying to read the Nth line in the file from scratch, rather than the next line in the file, which is more usual.
What you need to be doing is:
scan N-1 lines, paying attention to EOF
while another byte is available
if it is newline, terminate the string and return it
otherwise, add it to the buffer, allocating space if there isn't room.
Implementation
I think I'd probably use a function get_ch() like this:
static inline int get_ch(int fd)
{
char c;
if (read(fd, &c, 1) == 1)
return (unsigned char)c;
return EOF;
}
Then in the main char *read_nth_line(int fd, int line_no) function you can do:
char *read_nth_line(int fd, int line_no)
{
if (line_no <= 0)
return NULL;
/* Skip preceding lines */
for (int i = 1; i < line_no; i++)
{
int c;
while ((c = get_ch(fd)) != '\n')
{
if (c == EOF)
return NULL;
}
}
/* Capture next line */
size_t max_len = 8;
size_t act_len = 0;
char *buffer = malloc(8);
int c;
while ((c = get_ch(fd)) != EOF && c != '\n')
{
if (act_len + 2 >= max_len)
{
size_t new_len = max_len * 2;
char *new_buf = realloc(buffer, new_len);
if (new_buf == 0)
{
free(buffer);
return NULL;
}
buffer = new_buf;
max_len = new_len;
}
buffer[act_len++] = c;
}
if (c == '\n')
buffer[act_len++] = c;
buffer[act_len] = '\0';
return buffer;
}
Test code added:
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
extern char *read_nth_line(int fd, int line_no);
…code from main answer…
int main(void)
{
char *line;
while ((line = read_nth_line(0, 3)) != NULL)
{
printf("[[%s]]\n", line);
free(line);
}
return 0;
}
This reads every third line from standard input. It seems to work correctly. It would be a good idea to do more exhaustive checking of boundary conditions (short lines, etc) to make sure it doesn't abuse memory. (Testing lines of lengths 1 — newline only — up to 18 characters with valgrind shows it is OK. Random longer tests also seemed to be correct.)

C - How to Read String Lines from Stdin or File Memory Save

I need a version of read line that is memory save. I have this "working" solution. But I'm not sure how it behaves with memory. When I enable free(text) it works for a few lines and then I get an error. So now neither text nor result is ever freed although I malloc text. Is that correct ? And why is that so ?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char* readFromIn()
{
char* text = malloc(1024);
char* result = fgets(text, 1024, stdin);
if (result[strlen(result) - 1] == 10)
result[strlen(result) - 1] = 0;
//free(text);
return result;
}
I have A LOT of short lines to read with this and I also need stdin to be replaceable with a FILE* handle. There is no need for me to realloc text because I have only short lines.
fgets returns a pointer to the string, so after the fgets line, result will be the same memory address as text. Then when you call free (text); you are returning invalid memory.
You should free the memory in the calling function when you have finished with result
You could also avoid the malloc/free stuff by structuring your code to pass a buffer something like this:
void parent_function ()
{
char *buffer[1024];
while (readFromIn(buffer)) {
// Process the contents of buffer
}
}
char *readFromIn(char *buffer)
{
char *result = fgets(buffer, 1024, stdin);
int len;
// fgets returns NULL on error of end of input,
// in which case buffer contents will be undefined
if (result == NULL) {
return NULL;
}
len = strlen (buffer);
if (len == 0) {
return NULL;
}
if (buffer[len - 1] == '\n') {
buffer[len - 1] = 0;
return buffer;
}
Trying to avoid the malloc/free is probably wise if you are dealing with many small, short lived items so that the memory doesn't get fragmented and it should faster as well.
char *fgets(char *s, int size, FILE *stream) reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer. A terminating null byte ('\0') is stored after the last character in the buffer.
Return Value: returns s on success, and NULL on error or when end of file occurs while no characters have been read.
So there are 2 critical problems with your code:
You don't check the return value of fgets
You want to deallocate the memory, where this string is stored and return a pointer to this memory. Accessing the memory, where such a pointer (dangling pointer) points to, leads to undefined behaviour.
Your function could look like this:
public char* readFromIn() {
char* text = malloc(1024);
if (fgets(text, 1024, stdin) != NULL) {
int textLen = strlen(text);
if (textLen > 0 && text[textLen - 1] == '\n')
text[textLen - 1] == '\0'; // getting rid of newline character
return text;
}
else {
free(text);
return NULL;
}
}
and then caller of this function should be responsible for deallocating the memory that return value of this function points to.
I know you mentioned that the lines are only short, but none of the solutions provided will work for lines greater than 1024 in length. It is for this reason that I provide a solution which will attempt to read entire lines, and resize the buffer when there's not enough space.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MINIMUM_CAPACITY 16
size_t read_line(char **buffer, size_t *capacity) {
char *buf = *buffer;
size_t cap = *capacity, pos = 0;
if (cap < MINIMUM_CAPACITY) { cap = MINIMUM_CAPACITY; }
for (;;) {
buf = realloc(buf, cap);
if (buf == NULL) { return pos; }
*buffer = buf;
*capacity = cap;
if (fgets(buf + pos, cap - pos, stdin) == NULL) {
break;
}
pos += strcspn(buf + pos, "\n");
if (buf[pos] == '\n') {
break;
}
cap *= 2;
}
return pos;
}
int main(void) {
char *line = NULL;
size_t size = 0;
for (size_t end = read_line(&line, &size); line[end] == '\n'; end = read_line(&line, &size)) {
line[end] = '\0'; // trim '\n' off the end
// process contents of buffer here
}
free(line);
return 0;
}
An ideal solution should be able to operate with a fixed buffer of 1 byte. This requires a more comprehensive understanding of the problem, however. Once achieved, adapting such a solution would achieve the most optimal solution.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *readFromIn(FILE *fp)
{
char text[1024];
size_t len;
if (!fgets(text, sizeof text, fp)) return NULL;
len = strlen(text);
while (len && text[len-1] == '\n') text[--len] = 0;
return strdup(text);
}
Why did no one propose to move the buffer from heap to stack ? This is my solution now:
char input[1024]; // held ready as buffer for fgets
char* readFromIn()
{
char* result = fgets(input, 1024, stdin);
if (result == null)
return "";
if (result[strlen(result) - 1] == '\n')
result[strlen(result) - 1] = 0;
return result;
}

Resources