Making getline() in C - c

I wanted to make my own getline() like function in C and this is what I came up with:
void getstring(char *string)
{
fgets(string,STRING_LENGTH,stdin);
string[strlen(string)-1]='\0';
}
It's good enough if number of typed characters doesn't exceed STRING_LENGHT (in this case 100) but once it does, everything above it is stays in buffer and jumps on next read string.
I've already tried to flush buffer by using following procedure:
void flush_buffer()
{
char c;
while((c = getchar()) != '\n' && c != EOF)
/* discard */ ;
}
It does its job in described case, but once the string doesn't exceed STRING_LENGHT I need to type anything before moving on.
Is there any way to read string such that I know whether STRING_LENGHT is exceeded or not so I can condition the flushing?
If there's a better way to make getline() like function, it would be even better.

Is there any way to read string such that I know whether STRING_LENGHT is exceeded or not so I can condition the flushing?
The amount of character read by fgets() will not meet nor exceed STRING_LENGHT.
But to find out if more of the line remains unread, the usual way it to detect if the string read is of maximally length and does not include a '\n'.
// return EOF, 0 (not all line was read) or 1 (all the line was read)
int getstring(char *string) {
if (fgets(string,STRING_LENGTH,stdin) == NULL) {
return EOF;
}
int retval = 1;
size_t sz = strlen(string);
if (sz + 1 == STRING_LENGTH && string[sz-1] != '\n') {
int c;
while((c = getchar()) != '\n' && c != EOF) {
retval = 0;
}
}
return retval;
}
More robust to pass in the size
// int getstring(char *string) {
// if (fgets(string,STRING_LENGTH,stdin) == NULL) {
int getstring(char *string, int sz) {
if (fgets(string,sz,stdin) == NULL) {
But to be the same as getline(), more changes are needed. #Barmar
OP's flush_buffer() is an infinite loop when char is unsigned and end-of-file is true. Use int
void flush_buffer(void) {
// char c;
int c;
while((c = getchar()) != '\n' && c != EOF)
/* discard */ ;
}
string[strlen(string)-1]='\0'; is undefined behavior (UB) should the first character read by fgets() is a null character - a nice little exploit to avoid.
fgets(string,STRING_LENGTH,stdin);
if (*string) {
string[strlen(string)-1]='\0';
}
See also Removing trailing newline character from fgets() input.

Related

How to accept string input only if it of certain length in C else ask user to input the string again

How to accept set of strings as input in C and prompt the user again to re-enter the string if it exceeds certain length. I tried as below
#include<stdio.h>
int main()
{
char arr[10][25]; //maximum 10 strings can be taken as input of max length 25
for(int i=0;i<10;i=i+1)
{
printf("Enter string %d:",i+1);
fgets(arr[i],25,stdin);
}
}
But here fgets accepts the strings greater than that length too.
If the user hits return, the second string must be taken as input. I'm new to C
How to accept string input only if it of certain length
Form a helper function to handle the various edge cases.
Use fgets(), then drop the potential '\n' (which fgets() retains) and detect long inputs.
Some untested code to give OP an idea:
#include <assert.h>
#include <stdio.h>
// Pass in the max string _size_.
// Return NULL on end-of-file without input.
// Return NULL on input error.
// Otherwise return the buffer pointer.
char* getsizedline(size_t sz, char *buf, const char *reprompt) {
assert(sz > 0 && sz <= INT_MAX && buf != NULL); // #1
while (fgets(buf, (int) sz, stdin)) {
size_t len = strlen(buf);
// Lop off potential \n
if (len > 0 && buf[--len] == '\n') { // #2
buf[len] = '\0';
return buf;
}
// OK if next ends the line
int ch = fgetc(stdin);
if (ch == '\n' || feof(stdin)) { // #3
return buf;
}
// Consume rest of line;
while (ch != '\n' && ch != EOF) { // #4
ch = fgetc(stdin);
}
if (ch == EOF) { // #5
return NULL;
}
if (reprompt) {
fputs(reprompt, stdout);
}
}
return NULL;
}
Uncommon: reading null characters remains a TBD issue.
Details for OP who is a learner.
Some tests for sane input parameters. A size of zero does not allow for any input saved as a null character terminated string. Buffers could be larger than INT_MAX, but fgets() cannot directly handle that. Code could be amended to handle 0 and huge buffers, yet leave that for another day.
fgets() does not always read a '\n'. The buffer might get full first or the last line before end-of-file might lack a '\n'. Uncommonly a null character might be read - even the first character hence the len > 0 test, rendering strlen() insufficient to determine length of characters read. Code would need significant changes to accommodate determining the size if null character input needs detailed support.
If the prior fgets() filled its buffer and the next read character attempt resulted in an end-of-file or '\n', this test is true and is OK, so return success.
If the prior fgetc() resulted in an input error, this loops exits immediately. Otherwise, we need to consume the rest of the line looking for a '\n' or EOF (which might be due to an end-of-file or input error.)
If EOF returned (due to an end-of-file or input error), no reason to continue. Return NULL.
Usage
// fgets(arr[i],25,stdin);
if (getsizedline(arr[i], sizeof(arr[i]), "Too long, try again.\n") == NULL) {
break;
}
This code uses a buffer slightly larger than the required max length. If a text line and the newline can't be read into the buffer, it reads the rest of the line and discards it. If it can, it again discards if too long (or too short).
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
#define INPUTS 10
#define STRMAX 25
int main(void) {
char arr[INPUTS][STRMAX+1];
char buf[STRMAX+4];
for(int i = 0; i < INPUTS; i++) {
bool success = false;
while(!success) {
printf("Enter string %d: ", i + 1);
if(fgets(buf, sizeof buf, stdin) == NULL) {
exit(1); // or sth better
}
size_t index = strcspn(buf, "\n");
if(buf[index] == '\0') { // no newline found
// keep reading until end of line
while(fgets(buf, sizeof buf, stdin) != NULL) {
if(strchr(buf, '\n') != NULL) {
break;
}
}
if(feof(stdin)) {
exit(1); // or sth better
}
continue;
}
if(index < 1 || index > STRMAX) {
continue; // string is empty or too long
}
buf[index] = '\0'; // truncate newline
strcpy(arr[i], buf); // keep this OK string
success = true;
}
}
printf("Results:\n");
for(int i = 0; i < INPUTS; i++) {
printf("%s\n", arr[i]);
}
return 0;
}
The nice thing about fgets() is that it will place the line-terminating newline character ('\n') in the input buffer. All you have to do is look for it. If it is there, you got an entire line of input. If not, there is more to read.
The strategy then, is:
fgets( s, size_of_s, stdin );
char * p = strpbrk( s, "\r\n" );
if (p)
{
// end of line was found.
*p = '\0';
return s; (the complete line of input)
}
If p is NULL, then there is more work to do. Since you wish to simply ignore lines that are too long, that is the same as throwing away input. Do so with a simple loop:
int c;
do c = getchar(); while ((c != EOF) && (c != '\n'));
Streams are typically buffered behind the scenes, either by the C Library or by the OS (or both), but even if they aren’t this is not that much of an overhead. (Use a profiler before playing “I’m an optimizing compiler”. Don’t assume bad things about the C Library.)
Once you have tossed everything you didn’t want (to EOL), make sure your input isn’t at EOF and loop to ask the user to try again.
Putting it all together
char * prompt( const char * message, char * s, size_t n )
{
while (!feof( stdin ))
{
// Ask for input
printf( "%s", message );
fflush( stdout ); // This line _may_ be necessary.
// Attempt to get an entire line of input
if (!fgets( s, n, stdin )) break;
char * p = strpbrk( s, "\r\n" );
// Success: return that line (sans newline character(s)) to the user
if (p)
{
*p = '\0';
return s;
}
// Failure: discard the remainder of the line before trying again
int c;
do c = getchar(); while ((c != EOF) && (c != '\n'));
}
// If we get this far it is because we have
// reached EOF or some other input error occurred.
return NULL;
}
Now you can use this utility function easily enough:
char user_name[20]; // artificially small
if (!prompt( "What is your name (maximum 19 characters)? ", user_name, sizeof(user_name) ))
{
complain_and_quit();
// ...because input is dead in a way you likely cannot fix.
// Feel free to check ferror(stdin) and feof(stdin) for more info.
}
This little prompt function is just an example of the kinds of helper utility functions you can write. You can do things like have an additional prompt for when the user does not obey you:
What is your name? John Jacob Jingleheimer Schmidt
Alas, I am limited to 19 characters. Please try again:
What is your name? John Schmidt
Hello John Schmidt.

How to read multiple lines of string from stdin in C?

I am a novice in C programming. Suppose I want to read multiple lines of string from stdin. How can I keep reading until a line only containing EOL?
example of input
1+2\n
1+2+3\n
1+2+3+4\n
\n (stop at this line)
It seems that when I hit enter(EOL) directly, scanf won't execute until something other than just EOL has been entered. How can I solve that problem?
I'll be really grateful if someone can help me with this. Thank you.
If you want to learn C, you should avoid scanf. The only use cases where scanf actually makes sense are in problems for which C is the wrong language. Time spent learning the foibles of scanf is not well spent, and it doesn't really teach you much about C. For something like this, just read one character at a time and stop when you see two consecutive newlines. Something like:
#include <stdio.h>
int
main(void)
{
char buf[1024];
int c;
char *s = buf;
while( (c = fgetc(stdin)) != EOF && s < buf + sizeof buf - 1 ){
if( c == '\n' && s > buf && s[-1] == '\n' ){
ungetc(c, stdin);
break;
}
*s++ = c;
}
*s = '\0';
printf("string entered: %s", buf);
return 0;
}
to read multiple lines of string from stdin. How can I keep reading until a line only containing EOL?
Keep track of when reading the beginning of the line. If a '\n' is read at the beginning, stop
getchar() approach:
bool beginning = true;
int ch;
while ((ch = getchar()) != EOF) {
if (beginning) {
if (ch == '\n') break;
}
// Do what ever you want with `ch`
beginning = ch == '\n';
}
fgets() approach - needs more code to handle lines longer than N
#define N 1024
char buf[N+1];
while (fgets(buf, sizeof buf, stdin) && buf[0] != '\n') {
; // Do something with buf
}
If you need to read one character at a time then you can with either getchar or fgetc depending upon whether or not you're reading from stdin or some other stream.
But you said you were reading strings, so I'm assuming fgets is more appropriate.
There are primarily two considerations:
maximum line length
whether or not to handle Windows versus non-Windows line endings
Even if you are a beginner--and I won't go into #2 here--you should know you can defend against it. I will at least say that if you compile on one platform and read from stdin from a redirected file from another platform, then you might have to write a defense.
#include <stdio.h>
#include <string.h>
#include <errno.h>
int main (int argc, char *argv[]) {
char buf[32]; // relatively small buf makes testing easier
int lineContinuation = 0;
// If no characters are read, then fgets returns NULL.
while (fgets(buf, sizeof(buf), stdin) != NULL) {
int l = strlen(buf); // No newline in buf if line len + newline exceeds sizeof(buf)
if (buf[l-1] == '\n') {
if (l == 1 && !lineContinuation) {
break; // errno should indicate no error.
}
printf("send line ending (len=%d) to the parser\n", l);
lineContinuation = 0;
} else {
lineContinuation = 1;
printf("send line part (len=%d) to the parser\n", l);
}
}
printf("check errno (%d) if you must handle unexpected end of input use cases\n", errno);
}

Use and explanation of getchar() function

I am writing a program to read a user input statement and extract all integers from the input. For example, if I enter "h3ll0", the program will output "30". I have used the fgets function to read the user input.
However, I am currently reading about getchar() and would like to know what would be the best way to use getchar() in my program to read user input instead of fgets. I am not really clear on how getchar() works and what situations it can be useful in.
This question is related to a project that specifically asks for getchar() as the method of reading user input. As I was unclear on how getchar() works, I built the rest of the program using fgets to ensure it was working.
#include <stdio.h>
int main()
{
char user_input[100];
int i;
int j = 0;
printf("Please enter your string: ");
fgets(user_input ,100, stdin);
for(i = 0; user_input[i] ; i++)
{
if(user_input[i] >= '0' && user_input[i] <= '9')
{
user_input[j] = user_input[i];
j++;
}
}
user_input[j] = '\0';
printf("Your output of only integers is: ");
printf("%s\n", user_input);
return 0;
}
OP: unclear on how getchar() works
int fgetc(FILE *stream) typically returns 1 of 257 different values.
"If ... a next character is present, the fgetc function obtains that character as an unsigned char converted to an int C11 §7.21.7.1 2
On end-of-file or input error (rare), EOF, is returned.
OP: to use getchar() in my program to read user input instead of fgets.
Create your own my_fgets() with the same function signature and same function as fgets() and then replace.
char *fgets(char * restrict s, int n, FILE * restrict stream);
The fgets function reads at most one less than the number of characters specified by n from the stream pointed to by stream into the array pointed to by s. No additional characters are read after a new-line character (which is retained) or after end-of-file. A null character is written immediately after the last character read into the array. C11 §7.21.7.2 2
Return the same value
The fgets function returns s if successful. If end-of-file is encountered and no characters have been read into the array, the contents of the array remain unchanged and a null pointer is returned. If a read error occurs during the operation, the array contents are indeterminate and a null pointer is returned. §7.21.7.2 3
Sample untested code
#include <stdbool.h>
#include <stdio.h>
char *my_fgets(char * restrict s, int n, FILE * restrict stream) {
bool something_read = false;
int ch = 0;
char *dest = s;
// Room ("reads at most one less") and EOF not returned?
while (n > 1 && (ch = fgetc(stream)) != EOF) {
n--;
something_read = true;
*dest++ = (char) ch;
if (ch == '\n') {
break; // "No additional characters are read after a new-line character"
}
}
// Did code end the while loop due to EOF?
if (ch == EOF) {
// Was EOF due to end-of-file or rare input error?
if (feof(stream)) {
// "If end-of-file is encountered and no characters ... read into the array ..."
if (!something_read) {
return NULL;
}
} else {
// "If a read error ..."
return NULL; // ** Note 1
}
}
// room for \0?
if (n > 0) {
*dest = '\0'; //" A null character is written immediately after the last character"
}
return s;
}
Perhaps improve fgets() and use size_t for n.
char *my_fgets(char * restrict s, size_t n, FILE * restrict stream);
fgets() with n <= 0 is not clearly defined. Using size_t, an unsigned type, at least eliminates n < 0 concerns.
Note 1: or use s = NULL; instead of return NULL; and let the remaining code null terminate the buffer. We have that option as "array contents are indeterminate".
Something like this should work as a clunky replacement to fgets using only getchar. I don't guarantee the accuracy of the error handling.
I don't think you would ever want to use getchar over fgets in an application. Getchar is more limited and less secure.
#include <stdint.h>
void your_fgets(char *buffer, size_t buffer_size)
{
int i;
size_t j;
if (buffer_size == 0)
return ;
else if (buffer_size == 1)
{
buffer[0] = '\0';
return ;
}
j = 0;
while ((i = getchar()) != EOF)
{
buffer[j++] = i;
if (j == buffer_size - 1 || i == '\n')
{
buffer[j] = '\0';
return ;
}
}
buffer[j] = '\0';
}
I am baffled by the comments on this post suggesting that fgets is easier to use. Using fgets unnecessarily complicates the issue. Just do:
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
int
main(int argc, char **argv)
{
int c;
while( ( c = getchar() ) != EOF ) {
if(isdigit(c) && (putchar(c) == EOF)) {
perror("stdout");
return EXIT_FAILURE;
}
}
return ferror(stdin);
}
There is absolutely no reason to use any additional buffering, or read the input one line at a time. Maybe you'll want to output newlines as they come in, but that would be an implementation detail that is left unspecified in the question. Either way, it's utterly trivial (if(( c == '\n' || isdigit(c)) && (putchar(c) == EOF))). Just read a character and decide if you want to output it. The logic is much easier if you don't think about the input as being more complicated than it is. (It's not line-oriented...it's just a stream of bytes.)
If, for some unknown reason you want to make this tool usable only in an interactive setting and load up your output with excess verbosity, you can easily do something like:
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
int
main(int argc, char **argv)
{
int c;
do {
int want_header = 1;
printf("Please enter your string: ");
while( ( c = getchar() ) != EOF && c != '\n' ) {
if(! isdigit(c)) {
continue;
}
if(want_header) {
want_header=0;
printf("Your output of only integers is: ");
}
if(putchar(c) == EOF) {
perror("stdout");
return EXIT_FAILURE;
}
}
if( c == '\n')
putchar(c);
want_header = 0;
} while(c == '\n');
return ferror(stdin);
}
but, please, don't do that. (Imagine if grep started by emitting a prompt that said "please enter the regex you would like to search for"!)

Dynamically created C string

I'm trying to get an expression from the user and put it in a dynamically created string. Here's the code:
char *get_exp() {
char *exp, *tmp = NULL;
size_t size = 0;
char c;
scanf("%c", &c);
while (c != EOF && c != '\n') {
tmp = realloc(exp, ++size * sizeof char);
if (tmp == NULL)
return NULL;
exp = tmp;
exp[size-1] = c;
scanf("%c", &c);
}
tmp = realloc(exp, size+1 * sizeof char);
size++;
exp = tmp;
exp[size] = '\0';
return exp;
}
However, the first character read is a newline char every time for some reason, so the while loop exits. I'm using XCode, may that be the cause of the problem?
No, XCode is not part of your problem (it is a poor workman who blames his tools).
You've not initialized exp, which is going to cause problems.
Your code to detect EOF is completely broken; you must test the return value of scanf() to detect EOF. You'd do better using getchar() with int c:
int c;
while ((c = getchar()) != EOF && c != '\n')
{
...
}
If you feel you must use scanf(), then you need to test each call to scanf():
char c;
while (scanf("%c", &c) == 1 && c != EOF)
{
...
}
You do check the result of realloc() in the loop; that's good. You don't check the result of realloc() after the loop (and you aren't shrinking your allocation); please check every time.
You should consider using a mechanism that allocates many bytes at a time, rather than one realloc() per character read; that is expensive.
Of course, if the goal is simply to read a line, then it would be simplest to use POSIX getline(), which handles all the allocation for you. Alternatively, you can use
fgets() to read the line. You might use a fixed buffer to collect the data, and then copy that to an appropriately sized dynamically allocated buffer. You would also allow for the possibility that the line is very long, so you'd check that you'd actually got the newline.
Here on Windows XP/cc, like Michael said, it works if exp is initialized to NULL.
Here's a fixed code, with comments explaining what is different from your code in the question:
char *get_exp()
{
// keep variables with narrowest scope possible
char *exp = NULL;
size_t size = 0;
// use a "forever" loop with break in the middle, to avoid code duplication
for(;;) {
// removed sizeof char, because that is defined to be 1 in C standard
char *tmp = realloc(exp, ++size);
if (tmp == NULL) {
// in your code, you did not free already reserved memory here
free(exp); // free(NULL) is allowed (does nothing)
return NULL;
}
exp = tmp;
// Using getchar instead of scanf to get EOF,
// type int required to have both all byte values, and EOF value.
// If you do use scanf, you should also check it's return value (read doc).
int ch = getchar();
if (ch == EOF) break; // eof (or error, use feof(stdin)/ferror(stdin) to check)
if (ch == '\n') break; // end of line
exp[size - 1] = ch; // implicit cast to char
}
if (exp) {
// If we got here, for loop above did break after reallocing buffer,
// but before storing anything to the new byte.
// Your code put the terminating '\0' to 1 byte beyond end of allocation.
exp[size-1] = '\0';
}
// else exp = strdup(""); // uncomment if you want to return empty string for empty line
return exp;
}

What is the easiest way to count the newlines in an ASCII file?

Which is the fastest way to get the lines of an ASCII file?
Normally you read files in C using fgets. You can also use scanf("%[^\n]"), but quite a few people reading the code are likely to find that confusing and foreign.
Edit: on the other hand, if you really do just want to count lines, a slightly modified version of the scanf approach can work quite nicely:
while (EOF != (scanf("%*[^\n]"), scanf("%*c")))
++lines;
The advantage of this is that with the '*' in each conversion, scanf reads and matches the input, but does nothing with the result. That means we don't have to waste memory on a large buffer to hold the content of a line that we don't care about (and still take a chance of getting a line that's even larger than that, so our count ends up wrong unless we got to even more work to figure out whether the input we read ended with a newline).
Unfortunately, we do have to break up the scanf into two pieces like this. scanf stops scanning when a conversion fails, and if the input contains a blank line (two consecutive newlines) we expect the first conversion to fail. Even if that fails, however, we want the second conversion to happen, to read the next newline and move on to the next line. Therefore, we attempt the first conversion to "eat" the content of the line, and then do the %c conversion to read the newline (the part we really care about). We continue doing both until the second call to scanf returns EOF (which will normally be at the end of the file, though it can also happen in case of something like a read error).
Edit2: Of course, there is another possibility that's (at least arguably) simpler and easier to understand:
int ch;
while (EOF != (ch=getchar()))
if (ch=='\n')
++lines;
The only part of this that some people find counterintuitive is that ch must be defined as an int, not a char for the code to work correctly.
Here's a solution based on fgetc() which will work for lines of any length and doesn't require you to allocate a buffer.
#include <stdio.h>
int main()
{
FILE *fp = stdin; /* or use fopen to open a file */
int c; /* Nb. int (not char) for the EOF */
unsigned long newline_count = 0;
/* count the newline characters */
while ( (c=fgetc(fp)) != EOF ) {
if ( c == '\n' )
newline_count++;
}
printf("%lu newline characters\n", newline_count);
return 0;
}
Maybe I'm missing something, but why not simply:
#include <stdio.h>
int main(void) {
int n = 0;
int c;
while ((c = getchar()) != EOF) {
if (c == '\n')
++n;
}
printf("%d\n", n);
}
if you want to count partial lines (i.e. [^\n]EOF):
#include <stdio.h>
int main(void) {
int n = 0;
int pc = EOF;
int c;
while ((c = getchar()) != EOF) {
if (c == '\n')
++n;
pc = c;
}
if (pc != EOF && pc != '\n')
++n;
printf("%d\n", n);
}
Common, why You compare all characters? It is very slow. In 10MB file it is ~3s.
Under solution is faster.
unsigned long count_lines_of_file(char *file_patch) {
FILE *fp = fopen(file_patch, "r");
unsigned long line_count = 0;
if(fp == NULL){
return 0;
}
while ( fgetline(fp) )
line_count++;
fclose(fp);
return line_count;
}
What about this?
#include <stdio.h>
#include <string.h>
#define BUFFER_SIZE 4096
int main(int argc, char** argv)
{
int count;
int bytes;
FILE* f;
char buffer[BUFFER_SIZE + 1];
char* ptr;
if (argc != 2 || !(f = fopen(argv[1], "r")))
{
return -1;
}
count = 0;
while(!feof(f))
{
bytes = fread(buffer, sizeof(char), BUFFER_SIZE, f);
if (bytes <= 0)
{
return -1;
}
buffer[bytes] = '\0';
for (ptr = buffer; ptr; ptr = strchr(ptr, '\n'))
{
++count;
++ptr;
}
}
fclose(f);
printf("%d\n", count - 1);
return 0;
}

Resources