I read a snippet of code from C Primer Plus, and tried hard to understand *find = '\0';
#include <stdio.h>
#include <string.h>
char *s_gets(char *st, int n);
struct book {
char title[40];
char author[40];
float value;
}
int main(void) {
...
}
char *s_gets(char *st, int n) {
char *ret_val;
char *find;
ret_val = fgets(st, n, stdin);
if (ret_val) {
find = strchr(st, '\n'); //look for newline
if (find) // if address is not null
*find = '\0'; //place a null character there
else
while (getchar() != '\n')
continue; //dispose rest of line
}
return ret_val;
}
For what purpose should find = strchr(st, '\n'); be followed by *find = '\0';
I searched strchr but found it an odd name although could get idea about it's function. Does the name strchr come from stringcharacter?
The code using find = strchr(s, '\n') and what follows zaps the newline that was read by fgets() and included in the result string, if there is indeed a newline in the string. Often, you can use an alternative, more compact, notation:
s[strcspn(s, "\n")] = '\0';
which is written without any conditional code visible. (If there's no newline, the null byte overwrites the existing null byte.)
The overall objective seems to be to make s_gets() behave more like an antique, dangerous and no longer standard function, gets(), which reads up to and including a newline, but does not include the newline in the result. The gets() function has other design flaws which make it a function to be forgotten — never use it!
The code shown also detects when no newline was read and then goes into a dangerous loop to read the rest of the line. The loop should be:
else
{
int c;
while ((c = getchar()) != EOF && c != '\n')
;
}
It is important to detect EOF; not all files end with a newline. It is also important to detect EOF reliably, which means this code has to use int c (whereas the original flawed loop could avoid using a variable like c). If this code carelessly used char c instead of int c, it could either fail to detect EOF altogether (if plain char is an unsigned type) or it could give a false positive for EOF when the data being read contains a byte with value 0xFF (if plain char is a signed type).
Note that using strcspn() directly as shown is not an option in this code because then you can't detect whether there was a newline in the data; you merely know there is no newline in the data after the call. As Antti Haapala points out, you could capture the result of strcspn() and then decide whether a newline was found and therefore whether to read to the end of line (or end of file if there is no EOL before the EOF).
Related
I've written the following function to dynamically allocate an input string while typing, without asking the user how many characters it's long.
#include<stdio.h>
#include<stdlib.h>
char* dyninp (char str[], int *n) {
char ch;
int i = 0;
do {
ch = getchar();
str = (char *) realloc(str, (i+1)*sizeof(char));
str[i] = ch;
i++;
} while (ch != '\n');
/*substitute '\n' with '\0'*/
str[i-1] = '\0';
if (n != NULL) {
/*i contains the total lenght, including '\0'*/
*n = i-1;
}
/*because realloc changes array's address*/
return str;
}
/*it is called in this way:
char *a;
int n;
a = dyninp (a, &n);
*/
This code works, but I have some questions about it:
Why does it work?
I don't understand why, when I execute it, I can also delete characters before pressing enter. The getchar() function reads only one character at each iteration, which is written into the array, so how could I delete some ones?
If getchar() deletes the previous character when receives '\127', then the loop should continue executing as with any other character. But this doesn't happen, because, when loop ends, "i" always contains the exact number of elements.
Is my code efficient? If it's not, how could I make it better (even using built-in functions)?
Unless you put the terminal in "raw" mode, the operating system doesn't make input available to the application until you press return. Input editing is handled by the operating system. When you call getchar(), it reads a character from this input buffer, not directly from the terminal.
There's a POSIX function getline() that does the same thing.
Answer for the first question:
The reason for this is probably buffering in the terminal driver. A short explanation is provided here in the notes section. The getchar() function does not receive any input until a line has been committed by the user in the terminal when it receives the whole string and returns it character by character. Deletion is therefore a feature of your terminal, not your program.
I have written a small script to detect the full value from the user input with the getchar() function in C. As getchar() only returns the first character i tried to loop through it... The code I have tried myself is:
#include <stdio.h>
int main()
{
char a = getchar();
int b = strlen(a);
for(i=0; i<b; i++) {
printf("%c", a[i]);
}
return 0;
}
But this code does not give me the full value of the user input.
You can do looping part this way
int c;
while((c = getchar()) != '\n' && c != EOF)
{
printf("%c", c);
}
getchar() returns int, not char. And it only returns one char per iteration. It returns, however EOF once input terminates.
You do not check for EOF (you actually cannot detect that instantly when getchar() to char).
a is a char, not an array, neither a string, you cannot apply strlen() to it.
strlen() returns size_t, which is unsigned.
Enable most warnings, your compiler wants to help you.
Sidenote: char can be signed or unsigned.
Read a C book! Your code is soo broken and you confused multiple basic concepts. - no offense!
For a starter, try this one:
#include <stdio.h>
int main(void)
{
int ch;
while ( 1 ) {
ch = getchar();
x: if ( ch == EOF ) // done if input terminated
break;
printf("%c", ch); // %c takes an int-argument!
}
return 0;
}
If you want to terminate on other strings, too, #include <string.h> and replace line x: by:
if ( ch == EOF || strchr("\n\r\33", ch) )
That will terminate if ch is one of the chars listed in the string literal (here: newline, return, ESCape). However, it will also match ther terminating '\0' (not sure if you can enter that anyway).
Storing that into an array is shown in good C books (at least you will learn how to do it yourself).
Point 1: In your code, a is not of array type. you cannot use array subscript operator on that.
Point 2: In your code, strlen(a); is wrong. strlen() calculates the length of a string, i.e, a null terminated char array. You need to pass a pointer to a string to strlen().
Point 3: getchar() does not loop for itself. You need to put getchar() inside a loop to keep on reading the input.
Point 4: getchar() retruns an int. You should change the variable type accordingly.
Point 5: The recommended signature of main() is int main(void).
Keeping the above points in mind,we can write a pesudo-code, which will look something like
#include <stdio.h>
#define MAX 10
int main(void) // nice signature. :-)
{
char arr[MAX] = {0}; //to store the input
int ret = 0;
for(int i=0; i<MAX; i++) //don't want to overrrun array
{
if ( (ret = getchar())!= EOF) //yes, getchar() returns int
{
arr[i] = ret;
printf("%c", arr[i]);
}
else
;//error handling
}
return 0;
}
See here LIVE DEMO
getchar() : get a char (one character) not a string like you want
use fgets() : get a string or gets()(Not recommended) or scanf() (Not recommended)
but first you need to allocate the size of the string : char S[50]
or use a malloc ( #include<stdlib.h> ) :
char *S;
S=(char*)malloc(50);
It looks like you want to read a line (your question mentions a "full value" but you don't explain what that means).
You might simply use fgets for that purpose, with the limitation that you have to provide a fixed size line buffer (and handle - or ignore - the case when a line is larger than the buffer). So you would code
char linebuf[80];
memset (linebuf, 0, sizeof(linbuf)); // clear the buffer
char* lp = fgets(linebuf, sizeof(linebuf), stdin);
if (!lp) {
// handle end-of-file or error
}
else if (!strchr(lp, '\n')) {
/// too short linebuf
}
If you are on a POSIX system (e.g. Linux or MacOSX), you could use getline (which dynamically allocates a buffer). If you want some line edition facility on Linux, consider also readline(3)
Avoid as a plague the obsolete gets
Once you have read a line into some buffer, you can parse it (e.g. using manual parsing, or sscanf -notice the useful %n conversion specification, and test the result count of sscanf-, or strtol(3) -notice that it can give you the ending pointer- etc...).
I am really desperate trying to figure out how can I read char with value -1/255 because for most functions this means EOF. For example if I enter all characters from extended ASCII from low to high (decimal value) I end up with -1/255 which is EOF so I will not get it to array. I created small code to express my problem.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define BUFFERSIZE 1024
int main(void){
char c;
unsigned int *array = (unsigned int*)calloc(BUFFERSIZE,sizeof(unsigned int)), i = 0;
while (1){
c = fgetc(stdin);
if (c == EOF)
break;
array[i] = c;
i++;
}
array[i] = 0;
unsigned char *string = (unsigned char *)malloc(i);
for(int j = 0;j < i;j++)
string[j] = array[j];
free(array);
//working with "string"
return 0;
}
I could mode
if (c == EOF)
break;
like this
c = fgetc(stdin);
array[i] = c;
i++;
if (c == EOF)
break;
but ofcourse, program will read control character that user input from keyboard too (for example Ctrl+D - Linux). I tried opening stdin as binary but I found out that posix systems carries all files as binary. I am using QT, GCC and Ubuntu. I tried fread, read, fgets but I ended up the same. Simply said, I need to read everything I enter on stdin and put it into char array except when I enter control character (Ctrl+D) to end reading. Any advices appreciated.
Edit: As noted in the comment by #TonyB you should not declare c as char because fgetc() returns int, so changing it to int c; should make it possible to store EOF in c.
I didn't see that you declared c as char c; so all the credit goes to #TonyB.
Original Answer: Although the problem was addressed in the comments, and I added the solution to this answer, I think you are confused, EOF is not a character, it's a special value returned by some I/O functions to indicate the end of an stream.
You should never assume that it's vaule is -1, it often is but there is a macro for a reason, so you should always rely on the fact that these functions return EOF not -1.
Since there is no ascii representation for the value -1 you can't input that as a character, you can however parse the input string {'-', '1', '\0'}, and convert it to a number if you need to.
Also, Do not cast the return value of malloc().
I am trying to write a function that does the following things:
Start an input loop, printing '> ' each iteration.
Take whatever the user enters (unknown length) and read it into a character array, dynamically allocating the size of the array if necessary. The user-entered line will end at a newline character.
Add a null byte, '\0', to the end of the character array.
Loop terminates when the user enters a blank line: '\n'
This is what I've currently written:
void input_loop(){
char *str = NULL;
printf("> ");
while(printf("> ") && scanf("%a[^\n]%*c",&input) == 1){
/*Add null byte to the end of str*/
/*Do stuff to input, including traversing until the null byte is reached*/
free(str);
str = NULL;
}
free(str);
str = NULL;
}
Now, I'm not too sure how to go about adding the null byte to the end of the string. I was thinking something like this:
last_index = strlen(str);
str[last_index] = '\0';
But I'm not too sure if that would work though. I can't test if it would work because I'm encountering this error when I try to compile my code:
warning: ISO C does not support the 'a' scanf flag [-Wformat=]
So what can I do to make my code work?
EDIT: changing scanf("%a[^\n]%*c",&input) == 1 to scanf("%as[^\n]%*c",&input) == 1 gives me the same error.
First of all, scanf format strings do not use regular expressions, so I don't think something close to what you want will work. As for the error you get, according to my trusty manual, the %a conversion flag is for floating point numbers, but it only works on C99 (and your compiler is probably configured for C90)
But then you have a bigger problem. scanf expects that you pass it a previously allocated empty buffer for it to fill in with the read input. It does not malloc the sctring for you so your attempts at initializing str to NULL and the corresponding frees will not work with scanf.
The simplest thing you can do is to give up on n arbritrary length strings. Create a large buffer and forbid inputs that are longer than that.
You can then use the fgets function to populate your buffer. To check if it managed to read the full line, check if your string ends with a "\n".
char str[256+1];
while(true){
printf("> ");
if(!fgets(str, sizeof str, stdin)){
//error or end of file
break;
}
size_t len = strlen(str);
if(len + 1 == sizeof str){
//user typed something too long
exit(1);
}
printf("user typed %s", str);
}
Another alternative is you can use a nonstandard library function. For example, in Linux there is the getline function that reads a full line of input using malloc behind the scenes.
No error checking, don't forget to free the pointer when you're done with it. If you use this code to read enormous lines, you deserve all the pain it will bring you.
#include <stdio.h>
#include <stdlib.h>
char *readInfiniteString() {
int l = 256;
char *buf = malloc(l);
int p = 0;
char ch;
ch = getchar();
while(ch != '\n') {
buf[p++] = ch;
if (p == l) {
l += 256;
buf = realloc(buf, l);
}
ch = getchar();
}
buf[p] = '\0';
return buf;
}
int main(int argc, char *argv[]) {
printf("> ");
char *buf = readInfiniteString();
printf("%s\n", buf);
free(buf);
}
If you are on a POSIX system such as Linux, you should have access to getline. It can be made to behave like fgets, but if you start with a null pointer and a zero length, it will take care of memory allocation for you.
You can use in in a loop like this:
#include <stdlib.h>
#include <stdio.h>
#include <string.h> // for strcmp
int main(void)
{
char *line = NULL;
size_t nline = 0;
for (;;) {
ptrdiff_t n;
printf("> ");
// read line, allocating as necessary
n = getline(&line, &nline, stdin);
if (n < 0) break;
// remove trailing newline
if (n && line[n - 1] == '\n') line[n - 1] = '\0';
// do stuff
printf("'%s'\n", line);
if (strcmp("quit", line) == 0) break;
}
free(line);
printf("\nBye\n");
return 0;
}
The passed pointer and the length value must be consistent, so that getline can reallocate memory as required. (That means that you shouldn't change nline or the pointer line in the loop.) If the line fits, the same buffer is used in each pass through the loop, so that you have to free the line string only once, when you're done reading.
Some have mentioned that scanf is probably unsuitable for this purpose. I wouldn't suggest using fgets, either. Though it is slightly more suitable, there are problems that seem difficult to avoid, at least at first. Few C programmers manage to use fgets right the first time without reading the fgets manual in full. The parts most people manage to neglect entirely are:
what happens when the line is too large, and
what happens when EOF or an error is encountered.
The fgets() function shall read bytes from stream into the array pointed to by s, until n-1 bytes are read, or a is read and transferred to s, or an end-of-file condition is encountered. The string is then terminated with a null byte.
Upon successful completion, fgets() shall return s. If the stream is at end-of-file, the end-of-file indicator for the stream shall be set and fgets() shall return a null pointer. If a read error occurs, the error indicator for the stream shall be set, fgets() shall return a null pointer...
I don't feel I need to stress the importance of checking the return value too much, so I won't mention it again. Suffice to say, if your program doesn't check the return value your program won't know when EOF or an error occurs; your program will probably be caught in an infinite loop.
When no '\n' is present, the remaining bytes of the line are yet to have been read. Thus, fgets will always parse the line at least once, internally. When you introduce extra logic, to check for a '\n', to that, you're parsing the data a second time.
This allows you to realloc the storage and call fgets again if you want to dynamically resize the storage, or discard the remainder of the line (warning the user of the truncation is a good idea), perhaps using something like fscanf(file, "%*[^\n]");.
hugomg mentioned using multiplication in the dynamic resize code to avoid quadratic runtime problems. Along this line, it would be a good idea to avoid parsing the same data over and over each iteration (thus introducing further quadratic runtime problems). This can be achieved by storing the number of bytes you've read (and parsed) somewhere. For example:
char *get_dynamic_line(FILE *f) {
size_t bytes_read = 0;
char *bytes = NULL, *temp;
do {
size_t alloc_size = bytes_read * 2 + 1;
temp = realloc(bytes, alloc_size);
if (temp == NULL) {
free(bytes);
return NULL;
}
bytes = temp;
temp = fgets(bytes + bytes_read, alloc_size - bytes_read, f); /* Parsing data the first time */
bytes_read += strcspn(bytes + bytes_read, "\n"); /* Parsing data the second time */
} while (temp && bytes[bytes_read] != '\n');
bytes[bytes_read] = '\0';
return bytes;
}
Those who do manage to read the manual and come up with something correct (like this) may soon realise the complexity of an fgets solution is at least twice as poor as the same solution using fgetc. We can avoid parsing data the second time by using fgetc, so using fgetc might seem most appropriate. Alas most C programmers also manage to use fgetc incorrectly when neglecting the fgetc manual.
The most important detail is to realise that fgetc returns an int, not a char. It may return typically one of 256 distinct values, between 0 and UCHAR_MAX (inclusive). It may otherwise return EOF, meaning there are typically 257 distinct values that fgetc (or consequently, getchar) may return. Trying to store those values into a char or unsigned char results in loss of information, specifically the error modes. (Of course, this typical value of 257 will change if CHAR_BIT is greater than 8, and consequently UCHAR_MAX is greater than 255)
char *get_dynamic_line(FILE *f) {
size_t bytes_read = 0;
char *bytes = NULL;
do {
if ((bytes_read & (bytes_read + 1)) == 0) {
void *temp = realloc(bytes, bytes_read * 2 + 1);
if (temp == NULL) {
free(bytes);
return NULL;
}
bytes = temp;
}
int c = fgetc(f);
bytes[bytes_read] = c >= 0 && c != '\n'
? c
: '\0';
} while (bytes[bytes_read++]);
return bytes;
}
I'm looking to have fscanf identify when a potential overflow happens, and I can't wrap my head around how best to do it.
For example, for a file containing the string
**a**bb**cccc**
I do a
char str[10];
while (fscanf(inputf, "*%10[^*]*", str) != EOF) {
}
because I'm guaranteed that what is between ** and ** is usually less than 10. But sometimes I might get a
**a**bb**cccc*
(without the last *) or even potentially a buffer overflow.
I considered using
while (fscanf(inputf, "*%10[^*]", str) != EOF) {
}
(without the last *) or even
while (fscanf(inputf, "*%10s*", str) != EOF) {
}
but that would return the entire string. I tried seeing if I could check for the presence or lack of a *, but I can't get that to work. I've also seen implementation of fgets, but I'd rather not make it complicated. Any ideas?
While fscanf() seems to have been designed as a general purpose expression parser, few programmers rely on that ability. Instead, use fgets() to read a text line, and then use a parser of your choosing or design to dissect the text buffer.
Using the full features of fgets() is dodgy on different implementations and doesn't always provide full functionality, nor even get those implemented right.
I'm not clear on exactly what you want. Is it to skip over any number of stars, and then read up to 9 non-star characters into a buffer? If so, try this:
void read_field(FILE *fin, char buf[10])
{
int c;
char *ptr = buf;
while ((c = getc(fin)) == '*')
/*continue*/;
while (c != '*' && c != EOF && ptr < buf+9)
{
*ptr++ = c;
c = getc(fin);
}
*ptr = '\0';
/* skip to next star here? */
}
You will note that I am not using fscanf. That is because fscanf is nearly always more trouble than it's worth. The above is more typing, but I can be confident that it does what I described it as doing.