ANSI C - how to read from stdin word by word? - c

Code:
#include <stdio.h>
int main(void) {
char i[50];
while(scanf("%s ", i)){
printf("You've written: %s \n", i);
}
printf("you have finished writing\n");
return 0;
}
One problem is that the code doesn't do as it is expected to. If I typed in:
abc def ghi.
It would output:
You've written: abc
You've written: def
How can I fix it? The goal is to read every single word from stdin until it reaches "ENTER" or a "." (dot).

#cnicutar is pretty close, but you apparently only want to start reading at something other than white-space, and want to stop reading a single word when you get to whitespace, so for you scanset, you probably want something more like:
while(scanf(" %49[^ \t.\n]%*c", i)) {
In this, the initial space skips across any leading white space. The scan-set then reads until it gets to a space, tab, new-line or period. The %*c then reads (but throws away) the next character (normally the one that stopped the scan).
This can, however, throw away a character when/if you reach the end of the buffer, so you may want to use %c, and supply a character to read into instead. That will let you recover from a single word longer than the buffer you supplied.

How about:
scanf("%49[ ^\n.]", str)
Or something like that.

Ditch scanf altogether and go with fgets:
while (fgets(i, sizeof i, stdin))
{
printf("you've written: %s\n", i);
}
with the following caveats:
If there's room in the target buffer, fgets will store the trailing newline as part of the input;
If you want to stop reading on finding a ., you'll have to add some logic to look for it in the input string, such as the following:
int foundDot = 0;
while (fgets(i, sizeof i, stdin) && !foundDot)
{
char *dot = strchr(i, '.');
char *newline = strchr(i, '\n');
if (dot != NULL)
{
foundDot = 1;
*dot = 0; // overwrite the '.' character with the 0 terminator
}
if (newline != NULL)
{
*newline = 0; // overwrite newline character with 0 terminator
}
/**
* Assuming you don't want to print a blank line if you find a dot
* all by itself.
*/
if (strlen(i) > 0)
printf("you've written: %s\n", i);
}

The easiest way to do this is with flex. Otherwise you are repeating a bunch of difficult, complex work, and are likely to make mistakes.
Also, read lex and yacc, 2nd edition.

Related

C - Ignore spaces in scanf()

I'm trying to make a simple string acquisition. What i need is to write a string from input (stdin), which can contain spaces, and save it without any spaces between words.
So far i've written this simple code which saves everything (also spaces), but i don't know how to make the scanf() ignore the spaces.
int main(){
char str[10];
scanf("%[^\n]s, str);
printf("%s", str;
}
For example:
if my input is: I love C programming! my output should be: IloveCprogramming!
I tried to use %*, used to ignore characters, but without any success.
I also know that i could "rescan" the string once is saved and remove all the spaces, but i need to do this acquisition as efficient as possible, and rescan every string to remove the spaces will increase the computational time a lot (instead of just scanning and ignoring, which has complexity of O(n))
You are using the wrong tool for the job. You need to use getc
And do the following
int ch;
char str[10];
// Loop until either loop reaches 9 (need one for null character) or EOF is reached
for (int loop = 0; loop < 9 && (ch = getc(stdin)) != EOF; ) {
if (ch != ' ' ) {
str[loop] = ch;
++loop;
}
}
str[loop] = 0;
printf("%s", str);
No re-scan required
scanf() is not useful for your purpose, indeed you do not even need a buffer to strip spaces from a line of input: just read bytes one at a time, ignore the spaces, output the others and stop at newline or EOF:
#include <stdio.h>
int main(void) {
int c;
while ((c = getchar()) != EOF) {
if (c != ' ') {
putchar(c);
}
if (c == '\n') {
break;
}
}
return 0;
}
Note also that your code has problems:
the scanf() format string is unterminated
the trailing s is incorrect, the format is simply %[^\n]
it is safer to specify the maximum number of bytes to store into the array before the null terminator: scanf("%9[^\n]", str);
you should test the return value of scanf() to avoid passing an uninitialize array to printf if the conversion fails, for example on an empty line or an empty file.
You could use scanf() as an inefficient way too read characters while ignoring white space, with char c; while (scanf(" %c", &c) == 1) { putchar(c); } but you would be unable to detect the end of line.
If interested in removing other white space from input (in addition to '') you can also incorporate the C library function isspace(.), which tests for the following standard white space characters:
' ' (0x20) space (SPC)
'\t' (0x09) horizontal tab (TAB)
'\n' (0x0a) newline (LF)
'\v' (0x0b) vertical tab (VT)
'\f' (0x0c) feed (FF)
'\r' (0x0d) carriage return (CR)
This example incorporates function using the isspace(.); library function, and provides a method to clear all standard white space from a C string.
int main(void)
{
char string[] = {"this contain's \n whitespace\t"};
int len = strlen(string);
char out[len+1];// +1 for null terminator
//(accommodates when input contains no whitespace)
int count = clean_whitespace(string, out);
return 0;
}
int clean_whitespace(const char *in, char *out)
{
int len, count=0, i;
if((in) && (out))
{
len = strlen(in);
for(i=0;i<len;i++)
{
if(!isspace(in[i]))
{
out[count++] = in[i];
}
}
out[count]=0;//add null terminator.
}
return count;
}
So far i've written this simple code which saves everything (also
spaces), but i don't know how to make the scanf() ignore the spaces.
You're coming at this from the opposite direction of what most new C programmers do. The problem is not usually to make scanf skip spaces, as it does that by default for most types of field, and in particular for %s fields. Spaces are ordinarily recognized as field delimiters, so not only are leading spaces skipped, but also spaces are not read inside fields. I presume that it is because you know this that you are using a %[ field.
But you cannot have your cake and eat it too. The field directive %[^\n] says that the data to be read consist of a run of non-newline characters. scanf will faithfully read all such characters and transfer them to the array you designate. You do not have the option to instruct scanf to avoid transferring some of the characters that you told it were part of the field.
If you want to continue to use scanf then you have two options:
remove the spaces after you read the data, or
read and transfer the space-separated pieces as separate fields.
Another answer already describes how to do the former. Here's how you might do the latter:
int main(void) {
int field_count;
do {
char str[80];
char tail;
field_count = scanf("%79[^ \n]%c", str, &tail));
if (field_count == 0) {
// No string was scanned this iteration: the first available char
// was a space or newline. Consume it, then proceed appropriately.
field_count = scanf("%c", &tail);
if (field_count != 1 || tail == '\n') {
// newline, end-of-file, or error: break out of the loop
break;
} // else it's a space -- ignore it
} else if (field_count > 0) {
// A string was scanned; print it:
printf("%s", str);
if (field_count == 2) {
// A trailing character was scanned, too; take appropriate action:
if (tail == '\n') {
break;
} else if (tail != ' ') {
putchar(tail);
} // else it is a space; ignore it
}
} // else field_count == EOF
} while (field_count != EOF);
}
Things to note:
The 79-character (maximum) field width in the scanf %79[^ \n] directive. Without a field width, there is a serious risk of overrunning your array bound (which must be at least one character longer than the field to allow for a string terminator).
[ is a field type, not a qualifier. s is a separate field type that also handles strings, but has different behavior; no s field is used here.
scanf's return value tells you how many fields were successfully scanned, which can be fewer than are described in the format string in the event that a mismatch occurs between input and format, or the end of the input is reached, or an I/O error occurs. These possibilities need to be taken into account.
In the event that the second field, %c, is in fact scanned, it allows you to determine whether the preceding string field ended because the field width was exhausted without reaching a space or newline, because a space was observed, or because a newline was observed. Each of these cases requires different handling.
Although scanf skips leading whitespace for most field types, %[ and %c fields are two of the three exceptions.
This approach skips space characters (' ') specifically; it does not skip other whitespace characters such as horizontal and vertical tabs, carriage returns, form feeds, etc.. This approach could be adapted to handle those, too, but what is presented is sufficient to demonstrate.
I'm posting this to demonstrate that it is also possible to solve this problem just with scanf.
int main() {
char a[10];
for(int i = 0; i < 10 ; i++){
scanf("%c", &a[i]);
if( a[i] == ' ')
i--;
}
}
the one above simply scans 10 characters without the spaces inbetween.
for(int i = 0; i < 9; i++){
printf("%c,", a[i]);
}
printf("%c", a[9]);
and this is the way to use if you want to replace the spaces with something else, for example: ' , '
If you want the input to consist of more characters, simply define a new variable x and change the 10 into x, and the 9's into x-1
For completeness, here's a simple version using scanf():
#include <stdio.h>
int main(void)
{
char buff[10];
int r;
r = 1;
scanf("%*[ ]");
while (r == 1) {
r = scanf("%9[^ \n]%*[ ]", buff);
if (r == 1) fputs(buff, stdout);
}
putchar('\n');
return 0;
}
What this does:
scanf("%*[ ]"): skip initial whitespace, if any (requested format: non empty sequence of spaces, without storing it in a variable), ignoring the result.
r = scanf("%9[^ \n]%*[ ]", buff): read two requested formats (explained below) and return number of successful conversions.
%9[^ \n]: requested format: up to 9 characters of contiguous text (read up to newline).
%*[ ]: requested format: non empty sequence of spaces, without storing it in a variable.
if (r == 1) fputs(buff, stdout): check if some text was read (1 successful conversion from scanf()). If it was, output it.
This is executed in a loop until a text slice cannot be read anymore. Optionally, the final \n can be read with getchar().
Example execution:
$ ./scanstring
abcd xyz aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa wwwwwwwwwwwwwwwwwwwwwwwwwwww zzzzz 1234567891011121314 !!!
abcdxyzaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaawwwwwwwwwwwwwwwwwwwwwwwwwwwwzzzzz1234567891011121314!!!
scanf() manual: https://man7.org/linux/man-pages/man3/scanf.3.html

Detecting new line in C

My code goes something like this:
char k[1000];
while(1){
scanf("%s",&k);
if(k[0] == '\n'){
exit(0);}
/* Do some processing on k */
memset(k,0,1000);
}
My intention is to process user input per normal and terminate when user inputs empty string or new line. This doesn't seem to work.
Could you guys help on what went wrong?
On related note, I also want to terminate if it is the end of file, how should I do it for EoF?
Thank you in advance for all the help.
First off -- don't use scanf for user input. It is a minefield of subtle issues just waiting to bite new C programmers, instead use a line-oriented input function like fgets or POSIX getline. Both read up to (and including) the trailing '\n' every time (as long as you provide a buffer of sufficient size for fgets -- otherwise it just keep reading blocks of characters of its buffer size until it encounters a '\n' or EOF)
So to read user input until an empty-string or EOF is encountered, you could simply do something like the following:
#include <stdio.h>
#include <string.h>
#define MAXC 1000
int main (void) {
char k[MAXC] = "";
for (;;) { /* loop until empty-string of EOF */
printf ("input: "); /* prompt for input */
if (fgets (k, MAXC, stdin)) { /* read line (MAXC chars max) */
if (*k == '\n') { /* test for empty-string */
fprintf (stderr, "empty-string! bye.\n");
break;
}
size_t l = strlen (k); /* get length of string */
if (l && k[l - 1] == '\n') /* check if last char is '\n' */
k[--l] = 0; /* overwrite with nul-terminator */
printf ("got input: %s\n", k);
}
else { /* got EOF */
fprintf (stderr, "EOF -- bye.\n");
break;
}
}
return 0;
}
Example Use/Output
>bin\fgets_user_input.exe
input: this
got input: this
input: is some
got input: is some
input: input
got input: input
input:
empty-string! bye.
>bin\fgets_user_input.exe
input: this is more
got input: this is more
input: ^Z
EOF -- bye.
>bin\fgets_user_input_cl.exe
input: it works the same
got input: it works the same
input: compiled by gcc
got input: compiled by gcc
input: or by cl.exe (VS)
got input: or by cl.exe (VS)
input:
empty-string! bye.
(note: for Linux Ctrl+d generates the EOF, I just happened to be on windoze above)
Like ever so often, the problem here is inappropriate usage of scanf(). scanf() is not for reading input but for parsing it and the format-string tells it how to parse.
In your case, %s is looking for a sequence of non-whitespace characters (IOW, a word) and it skips any leading whitespace. \n (newline) is just a whitespace character, so it is always skipped -- your scanf() will just wait for more input until it can parse %s.
For more information on scanf() pitfalls, I recommend you my beginners' guide away from scanf(). As a rule of thumb, with interactive input (which is the default), scanf() is almost always wrong.
There's another huge problem with scanf("%s", ...): It will happily overflow any buffer you provide it, as long as the input contains non-whitespace characters, just like gets() which was even removed from C for exactly that reason: Buffer overflows are extremely dangerous! Therefore always use a field-width, in your case scanf("%999s", ...). This parses a maximum of 999 characters, leaving one for the necessary 0 byte terminating a string.
But now for how to do it correctly: There are several functions in C that are indeed for reading input and one of them is for reading a line of input: fgets(). In your code, it would look like this:
char k[1000];
while(fgets(k, 1000, stdin)){
if(k[0] == '\n'){
exit(0);
}
/* Do some processing on k */
memset(k,0,1000);
}
I used your original code here, still some further remarks:
It would be better to define a macro instead of using the magic number 1000, e.g. #define INPUTSIZE 1000 and use this instead, like char k[INPUTSIZE];, fgets(k, INPUTSIZE, stdin) etc.
Clearing the whole array is not needed, so to avoid unnecessary work, replace the memset() with just k[0] = '\0'; or similar. A C string ends at the first 0 byte, so this is enough to make k hold an empty string. If your program does nothing more than shown here, you could even get rid of this completely, as the next fgets() call overwrites the array anyways (or returns NULL on error, which would stop the loop).
Also note that fgets() reads the whole line including the newline character at the end, so keep this in mind when processing the contents of k.
This one is guaranteed to give everything except newlines (and EOFs) for you:
char k[1000];
scanf("%[^\n]", k);
And when it returns, the next character is guaranteed to be either a newline, or non-existent at all (EOF reached). Get it like this:
int next_char = getcgar();
if (next_char == EOF){
your_eof_process();
}
else if (nexr_char == '\n'){
your_newline_process();
}
Personally, I would do it using only getchar():
char k[1000];
int ind, tempc;
for (ind = 0; ind < sizeof k; ind ++){
tempc = getchar();
if (tempc == '\n'){
// Some stuff
}
else if (tempc == EOF){
// Other stuff
}
else {
k[ind] = tempc;
}
}
k[sizeof(k)-1] = '\0';

Search whitespace in string inC

problem is when I try to enter a string with space compiler render that as separate 2 strings. But requirement is whenever there is a space in string don't treat it as 2 strings,but rather a single string. The program should print yes only if my four inputs are MAHIRL,CHITRA,DEVI and C. my code is:
#include<stdio.h>
#include<string.h>
int main()
{
char str1[10],str2[10],str3[10],str4[10];
scanf("%s",str1);
scanf("%s",str2);
scanf("%s",str3);
scanf("%s",str4);
if(strcmp(str1,"MAHIRL")==0 && strcmp(str2,"CHITRA")==0 && strcmp(str3,"DEVI")==0 && strcmp(str4,"C")==0 ){
printf("yes");
}
else{
printf("no");
}
return 0;
}
I tried using strtok() and strpbrk(), but I'm not quite sure how to implement them in my code. Any help or suggestion is appreciated. Thanks.
problem is when I try to enter a string with space compiler render that as separate 2 strings
That's not a problem, that's the feature / behaviour of %s format specifier with scanf(). You cannot read space-delimited input using that.
For conversion specifier s, chapter ยง7.21.6.2, C11
s Matches a sequence of non-white-space characters. [...]
So, the matching ends as soon as it hits a white-space character, here, the space.
If you have to read a line (i.e., input terminated by newline), use fgets() instead.
The %s directive matches characters up to a whitespace before storing them, so it is not possible to get lines of input this way. There are other ways to use scanf() to read lines of input, but these are error-prone, and this is really not the right tool for the job.
Better to use fgets() to fetch a line of input to a buffer, and sscanf() to parse the buffer. Since the requirement here is that four strings are entered, this is a simple problem using this method:
#include <stdio.h>
#include <string.h>
int main(void)
{
char str1[10],str2[10],str3[10],str4[10];
char buffer[100];
if (fgets(buffer, sizeof buffer, stdin) == NULL) {
fprintf(stderr, "Error in fgets()\n");
return 1;
}
if (sscanf(buffer, "%9s%9s%9s%9s", str1, str2, str3, str4) == 4) {
if (strcmp(str1,"MAHIRL") == 0 &&
strcmp(str2,"CHITRA") == 0 &&
strcmp(str3,"DEVI") == 0 &&
strcmp(str4,"C") == 0 ){
printf("yes\n");
} else {
printf("no\n");
}
} else {
printf("Input requires 4 strings\n");
}
return 0;
}
An additional character array is declared, buffer[], with enough extra space to contain extra input; this way, if the user enters some extra characters, it is less likely to interfere with the subsequent behavior of the program. Note that fgets() returns a null pointer if there is an error, so this is checked for; an error message is printed and the program exits if an error is encountered here.
Then sscanf() is used to parse buffer[]. Note here that maximum widths are specified with the %s directives to avoid buffer overflow. The fgets() function stores the newline in buffer[] (if there is room), but using sscanf() in this way avoids needing to further handle this newline character.
Also note that sscanf() returns the number of successful assignments made; if this return value is not 4, the input was not as expected and the values held by str1,..., str4 should not be used.
Update
Looking at this question again, I am not sure that I have actually answered it. At first I thought that you wanted to use scanf() to read a line of input, and extract the strings from this. But you say: "whenever there is a space in string don't treat it as 2 strings", even though none of the test input in your example code contains such spaces.
One option for reading user input containing spaces into a string would be to use a separate call to fgets() for each string. If you store the results directly in str1,...,str4 you will need to remove the newline character kept by fgets(). What may be a better approach would be to store the results in buffer again, and then to use sscanf() to extract the string, this time including spaces. This can be done using the scanset directive:
fgets(buffer, sizeof buffer, stdin);
sscanf(buffer, " %9[^\n]", str1);
The format string here contains a leading space, telling sscanf() to skip over zero or more leading whitespace characters. The %[^\n] directive tells sscanf() to match characters, including spaces, until a newline is encountered, storing them in str1[]. Note that a maximum width of 9 is specified, leaving room for the \0 terminator.
If you want to be able to enter multiple strings, each containing spaces, on the same line of user input, you will need to choose a delimiter. Choosing a comma, this can be accomplished with:
fgets(buffer, sizeof buffer, stdin);
sscanf(buffer, " %9[^,], %9[^,], %9[^,], %9[^,\n]", str1, str2, str3, str4);
Here, there is a leading space as before, to skip over any stray whitespace characters (such as \n characters) that may be in the input. The %[^,] directives tell sscanf() to match characters until a comma is encountered, storing them in the appropriate array (str1[],..., str3[]). The following , tells sscanf() to match one comma and zero or more whitespace characters before the next scanset directive. The final directive is %[^,\n], telling sscanf() to match characters until either a comma or a newline are encountered.
#include <stdio.h>
#include <string.h>
int main(void)
{
char str1[10],str2[10],str3[10],str4[10];
char buffer[100];
if (fgets(buffer, sizeof buffer, stdin) == NULL) {
fprintf(stderr, "Error in fgets()\n");
return 1;
}
/* Each individual string str1,..., str4 may contain spaces */
if (sscanf(buffer, " %9[^,], %9[^,], %9[^,], %9[^,\n]",
str1, str2, str3, str4) == 4) {
if (strcmp(str1,"test 1") == 0 &&
strcmp(str2,"test 2") == 0 &&
strcmp(str3,"test 3") == 0 &&
strcmp(str4,"test 4") == 0 ){
printf("yes\n");
} else {
printf("no\n");
}
} else {
printf("Input requires 4 comma-separated strings\n");
}
return 0;
}
Here is a sample interaction with this final program:
test 1, test 2, test 3, test 4
yes
While reading strings from user, use __fpurge(stdin) function from stdio_ext.h. This flushes out the stdin. When you are entering a string, you press enter at last, which is also a character. In order to avoid that, we use fpurge function.
scanf("%s",str1);
__fpurge(stdin);
scanf("%s",str2);
__fpurge(stdin);
scanf("%s",str3);
__fpurge(stdin);
scanf("%s",str4);
__fpurge(stdin);
Also if you want to input a string from user containing spaces, use following:
scanf("%[^\n]", str1);
This will not ignore the spaces you enter while inputting string.
EDIT: Instead of using fpurge function, one can use following code:
while( getchar() != '\n' );

using fscanf to not read in logn blank lines after reading in a sentence

while((fscanf(datafile, " %127[^;] %[^\n]", name, movie)) == 2) {
printf("%s\n", movie);
for(i=0; i<=strlen(movie); i++) {
if(movie[i]!='\0'){
printf("%c", movie[i]);
} else {
printf("44 %d", i);
break;
}
}
printf("%d\n", strlen(movie));
break;
insert_tree(tree, name, movie);
}
i have this code
fscanf reads in all the strings after semicolon but it also reads in long blank spaces after a sentence has ended in the file
how can i make this stop at just the right point??
Can't. To read a line including spaces, yet stop when the lines ends with spaces requires knowing that the trailing spaces exist without reading them.
Instead, read the line and post-process it.
char buf[127+1+1];
fgets(buf, sizeof buf, stdin);
buf[strcspn(buf, "\n")] = '\0'; // Drop potential \n
// get rid of trailing ' '
/// This is inefficient, but shown for simplicity - certainly OP can do better.
char *p;
while ((p = strrchr(buf, ' ')) != NULL) *p = '\0';
Firstly, never use scanf or fscanf. Read a line from the file and then use sscanf on that. This way your code won't go berserk on input errors and throw away large amounts of data. It'll also generally simplify your format string.
As to the long blank spaces, I'd suggest you just looked for the last non-blank space in the strings and terminated the string there. Sadly you'll have to do this by hand as there's no standard library function.
It'd be helpful to know the format of these lines. If it's just 'name;movie' (and you're guaranteed not to have ; in the name or the movie part) you could use strchr to find the separator

How to take inputs for strings one by one in C

I have to take inputs like below, and print the same (only the sentences):
2
I can't believe this is a sentence.
aarghhh... i don't see this getting printed.
Digit 2 shows the number of lines to be followed (2 lines after this here).
I used all the options scanf and fgets with various regex used.
int main() {
int t;
char str[200];
scanf ("%d", &t);
while (t > 0){
/*
Tried below three, but not getting appropriate outputs
The output from the printf(), should have been:
I can't believe this is a sentence.
aarghhh... i don't see this getting printed.
*/
scanf ("%[^\n]", str);
//scanf("%200[0-9a-zA-Z ]s", str);
//fgets(str, 200, stdin);
printf ("%s\n", str);
t--;
}
}
I am sorry, i have searched all related posts, but I am not able to find any answer to this:
All versions of scanf() produce no results, and fgets() prints only the first sentence.
Thanks in advance.
You should just use fgets(). Remember that it will keep the linefeed, so you might want to remove that manually after reading the line:
if(scanf("%d", &t) == 1)
{
while(t > 0)
{
if(fgets(str, sizeof str, stdin) != NULL)
{
const size_t len = strlen(str);
str[len - 1] = '\0';
printf("You said '%s'\n", str);
--t;
}
else
printf("Read failed, weird.\n");
}
}
To make it easier let's say the input is "2\none\ntwo\n".
When you start your program, before the first scanf() the input buffer has all of it and points to the beginning
2\none\ntwo
^
After the first scanf(), the "2" is consumed leaving the input buffer as
2\none\ntwo
^^
And now you attempt to read everything but a newline ... but the first thing in the buffer is a newline, so nothing gets read.
Suggestion: always use fgets() to read full lines, and then parse the input as you think is better.
To use regex in C you must include regex.h. In this case, you do not need regex. Where you have "%[^\n]", replace it with "%s". Make sure that you include stdio.h.

Resources