How to write and read (including spaces) from text file - c

I'm using fscanf and fprintf.
I tried to delimit the strings on each line by \t and to read it like so:
fscanf(fp,"%d\t%s\t%s",&t->num,&t->string1,&t->string2);
The file contents:
1[TAB]string1[TAB]some string[NEWLINE]
It does not read properly. If I printf("%d %s %s",t->num,t->string1,t->string2) I get:
1 string1 some
Also I get this compile warning:
warning: format specifies type 'char *' but the argument has type 'char (*)[15]' [-Wformat]
How can I fix this without using binary r/w?

I'm guessing the space in "some string" is the problem. fscanf() reading a string using %s stops at the first whitespace character. To include spaces, use something like:
fscanf(fp, "%d\t%[^\n\t]\t%[^\n\t]", &t->num, &t->string1, &t->string2);
See also a reference page for fscanf() and/or another StackOverflow thread on reading tab-delimited items in C.
[EDIT in response to your edit: You seem to also have a problem with the arguments you're passing into fscanf(). You will need to post the declarations of t->string1 to be sure, but it looks like string1 is an array of characters, and therefore you should remove the & from the fscanf() call...]

The %s conversion specification stops reading at the first white space, and tabs and blanks both count as white space.
If you want to read a string of non-tabs, you can use a 'scan set' conversion specifier:
if (fscanf(fp, "%d\t%[^\t\n]\t%[^\t\n]", &t->num, t->string1, t->string2) != 3)
...oops - format error in input data...
(I'd lay odds that omitting the & from the string arguments is correct.) The question was edited; I win. Dropping the & is necessary to avoid the compiler warning!
This still doesn't quite do what you expect. If there are blanks at the start of the second field, they'll be eaten by the \t in the format string. Any white space in the format string eats any white space (including newlines) in the input. The %[^\t] conversion specification won't get started until there's a character that isn't white space in the input. I'm also assuming you want your input limited by newlines. You can leave out the \n characters if you prefer.
Note that I checked that the fscanf() interpreted 3 fields. It is important to error check your inputs.
If you really want control, you should probably read whole lines with fgets() and then use sscanf() to parse the data.
About fgets() and sscanf(); can you expand about how it will give more control?
Suppose the input data is written
1234
a string with spaces
another string
spread out over multiple lines like that. With raw fscanf(), this will be acceptable input even though it is spread over 9 lines of input. With fgets(), you can read a single line, and then analyze it with sscanf(), and you'll know that the first line was not in the correct format. You can then decide what to do about it.
Also, since mafso called me on it in his comment, we should ensure that there are no buffer overflows by limiting the size of the strings that the scan sets match.
if (fscanf(fp, "%d\t%14[^\t\n]\t%14[^\t\n]", &t->num, t->string1, t->string2) != 3)
...oops - format error in input data...
I'm using the error message about char (*)[15] to deduce that 14 is the correct number to use. Note that unlike printf(), you can't specify the sizes via * notation (in the scanf()-family, * supresses assignment), so you have to create the format with the correct sizes. Further, the size you specify is the number of characters before the terminating null byte, so if the array is of size 15, the size you specify in the format string is 14, as shown.

Related

Why is this program not printing the input I provided? (C)

Code I have:
int main(){
char readChars[3];
puts("Enter the value of the card please:");
scanf(readChars);
printf(readChars);
printf("done");
}
All I see is:
"done"
after I enter some value to terminal and pressing Enter, why?
Edit:
Isn't the prototype for scanf:
int scanf(const char *format, ...);
So I should be able to use it with just one argument?
The actual problem is that you are passing an uninitialized array as the format to scanf().
Also you are invoking scanf() the wrong way try this
if (scanf("%2s", readChars) == 1)
printf("%s\n", readChars);
scanf() as well as printf() use a format string and that's actually the cause for the f in their name.
And yes you are able to use it with just one argument, scanf() scans input according to the format string, the format string uses special values that are matched against the input, if you don't specify at least one then scanf() will only be useful for input validation.
The following was extracted from C11 draft
7.21.6.2 The fscanf function
The format shall be a multibyte character sequence, beginning and ending in its initial shift state. The format is composed of zero or more directives: one or more white-space characters, an ordinary multibyte character (neither % nor a white-space character), or a conversion specification. Each conversion specification is introduced by the character %. After the %, the following appear in sequence:
An optional assignment-suppressing character *.
An optional decimal integer greater than zero that specifies the maximum field width
(in characters).
An optional length modifier that specifies the size of the receiving object.
A conversion specifier character that specifies the type of conversion to be applied.
as you can read above, you need to pass at least one conversion specifier, and in that case the corresponding argument to store the converted value, if you pass the conversion specifier but you don't give an argument for it, the behavior is undefined.
Yes, it is possible to call scanf with just one parameter, and it may even be useful on occasion. But it wouldn't do what you apparently thought it would. (It would just expect the characters in the argument in the input stream and skip them.) You didn't notice because you failed to do due diligence as a programmer. I'll list what you should do:
RTFM. scanf's first parameter is a format string. Plain characters which are not part of conversion sequences and are not whitespace are expected literally in the input. They are read and discarded. If they do not appear, conversion stops there, and the position in the input stream where the unexpected character occured is the start of subsequent reads. In your case probably no character was ever successfully read from the input, but you don't know for sure, because you didn't initialize the format string (see below).
Another interesting detail is scanf's return value which indicates the number items successfully read. I'll discuss that below together with the importance to check return values.
Initialize locals. C doesn't automatically initialize local data for performance reasons (in today's light one would probably enforce user initialization like other languages do, or make auto initialization a default with an opt-out possibility for the few inner loops where it would hurt). Because you didn't initialize readchars, you don't know what's in it, so you don't know what scanf expected in the input stream. On top it probably is nominally undefined behaviour. (But on your PC it shouldn't do anything unexpected.)
Check return values. scanf probably returned 0 in your example. The manual states that scanf returns the number of items successfully read, here 0, i.e. no input conversion took place. This type of undetected failure can be fatal in long sequences of read operations because the following scanfs may read in one-off indexes from a sequence of tokens, or may stall as well (and not update their pointees at all), etc.
Please bear with me -- I do not always read the manual, check return values or (by error) initialize variables for little test programs. But if it doesn't work, it's part of my investigation. And before I ask anybody, let alone the world, I make damn sure that I have done my best to find out what I did wrong, beforehand.
You're not using scanf correctly:
scanf(formatstring, address_of_destination,...)
is the right way to do it.
EDIT:
Isn't the prototype for scanf:
int scanf(const char *format, ...);
So I should be able to use it with just one argument?
No, you should not. Please read documentation on scanf; format is a string specifying what scanf should read, and the ... are the things that scanf should read into.
The first argument to scanf is the format string. What you need is:
scanf("%2s", readChars);
It Should provided Format specifiers in scanf function
char readChars[3];
puts("Enter the value of the card please:");
scanf("%s",readChars);
printf("%s",readChars);
printf("done");
http://www.cplusplus.com/reference/cstdio/scanf/ more info...

scanf("%[^\n]s",a) vs gets(a)

I have been told that scanf should not be used when user inputs a string. Instead, go for gets() by most of the experts and also the users on StackOverflow. I never asked it on StackOverflow why one should not use scanf over gets for strings. This is not the actual question but answer to this question is greatly appreciated.
Now coming to the actual question. I came across this type of code -
scanf("%[^\n]s",a);
This reads a string until user inputs a new line character, considering the white spaces also as string.
Is there any problem if I use
scanf("%[^\n]s",a);
instead of gets?
Is gets more optimized than scanf function as it sounds, gets is purely dedicated to handle strings. Please let me know about this.
Update
This link helped me to understand it better.
gets(3) is dangerous and should be avoided at all costs. I cannot envision a use where gets(3) is not a security flaw.
scanf(3)'s %s is also dangerous -- you must use the "field width" specifier to indicate the size of the buffer you have allocated. Without the field width, this routine is as dangerous as gets(3):
char name[64];
scanf("%63s", name);
The GNU C library provides the a modifier to %s that allocates the buffer for you. This non-portable extension is probably less difficult to use correctly:
The GNU C library supports a nonstandard extension that
causes the library to dynamically allocate a string of
sufficient size for input strings for the %s and %a[range]
conversion specifiers. To make use of this feature, specify
a as a length modifier (thus %as or %a[range]). The caller
must free(3) the returned string, as in the following
example:
char *p;
int n;
errno = 0;
n = scanf("%a[a-z]", &p);
if (n == 1) {
printf("read: %s\n", p);
free(p);
} else if (errno != 0) {
perror("scanf");
} else {
fprintf(stderr, "No matching characters\n"):
}
As shown in the above example, it is only necessary to call
free(3) if the scanf() call successfully read a string.
Firstly, it is not clear what that s is doing in your format string. The %[^\n] part is a self-sufficient format specifier. It is not a modifier for %s format, as you seem to believe. This means that "%[^\n]s" format string will be interpreted by scanf as two independent format specifiers: %[^\n] followed by a lone s. This will direct scanf to read everything until \n is encountered (leaving \n unread), and then require that the next input character is s. This just doesn't make any sense. No input will match such self-contradictory format.
Secondly, what was apparently meant is scanf("%[^\n]", a). This is somewhat close to [no longer available] gets (or fgets), but it is not the same. scanf requires that each format specifiers matches at least one input character. scanf will fail and abort if it cannot match any input characters for the requested format specifier. This means that scanf("%[^\n]",a) is not capable of reading empty input lines, i.e. lines that contain \n character immediately. If you feed such a line into the above scanf, it will return 0 to indicate failure and leave a unchanged. That's very different from how typical line-based input functions work.
(This is a rather surprising and seemingly illogical properly of %[] format. Personally, I'd prefer %[] to be able to match empty sequences and produce empty strings, but that's not how standard scanf works.)
If you want to read the input in line-by-lane fashion, fgets is your best option.

How To Read in Strings that only Contain Alphabet letters with fscanf?

I have been struggling to figure out the fscanf formatting. I just want to read in a file of words delimited by spaces. And I want to discard any strings that contain non-alphabetic characters.
char temp_text[100];
while(fscanf(fcorpus, "%101[a-zA-Z]s", temp_text) == 1) {
printf("%s\n", temp_text);
}
I've tried the above code both with and without the 's'. I read in another stackoverflow thread that the s when used like that will be interpreted as a literal 's' and not as a string. Either way - when I include the s and when I do not include the s - I can only get the first word from the file I am reading through to print out.
The %[ scan specifier does not skip leading spaces. Either add a space before it or at the end in place of your s. Also you have your 100 and 101 backwards and thus a serious buffer overflow bug.
The s isn't needed.
Here are a few things to try:
Print out the return value from fscanf, and make sure it is 1.
Make sure that the fscanf is consuming the whitespace by using fgetc to get the next character and printing it out.

Using scanf to read in certain amount of characters in C?

I am having trouble accepting input from a text file. My program is supposed to read in a string specified by the user and the length of that string is determined at runtime. It works fine when the user is running the program (manually inputting the values) but when I run my teacher's text file, it runs into an infinite loop.
For this example, it fails when I am taking in 4 characters and his input in his file is "ABCDy". "ABCD" is what I am supposed to be reading in and 'y' is supposed to be used later to know that I should restart the game. Instead when I used scanf to read in "ABCD", it also reads in the 'y'. Is there a way to get around this using scanf, assuming I won't know how long the string should be until runtime?
Normally, you'd use something like "%4c" or "%4s" to read a maximum of 4 characters (the difference is that "%4c" reads the next 4 characters, regardless, while "%4s" skips leading whitespace and stops at a whitespace if there is one).
To specify the length at run-time, however, you have to get a bit trickier since you can't use a string literal with "4" embedded in it. One alternative is to use sprintf to create the string you'll pass to scanf:
char buffer[128];
sprintf(buffer, "%%%dc", max_length);
scanf(buffer, your_string);
I should probably add: with printf you can specify the width or precision of a field dynamically by putting an asterisk (*) in the format string, and passing a variable in the appropriate position to specify the width/precision:
int width = 10;
int precision = 7;
double value = 12.345678910;
printf("%*.*f", width, precision, value);
Given that printf and scanf format strings are quite similar, one might think the same would work with scanf. Unfortunately, this is not the case--with scanf an asterisk in the conversion specification indicates a value that should be scanned, but not converted. That is to say, something that must be present in the input, but its value won't be placed in any variable.
Try
scanf("%4s", str)
You can also use fread, where you can set a read limit:
char string[5]={0};
if( fread(string,(sizeof string)-1,1,stdin) )
printf("\nfull readed: %s",string);
else
puts("error");
You might consider simply looping over calls to getc().

fscanf problem with reading in String

I'm reading in a .txt file. I'm using fscanf to get the data as it is formatted.
The line I'm having problems with is this:
result = fscanf(fp, "%s", ap->name);
This is fine until I have a name with a whitespace eg: St Ives
So I use this to read in the white space:
result = fscanf(fp, "%[^\n]s", ap->name);
However, when I try to read in the first name (with no white space) it just doesn't work and messes up the other fscanf.
But I use the [^\n] it works fine within a different file I'm using. Not sure what is happening.
If I use fgets in the place of the fscanf above I get "\n" in the variable.
Edit//
Ok, so if I use:
result = fscanf(fp, "%s", ap->name);
result = fscanf(fp, "%[^\n]s", ap->name);
This allows me to read in a string with no white space. But When I get a "name" with whitespace it doesn't work.
One problem with this:
result = fscanf(fp, "%[^\n]s", ap->name);
is that you have an extra s at the end of your format specifier. The entire format specifier should just be %[^\n], which says "read in a string which consists of characters which are not newlines". The extra s is not part of the format specifier, so it's interpreted as a literal: "read the next character from the input; if it's an "s", continue, otherwise fail."
The extra s doesn't actually hurt you, though. You know exactly what the next character of input: a newline. It doesn't match, and input processing stops there, but it doesn't really matter since it's the end of your format specifier. This would cause problems, though, if you had other format specifiers after this one in the same format string.
The real problem is that you're not consuming the newline: you're only reading in all of the characters up to the newline, but not the newline itself. To fix that, you should do this:
result = fscanf(fp, "%[^\n]%*c", ap->name);
The %*c specifier says to read in a character (c), but don't assign it to any variable (*). If you omitted the *, you would have to pass fscanf() another parameter containing a pointer to a character (a char*), where it would then store the resulting character that it read in.
You could also use %[^\n]\n, but that would also read in any whitespace which followed the newline, which may not be what you want. When fscanf finds whitespace in its format specifier (a space, newline, or tab), it consumes as much whitespace as it can (i.e. you can think of it consuming the longest string that matches the regular expression [ \t\n]*).
Finally, you should also specify a maximum length to avoid buffer overruns. You can do this by placing the buffer length in between the % and the [. For example, if ap->name is a buffer of 256 characters, you should do this:
result = fscanf(fp, "%255[^\n]%*c", ap->name);
This works great for statically allocated arrays; unfortunately, if the array is dyamically sized at runtime, there's no easy to way to pass the buffer size to fscanf. You'll have to create the format string with sprintf, e.g.:
char format[256];
snprintf(format, sizeof(format), "%%%d[^\n]%%*c", buffer_size - 1);
result = fscanf(fp, format, ap->name);
Jumm wrote:
If I use fgets in the place of the fscanf above I get "\n" in the variable.
Which is a far easier problem to solve so go with it:
fgets( ap->name, MAX, fp ) ;
nlptr = strrchr ( ap->name, '\n' ) ;
if( nlptr != 0 )
{
*nlptr = '\0' ;
}
I'm not sure how you mean [^\n] is suppose to work. [] is a modifier which says "accept one character except any of the characters which is inside this block". The ^ inverts the condition. %s with fscanf only reads until it comes across a delimiter. For strings with spaces and newlines in them, use a combination of fgets and sscanf instead, and specify a restriction on the length.
There is no such thing as I gather you are trying to imply a regular expression in the fscanf function which does not exist, not that to my knowledge nor have I seen it anywhere - enlighten me on this.
The format specifier for reading a string is %s, it could be that you need to do it this way, %s\n which will pick up the newline.
But for pete's sake do not use the standard old gets family functions as specified by Clifford's answer above as that is where buffer overflows happen and was used in a infamous worm of the 1990's - the Morris Worm, more specifically in the fingerd daemon, that used to call gets that caused chaos. Fortunately, now, that has now been patched. And furthermore, a lot of programmers have been drilled into the mentality not to use the function.
Even Microsoft has adopted a safe version of gets family of functions, that specifies a parameter to indicate the length of buffer instead.
EDIT
My bad - I did not realize that Clifford indeed has specified the max length for input...Whoops! Sorry! Clifford's answer is correct! So +1 to Clifford's answer.
Thanks Neil for pointing out my error...
Hope this helps,
Best regards,
Tom.
I found the problem.
As Paul Tomblin said, I had an extra new line character in the field above. So using what tommieb75 said I used:
result = fscanf(fp, "%s\n", ap->code);
result = fscanf(fp, "%[^\n]s", ap->name);
And this fixed it!
Thanks for your help.

Resources