Problem with reading string from a file in C - c

I've tried to look for a similar problem but was unable to find it so I'm posting this. Here's the thing.
Let's say I have a file named text.txt. Now, the file consists of 3 integers and string, something like this:
4 59 32 This is sentence 1
5 9 130 Grass is green
3 12 149 I need help
I'm still learning C so I'm sorry if this is some easy type question etc. Here's the problem. I don't know how to read this. The same is if the string is at the beginning of the file like this
This is sentence 1 4 59 32
Grass is green 5 9 130
I need help 3 12 149
I know how to read it if I know the amount of words it will consist of (like if the file would be something like Name Surname Number Number Number) but this when I need to read entire, I have no idea.
Here's the code from comments. However, as #john pointed out, it's false right from the start since i gets first character and then I do scanf (still, I've tried out and with only numbers involved, fscanf gets correct values even though first character is read).
I was also thinking about some while loop with isalpha() and isspace() involved but to no avail.
while((i = fgetc(input)) != (int)(EOF))
{
fscanf(inputFile, "%d %d %d", &num1, &num2, &num3);
j=0;
while(i != (int)('\n'))
{
string[j++]=(char)i;
i = fgetc(inputFile);
}
string[j] = '\0';
printf("%d %d %d %s\n",br1, br2, br3, string);

it's false right from the start since i gets first character and then I do scanf
You're right, it's wrong. There's no need to consume a line's first character just to test for EOF - we can leave that to fscanf().
Also there's no need to read the string's characters one by one - fscanf() can read a string ending at \n with the conversion specification %[^\n].
If the input string length is not limited, we'd have to prevent the buffer overflow when the input string would need more memory than provided by string by adding a maximum field width. If you defined e. g. char string[100];, the maximum field width is 99, since the last char is needed for the terminating null character.
So we could write e. g.
while (fscanf(input, "%d %d %d %99[^\n]", &num1, &num2, &num3, string) == 4)
printf("%d %d %d %s\n", num1, num2, num3, string);

Related

Program To Check If A Number Is Present In An Array

I wrote the below C code to check if a number is present in an array whose elements are input by the user. But weirdly it's skipping the the third printf statement, directly taking the input and printing Enter the number you wish to look for after taking that input. What is causing this? Included input and output box below code.
CODE:
#include <stdio.h>
#include <stdlib.h>
void main() {
int arr[30], size, i, num, flag=0;
printf("Enter size of array. \n");
scanf("%d",&size);
printf("Enter %d array elements one by one. \n",size);
for (i=0; i<size; i++) {
scanf("%d \n",&arr[i]);
}
printf("Enter the number you wish to look for. \n");
scanf("%d",&num);
for(i=0;i<size;i++) {
if (num == arr[i]) {
flag++;
}
}
if (flag>0) {
printf("The number %d is present in the array.",num);
} else {
printf("The number %d is not present in the array.",num);
}
}
INPUT/OUTPUT:
Enter size of array.
5
Enter 5 array elements one by one.
1
2
3
4
5
5
Enter the number you wish to look for.
The number 5 is present in the array.
You can see that Enter the number you wish to look for. should come before 5, but it is not so.
Solved
Simply fixed by removing \n from scanf.
In scanf, a space character already represents any whitespace. So in your "%d \n" the function already processes the new line right after the last digit, but then you force it to wait for another newline.
This causes the program to wait for yet another line. After it's input, the program continues and asks for the number to search, and at that point the input was already entered.
Just use only one space in scanf, it will already work for the newline, ideally before the digit itself so that you don't need one extra line to complete the operation:
scanf(" %d", arr + i);
The space in the input format string "%d \n" tells the input system to
consume... all available consecutive whitespace characters from the input
(described here)
So when you enter your last number 5, the system now tries to consume all whitespace characters. To do that, it waits for additional input, until it's not a whitespace. So, paradoxically or not, to consume spaces, the system has to read a non-space, which is the second 5 you input.
To fix this behavior, you can tell your system to input only a number, without consuming whitespace:
scanf("%d",&arr[i]);
However, this will leave the whitespace in the buffer, which may interfere with later input. To discard the whitespace, you can use various techniques, described e.g. here.
In my opinion, the most correct technique (however, maybe the most cryptic one) is
scanf("%d%*[^\n]%*c",&arr[i]);
%d - read the number
%*[^\n] - read a string, terminated by a newline byte; discard it and don't store it anywhere
%*c - read a byte (which is a newline byte); discard it and don't store it anywhere
BTW in your format string "%d \n", there are two whitespaces: a regular space and an end-of-line. They both tell scanf to consume all whitespaces in input. The effect is exactly the same as with one space "%d " or with one end-of-line "%d\n", so this particular format string may be highly confusing to whoever reads your code (including yourself).

why does fscanf read garbage values

I have a simple file as below:
1
3
a 7
and when I run the code below, I get some unexpected result. I initially try to read the first two integers and then read the character a and number 7. There is no white space after the number 1 or 3.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
main(int argc, char **argv)
{
FILE *f;
f = fopen(argv[1],"r");
int num1, num2, num3;
char t;
fscanf(f, "%d",&num1);
fscanf(f, "%d",&num2);
fscanf(f, "%c %d", &t, &num3);
printf("%c %d\n", t, num3);
}
EDIT:
Input is the file with the content:
1
3
a 7
and the output is a new line and some garbage. Expected output should be a 7
EDIT 2: it reads correctly 1 and 3. Then trying to read a single character a if fails
Loooks at what happens when you run this:
fscanf(f, "%d",&num1);
skips whitespace (of which there is none), then reads an integer (1)
fscanf(f, "%d",&num2);
skips whitespace (the newline at the end of the first line) then reads an integer (3)
fscanf(f, "%c %d", &t, &num3);
reads the next caharcter from the input (a newline), then skips whitespace (none) and tries to read an integer. The next input character is 'a', so this fails, and the fscanf call returns 1 without writing anything into num3.
So the problem you are having is that due to the fact that %c does NOT skip whitespace, you are reading the whitespace (newline) instead of the character you expect. The most obvious solution is to add a space () to the format to skip whitespace:
fscanf(f, " %c%d", &t, &num3);
Note that I've also removed the space before %d, as it is redundant (%d always skips whitespace).
In addition, it is always a good idea to check the return value of fscanf to make sure it is reading the number of input items you expect.

Regarding about file input and output in C

I normally don't ask questions on here unless I'm really stuck! I was wondering if anyone can please explain why my code prints out a '5 47'. I understand why there is a 5, but not why there is a 47? I looked up the ASCII values for blankspace (32) and I tried changing the second letter to e, f, g, for example but the output remains '5 47' unchanged.
In general, when I use fscanf(fp, "%d", &variablename), does the fscanf skip over miscellaneous characters? For example: in my file test.txt I had "5 hello 6 ben jerry\n". How would I scan in the 5 and the 6? Would fscanf(fp, "%d %d", &test1, &test2) scan in the 5 and 6, skipping over the word "hello"?
Here is my simple code I am using to test output:
int main(int argc, char *argv[]) {
int blah, test;
FILE * fp;
fp = fopen(argv[1], "r");
fscanf(fp, "%d %d", &blah, &test);
printf("%d %d\n", blah, test);
return 0;
}
My file I am using as argv[1] contents:
5g
P.S. is FILE *fp an actual pointer to each character/number and does it work as a placeholder when it scans through the file? Is that why we need rewind(fp) once it hits the end of the file?
The operator %d looks for an integer, not a character. Because g is a character, not an integer, %d is getting confused and the output will not always be 5 47. The 47 could be anything. it could be 5 7, 5 23 etc. This is because the fscanf is not reading a second number, so no value is being assigned to test. Therefore, test remains at the value which was sitting in that piece of memory when the program was initiated.
To fix this, replace %d with %c and change the type of blah and test to int. Also, as WhozCraig said, it is good practise to check the return value of fscanf to check that two values have been found. This way, you can be sure that everything you're looking for has been found.
Note that the scanf() family of functions stop reading when they come across a character that is not expected by the format string. The unexpected character is left in the input for the next input operation to process.
If you want to read two integers that are definitely separated by a 'word' that is not an integer, then you will need to skip the word. If you don't know in advance what the word will be, you need to use assignment suppression (see the POSIX scanf() page for lots of information).
Hence, your code to read the two integers from input containing
5 hello 6 ben jerry
should be:
if (fscanf(fp, "%d %*s %d", &blah, &test) != 2)
…Oops; format error?…
Note that the code tests that it got the expected result. However, if you don't know whether there'll be a word between the two numbers, you are much better off using fgets() and sscanf() because you can try different parses of the same line:
char buffer[4096];
while (fgets(buffer, sizeof(buffer), fp) != 0)
{
if (sscanf(buffer, "%d %*s %d", &blah, &test) == 2)
…got two numbers with a word — let's go!
else if (sscanf(buffer, "%d %d", &blah, &test) == 2)
…got two numbers but no word — let's go!
else
…didn't recognize the format…
}
One of the major advantages of this is that you can report the error in terms of the complete line of input, rather than just the part that fscanf() didn't manage to work on.
Your last question, about a FILE *, is not a pointer to each character in the file. It is a handle which allows you to invoke functions that take a file pointer argument to read from or write to the associated file. However, you can't use indexing based on the file pointer (so fp[1024] does not identify the character at offset 1024 in the file or anything useful like that). If you want that sort of behaviour, you need a memory-mapped file (mmap() for POSIX systems).

fscanf in C - how to determine comma?

I am reading set of numbers from file by fscanf(), for each number I want to put it into array. Problem is that thoose numbers are separated by "," how to determine that fscanf should read several ciphers and when it find "," in file, it would save it as a whole number? Thanks
This could be a start:
#include <stdio.h>
int main() {
int i = 0;
FILE *fin = fopen("test.txt", "r");
while (fscanf(fin, "%i,", &i) > 0)
printf("%i\n", i);
fclose(fin);
return 0;
}
With this input file:
1,2,3,4,5,6,
7,8,9,10,11,12,13,
...the output is this:
1
2
3
4
5
6
7
8
9
10
11
12
13
What exactly do you want to do?
I'd probably use something like:
while (fscanf(file, "%d%*[, \t\n]", &numbers[i++]))
;
The %d converts a number, and the "%*[, \t\n]" reads (but does not assign) any consecutive run of separators -- which I've defined as commas, spaces, tabs, newlines, though that's fairly trivial to change to whatever you see fit.
fscanf(file, "%d,%d,%d,%d", &n1, &n2, &n3, &n4); but won't work if there are spaces between numbers. This answer shows how to do it (since there aren't library functions for this)
Jerry Coffin's answer is nice, though there are a couple of caveats to watch for:
fscanf returns a (negative) value at the end of the file, so the loop won't terminate properly.
i is incremented even when nothing was read, so it will end up pointing one past the end of the data.
Also, fscanf skips all whitespace (including \t and \n if you leave a space between format parameters.
I'd go for something like this.
int numbers[5000];
int i=0;
while (fscanf(file, "%d %*[,] ", &numbers[i])>0 && i<sizeof(numbers))
{
i++;
}
printf("%d numbers were read.\n", i);
Or if you want to enforce there being a comma between the numbers you can replace the format string with "%d , ".

ignoring whitespace with sscanf in C

I have a string such as "4 Tom Tim 6", and i am trying to scan those values with sscanf like this
sscanf(string, "%d %s %d", &NUMBER1, NAME, &number2 )
is there any way to do this and deposit in NUMBER1 the value 4, in NUMBER2 the value 6, and in NAME the value "Tom Tim"?
I tried but sscanf splits "Tom" and "Tim" because there is a whitespacew between them and thus it also returns a incorrect value for NUMBER2.
Update:
Let me be more specific. There will always be a number at the beginning and at the end of the my string, and a substring between those numbers, which could have any length and any quantity of whitespaces, and what im trying to get is that substring in a single variable, and the numbers in the beggining and the end.
You read it in as
sscanf(string, "%d %s %s %d", &NUMBER1, &NAME, &SECONDNAME, &NUMBER2);
then concatenate them
strcat(NAME," "); // Add space
strcat(NAME,SECONDNAME); // Add second name
Make sure that NAME has enough space to hold both the first and second name. You will also have to:
#include <string.h>
In order to come up with the solution (and tell whether it is even possible with sscanf), you need to provide more information about the format of your string. It is not possible to derive anything conclusive from a single example you provided so far.
In your particular case one needs to know where the name ends and the next number begins. How do you define that in your case? Are we supposed to assume that the first decimal digit character means the end of the name and the beginning of the number2? Or is it something more complicated? If the input string contains a "Tom16" sequence, is the entire "Tom16" supposed to be the name, or should we split it into "Tom" and leave 16 for number2?
Basically, your question, as stated, allows for no meaningful answer, only for random suggestions.
Update: Your description of the format of the string is still far from being complete, but I can suggest using the following format specifier in sscanf
sscanf(string, "%d %[^0123456789]%d", &number1, name, &number2)
This will work, assuming that the "numbers" you are referring to are composed of decimal digits only and assuming that name cannot contain any decimal digits. Also note that it will not include the leading space onto the name, but it will include the trailing space. If you don't want it you'll have to trim the trailing space from the name yourself.
In any case, parsing capabilities of sscanf are rather limited. They are normally inadequate for solving problem like yours. What I have above is probably the best you can get out of sscanf. If you need something even a little more elaborate, you'll have to parse your string manually, token by token, instead of trying to parse the whole thing in one shot with sscanf.
No, not with sscanf().
You can do it 'easily' with fgets(), and parsing the line character by character
/* basic incomplete version; no error checking; name ends with whitespace */
#include <ctype.h>
#include <stdio.h>
int num1, num2;
char name[250], line[8192], *p;
fgets(line, sizeof line, stdin);
num1 = num2 = 0;
p = line;
while (isdigit((unsigned char)*p) {num1 = num1*10 + *p - '0'; p++};
while (isspace((unsigned char)*p)) p++;
while (!isdigit((unsigned char)*p)) *name++ = *p++;
while (isdigit((unsigned char)*p) {num2 = num2*10 + *p - '0'; p++};
You can't do this work with the sscanf function and a "central" string with an arbitrary number of spaces in it, since the whitespace is also your delimiter for the next field; if that %s matched strings with whitespace, it would "eat" also the 6.
If it's only your "central" field that is "special" and you have only those three fields, you should read your string backwards to find the beginning of the third field, and transform it in number; then you replace the character before the 6 with a \0, thus truncating the string before the third field.
Then you can use strtoul to convert the first field and to determine where it ends (using its second parameter); considering the string that starts from there and goes to the end of the truncated string you get the second field.
#AndreyT is pretty much correct. I'm going to guess that the middle field should stop at any digit. If that's the case, then yes sscanf can do the job:
sscanf(string, "%d %[^0-9] %d", &NUMBER1, NAME, &number2);
You really want to limit the amount that's read to the length of your buffer though:
char name[32];
sscanf(string, "%d %31[^0-9] %d", &number1, name, &number2);
I should add that technically this isn't portable as-is. To be entirely portable, you should use [^0123456789] instead of [^0-9]. Old versions of Borland compilers actually treated "0-9" as meaning the three characters '0', '-' and '9'. The standard permits this, though I don't know of any current compiler that takes its permission to be stupid.
You could :
sscanf(string, "%d %s %s %d", &NUMBER1, NAME1 , NAME2, &number2 );
strcat(NAME , NAME1);
strcat(NAME , " ");
strcat(NAME , NAME2);
But this would result in undefined behaviour, if NAME is not big enough.
I can think of a couple of ways:
1) If you always know the size of the "Tom Tim" field, use the %c format with a length specifier:
int num1;
int num2;
char name[8];
sscanf(string, "%d %7c %d", &num1, name, &num2);
name[7] = '/0';
Note that NAME needs to be large enough to hold the characters read and that it won't be null terminated so that has to be done manually.
2) If you know there are always two fields, use two string specifiers and strncat() them together:
char name1[40];
char name2[20];
int num1;
int num2;
sscanf(string, "%d %s %s %d", &num1, name1, name2, &num2);
strncat(name1, name2, sizeof(name2)-1);
You could also parse the string using strtok_r(). I'll leave that as an exercise for the reader.

Resources