retrieving a string with spaces from a file in C - c

We were given an assignment that involved taking information from a file and storing the data in an array. The data in the file is sorted as follows
New York 40 43 N 74 01 W
the first 20 characters are the name of the city followed by the latitude and longitude. latitude and longitude should be easy with a few
fscanf(infile, "%d(or %c depending on which one i'm getting)", pointer)
operations so they won't be a problem.
My problem is that i do not know how to collect the string for the name of the city because some of the city names have spaces. I read something about using delimiters but from what i read, it seems like that is used more for reading an entire line. Is there any way to read the city name from a file and store the entire name with spaces in a character array? Thanks.

Here's a hint: With spaces as your only delimiter, how would you tell fscanf() where the city name starts and the latitude starts? You're getting close with your "it seems like that is used more for reading an entire line". Explore that, perhaps with fgets().

scanf() can take a limited amount of characters with the "%c" specifier.
ATTENTION It will not add a terminating null byte.
char cityname[21];
scanf("%20c%d%d %c%d%d %c", cityname,
&lat1, &lat2, &lat_letter,
&lon1, &lon2, &lon_letter);
cityname[20] = 0;
But you're better off using fgets() and parsing the string "manually". Otherwise you're going to have END-OF-LINE issues

G'day,
As you're scanning up until the first number, the latitude, for your city name maybe use a scan for non-numbers for the first item?

If you have spaces in your city name, you either need to use delimiters or define the city name field to be fixed length. Otherwise trying to parse a three-word city name, e.g. "Salt Lake City", will kill the next field.

Just a hint : read the entire line in memory and then take the first 20 chars for the city name, the next, say 10 chars for latitude and so on.

You can specify size for %c which will collect a block of characters of the specified size. In your case, if the city name is 20 characters long put %20c inside the line format of scanf.
Then you have to put terminator at the end and trim the string.

From "man fgets":
char *fgets(char *s, int size, FILE *stream);
fgets() reads in at most one less than size characters from stream and
stores them into the buffer pointed to by s. Reading stops after an
EOF or a newline. If a newline is read, it is stored into the buffer.
A '\0' is stored after the last character in the buffer.
fgets() return s on success, and NULL on error or when end
of file occurs while no characters have been read.
This means that you need a char array of 21 chars to store a 20 char string (the 21st char will be the '\0' delimiter at the end, and fgets will put it automagically).
Good luck!

Related

what is this format specifier means %[^,] in C?

i have a code snippet like this
while( fscanf(b,"%c,%[^,],%[^,],%f",&book.type,book.title,book.author,&book.price)!=EOF)
Reading up on the format string part of fscanf
Specifically this part
matches a non-empty sequence of character from set of characters. If
the first character of the set is ^, then all characters not in the
set are matched.
So, that format specifier matches all characters except the , character (which in your format string is matched afterward). So, if you had a struct like
typedef struct Book_t {
char type;
char title[100];
char author[100];
float price;
}Book ;
and then had a file that had the data in the schema:
BookType,BookTitle,BookAuthor,BookPrice
Then once could possibly read in each line into book as
fscanf(b,"%c,%[^,],%[^,],%f",&book.type,book.title,book.author,&book.price)
For a line of the file as:
A,old man and the sea,ernest hemingway,12.5
A would be read into book.type and then all characters not matching a comma would be read, so that would read in till sea and stop since the next character is a ,. This , would be matched with the , in the format string. The same process would repeat for the author field.
Note the caveat that reading in an unspecified number of characters till the matching stops before a comma is bad idea because the buffer that it's reading into is usually of a fixed length. That's why it's better to specify the maximum width (accounting for the null character) while doing so. Continuing with the above example, this would look something like
fscanf(b,"%c,%99[^,],%99[^,],%f",&book.type,book.title,book.author,&book.price)
To have the 99 denote that it should match only up to 99 characters at max to avoid any buffer overflows since the buffer title can hold only upto 100 characters and at least one byte would be required for the \0 character.

Reading tab delimited record using fscanf

Data file:
Newton 30 United Kingdom Scientist
Maxwell 25 United Kingdom Mathematician
Edison 60 United States Engineer
Code to read it:
#define MAX_NAME 50
#define MAX_COUNTRY 25
#define MAX_PROFILE 20
struct person
{
char *name;
int age;
char *country;
char *profile;
};
struct person pObj;
pObj->name = (char *) malloc(sizeof(MAX_NAME));
pObj->country = (char *) malloc(sizeof(MAX_COUNTRY));
pObj->profile = (char *) malloc(sizeof(MAX_PROFILE));
fscanf(fPtr,"%s\t%d\t%s\t%s\n",pObj->name,&pObj->age,pObj->country,pObj->profile);
I wrote a program to read tab delimited record to a structure using fscanf(). Same thing I can do by strtok(), strsep() functions also. But If I use strtok(), I forced to use atoi() function to load age field. But I don't want to use that atoi() function. So I simply used fscanf() to read age as Integer directly from the FILE stream buffer. It works fine. BUT for some record, country field is empty as like below.
Newton 30 United Kingdom Scientist
Maxwell 25 Mathematician
Edison 60 United States Engineer
When I read the second record, fscanf() doesn't fill empty string to the country field instead it has been filled with profile data. We understand fscanf() works that way. But is it there any option to scan the country field even though it is empty in the file? Can I do this without using atoi() function for age? i.e., reading fields by that respective types but not all the fields as strings.
Original format
The %s conversion specification skips any white space (blanks, tabs, newlines, etc) in the input, and then reads non-white-space up to the next white space character. The \t appearing in the format string causes fscanf() to skip zero or more white space characters (not just tabs).
You have:
fscanf(fPtr,"%s\t%d\t%s\t%s\the n", pObj->name, pObj->age, pObj->country, pObj-profile);
You need to pass a pointer to the age and you need an arrow -> between pObj and profile (please post code that could compile; it doesn't inspire confidence when there are errors like this):
fscanf(fPtr,"%s\t%d\t%s\t%s\the n", pObj->name, &pObj->age, pObj->country, pObj->profile);
Given the first input line:
Newton 30 United Kingdom Scientist
fscanf() will read Newton into pObj->name, 30 into pObj->age,UnitedintopObj->countryandKingdomintopObj->profile.fscanf()` and family are very casual about white space, in general. Most conversions skip leading white space.
After the 4 values are assigned, you have \the n" at the end of the format. The tab skips the white space between Kingdom and Scientist, but the data doesn't match he n, so the scanning stops — not that you're any the wiser for that.
The next operation will pick up where this one stopped, so the next pObj->name will be assigned Scientist and then the pObj->age conversion will fail because Maxwell doesn't represent an integer. The conversions stop there on that fscanf().
And so the problems continue. Your claimed output can't be attained with the code you show in the question.
If you're adamant that you must use fscanf(), you'll need to use scan sets such as %24[^\t] to read the country. But you'd do better using fgets() or POSIX function getline() to read whole lines of input, and then perhaps use sscanf() but more likely use strcspn() or strpbrk() from Standard C (or perhaps strtok() or — far better — POSIX strtok_r() or Windows strtok_s(), or non-standard strsep()) to split the line into fields at tabs. Note that strtok_r() et al don't care how many repeats there are of the delimiter (tabs in your case) between the fields; you can't have empty fields with them. You can identify empty fields with strcspn(), strpbrk() and strsep().
Cleaned up format
The format string has been revised to:
fscanf(fPtr,"%s\t%d\t%s\t%s\n", pObj->name, &pObj->age, pObj->country, pObj->profile);
This won't work, but can now be adapted so it will work.
if (fscanf(fPtr," %49[^\t]\t%d\t%24[^\t]\t%19[^\n]", pObj->name, &pObj->age, pObj->country, pObj->profile) != 4)
…handle a format error…
Beware trailing white space in scanf() format strings. The leading blank skips any newline left over from previous lines, and skips any leading white space on a line. The %49[^\t] looks for up to 49 non-tabs; the tab is optional and matches any sequence of white space, but the first character will be a tab unless the name was too long. Then it reads a number, more optional white space (it doesn't have to be a tab, but it will be unless the data is malformatted), then up to 24 non-tabs, white space again (of which the first character will be a tab unless there's a formatting problem), and up to 19 non-tabs. The next character should be a newline, unless there's a formatting problem.

Reading the string with defined number of characters from the input

So I am trying to read a defined number of characters from the input. Let's say that I want to read 30 characters and put them in to a string. I managed to do this with a for loop, and I cleaned the buffer as shown below.
for(i=0;i<30;i++){
string[i]=getchar();
}
string[30]='\0';
while(c!='\n'){
c=getchar(); // c is some defined variable type char
}
And this is working for me, but I was wondering if there is another way to do this. I was researching and some of them are using sprintf() for this problem, but I didn't understand that solution. Then I found that you can use scanf with %s. And some of them use %3s when they want to read 3 characters. I tried this myself, but this command only reads the string till the first empty space. This is the code that I used:
scanf("%30s",string);
And when I run my program with this line, if I for example write: "Today is a beatiful day. It is raining, but it's okay i like rain." I thought that the first 30 characters would be saved in to the string. But when i try to read this string with puts(string); it only shows "Today".
If I use scanf("%s",string) or gets(string) that would rewrite some parts of my memory if the number of characters on input is greater than 30.
You can use scanf("%30[^\n]",s)
Actually, this is how you can set which characters to input. Here, carat sign '^' denotes negation, ie. this will input all characters except \n. %30 asks to input 30 characters. So, there you are.
The API you're looking for is fgets(). The man page describes
char *fgets(char *s, int size, FILE *stream);
fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer. A terminating null byte ('\0') is stored after the last character in the buffer.

Adding Strings to a Text File in C

I am making a grocery list program and I want to include a user-input string that is put it in the next available line in the text file. Right now this adds the string to the file, but it then puts in random characters for a number of spaces and then, if I input something else, it won't be on the next line.
void AddToFile(FILE *a) {
char addItem[20];
printf("Enter item: ");
scanf("%s", &addItem);
fwrite(addItem, sizeof(addItem), 1, a);
}
this line should be:
// strlen instead of sizeof
fwrite(addItem, strlen(addItem), sizeof(char), a);
Your code always writes 20 characters, instead of the real number of characters that the string has.
Apart from the correction stated by pivotnig, fwrite() does not write a new line character to the file. Either write the new line after the fwrite() or append it to the addItem buffer before the fwrite().
You should prevent buffer overrun by limiting the number of characters copied to buf:
scanf("%19s", addItem);
In the current example, if you will try to write an item with more than 20 characters you will get into trouble as scanf will overwrite non allocated memory. A slightly improved version would be:
scanf("%19s",&addItem);
In this way, at least you will not overwrite random memory.
Edit: Thanks to Jack for pointing out \0 has to be written also.

using scanf and family to read in two strings separated by a space from a file in c

I am trying to read in 2 string that are separated by a space from a file.
Whatever I try I keep getting the 1st string initialized but the second string is always NULL.
Some of the formatters that I have tried are "%s%s" , "%s %s" , "%s[\n\t ]%s"
Any ideas of what I am doing wrong?
I think it has to do with the internal buffer of scanf -- reads the first %s then puts some invisible character in buffer reads that with 2nd %s which then is read and the second string is NULL when complete.
What do your strings look like?
I don't think your hypothesis of fscanf() modifying the input data by placing "some invisible character in buffer" is true.
It seems more likely that your strings don't conform to the requirements of the %s format specifier.
between your string, if there is just one space
fscanf ( ..., "%s %s", ... ) ; // you know how to fill space marked with ...
but if number of white space is not known :
char stack[YourscreenSize];
fscanf ( ..., "YourscreenSize[^\n]", stack ); // take all line in one data,
then parse it
if number of white space is not known, ( second way )
take data from file
check it, send first char in isgraph function whether it is white space or not
if it is white space, then erase stored data
do it, iteratively.When you see EOF break from iteration, (you can chech return value of
fscanf to know whether it reads char or EOF )

Resources