String delimiter using C - c

I've started learning how to use strings, but I'm a little bit confused about the whole concept. I'm trying to read word by word from a file that contains strings.
Here is the file:
Row, row, row your boat,
Gently down the stream.
Merrily, merrily, merrily, merrily,
Life is but a dream.
My approach was to use
char hold[25];
// Statement
while(fscanf(fpRow, "%s", hold) != EOF)
printf("%s %d\n", hold, strlen(hold));
So my task is to read each string and exclude all the , and . in the file. To do so the approach would be to use %[^,.] instead of %s correct? But when I tried this approach my string only wants to read the first word of the file and the loop never exits. Can someone explain to me what I'm doing wrong? Plus, if it's not too much to ask for what's the significance between fscanf and fgets? Thanks
while(fscanf(fpRow, "%24[^,.\n ]", hold) != EOF)
{
fscanf(fpRow, "%*c", hold);
printf("%s %d\n", hold, strlen(hold));
}

Yes, %[^,. ] should work -- but keep in mind that when you do that, it will stop reading when it encounters any of those characters. You then need to read that character from the input buffer, before trying to read another word.
Also note that when you use either %s or %[...], you want to specify the length of the buffer, or you end up with something essentially like gets, where the wrong input from the user can/will cause buffer overflow.

Related

Stack Smashing and using malloc

I'm making a program that counts the number of words contained within a file. My code works for certain test cases with files that have less than a certain amount of words/characters...But when I test it with, let's say, a word like:
"loooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong", (this is not random--this is an actual test case I'm required to check), it gives me this error:
*** stack smashing detected ***: ./wcount.out terminated
Abort (core dumped)
I know what the error means and that I have to implement some sort of malloc line of code to be able to allocate the right amount of memory, but I can't figure out where in my function to put it or how to do it:
int NumberOfWords(char* argv[1]) {
FILE* inFile = NULL;
char temp_word[20]; <----------------------I think this is the problem
int num_words_in_file;
int words_read = 0;
inFile = fopen(argv[1], "r");
while (!feof(inFile)) {
fscanf(inFile, "%s", temp_word);
words_read++;
}
num_words_in_file = words_read;
printf("There are %d word(s).\n", num_words_in_file - 1);
fclose(inFile);
return num_words_in_file;
}
As you've correctly identified by rendering your source code invalid (future tip: /* put your arrows in comments */), the problem is that temp_word only has enough room for 20 characters (one of which must be a terminal null character).
In addition, you should check the return value of fopen. I'll leave that as an exercise for you. I've answered this question in other questions (such as this one), but I don't think just shoving code into your face will help you.
In this case, I think it may pay to better analyse the problem you have, to see if you actually need to store words to count them. As we define a word (the kind read by scanf("%s", ...) as a sequence of non-whitespace characters followed by a sequence of (zero or more) whitespace characters, we can see that such a counting program as yours needs to follow the following procedure:
Read as much whitespace as possible
Read as much non-whitespace as possible
Increment the "word" counter if all was successful
You don't need to store the non-whitespace any more than you do the whitespace, because once you've read it you'll never revisit it. Thus you could write this as two loops embedded into one: one loop which reads as much whitespace as possible, another which reads non-whitespace, followed by your incrementation and then the outer loop repeats the whole lot... until EOF is reached...
This will be best achieved using the %*s directive, which tells scanf-related functions not to try to store the word. For example:
size_t word_count = 0;
do {
fscanf(inFile, "%*s");
} while (!feof(inFile) && ++word_count);
You are limited by the size of your array. A simple solution would be to increase the size of your array. But you are always susceptible to stack smashing if someone enters a long word.
A word is delimited by spaces.
You can simply store a counter variable initialized to zero, and a variable that records the current char that you are looking at. Every time you read in a character using fgetc(inFile, &temp) that is a space, you increment the counter.
In your current code you simply want to count the words. Therefore you are not interested in the words themselves. You can suppress the assignment with the optional * character:
fscanf(inFile, "%*s");

How to read numbers from a text file properly?

I would like to write a lottery program in C, that reads the chosen numbers of former weeks into an array. I have got a text file in which there are 5 columns that are separated with tabulators. My questions would be the following:
What should I separate the columns with? (e.g. a comma, a semicolon, a tabulator or something else)
Should I include a kind of EOF in the last row? (e.g. -1, "EOF") Is there any accepted or "official" convention to do this?
Which function should I use for reading the numbers? Is there any proper or "accepted" way of reading data from text files?
I used to write a C program for a "Who Wants to Be a Billionaire" game. In that one I used a kind of function that read each line into an array that was big enough to hold a whole line. After that I separated its data into variables like this:
line: "text1";"text2";"text3";"text4"endline (-> line loaded into a buffer array)
text1 -> answer1 (until reaching the semicolon)
text2 -> answer2 (until reaching the semicolon)
text3 -> answer3 (until reaching the semicolon)
text4 -> answer4 (until reaching the end of the line)
endline -> start over, that is read a new line and separate its contents into variables.
It worked properly, but I don't know if it was good enough for a programmer. (btw I'm not a programmer yet, I study Computer Science at a university)
Every answers and advice is welcome. Thanks in advance for your kind help!
The scanf() family of functions don't care about newlines, so if you want to process lines, you need to read the lines first and then process the lines with sscanf(). The scanf() family of functions also treats white space — blanks, tabs, newlines, etc. — interchangeably. Using tabs as separators is fine, but blanks will work too. Clearly, if you're reading and processing a line at a time, newlines won't really factor into the scanning.
int lottery[100][5];
int line;
char buffer[4096];
for (line = 0; fgets(buffer, sizeof(buffer), stdin) != 0 && line < 100; line++)
{
if (sscanf(buffer, "%d %d %d %d %d", &lottery[line][0], &lottery[line][1],
&lottery[line][2], &lottery[line][3], &lottery[line][4]) != 5)
{
fprintf(stderr, "Faulty line: [%s]\n", line);
break;
}
}
This stops on EOF, too many lines, and a faulty line (one which doesn't start with 5 numbers; you can check their values etc in the loop if you want to — but what are the tests you need to run?). If you want to validate the white space separators, you have to work harder.
Maybe you want to test for nothing but spaces and newlines after the 5 numbers; that's a bit trickier (it can be done; look up the %n conversion specification in sscanf()).

what does fscanf being == 1 do

There is more to this code obviously but I am just curious as to what this line of code actually does. I know the while loop and such but am new to the fscanf()
while (fscanf(input_file, "%s", curr_word) == 1)
fscanf() returns the number of input items successfully scanned and stored.
as per the man page
Return Value
These functions return the number of input items successfully matched and assigned, which can be fewer than provided for, or even zero in the event of an early matching failure.
In your case
while (fscanf(input_file, "%s", curr_word) == 1)
fsaacf() will return a value of 1 if it is able to successfully scan a string (as per the %s format specifier) from input_file and put it into curr_word.
fscanf(input_file, "%s", curr_word) reads the input stream input_file and stores the next sequence of non spacing characters into the array pointed to by cuur_word and appends a '\0' byte. As you can see, the size of this array is not passed to fscanf. This is a classical case of potential buffer overflow, a security flaw that can be exploited by a hacker by storing appropriate contents in the input stream.
After gets, the scanf family of library functions is the best source of buffer overflow bugs one can find.
It is very difficult to use fscanf correctly. Most C programmers should avoid it.

Use of fgets() and gets()

#include <stdlib.h>
#include <stdio.h>
int main() {
char ch, file_name[25];
FILE *fp;
printf("Enter the name of file you wish to see\n");
gets(file_name);
fp = fopen(file_name,"r"); // is for read mode
if (fp == NULL) {
printf(stderr, "There was an Error while opening the file.\n");
return (-1);
}
printf("The contents of %s file are :\n", file_name);
while ((ch = fgetc(fp)) != EOF)
printf("%c",ch);
fclose(fp);
return 0;
}
This code seems to work but I keep getting a warning stating "warning: this program uses gets(), which is unsafe."
So I tried to use fgets() but I get an error which states "too few arguments to function call expected 3".
Is there a way around this?
First : Never use gets() .. it can cause buffer overflows
second: show us how you used fgets() .. the correct way should look something like this:
fgets(file_name,sizeof(file_name),fp); // if fp has been opened
fgets(file_name,sizeof(file_name),stdin); // if you want to input the file name on the terminal
// argument 1 -> name of the array which will store the value
// argument 2 -> size of the input you want to take ( size of the input array to protect against buffer overflow )
// argument 3 -> input source
FYI:
fgets converts the whole input into a string by putting a \0 character at the end ..
If there was enough space then fgets will also get the \n from your input (stdin) .. to get rid of the \n and still make the whole input as a string , do this:
fgets(file_name,sizeof(file_name),stdin);
file_name[strlen(file_name)] = '\0';
Yes: fgets expects 3 arguments: the buffer (same as with gets), the size of the buffer and the stream to read from. In your case your buffer-size can be obtained with sizeof file_name and the stream you want to read from is stdin. All in all, this is how you'll call it:
fgets(file_name, sizeof file_name, stdin);
The reason gets is unsafe is because it doesn't (cannot) know the size of the buffer that it will read into. Therefore it is prone to buffer-overflows because it will just keep on writing to the buffer even though it's full.
fgets doesn't have this problem because it makes you provide the size of the buffer.
ADDIT: your call to printf inside the if( fp == NULL ) is invalid. printf expects as its first argument the format, not the output stream. I think you want to call fprintf instead.
Finally, in order to correctly detect EOF in your while-condition you must declare ch as an int. EOF may not necessarily fit into a char, but it will fit in an int (and getc also returns an int). You can still print it with %c.
Rather than ask how to use fgets() you should either use google, or look at the Unix/Linux man page or the VisualStudio documentation for the function. There are hundreds of functions in C, C++ and lots of class objects. You need to first figure out how to answer the basics yourself, so that your real questions stand a chance of being answered.
If you are new to C, you are definitely doing the right thing of experimenting, but take a look at other code, as you go along, to learn some of the tips/tricks of how code is written.

Printf a buffer of char with length in C

I have a buffer which I receive through a serial port. When I receive a certain character, I know a full line has arrived, and I want to print it with printf method. But each line has a different length value, and when I just go with:
printf("%s", buffer);
I'm printing the line plus additional chars belonging to the former line (if it was longer than the current one).
I read here that it is possible, at least in C++, to tell how much chars you want to read given a %s, but it has no examples and I don't know how to do it in C. Any help?
I think I have three solutions:
printing char by char with a for loop
using the termination character
or using .*
QUESTION IS: Which one is faster? Because I'm working on a microchip PIC and I want it to happen as fast as possible
You can either add a null character after your termination character, and your printf will work, or you can add a '.*' in your printf statement and provide the length
printf("%.*s",len,buf);
In C++ you would probably use the std::string and the std::cout instead, like this:
std::cout << std::string(buf,len);
If all you want is the fastest speed and no formatting -- then use
fwrite(buf,1,len,stdout);
The string you have is not null-terminated, so, printf (and any other C string function) cannot determine its length, thus it will continue to write the characters it finds there until it stumbles upon a null character that happens to be there.
To solve your problem you can either:
use fwrite over stdout:
fwrite(buffer, buffer_length, 1, stdout);
This works because fwrite is not thought for printing just strings, but any kind of data, so it doesn't look for a terminating null character, but accepts the length of the data to be written as a parameter;
null-terminate your buffer manually before printing:
buffer[buffer_length]=0;
printf("%s", buffer); /* or, slightly more efficient: fputs(buffer, stdout); */
This could be a better idea if you have to do any other string processing over buffer, that will now be null-terminated and so manageable by normal C string processing functions.
Once you've identified the end of the line, you must append a '\0' character to the end of the buffer before sending it to printf.
You can put a NUL (0x0) in the buffer after receiving the last character.
buffer[i] = 0;

Resources