Trouble reading in Tokens in C - c

I can't seem to figure out, how to correctly read in a .txt file that has the following appereance: (example)
+ 1
+ 2
- 2
+ 5
p -1
? 5
and so on...
what I need now is to store the operator / token which can be '+' '-' 'p' or something like that, and the int that follows in two different variables because I need to check them later on.
char oprtr[1];
int value;
FILE *fp = fopen(args[1], "r");
while(!feof(fp) && !ferror(fp)){
if(fscanf(fp, "%s %d\n", oprtr, &value) < 1){
printf("fscanf error\n");
}
if(strcmp(oprtr, "+") == 0){
function1(bst, value);
} else if(strcmp(oprtr, "-") == 0){
function2(bst, value);
} else if((strcmp(oprtr, "p") == 0) && value == -1){
function3(root);
//some other functions and so on...
}
printing out oprtr and value in the loop shows that they are not being red in correctly, but it does compile. Does someone have a solution?

You have single characters, you can use == to compare them instead of strcmp. Just read the input in pairs and use a switch for example.
char c;
int x;
while(fscanf(fp, "%c %d", &c, &x) == 2)
{ switch(c)
{ case '+': /* ... */
}
}

Your string oprtr is too small to hold anything but an empty string (remember that C strings need a terminating 0 character!). So:
char oprtr[1];
needs to be at least:
char oprtr[2]; // string of maximum size 1
or more defensively:
char oprtr[256]; // string of maximum size 255

You can use the fscanf function, you can get the input from the file.
int fscanf(FILE *stream, const char *format, ...);
fscanf(fp," %c %d",&c,&d);

Related

How do I read a whole line to a string?

int run_add_line_after(Document *doc, char *command) {
int paragraph_num, line_num;
char com[MAX_STR_SIZE + 1], extra[MAX_STR_SIZE + 2], line[MAX_STR_SIZE + 1];
if (sscanf(command, " %s %d %d %s", com, &paragraph_num, &line_num, extra)
== 4 && paragraph_num > 0 && line_num >= 0 && extra[0] == '*') {
strcpy(line, &(extra[1]));
if (add_line_after(doc, paragraph_num, line_num, line) == FAILURE) {
printf("add_line_after failed\n");
}
return SUCCESS;
}
return FAILURE;
}
I want sscanf to read everything left in command to extra but it's only taking the first word. For example, if I have:
command: "add_line_after 1 0 *first line of the document"
I want to see:
com: "add_line_after"
paragraph_num: 1
line_num: 0
extra: "first line of the document"
but instead I get:
com: "add_line_after"
paragraph_num: 1
line_num: 0
extra: "first"
because %s stops when it hits the space. How do I read the rest of the line while still ignoring any whitespace between '0' and '*'?
For reference, MAX_STR_SIZE is 80 and command is a 1025 character array (though I don't think that matters). Just assume extra is large enough to hold the rest of the line.
sscanf is really the wrong tool to use here. It can be done, but probably should not be used the way you are trying. Fortunately, you are not passing com to add_line_after, which means it is not necessary to ensure that com is a properly null-terminated string, and this allows you to avoid all of that unnecessary string copying. (If you were passing com, you would either have to copy it, or write a null terminator into command.) You don't want or need to use sscanf to move data at all. You can just use it to parse the numeric values. It's not clear to me if you want to discard any whitespace that follows the *, and this code does not. If you want to do so, removing that whitespace is trivial and left as an exercise for the reader:
int
run_add_line_after(struct document *doc, const char *command)
{
int paragraph_num, line_num, n;
const char *line;
if( 2 == sscanf(command, "%*s %d %d %n", &paragraph_num, &line_num, &n)
&& paragraph_num > 0 && line_num >= 0 && command[n] == '*' )
{
line = command + n + 1;
if( add_line_after(doc, paragraph_num, line_num, line) == FAILURE) {
fprintf(stderr, "add_line_after failed\n");
} else {
return SUCCESS;
}
}
return FAILURE;
}
The idea here is to simply use sscanf to figure out where the extra data is in the string and to parse the integer values. This is absolutely the wrong tool to use (the scanf family is (almost) always the wrong tool), and is used here only for demonstration.
-- edit --
But, of course, this doesn't do exactly what you want. It would be much cleaner to move some functionality into add_line_after to handle the newline, but since you also need to remove the newline, it becomes necessary to do something like:
int
run_add_line_after(struct document *doc, char *command)
{
int paragraph_num, line_num, n;
char *line;
if( 2 == sscanf(command, "%*s %d %d %n", &paragraph_num, &line_num, &n)
&& paragraph_num > 0 && line_num >= 0 && command[n] == '*' )
{
line = command + n + 1;
line[strcspn(line, "\n")] = '\0';
if( add_line_after(doc, paragraph_num, line_num, line) == FAILURE) {
fprintf(stderr, "add_line_after failed\n");
} else {
return SUCCESS;
}
}
return FAILURE;
}
This is not ideal. It would be better if you modify the API so that you can avoid both copying the data and modifying the string that you are given.
Got it. Format string should be " %s %d %d %[^\n]s". It will keep reading into the last string variable until it hits the enter key.

C Reading a file of digits separated by commas

I am trying to read in a file that contains digits operated by commas and store them in an array without the commas present.
For example: processes.txt contains
0,1,3
1,0,5
2,9,8
3,10,6
And an array called numbers should look like:
0 1 3 1 0 5 2 9 8 3 10 6
The code I had so far is:
FILE *fp1;
char c; //declaration of characters
fp1=fopen(argv[1],"r"); //opening the file
int list[300];
c=fgetc(fp1); //taking character from fp1 pointer or file
int i=0,number,num=0;
while(c!=EOF){ //iterate until end of file
if (isdigit(c)){ //if it is digit
sscanf(&c,"%d",&number); //changing character to number (c)
num=(num*10)+number;
}
else if (c==',' || c=='\n') { //if it is new line or ,then it will store the number in list
list[i]=num;
num=0;
i++;
}
c=fgetc(fp1);
}
But this is having problems if it is a double digit. Does anyone have a better solution? Thank you!
For the data shown with no space before the commas, you could simply use:
while (fscanf(fp1, "%d,", &num) == 1 && i < 300)
list[i++] = num;
This will read the comma after the number if there is one, silently ignoring when there isn't one. If there might be white space before the commas in the data, add a blank before the comma in the format string. The test on i prevents you writing outside the bounds of the list array. The ++ operator comes into its own here.
First, fgetc returns an int, so c needs to be an int.
Other than that, I would use a slightly different approach. I admit that it is slightly overcomplicated. However, this approach may be usable if you have several different types of fields that requires different actions, like a parser. For your specific problem, I recommend Johathan Leffler's answer.
int c=fgetc(f);
while(c!=EOF && i<300) {
if(isdigit(c)) {
fseek(f, -1, SEEK_CUR);
if(fscanf(f, "%d", &list[i++]) != 1) {
// Handle error
}
}
c=fgetc(f);
}
Here I don't care about commas and newlines. I take ANYTHING other than a digit as a separator. What I do is basically this:
read next byte
if byte is digit:
back one byte in the file
read number, irregardless of length
else continue
The added condition i<300 is for security reasons. If you really want to check that nothing else than commas and newlines (I did not get the impression that you found that important) you could easily just add an else if (c == ... to handle the error.
Note that you should always check the return value for functions like sscanf, fscanf, scanf etc. Actually, you should also do that for fseek. In this situation it's not as important since this code is very unlikely to fail for that reason, so I left it out for readability. But in production code you SHOULD check it.
My solution is to read the whole line first and then parse it with strtok_r with comma as a delimiter. If you want portable code you should use strtok instead.
A naive implementation of readline would be something like this:
static char *readline(FILE *file)
{
char *line = malloc(sizeof(char));
int index = 0;
int c = fgetc(file);
if (c == EOF) {
free(line);
return NULL;
}
while (c != EOF && c != '\n') {
line[index++] = c;
char *l = realloc(line, (index + 1) * sizeof(char));
if (l == NULL) {
free(line);
return NULL;
}
line = l;
c = fgetc(file);
}
line[index] = '\0';
return line;
}
Then you just need to parse the whole line with strtok_r, so you would end with something like this:
int main(int argc, char **argv)
{
FILE *file = fopen(argv[1], "re");
int list[300];
if (file == NULL) {
return 1;
}
char *line;
int numc = 0;
while((line = readline(file)) != NULL) {
char *saveptr;
// Get the first token
char *tok = strtok_r(line, ",", &saveptr);
// Now start parsing the whole line
while (tok != NULL) {
// Convert the token to a long if possible
long num = strtol(tok, NULL, 0);
if (errno != 0) {
// Handle no value conversion
// ...
// ...
}
list[numc++] = (int) num;
// Get next token
tok = strtok_r(NULL, ",", &saveptr);
}
free(line);
}
fclose(file);
return 0;
}
And for printing the whole list just use a for loop:
for (int i = 0; i < numc; i++) {
printf("%d ", list[i]);
}
printf("\n");

proper use of scanf in a while loop to validate input

I made this code:
/*here is the main function*/
int x , y=0, returned_value;
int *p = &x;
while (y<5){
printf("Please Insert X value\n");
returned_value = scanf ("%d" , p);
validate_input(returned_value, p);
y++;
}
the function:
void validate_input(int returned_value, int *p){
getchar();
while (returned_value!=1){
printf("invalid input, Insert Integers Only\n");
getchar();
returned_value = scanf("%d", p);
}
}
Although it is generally working very well but when I insert for example "1f1" , it accepts the "1" and does not report any error and when insert "f1f1f" it reads it twice and ruins the second read/scan and so on (i.e. first read print out "invalid input, Insert Integers Only" and instead for waiting again to re-read first read from the user, it continues to the second read and prints out again "invalid input, Insert Integers Only" again...
It needs a final touch and I read many answers but could not find it.
If you don't want to accept 1f1 as valid input then scanf is the wrong function to use as scanf returns as soon as it finds a match.
Instead read the whole line and then check if it only contains digits. After that you can call scanf
Something like:
#include <stdio.h>
int validateLine(char* line)
{
int ret=0;
// Allow negative numbers
if (*line && *line == '-') line++;
// Check that remaining chars are digits
while (*line && *line != '\n')
{
if (!isdigit(*line)) return 0; // Illegal char found
ret = 1; // Remember that at least one legal digit was found
++line;
}
return ret;
}
int main(void) {
char line[256];
int i;
int x , y=0;
while (y<5)
{
printf("Please Insert X value\n");
if (fgets(line, sizeof(line), stdin)) // Read the whole line
{
if (validateLine(line)) // Check that the line is a valid number
{
// Now it should be safe to call scanf - it shouldn't fail
// but check the return value in any case
if (1 != sscanf(line, "%d", &x))
{
printf("should never happen");
exit(1);
}
// Legal number found - break out of the "while (y<5)" loop
break;
}
else
{
printf("Illegal input %s", line);
}
}
y++;
}
if (y<5)
printf("x=%d\n", x);
else
printf("no more retries\n");
return 0;
}
Input
1f1
f1f1
-3
Output
Please Insert X value
Illegal input 1f1
Please Insert X value
Illegal input f1f1
Please Insert X value
Illegal input
Please Insert X value
x=-3
Another approach - avoid scanf
You could let your function calculate the number and thereby bypass scanf completely. It could look like:
#include <stdio.h>
int line2Int(char* line, int* x)
{
int negative = 0;
int ret=0;
int temp = 0;
if (*line && *line == '-')
{
line++;
negative = 1;
}
else if (*line && *line == '+') // If a + is to be accepted
line++; // If a + is to be accepted
while (*line && *line != '\n')
{
if (!isdigit(*line)) return 0; // Illegal char found
ret = 1;
// Update the number
temp = 10 * temp;
temp = temp + (*line - '0');
++line;
}
if (ret)
{
if (negative) temp = -temp;
*x = temp;
}
return ret;
}
int main(void) {
char line[256];
int i;
int x , y=0;
while (y<5)
{
printf("Please Insert X value\n");
if (fgets(line, sizeof(line), stdin))
{
if (line2Int(line, &x)) break; // Legal number - break out
printf("Illegal input %s", line);
}
y++;
}
if (y<5)
printf("x=%d\n", x);
else
printf("no more retries\n");
return 0;
}
Generally speaking, it is my opinion that you are better to read everything from the input (within the range of your buffer size, of course), and then validate the input is indeed the correct format.
In your case, you are seeing errors using a string like f1f1f because you are not reading in the entire STDIN buffer. As such, when you go to call scanf(...) again, there is still data inside of STDIN, so that is read in first instead of prompting the user to enter some more input. To read all of STDIN, you should do something the following (part of code borrowed from Paxdiablo's answer here: https://stackoverflow.com/a/4023921/2694511):
#include <stdio.h>
#include <string.h>
#include <stdlib.h> // Used for strtol
#define OK 0
#define NO_INPUT 1
#define TOO_LONG 2
#define NaN 3 // Not a Number (NaN)
int strIsInt(const char *ptrStr){
// Check if the string starts with a positive or negative sign
if(*ptrStr == '+' || *ptrStr == '-'){
// First character is a sign. Advance pointer position
ptrStr++;
}
// Now make sure the string (or the character after a positive/negative sign) is not null
if(*ptrStr == NULL){
return NaN;
}
while(*ptrStr != NULL){
// Check if the current character is a digit
// isdigit() returns zero for non-digit characters
if(isdigit( *ptrStr ) == 0){
// Not a digit
return NaN;
} // else, we'll increment the pointer and check the next character
ptrStr++;
}
// If we have made it this far, then we know that every character inside of the string is indeed a digit
// As such, we can go ahead and return a success response here
// (A success response, in this case, is any value other than NaN)
return 0;
}
static int getLine (char *prmpt, char *buff, size_t sz) {
int ch, extra;
// Get line with buffer overrun protection.
if (prmpt != NULL) {
printf ("%s", prmpt);
fflush (stdout);
}
if (fgets (buff, sz, stdin) == NULL)
return NO_INPUT;
// If it was too long, there'll be no newline. In that case, we flush
// to end of line so that excess doesn't affect the next call.
// (Per Chux suggestions in the comments, the "buff[0]" condition
// has been added here.)
if (buff[0] && buff[strlen(buff)-1] != '\n') {
extra = 0;
while (((ch = getchar()) != '\n') && (ch != EOF))
extra = 1;
return (extra == 1) ? TOO_LONG : OK;
}
// Otherwise remove newline and give string back to caller.
buff[strlen(buff)-1] = '\0';
return OK;
}
void validate_input(int responseCode, char *prompt, char *buffer, size_t bufferSize){
while( responseCode != OK ||
strIsInt( buffer ) == NaN )
{
printf("Invalid input.\nPlease enter integers only!\n");
fflush(stdout); /* It might be unnecessary to flush here because we'll flush STDOUT in the
getLine function anyway, but it is good practice to flush STDOUT when printing
important information. */
responseCode = getLine(prompt, buffer, bufferSize); // Read entire STDIN
}
// Finally, we know that the input is an integer
}
int main(int argc, char **argv){
char *prompt = "Please Insert X value\n";
int iResponseCode;
char cInputBuffer[100];
int x, y=0;
int *p = &x;
while(y < 5){
iResponseCode = getLine(prompt, cInputBuffer, sizeof(cInputBuffer)); // Read entire STDIN buffer
validate_input(iResponseCode, prompt, cInputBuffer, sizeof(cInputBuffer));
// Once validate_input finishes running, we should have a proper integer in our input buffer!
// Now we'll just convert it from a string to an integer, and store it in the P variable, as you
// were doing in your question.
sscanf(cInputBuffer, "%d", p);
y++;
}
}
Just as a disclaimer/note: I have not written in C for a very long time now, so I do apologize in advance if there are any error in this example. I also did not have an opportunity to compile and test this code before posting because I am in a rush right now.
If you're reading an input stream that you know is a text stream, but that you are not sure only consists of integers, then read strings.
Also, once you've read a string and want to see if it is an integer, use the standard library conversion routine strtol(). By doing this, you both get a confirmation that it was an integer and you get it converted for you into a long.
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
bool convert_to_long(long *number, const char *string)
{
char *endptr;
*number = strtol(string, &endptr, 10);
/* endptr will point to the first position in the string that could
* not be converted. If this position holds the string terminator
* '\0' the conversion went well. An empty input string will also
* result in *endptr == '\0', so we have to check this too, and fail
* if this happens.
*/
if (string[0] != '\0' && *endptr == '\0')
return false; /* conversion succesful */
return true; /* problem in conversion */
}
int main(void)
{
char buffer[256];
const int max_tries = 5;
int tries = 0;
long number;
while (tries++ < max_tries) {
puts("Enter input:");
scanf("%s", buffer);
if (!convert_to_long(&number, buffer))
break; /* returns false on success */
printf("Invalid input. '%s' is not integer, %d tries left\n", buffer,
max_tries - tries);
}
if (tries > max_tries)
puts("No valid input found");
else
printf("Valid input: %ld\n", number);
return EXIT_SUCCESS;
}
ADDED NOTE: If you change the base (the last parameter to strtol()) from 10 to zero, you'll get the additional feature that your code converts hexadecimal numbers and octal numbers (strings starting with 0x and 00 respectively) into integers.
I took #4386427 idea and just added codes to cover what it missed (leading spaces and + sign), I tested it many times and it is working perfectly in all possible cases.
#include<stdio.h>
#include <ctype.h>
#include <stdlib.h>
int validate_line (char *line);
int main(){
char line[256];
int y=0;
long x;
while (y<5){
printf("Please Insert X Value\n");
if (fgets(line, sizeof(line), stdin)){//return 0 if not execute
if (validate_line(line)>0){ // check if the string contains only numbers
x =strtol(line, NULL, 10); // change the authentic string to long and assign it
printf("This is x %d" , x);
break;
}
else if (validate_line(line)==-1){printf("You Have Not Inserted Any Number!.... ");}
else {printf("Invalid Input, Insert Integers Only.... ");}
}
y++;
if (y==5){printf("NO MORE RETRIES\n\n");}
else{printf("%d Retries Left\n\n", (5-y));}
}
return 0;}
int validate_line (char *line){
int returned_value =-1;
/*first remove spaces from the entire string*/
char *p_new = line;
char *p_old = line;
while (*p_old != '\0'){// loop as long as has not reached the end of string
*p_new = *p_old; // assign the current value the *line is pointing at to p
if (*p_new != ' '){p_new++;} // check if it is not a space , if so , increment p
p_old++;// increment p_old in every loop
}
*p_new = '\0'; // add terminator
if (*line== '+' || *line== '-'){line++;} // check if the first char is (-) or (+) sign to point to next place
while (*line != '\n'){
if (!(isdigit(*line))) {return 0;} // Illegal char found , will return 0 and stop because isdigit() returns 0 if the it finds non-digit
else if (isdigit(*line)){line++; returned_value=2;}//check next place and increment returned_value for the final result and judgment next.
}
return returned_value; // it will return -1 if there is no input at all because while loop has not executed, will return >0 if successful, 0 if invalid input
}

C - How to scan an int only entered after symbol

I am having difficulty scanning from user input an integer (and storing it) only if printed directly after a !:
char cmd[MAX_LINE/2 + 1];
if (strcmp(cmd, "history") == 0)
history(hist, current);
else if (strcmp(cmd, "!!") == 0)
execMostRecHist(hist, current-1);
else if (strcmp(cmd, "!%d") == 0)
num = %d;
else
{//do stuff}
I understand this is completely wrong syntax for strcmp(), but just as an example of how I am gathering user input.
strcmp doesn't know about format specifiers, it just compares two strings. sscanf does what you want: It tests whether a string has a certain format and converts parts of the string to other types.
For example:
int n = 0;
if (sscanf(cmd, " !%d", &num) == 1) {
// Do stuff; num has already been assigned
}
The format specifier %d tells sscanf to look for a valid decimal integer. The exclamation mark has no special meaning and matches only if there is an exclamation mark. The space at the front means that the command may have leading white space. Nothe that there may be white space after the exclam and before the number and that the number may well be negative.
The format specifier is special to the scanf family and related to, but different from the ยด%dformat ofprintf`. Is usually has no meaning in other strings and certainly not when it is found unquoted in the code.
Don't you like writing a checker by yourself?
#include <ctype.h>
#include <stdio.h>
int check(const char *code) {
if (code == NULL || code[0] != '!') return 0;
while(*(++code) != '\0') {
if (!isdigit(*code)) return 0;
}
return 1;
}
/* ... */
if (check(cmd))
sscanf(cmd + 1, "%d", &num);
Use sscanf() and check its results.
char cmd[MAX_LINE/2 + 1];
num = 0; // Insure `num` has a known value
if (strcmp(cmd, "history") == 0)
history(hist, current);
else if (strcmp(cmd, "!!") == 0)
execMostRecHist(hist, current-1);
else if (sscanf(cmd, "!%d", &num) == 1)
;
else
{//do stuff}

Using fscanf to scan a value or use default if no value exists

I have a function to read a text file with the following format
string int int
string int int
string int int
I want to write a function that will assign the values from the text file into variables, but there will also be some cases where the format of the text file will be
string int
string int
string int
In that case, I'd like to set the value of the last int variable to 1. My code I have so far works with the first example but I'm a bit stuck on getting the second scenario to work:
void readFile(LinkedList *inList, char* file)
{
char tempName[30];
int tempLoc, tempNum;
FILE* f;
f = fopen(file, "r");
if(f==NULL)
{
printf("Error: could not open file");
}
else
{
while (fscanf(f, "%s %d %d\n", tempName, &tempLoc, &tempNum) != EOF)
{
insertFirst (inList, tempName, tempLoc, tempNum);
}
}
}
In the second case, fscanf will return 2 instead of 3. So you can rewrite the code like this:
while (1) {
int ret = fscanf(f, "%s %d %d\n", tempName, &tempLoc, &tempNum);
if (ret == EOF) {
break;
}
if (ret == 2) {
tempNum = 1;
} else if (ret != 3) {
// line appear invalid, deal with the error
}
insertFirst (inList, tempName, tempLoc, tempNum);
}
A more hacky way would be to set tempNum to 1 before calling fscanf and just check for EOF as you did above. But I think the code above is clearer.
Edit: to avoid overflows, this would be better. The code would perform better but this is harder to write. Just like above, I did not write any code for the error conditions but you definitely want to handle them
char lineBuf[255];
while (fgets(lineBuf, sizeof(lineBuf), f) != NULL) {
int spaceIdx, ret;
const int len = strlen(lineBuf);
if (len == (sizeof(lineBuf) - 1) {
// line is too long - either your buf is too small and you should tell the user
// that its input is bad
// I recommend to treat this as an error
}
lineBuf[len - 1] = '\0'; // remove \n
--len; // update len, we've removed one character
if (isspace(*lineBuf)) {
// error, line should not start with a space
}
spaceIdx = strcspn(lineBuf, "\t ");
if (spaceIdx == len) {
// error, no space in this line
}
// Ok, we've found the space. Deal with the rest.
// Note that for this purpose, sscanf is a bit heavy handed (but makes the code
// simpler). You could do it with strtol.
// Also, the first space in the format string is important, so sscanf skips
// all the space at the beginning of the string. If your format requires only
// one space between fields, you can do sscanf(lineBuf + spaceIdx + 1, "%d %d"...
ret = sscanf(lineBuf + spaceIdx, " %d %d", &tempLoc, &tempNum);
if (0 == ret) {
// error, no ints
}
else if (1 == ret) {
tempNum = 1;
}
// at that point, you could copy the first part of lineBuf to tempName, but then
// you have to deal with a potential overflow (and spend time on an useless copy),
// so use lineBuf instead
lineBuf[spaceIdx] = '\0';
insertFirst (inList, lineBuf, tempLoc, tempNum);
}

Resources