C String parsing - c

It's a relatively simple thing to do, but I've been having problems separating a string from a file into multiple variables. I've tried strtok and sscanf with delimiters, but I seem to be doing something wrong.
#define MAX 40
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
int main()
{
char brd_str[26];
char board[26], Res[26], Ind[26], Cap[26];
int i=0, n=0;
FILE *data;
data = fopen ("C:\\datafile.txt", "rt");
fgets(brd_str, 26, data);
sscanf(brd_str,"%d[^,],%f[^,],%e[^,],%e[^,]", &board, &Res, &Ind, &Cap);
printf("%3d %6d %8e %8e", board, Res, Ind, Cap);
fclose(data);
printf("\nPlease enter something for the program to exit");
scanf("%d", &i);
return(0);
}
The string itself looks like this 2,4.57,2.01e-2,5.00e-8. The comma would be the delimiter in this case. When I compile it I have really large numbers which are incorrect.
This would have to be done multiple times (up to 40), and the variables themselves will be used for calculations.
There seems to be something wrong with the sscanf statement I've put in. I'm not really sure what the problem is.

Change:
char board[26], Res[26], Ind[26], Cap[26];
to:
int board;
float Res;
float Ind;
float Cap;
And, change:
printf("%3d %6d %8e %8e", board, Res, Ind, Cap);
to (perhaps):
printf("%3d %6f %8e %8e", board, Res, Ind, Cap);

Basically, your most immediate problem is
char board[26], Res[26], Ind[26], Cap[26];
So, you have strings ...
sscanf(brd_str,"%d[^,],%f[^,],%e[^,],%e[^,]", &board, &Res, &Ind, &Cap);
you can't read with "%e" into (addresses of) strings!
printf("%3d %6d %8e %8e", board, Res, Ind, Cap);
you can't print strings with "%e"
Also there are quite a few more problems with your code.

Just for a hint at coding style, take a look at the following code...
#include <stdio.h>
#include <stdlib.h>
int main( int argc, char **argv )
{
FILE *file;
int c, cp;
char *buf = (char*)malloc( sizeof( char ) * 100 );
for (c = 0; c < 100; c++ )
{
buf[c] = 0;
}
file = fopen( "input.txt", "r" );
cp = 0;
while( ( c = fgetc( file ) ) != EOF )
{
if ( c != ',' )
{
buf[cp] = (int)c;
cp++;
}
printf( "%c", c );
}
for (c = 0; c < 100; c++ )
{
printf( "Buf%d: %c\n", c, buf[c] );
}
free( buf );
return 0;
}
This code loads characters in from file. If you are wanting strings, consider simply that strings are arrays of characters... There are quite a few examples online you can look at including the following...
Read .CSV file in C
I hope this helps...

You may not combine a scanset format-specifier with other types of format-specifiers like that. Just like you may not have "%df" as a format-specifier for integer float (???), or "%si" for string integer (???), you may not have things like "%d[^,]".
"%d" alone would already abort reading when it encounters a ',' or any other non-digit character, so what you are trying to do there is extra and invalid precaution. Having that "[^,]" next to it, will cause sscanf to look for a '[' then '^' then ',' then ']' inside that string.
So, in short, you should rather be having something rather simple like the following:
#include<stdio.h>
int main( ){
int board;
float Res, Ind, Cap;
scanf( "%d,%f,%e,%e", &board, &Res, &Ind, &Cap );
// reads digit sequence until the non-digit input
// reads that number into board as an integer
// consumes a comma character
// and so on...
printf( "%d\n%f\n%e\n%e", board, Res, Ind, Cap );
return 0;
}

Addressing only:
...I've tried strtok ... but I seem to be doing something wrong....
And
This would have to be done multiple times (up to 40)
It sounds like you have an input file with a variable number of inputs, up to 40?
So the way the data is read should accommodate, and stop reading at end of data.
Here is an example doing these things using strtok():
With a file containing these values:
1.2,345,23,78,234,21.4567,2.45566,23,45,78,12,34,5.678
I also verified it works with exponential notation, such as:
1.2,345,23,7.8e3,2.34e-2,21.4567,2.45e-8,2.3e3,45,78,12,34,5.678
And using this code, strtok will parse through using ", \n" as delimiters:
#include <ansi_c.h>
int main(void)
{
FILE *fp;
char *tok;
double numbers[100];
char tempBuf[260], lineBuf[260];
int i=0;
memset(numbers, 0, sizeof(numbers)/sizeof(numbers[0])*sizeof(int));
fp = fopen("F:\\play3\\numbers.txt", "r");
while(fgets (lineBuf, sizeof(lineBuf), fp))
{
tok = strtok(lineBuf, ", \n");
while(tok)
{
strcpy(tempBuf, tok);
if(strlen(tempBuf)>0)
{
numbers[i++] = strtod(tempBuf, NULL);
}
tok = strtok(NULL, ", \n");
}
}
fclose(fp);
return 0;
}
With the following results:

Related

Using spaces while taking integers from the user [duplicate]

This question already has answers here:
Reading string from input with space character? [duplicate]
(13 answers)
Closed last year.
I'm trying to practice some stuff about text files, printing and reading from them. I need to take input from user -maybe their phone number or someone else's- but i want them to be able to use spaces between numbers
Lets say my phone number is: 565 856 12
i want them to be able to give me this number with spaces, instead of a squished version like 56585612
So far i've tried scanf() and i don't know how to make scanf() do something like this. I've tried going for chars and for loops but its a tangle.
And when i type 565 856 12 and press enter, only 565 will be counted for the phone number. and 856 12 goes for the next scanf.
struct Student{
unsigned long long student_phone_number;
}
int main(){
FILE *filePtr;
filePtr = fopen("std_info.txt","w");
struct Student Student1;
printf("\nEnter Student's Phone Number: ");
scanf("%llu",&Student1.student_phone_number);
fprintf(filePtr,"%llu\t",Student1.student_phone_number);
}
To solve this problem, I modified the Student structure to store both unsigned long long integers and character array. User reads character array from stdin. The read data is validated using the isValid() method, and the string is converted to an unsigned long long integer using the convertToNumber() method.
#include <stdio.h>
#include <ctype.h>
#include <stdbool.h>
struct Student{
unsigned long long numberForm;
char *textForm;
};
// Converts character array to unsigned long long integer.
void convertToNumber(struct Student * const student);
// Validates the data in the Student.textForm variable.
bool isValid(const char * const input, const char * const format);
// Returns the number of characters in a character array.
size_t getSize(const char * const input);
// This function returns the "base ^ exponent" result.
unsigned long long power(int base, unsigned int exponent);
int main()
{
struct Student student;
char format[] = "NNN NNN NN"; /* "123 456 78" */
printf("Enter phone number: ");
fgets(student.textForm, getSize(format) + 1, stdin);
// The gets() function is deprecated in newer versions of the C/C++ standards.
if(isValid(student.textForm, format))
{
convertToNumber(&student);
printf("Result: %llu", student.numberForm);
}
return 0;
}
void convertToNumber(struct Student * const student)
{
int size = getSize(student->textForm) - 2;
unsigned int temp[size];
student->numberForm = 0ull;
for(int i = 0, j = 0 ; i < getSize(student->textForm) ; ++i)
if(isdigit(student->textForm[i]))
temp[j++] = student->textForm[i] - '0';
for(size_t i = 0 ; i < size ; ++i)
student->numberForm += temp[i] * power(10, size - i - 1);
}
bool isValid(const char * const input, const char * const format)
{
if(getSize(input) == getSize(format))
{
size_t i;
for(i = 0 ; i < getSize(input) ; ++i)
{
if(format[i] == 'N')
{
if(!isdigit(input[i]))
break;
}
else if(format[i] == ' ')
{
if(input[i] != format[i])
break;
}
}
if(i == getSize(input))
return true;
}
return false;
}
unsigned long long power(int base, unsigned int exponent)
{
unsigned long long result = 1;
for(size_t i = 0 ; i < exponent ; ++i)
result *= base;
return result;
}
size_t getSize(const char * const input)
{
size_t size = 0;
while(input[++size] != '\0');
return size;
}
This program works as follows:
Enter phone number: 123 465 78
Result: 12346578
You can use fgets to parse an input with spaces and all:
#include <stdio.h>
#define SIZE 100
int main() {
char str[SIZE];
fgets(str, sizeof str, stdin); // parses input string with spaces
// and checks destination buffer bounds
}
If you then want to remove the spaces you can do that easily:
#include <ctype.h>
void remove_white_spaces(char *str)
{
int i = 0, j = 0;
while (str[i])
{
if (!isspace(str[i]))
str[j++] = str[i];
i++;
}
str[j] = '\0';
}
Presto, this function will remove white spaces from the string you pass as an argument.
Live demo
Input:
12 4 345 789
Output:
124345789
After that it's easy to convert this into an unsigned integral value, you can use strtoul, but why would you store a phone number in a numeric type, would you be performing arithmetic operations on it? Doubtfully. And you then save to a file, so it really doesn't matter if it's a string or numeric type. I would just keep it as a string.
It is generally better to store a telephone number as a string, not as a number.
In order to read a whole line of input (not just a single word) as a string, I recommend that you use the function fgets.
Here is an example based on the code in your question:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct Student{
char phone_number[50];
};
int main( void )
{
//open output file (copied from code in OP's question)
FILE *filePtr;
filePtr = fopen("std_info.txt","w");
//other variable declarations
struct Student Student1;
char *p;
//prompt user for input
printf( "Enter student's phone number: ");
//attempt to read one line of input
if ( fgets( Student1.phone_number, sizeof Student1.phone_number, stdin ) == NULL )
{
printf( "input error!\n" );
exit( EXIT_FAILURE );
}
//attempt to find newline character, in order to verify that
//the entire line was read in
p = strchr( Student1.phone_number, '\n' );
if ( p == NULL )
{
printf( "line too long for input!\n" );
exit( EXIT_FAILURE );
}
//remove newline character by overwriting it with terminating
//null character
*p = '\0';
//write phone number to file
fprintf( filePtr, "%s\n", Student1.phone_number );
//cleanup
fclose( filePtr );
}
Use string for that purpose and trim the spaces from string and then convert the given number into integer using atoi function, to use atoi you must include stdlib.h header file. For example
#include<stdio.h>
#include<stdlib.h>
int main(){
char str[] = "123456";
unsigned long long int num = atoi(str);
printf("%llu", num);
}

fscanf() to read in only characters with no punctuation marks

I would like to read in some words (in this example first 20) from a text file (name specified as an argument in the command line). As the below code runs, I found it takes punctuation marks with characters too.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char * argv[]){
int wordCap = 20;
int wordc = 0;
char** ptr = (char **) calloc (wordCap, sizeof(char*));
FILE *myFile = fopen (argv[1], "r");
if (!myFile) return 1;
rewind(myFile);
for (wordc = 0; wordc < wordCap; wordc++){
ptr[wordc] = (char *)malloc(30 * sizeof( char ) );
fscanf(myFile, "%s", ptr[wordc]);
int length = strlen(ptr[wordc]);
ptr[wordc][length] = '\0';
printf("word[%d] is %s\n", wordc, ptr[wordc]);
}
return 0;
}
As I pass through the sentence: "Once when a Lion was asleep a little Mouse began running up and down upon him;", "him" will be followed with a semicolon.
I changed the fscanf() to be fscanf(myFile, "[a-z | A-Z]", ptr[wordc]);, it takes the whole sentence as a word.
How can I change it to make the correct output?
You could accept the semi-colon and then remove it latter, like so:
after you've stored the word in ptr[wordc]:
i = 0;
while (i < strlen(ptr[wordc]))
{
if (strchr(".;,!?", ptr[wordc][i])) //add any char you wanna delete to that string
memmove(&ptr[wordc][i], &ptr[wordc][i + 1], strlen(ptr[wordc]) - i);
else
i++;
}
if (strlen(ptr[wordc]) > 0) // to not print any word that was just punctuations beforehand
printf("word[%d] is %s\n", wordc, ptr[wordc]);
I haven't tested this code, so there might be a typo or something in it.
Alternatively you could switch
fscanf(myFile, "%s", ptr[wordc]);
for
fscanf(myFile, "%29[a-zA-Z]%*[^a-zA-Z]", ptr[wordc]);
to capture only letters. the 29 limits word size so you don't get overflow since you're allocating size for only 30 chars

Detecting single character in string

So, I'm trying to detect a single character in a string. There must be no other characters besides whitespace and a null character. This is my first issue, as my code detects the character in a string with other characters (besides the whitespace).
My second issue, is I can't seem to figure out how best to read matrices from a file. I'm supposed to read the first line and get the ROWS x COLUMNS. Then I'm supposed to read the data into the a matrix array that is stored globally. Then reading the second matrix into a second matrix array (stored globally as well).
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>
#define MAXLINE 100
typedef struct matrixStruct{
int rows;
int columns;
}matrixStruct;
typedef int bool;
enum{
false,
true
};
/*
*
*/
int aMatrix1[10][10];
int aMatrix2[10][10];
int multiMatrix[10][10];
int main(int argc, char** argv){
FILE *inputFile;
char tempLine[MAXLINE], *tempChar, *tempString;
char *endChar;
endChar = (char *)malloc(sizeof(char));
(*endChar) = '*';
bool readFile = true;
inputFile = fopen(argv[1], "r");
if(inputFile == NULL){
printf("File %s not found.\n", argv[1]);
perror("Error");
exit(EXIT_FAILURE);
}else{
printf("File opened!\n");
}
int numRow, numColumn, i, j, tempNum, count = 0;
do{
fgets(tempLine, MAXLINE, inputFile);
tempChar = strchr(tempLine, '*');
if(tempChar != NULL){
printf("True # %s\ncount=%d\n",tempChar,count);
readFile = false;
}else{
sscanf(tempLine, "%d %d", &numRow, &numColumn);
count++;
for(i=0;i<numRow;i++){
fgets(tempLine, MAXLINE, inputFile);
for(j=0;j<numColumn;j++){
aMatrix1[i][j] = atoi(tempNum);
}
}
}
}
while(readFile);
printf("aMatrix1[%d][%d]= \n", numRow, numColumn);
for(i=0; i < numRow;i++){
for(j=0; j < numColumn; j++){
printf("aMatrix[%d][%d] = %d\t", i, j, aMatrix1[i][j]);
}
printf("\n");
}
return (EXIT_SUCCESS);
}
For the first issue you could do what you suggested in your comment (regexp are an overkill here) - loop through the string, break on any non-whitespace char that's not what you expect, and count the ones that do match - you don't want 0 matches, and i guess also no more than 1.
However, I suggest you read the man page for strtok - I normally wouldn't suggest it as it's not thread-safe and has strange behaviors, but in this simple case it could work fine - provide whitespace chars as delimiters, and it would return the first non-whitespace string. If that's doesn't strcmp with "*", or if the next call to strtok doesn't return null, then it's not a match.
By the way - what do you plan to do with lines that aren't " .. * .. " or " ROWS x COLUMNS "? you're not handling them right now.
As for the second issue - strtok again could come to the rescue - repeated calls would just give you the whitespace-delimited numbers (as strings), and you'll be able to populate tempNum for each iteration.

Find int in a string (char*) in pure c

There is a string with a line of text. Let's say:
char * line = "Foo|bar|Baz|23|25|27";
I would have to find the numbers.
I was thinking of something like this:
If the given char is a number, let's put it into a temporary char array. (buffer)
If the next character is NOT a number, let's make the buffer a new int.
The problem is... how do I find numbers in a string like this?
(I'm not familiar with C99/gcc that much.)
Compiler used: gcc 4.3 (Environment is a Debian Linux stable.)
I would approach as the following:
Considering '|' as the separator, tokenize the line of text, i.e. split the line into multiple fields.
For each token:
If the token is numeric:
Convert the token to a number
Some library functions that might be useful are strtok, isdigit, atoi.
One possible implementation for the approach suggested in this answer, based on sscanf.
#include <stdio.h>
#include <string.h>
void find_integers(const char* p) {
size_t s = strlen(p)+1;
char buf[s];
const char * p_end = p+s;
int n;
/* tokenize string */
for (; p < p_end && sscanf(p, "%[^|]%n", &buf, &n); p += (n+1))
{
int x;
/* try to parse an integer */
if (sscanf(buf, "%d", &x)) {
printf("got int :) %d\n", x);
}
else {
printf("got str :( %s\n", buf);
}
}
}
int main() {
const char * line = "Foo|bar|Baz|23|25|27";
find_integers(line);
}
Output:
$ gcc test.c && ./a.out
got str :( Foo
got str :( bar
got str :( Baz
got int :) 23
got int :) 25
got int :) 27

Working with Char in c programming

I am a newbie in c programming language and I have a university tutorial assignment that is related with working with chars(I wont be graded for this assignment) where you have to count words, I have to compile and submit my answers in an online web environment where my code will run against test cases that are not visible to me.here is my assignment:
Write the function 'wc' which returns a string containing formatted as follows: "NUMLINES NUMWORDS NUMCHARS NUMBYTES" .
Whitespace characters are blanks, tabs (\t) and new lines (\n). A character is anything that is not whitespace. The given string is null-char (\0) terminated.
here is my code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char* wc(char* data) {
char* result ;
int numLine ;
int numWords ;
int numChars ;
int i;
int numBytes =strlen(data);
char* empty=NULL;
while(strstr(data,empty)>0){
numWords=1;
for (i = 0; i < sizeof(data); i++) {
if(data[i]=='\n'){
numLine++;
}
if(data[i]==' ' ){
numWords++;
}
if(data[i]!=' '){
numChars++;
}
}
}
sprintf(result, "%d %d %d %d", numLine, numWords, numChars, numBytes);
return result;
}
this code will give me the correct output result but I am missing something here at least the test tells me that.
You've got a very serious error:
char* result;
...
sprintf(result, "%d %d %d %d", numLine, numWords, numChars, numBytes);
This is not allowed in C. You need to allocate sufficient memory for the string first. Declare result as a large enough static array, or use malloc if you've covered that in your course.
e.g.
char buf[100]; // temporary buffer
sprintf(buf, "%d %d %d %d", numLine, numWords, numChars, numBytes);
char *result = malloc(strlen(buf) + 1); // just enough for the string
strcpy(result, buf); // store the string
return result;
What if you have this input?
Two Words.
You have to count the transitions between whitespace/non-whitespace, not just count spaces.
Also, I'm pretty sure strstr(data,NULL) will not do anything useful.
You also appear to be missing the \t for tab in your white space checker, and you're not correctly checking when you're in or out of a word. You can use the boolean type bool for this defined in stdbool.h for this.
Source code of wc unix command:
http://www.gnu.org/software/cflow/manual/html_node/Source-of-wc-command.html
All test cases handled.
1) sizeof is wrong:
Instead of sizeof operator you need to use strlen() in for loop, like:
for (i = 0; i < strlen(data); i++)
^ not sizeof
sizeof(data) returns only size of data pointer address that is 4. Because you are to read all char in data[] you need strlen() that will return length of data[] (or number of chars in data[])
2) memory error:
Next Error I can notice there is no memory allocated for result. it declare like:
char* result ;
and No memory allocate! and you are writing using sprintf that cause undefined behavior of your code
3) while(strstr(data,empty)>0) is wrong
strstr() search position of a string in to other you empty string is NULL , CHECK:
char *strstr(const char *s1, const char *s2);
you strstr() always returns data, Why are you calling this? I believe you don't need this while() loop.
I improved you code upto some extend as below, There was only three error as I mentioned above now corrected(to understand read comments), You basic algo is correct:
#define SIZE 256 // added size macro
char* wc(char* data)
char* result = malloc(SIZE*sizeof(char)); //(2) allocated memory for result
int numLine ;
int numWords ;
int numChars ;
int i;
int numBytes =strlen(data);
numWords=1;
// (3) remove while loop
for (i = 0; i < strlen(data); i++) { //(1) change size
if(data[i]=='\n'){
numLine++;
}
if(data[i]==' ' ){
numWords++;
}
if(data[i]!=' '){
numChars++;
}
}
sprintf(result, "%d %d %d %d", numLine, numWords, numChars, numBytes);
return result;
}
int main(){
printf("\nresult: %s\n", wc("q toei lxlckmc \t \n ldklkjjls \n i \t nn "));
return 1;
}
Output:
result: 2 14 28 41

Resources