Comparing String Arrays in C - c

This is the code:
#include <stdio.h>
int main(void)
{
char words[256];
char filename[64];
int count = 0;
printf("Enter the file name: ");
scanf("%s", filename);
FILE *fileptr;
fileptr = fopen(filename, "r");
if(fileptr == NULL)
printf("File not found!\n");
while ((fscanf(fileptr, " %s ", words))> 0)
{
if (words==' ' || words == '\n')
count++;
}
printf("%s contains %d words.\n", filename, count);
return 0;
}
I keep getting this error:
warning: comparison between pointer and integer [enabled by default]
if (words==' ' || words == '\n')
^
I don't get the error once I change, words to *words but that does not give me the correct results. I am trying count the number of words in a file.

not necessary compare because %s(words) does not contain white spaces(e.g. ' ' or '\n').
try this
while (fscanf(fileptr, "%s", words)> 0) {
count++;
}

words is char pointer while ' ' is char, *words equals to words[0]
usually we would define a new pointer as below
char *p = words;
while(*p != '\0' )
{
// using *p something you need to do
p++;
}

There is no string in C. Every string (/ literal) is an Array of chars. Use strcmp

Take care that using the array name words by itself implies a pointer to the first element in the array. If what you need is to compare 2 strings in C then the strcmp is what you are looking for.

You cannot compare strings in C. You should compare them character by character using the standard library function strcmp. Here's its prototype contained in the string.h header.
int strcmp(const char *s1, const char *s2);
The strcmp function compares the two strings s1 and s2. It returns an integer less than, equal to, or greater than zero if s1 is found, respectively, to be less than, to match, or be greater than s2.
The format string of fscanf " %s " (note the trailing and the leading space) will read and discard any number of whitespaces which it does anyway with the format string "%s". This means no whitespaces will be written into the buffer words by fscanf. fscanf will write only non-whitespace characters in words and returns when it encounters a whitespace. So, to count the number of words, just increase the counter for each successful fscanf call.
Also, your program should check for possible buffer overflow in scanf and fscanf calls. If the input string is too big for the buffer, then this would cause undefined behaviour and even causing crash due to segfault. You can guard against it by changing the format string. scanf("%63s", filename); means scanf will read from stdin until it encounters a whitespace and write at most 63 non-whitespace characters in the buffer filename and then add a terminating null byte at the end.
#include <stdio.h>
#include <string.h>
int main(void) {
// assuming max word length is 256
// +1 for the terminating null byte added by scanf
char words[256 + 1];
// assuming max file name length is 64
// +1 for the terminating null byte
char filename[64 + 1];
int count = 0; // counter for number of words
printf("Enter the file name: ");
scanf("%64s", filename);
FILE *fileptr;
fileptr = fopen(filename, "r");
if(fileptr == NULL)
printf("File not found!\n");
while((fscanf(fileptr, "%256s", words)) == 1)
count++;
printf("%s contains %d words.\n", filename, count);
return 0;
}

Related

Write a program that reads strings and writes them to a file

Here's my task, below is most of the code done and finally my specific question
Write a program that reads strings and writes them to a file. The string must be dynamically allocated and the string can be of arbitrary length. When the string has been read it is written to the file. The length of the string must be written first then a colon (‘:’) and then the string. The program stops when user enters a single dot (‘.’) on the line.
For example:
User enters: This is a test
Program writes to file: 14:This is a test
Hint: fgets() writes a line feed at the end of the string if it fits in the string. Start with a small length, for example 16 characters, if you don’t see a line feed at the end then realloc the string to add more space and keep on adding new data to the string until you see a line feed at the end. Then you know that you have read the whole line. Then remove any ‘\r’ or ‘\n’ from the string and write the string length and the string to the file. Free the string before asking for a new string.
MY CODE:
#pragma warning(disable: 4996)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_NAME_SZ 256
int main()
{
char key[] = ".\n";
char* text;
text = (char*)malloc(MAX_NAME_SZ);
if (text == NULL)
{
perror("problem with allocating memory with malloc for *text");
return 1;
}
FILE* fp;
fp = fopen("EX13.txt", "w");
if (fp == NULL)
{
perror("EX13.txt not opened.\n");
return 1;
}
printf("Enter text or '.' to exit: ");
while (fgets(text, MAX_NAME_SZ, stdin) && strcmp(key, text))
{
fprintf(fp, "%ld: %s", strlen(text) - 1, text);
printf("Enter text or '.' to exit: ");
}
free((void*)text);
fclose(fp);
puts("Exit program");
return 0;
}
SPECIFIC QUESTION:
How can I make the program to allow arbitrarily long lines so there shouldn't be no limit at all for line length? Thanks
You could declare a pointer to char, read char by char and keep using reallocating the pointer until you get to the '\n':
int main()
{
char key[] = "."; //Excluded the \n since I'm not using fget
char* text;
FILE* fp;
fp = fopen("EX13.txt", "w");
if (fp == NULL)
{
perror("EX13.txt not opened.\n");
return 1;
}
printf("Enter text or '.' to exit: ");
int cont = 0;
while (1) //read all chars
{
if(!cont) //if it is the first, allocate space for 1
text = (char*) malloc(sizeof (char));
else //otherwise increase the space allocated by 1
text = (char*) realloc(text, (cont + 1) * sizeof(char));
scanf("%c", &text[cont]); //read a single char
if(text[cont] == '\n') //see if it is the end of line
{
text[cont] = 0; //if it is the end of line, then it is the end of the string
if(!strcmp(key, text)) //if the string is just a dot, end the loop
break;
fprintf(fp, "%ld: %s\n", cont, text);
printf("Enter text or '.' to exit: ");
cont = 0; //restarting the counter for the next input
free(text); // freeing after each iteration. you can optimize to maintain the space and only increase after getting to a bigger string than the previous you had so far
}
else //if it is not the end of the string, increase its size by 1
cont++;
}
free((void*)text);
fclose(fp);
puts("Exit program");
return 0;
}
Suggest using getline()
This seems to be a class room assignment, so I will not be writing the code for you.
Note: for the getline() function to be visible in linux, at the beginning of your code, you will need a statement similar to:
#define _GNU_SOURCE
or
#define _POSIX_C_SOURCE 200809L
getline(3)
NAME
getdelim, getline -- get a line from a stream
LIBRARY
Standard C Library (libc, -lc)
SYNOPSIS
#include <stdio.h>
ssize_t
getdelim(char ** restrict linep, size_t * restrict linecapp,
int delimiter, FILE * restrict stream);
ssize_t
getline(char ** restrict linep, size_t * restrict linecapp,
FILE * restrict stream);
DESCRIPTION
The getdelim() function reads a line from stream, delimited by the char-
acter delimiter. The getline() function is equivalent to getdelim() with
the newline character as the delimiter. The delimiter character is
included as part of the line, unless the end of the file is reached.
The caller may provide a pointer to a malloced buffer for the line in
*linep, and the capacity of that buffer in *linecapp. These functions
expand the buffer as needed, as if via realloc(). If linep points to a
NULL pointer, a new buffer will be allocated. In either case, *linep and
*linecapp will be updated accordingly.
RETURN VALUES
The getdelim() and getline() functions return the number of characters
written, excluding the terminating NUL character. The value -1 is
returned if an error occurs, or if end-of-file is reached.
EXAMPLES
The following code fragment reads lines from a file and writes them to
standard output. The fwrite() function is used in case the line contains
embedded NUL characters.
char *line = NULL;
size_t linecap = 0;
ssize_t linelen;
while ((linelen = getline(&line, &linecap, fp)) > 0)
fwrite(line, linelen, 1, stdout);
ERRORS
These functions may fail if:
[EINVAL] Either linep or linecapp is NULL.
[EOVERFLOW] No delimiter was found in the first SSIZE_MAX characters.
These functions may also fail due to any of the errors specified for
fgets() and malloc().
Note: you will need to pass to free() the line, when the code is through with it, to avoid a memory leak.
Note: to remove any trailing '\n' you can use:
line[ strcspn( line, "\n" ) ] = '\0';
Note: after removing any trailing '\n' you can use:
size_t length = strlen( line );
To get the length of the line in bytes.
Then print that length and the line using:
printf( "%zu:%s", length, line );

Splitting user input into strings of specific length

I'm writing a C program that parses user input into a char, and two strings of set length. The user input is stored into a buffer using fgets, and then parsed with sscanf. The trouble is, the three fields have a maximum length. If a string exceeds this length, the remaining characters before the next whitespace should be consumed/discarded.
#include <stdio.h>
#define IN_BUF_SIZE 256
int main(void) {
char inputStr[IN_BUF_SIZE];
char command;
char firstname[6];
char surname[6];
fgets(inputStr, IN_BUF_SIZE, stdin);
sscanf(inputStr, "%c %5s %5s", &command, firstname, surname);
printf("%c %s %s\n", command, firstname, surname);
}
So, with an input of
a bbbbbbbb cc
the desired output would be
a bbbbb cc
but is instead the output is
a bbbbb bbb
Using a format specifier "%c%*s %5s%*s %5s%*s" runs into the opposite problem, where each substring needs to exceed the set length to get to the desired outcome.
Is there way to achieve this by using format specifiers, or is the only way saving the substrings in buffers of their own before cutting them down to the desired length?
In addition to the other answers, never forget when facing string parsing problems, you always have the option of simply walking a pointer down the string to accomplish any type parsing you require. When you read your string into buffer (my buf below), you have an array of characters you are free to analyze manually (either with array indexes, e.g. buffer[i] or by assigning a pointer to the beginning, e.g. char *p = buffer;) With your string, you have the following in buffer with p pointing to the first character in buffer:
--------------------------------
|a| |b|b|b|b|b|b|b|b| |c|c|\n|0| contents
--------------------------------
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 index
|
p
To test the character pointed to by p, you simply dereference the pointer, e.g. *p. So to test whether you have an initial character between a-z followed by a space at the beginning of buffer, you simply need do:
/* validate first char is 'a-z' and followed by ' ' */
if (*p && 'a' <= *p && *p <= 'z' && *(p + 1) == ' ') {
cmd = *p;
p += 2; /* advance pointer to next char following ' ' */
}
note:, you are testing *p first, (which is the shorthand for *p != 0 or the equivalent *p != '\0') to validate the string is not empty (e.g. the first char isn't the nul-byte) before proceeding with further tests. You would also include an else { /* handle error */ } in the event any one of the tests failed (meaning you have no command followed by a space).
When you are done, your are left with p pointing to the third character in buffer, e.g.:
--------------------------------
|a| |b|b|b|b|b|b|b|b| |c|c|\n|0| contents
--------------------------------
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 index
|
p
Now your job is simply, just advance by no more than 5 characters (or until the next space is encountered, assigning the characters to firstname and then nul-terminate following the last character:
/* read up to NLIM chars into fname */
for (n = 0; n < NMLIM && *p && *p != ' ' && *p != '\n'; p++)
fname[n++] = *p;
fname[n] = 0; /* nul terminate */
note: since fgets reads and includes the trailing '\n' in buffer, you should also test for the newline.
When you exit the loop, p is pointing to the seventh character in the buffer as follows:
--------------------------------
|a| |b|b|b|b|b|b|b|b| |c|c|\n|0| contents
--------------------------------
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 index
|
p
You now simply read forward until you encounter the next space and then advance past the space, e.g.:
/* discard remaining chars up to next ' ' */
while (*p && *p != ' ') p++;
p++; /* advance to next char */
note: if you exited the firstname loop pointing at a space, the above code does not execute.
Finally, all you do is repeat the same loop for surname that you did for firstname. Putting all the pieces of the puzzle together, you could do something similar to the following:
#include <stdio.h>
enum { NMLIM = 5, BUFSIZE = 256 };
int main (void) {
char buf[BUFSIZE] = "";
while (fgets (buf, BUFSIZE, stdin)) {
char *p = buf, cmd, /* start & end pointers */
fname[NMLIM+1] = "",
sname[NMLIM+1] = "";
size_t n = 0;
/* validate first char is 'a-z' and followed by ' ' */
if (*p && 'a' <= *p && *p <= 'z' && *(p + 1) == ' ') {
cmd = *p;
p += 2; /* advance pointer to next char following ' ' */
}
else { /* handle error */
fprintf (stderr, "error: no single command followed by space.\n");
return 1;
}
/* read up to NLIM chars into fname */
for (n = 0; n < NMLIM && *p && *p != ' ' && *p != '\n'; p++)
fname[n++] = *p;
fname[n] = 0; /* nul terminate */
/* discard remaining chars up to next ' ' */
while (*p && *p != ' ') p++;
p++; /* advance to next char */
/* read up to NLIM chars into sname */
for (n = 0; n < NMLIM && *p && *p != ' ' && *p != '\n'; p++)
sname[n++] = *p;
sname[n] = 0; /* nul terminate */
printf ("input : %soutput : %c %s %s\n",
buf, cmd, fname, sname);
}
return 0;
}
Example Use/Output
$ echo "a bbbbbbbb cc" | ./bin/walkptr
input : a bbbbbbbb cc
output : a bbbbb cc
Look things over an let me know if you have any questions. No matter how elaborate the string or what you need from it, you can always get what you need by simply walking a pointer (or a pair of pointers) down the length of the string.
One way to split the input buffer as OP desires is to use multiple calls to sscanf(), and to use the %n conversion specifier to keep track of the number of characters read. In the code below, the input string is scanned in three stages.
First, the pointer strPos is assigned to point to the first character of inputStr. Then the input string is scanned with " %c%n%*[^ ]%n". This format string skips over any initial whitespaces that a user might enter before the first character, and stores the first character in command. The %n directive tells sscanf() to store the number of characters read so far in the variable n; then the *[^ ] directive tells sscanf() to read and ignore any characters until a whitespace character is encountered. This effectively skips over any remaining characters that were entered after the initial command character. The %n directive appears again, and overwrites the previous value with the number of characters read until this point. The reason for using %n twice is that, if the user enters a character followed by a whitespace (as expected), the second directive will find no matches, and sscanf() will exit without ever reaching the final %n directive.
The pointer strPos is moved to the beginning of the remaining string by adding n to it, and sscanf() is called a second time, this time with "%5s%n%*[^ ]%n". Here, up to 5 characters are read into the character array firstname[], the number of characters read is saved by the %n directive, any remaining non-whitespace characters are read and ignored, and finally, if the scan made it this far, the number of characters read is saved again.
strPos is increased by n again, and the final scan only needs "%s" to complete the task.
Note that the return value of fgets() is checked to be sure that it was successful. The call to fgets() was changed slightly to:
fgets(inputStr, sizeof inputStr, stdin)
The sizeof operator is used here instead of IN_BUF_SIZE. This way, if the declaration of inputStr is changed later, this line of code will still be correct. Note that the sizeof operator works here because inputStr is an array, and arrays do not decay to pointers in sizeof expressions. But, if inputStr were passed into a function, sizeof could not be used in this way inside the function, because arrays decay to pointers in most expressions, including function calls. Some, #DavidC.Rankin, prefer constants as OP has used. If this seems confusing, I would suggest sticking with the constant IN_BUF_SIZE.
Also note that the return values for each of the calls to sscanf() are checked to be certain that the input matches expectations. For example, if the user enters a command and a first name, but no surname, the program will print an error message and exit. It is worth pointing out that, if the user enters say, a command character and first name only, after the second sscanf() the match may have failed on \n, and strPtr is then incremented to point to the \0 and so is still in bounds. But this relies on the newline being in the string. With no newline, the match might fail on \0, and then strPtr would be incremented out of bounds before the next call to sscanf(). Fortunately, fgets() retains the newline, unless the input line is larger than the specified size of the buffer. Then there is no \n, only the \0 terminator. A more robust program would check the input string for \n, and add one if needed. It would not hurt to increase the size of IN_BUF_SIZE.
#include <stdio.h>
#include <stdlib.h>
#define IN_BUF_SIZE 256
int main(void)
{
char inputStr[IN_BUF_SIZE];
char command;
char firstname[6];
char surname[6];
char *strPos = inputStr; // next scan location
int n = 0; // holds number of characters read
if (fgets(inputStr, sizeof inputStr, stdin) == NULL) {
fprintf(stderr, "Error in fgets()\n");
exit(EXIT_FAILURE);
}
if (sscanf(strPos, " %c%n%*[^ ]%n", &command, &n, &n) < 1) {
fprintf(stderr, "Input formatting error: command\n");
exit(EXIT_FAILURE);
}
strPos += n;
if (sscanf(strPos, "%5s%n%*[^ ]%n", firstname, &n, &n) < 1) {
fprintf(stderr, "Input formatting error: firstname\n");
exit(EXIT_FAILURE);
}
strPos += n;
if (sscanf(strPos, "%5s", surname) < 1) {
fprintf(stderr, "Input formatting error: surname\n");
exit(EXIT_FAILURE);
}
printf("%c %s %s\n", command, firstname, surname);
}
Sample interaction:
a Zaphod Beeblebrox
a Zapho Beebl
The fscanf() functions have a reputation for being subtle and error-prone; the format strings used above may seem a little bit tricky. By writing a function to skip to the next word in the input string, the calls to sscanf() can be simplified. In the code below, skipToNext() takes a pointer to a string as input; if the first character of the string is a \0 terminator, the pointer is returned unchanged. All initial non-whitespace characters are skipped over, then any whitespace characters are skipped, up to the next non-whitespace character (which may be a \0). A pointer is returned to this non-whitespace character.
The resulting program is a little bit longer than the previous program, but it may be easier to understand, and it certainly has simpler format strings. This program does differ from the first in that it no longer accepts leading whitespace in the string. If the user enters whitespace before the command character, this is considered erroneous input.
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#define IN_BUF_SIZE 256
char * skipToNext(char *);
int main(void)
{
char inputStr[IN_BUF_SIZE];
char command;
char firstname[6];
char surname[6];
char *strPos = inputStr; // next scan location
if (fgets(inputStr, sizeof inputStr, stdin) == NULL) {
fprintf(stderr, "Error in fgets()\n");
exit(EXIT_FAILURE);
}
if (sscanf(strPos, "%c", &command) != 1 || isspace(command)) {
fprintf(stderr, "Input formatting error: command\n");
exit(EXIT_FAILURE);
}
strPos = skipToNext(strPos);
if (sscanf(strPos, "%5s", firstname) != 1) {
fprintf(stderr, "Input formatting error: firstname\n");
exit(EXIT_FAILURE);
}
strPos = skipToNext(strPos);
if (sscanf(strPos, "%5s", surname) != 1) {
fprintf(stderr, "Input formatting error: surname\n");
exit(EXIT_FAILURE);
}
printf("%c %s %s\n", command, firstname, surname);
}
char * skipToNext(char *c)
{
int inWord = isspace(*c) ? 0 : 1;
if (inWord && *c != '\0') {
while (!isspace(*c)) {
++c;
}
}
inWord = 0;
while (isspace(*c)) {
++c;
}
return c;
}

Scanning string from byte array

Is there a way to use scanf() to scan a string from an array of bytes?
i.e: scan any number of bytes before a specific value is found, and after that scan the subsequent string?
The main problem I'm having is dealing with the '\0' value. Is there a way to make scanf() bypass the NUL terminator in a controlled way?
Why dont you just iterate over the string?
char str[30];
char str1[30];
char str2[30];
//Initialize str it to some string
int i=0;
while(str[i]!='x') //say scan till you find x
{
i++;
}
i++;
memcpy(str1, str, i); //extract this substring till x
str1[i+1]='\0';
int j=0;
while(str[i]!='\0') //now copy the rest
{
j++; //track the point where x appeared
i++;
}
memcpy(str2, &str[i-j], j); //extract the rest
str2[j+1]='\0';
You can use sscanf() instead:
char string[100] = "720 11 43";
int x, y, z;
sscanf(string, "%d %d %d", &x, &y, &z);
Cannot do this from an array of bytes using sscanf() because sscanf() stops when it reaches a '\0'. #Joseph Quinsey
If code is reading from a file or stdin, there is a solution.
Both fscanf("%[^something]") and fscanf("%c") will scan a '\0'. Even fgets() will scan a '\0'.
OP: "scan any number of bytes before a specific value is found, and after that scan the subsequent string?"
The following will 1) scan over any number of byte until an x is found, 2) scan the x, 3) scan-over white-space and 4) scan and save non-white-space. Unfortunately this last step treats embedded '\0' as non-white-space.
ch ch;
char buf[100];
if (1 == fscanf(inf, "%*[^x]x%99s", buf)) string_after_x_is_found(buf);
To use fscanf("%c") in a general sense
FILE *inf;
inf = fopen("something", "rb");
char ch;
while (fscanf(inf,"%c", &ch) == 1) {
foo(ch);
}
fclose(inf);
Embedded '\0' really messes up the scanf() family. Careful use of format can work, but I recommend simply using fread() or fgetc() instead.

help with C string i/o

Reads through a .txt file and puts all chars that pass isalpha() in a char array. for spaces it puts \0, so the characters in the array are separated by strings. This works.
I need help with the second part. I need to read a string that the user inputs (lets call it the target string), find all instances of the target string in the char array, and then for each instance:
1. print the 5 words before the target string
2. print the target string itself
3. and print the 5 words after the target string
I can't figure it out, i'm new to C in general, and I find this i/o really difficult after coming from Java. Any help would be appreciated, here's the code I have right now:
#include <stdio.h>
#include <string.h>
main(argc, argv)
int argc;
char *argv[];
{
FILE *inFile;
char ch, ch1;
int i, j;
int arrayPointer = 0;
char wordArray [150000];
for (i = 0; i < 150000; i++)
wordArray [i] = ' ';
/* Reading .txt, strip punctuation, conver to lowercase, add char to array */
void extern exit(int);
if(argc > 2) {
fprintf(stderr, "Usage: fread <filename>\n");
exit(-1);
}
inFile = fopen(argv[1], "r");
ch = fgetc(inFile);
while (ch != EOF) {
if(isalpha(ch)) {
wordArray [arrayPointer] = tolower(ch);
arrayPointer++;
}
else if (isalpha(ch1)) {
wordArray [arrayPointer] = '\0';
arrayPointer++;
}
ch1 = ch;
ch = fgetc(inFile);
}
fclose;
/* Getting the target word from the user */
char str [20];
do {
printf("Enter a word, or type \"zzz\" to quit: ");
scanf ("%s", str);
char* pch;
pch = strstr(wordArray, str);
printf("Found at %d\n", pch - wordArray + 1);
pch = strstr(pch + 1, str);
} while (pch != NULL);
}
There are a number of problems here, but the one that is probably tripping you up the most is the use of strstr as you've got it. Both parameters are strings; the first is the haystack, and the second is the needle. The definition of a C string is (basically) a sequence of characters terminated by '\0'. Take a look at how you've constructed your wordArray; it's effectively a series of strings one right after the other. So when you are using strstr the first time, you are only ever looking at the first string.
I realize this isn't the entire answer you are looking for, but hopefully it points you in the right direction. You may want to consider building up an array of char * that points into your wordArray at each word. Iterate over that new array checking for the string the user is looking for. If you find it, you now have an index you can work backwards and forwards from.

Parsing text in C

I have a file like this:
...
words 13
more words 21
even more words 4
...
(General format is a string of non-digits, then a space, then any number of digits and a newline)
and I'd like to parse every line, putting the words into one field of the structure, and the number into the other. Right now I am using an ugly hack of reading the line while the chars are not numbers, then reading the rest. I believe there's a clearer way.
Edit: You can use pNum-buf to get the length of the alphabetical part of the string, and use strncpy() to copy that into another buffer. Be sure to add a '\0' to the end of the destination buffer. I would insert this code before the pNum++.
int len = pNum-buf;
strncpy(newBuf, buf, len-1);
newBuf[len] = '\0';
You could read the entire line into a buffer and then use:
char *pNum;
if (pNum = strrchr(buf, ' ')) {
pNum++;
}
to get a pointer to the number field.
fscanf(file, "%s %d", word, &value);
This gets the values directly into a string and an integer, and copes with variations in whitespace and numerical formats, etc.
Edit
Ooops, I forgot that you had spaces between the words.
In that case, I'd do the following. (Note that it truncates the original text in 'line')
// Scan to find the last space in the line
char *p = line;
char *lastSpace = null;
while(*p != '\0')
{
if (*p == ' ')
lastSpace = p;
p++;
}
if (lastSpace == null)
return("parse error");
// Replace the last space in the line with a NUL
*lastSpace = '\0';
// Advance past the NUL to the first character of the number field
lastSpace++;
char *word = text;
int number = atoi(lastSpace);
You can solve this using stdlib functions, but the above is likely to be more efficient as you're only searching for the characters you are interested in.
Given the description, I think I'd use a variant of this (now tested) C99 code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
struct word_number
{
char word[128];
long number;
};
int read_word_number(FILE *fp, struct word_number *wnp)
{
char buffer[140];
if (fgets(buffer, sizeof(buffer), fp) == 0)
return EOF;
size_t len = strlen(buffer);
if (buffer[len-1] != '\n') // Error if line too long to fit
return EOF;
buffer[--len] = '\0';
char *num = &buffer[len-1];
while (num > buffer && !isspace((unsigned char)*num))
num--;
if (num == buffer) // No space in input data
return EOF;
char *end;
wnp->number = strtol(num+1, &end, 0);
if (*end != '\0') // Invalid number as last word on line
return EOF;
*num = '\0';
if (num - buffer >= sizeof(wnp->word)) // Non-number part too long
return EOF;
memcpy(wnp->word, buffer, num - buffer);
return(0);
}
int main(void)
{
struct word_number wn;
while (read_word_number(stdin, &wn) != EOF)
printf("Word <<%s>> Number %ld\n", wn.word, wn.number);
return(0);
}
You could improve the error reporting by returning different values for different problems.
You could make it work with dynamically allocated memory for the word portion of the lines.
You could make it work with longer lines than I allow.
You could scan backwards over digits instead of non-spaces - but this allows the user to write "abc 0x123" and the hex value is handled correctly.
You might prefer to ensure there are no digits in the word part; this code does not care.
You could try using strtok() to tokenize each line, and then check whether each token is a number or a word (a fairly trivial check once you have the token string - just look at the first character of the token).
Assuming that the number is immediately followed by '\n'.
you can read each line to chars buffer, use sscanf("%d") on the entire line to get the number, and then calculate the number of chars that this number takes at the end of the text string.
Depending on how complex your strings become you may want to use the PCRE library. At least that way you can compile a perl'ish regular expression to split your lines. It may be overkill though.
Given the description, here's what I'd do: read each line as a single string using fgets() (making sure the target buffer is large enough), then split the line using strtok(). To determine if each token is a word or a number, I'd use strtol() to attempt the conversion and check the error condition. Example:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
/**
* Read the next line from the file, splitting the tokens into
* multiple strings and a single integer. Assumes input lines
* never exceed MAX_LINE_LENGTH and each individual string never
* exceeds MAX_STR_SIZE. Otherwise things get a little more
* interesting. Also assumes that the integer is the last
* thing on each line.
*/
int getNextLine(FILE *in, char (*strs)[MAX_STR_SIZE], int *numStrings, int *value)
{
char buffer[MAX_LINE_LENGTH];
int rval = 1;
if (fgets(buffer, buffer, sizeof buffer))
{
char *token = strtok(buffer, " ");
*numStrings = 0;
while (token)
{
char *chk;
*value = (int) strtol(token, &chk, 10);
if (*chk != 0 && *chk != '\n')
{
strcpy(strs[(*numStrings)++], token);
}
token = strtok(NULL, " ");
}
}
else
{
/**
* fgets() hit either EOF or error; either way return 0
*/
rval = 0;
}
return rval;
}
/**
* sample main
*/
int main(void)
{
FILE *input;
char strings[MAX_NUM_STRINGS][MAX_STRING_LENGTH];
int numStrings;
int value;
input = fopen("datafile.txt", "r");
if (input)
{
while (getNextLine(input, &strings, &numStrings, &value))
{
/**
* Do something with strings and value here
*/
}
fclose(input);
}
return 0;
}

Resources