Reading text files into an array in C - c

Just got a question regards, when you read lines of text from a text file how would you separate the words and store them into an array.
For example if I have two lines of text in my text file that looks like this:
1005; AndyCool; Andy; Anderson; 23; LA
1006; JohnCool; John; Anderson; 23; LA
How would you split them into based on the ';' .
And then store them in 2D array.
Sorry I haven't started my coding just yet to paste it here
Cheers ...

Use the strsep function:
char* token;
char* line;
/* I assume the line as loaded from file */;
if( line != NULL ) {
while ((token = strsep(&line, ";")) != NULL)
{
/*
token points to the current extracted string,
use it to fill your array
*/
}
}

First read using fgets, then use strtok to split a string http://www.cplusplus.com/reference/cstring/strtok/

Look at the manual pages for fopen, fgets, strstr and and strchr and strspn functions ... The strtok and strsep functions also work for most things you will do.

Related

C - Nested loop using strtok

I am trying to use strtok to split up a text file into strings that I can pass to a spell check function, the text file includes characters such as '\n', ' ?!,.' etc...
I need to print any words that fail the spell check and the line number that they are on. Keeping track of the line is what I'm struggling with.
I have tried this so far but it only returns results for the first line of the text file:
char str[409377];
fread(str, noOfChars, 1, file);
fclose(file);
int lines=1;
char *token;
char *line;
char splitLine[] = "\n";
char delimiters[] = " ,.?!(){}*&^%$£_-+=";
line = strtok(str, splitLine);
while(line!=NULL){
token = strtok(line, delimiters);
while(token != NULL){
//print is just to test if I can loop through all the words
printf("%s", token);
//spellCheck function & logic here
token = strtok(NULL, delimiters);
}
line = strtok(NULL, splitLine);
lines++
}
Is using the nested while loop and strtok possible? Is there a better way to keep track of the line number?
The strtok function is not reentrant! It can not be used to tokenize multiple strings simultaneously. It's because it keeps internal state about the string currently being tokenized.
If you have a modern compiler and standard library then you could use strtok_s instead. Otherwise you have to come up with another solution.
You can use strtok, but it's not very easy to use. It's a stupid function, all it really does is replace delimiters with nuls and return a pointer to the start of the sequence it has delimited. So it's destructive. It can't handle special cases like English words being allowed one apostrophe (we're is a word, we'r'e is not), you have to make sure you list all the delimiters specifically.
It's probably best to write mystrok yourself, so you understand how it works. Then use that as the basis for your own word extractor.
The reason for your bug is that you chop off the first line, then that is all that strok sees on the subsequent calls.

Splitting a string into words

I have following problem:
// Basically I am reading from a file and storing in local array.
char myText[100] = "This is text of movie Jurassic Park";
// here I want to store each work in to dictionary
st.insert(&myText[0]); // should insert "This" not till end of sentence.
// similarly for next word "is", "text" and so on.
How do I do that in C?
For this, you would use the strtok function:
char myText[100] = "This is text of movie Jurassic Park";
char *p;
for (p = strtok(myText," "); p != NULL; p = strtok(NULL," ")) {
st.insert(p);
}
Note that this function modifies the string it's parsing by adding NUL bytes where the delimiters are.
You could use strtok(). http://www.cplusplus.com/reference/cstring/strtok/
For C you will need to include .
If you just want to split on spaces, you basically may want an strsplit or strtok.
Have a look at Split string with delimiters in C

Colon separated contents in a text file

I need a C program which can read contents from a text file and the contents in the file are colon separated as shown
CatId;1;CatName;CLOTHS;Prefix;CH;ActiveStatus;Y;......
So can any one suggest a best and simple logic to read the contents and store it in a buffer?
Thanks in advance
I'm not sure if it's the best way to do it but I would:
Use fgets to read the file line by
line
Use strtok to tokenize the string
(or do it manually depending on how
lazy I feel)
Something like this:
char *p;
while (fgets(line, MAXLINE, fp)) {
p = strtok(line, ";");
while (NULL != p) {
/* p is a token */
p = strtok(NULL, ";");
}
}

How I can skip a blank line in an input file when using strtok?

I want to pass lines of a file using strtok; the values are comma separated. However, strtok also reads blank lines which only contain spaces. Isn't it suppose to return a null pointer in such a situation?
How can I ignore such a line? I tried to check NULL, but as mentioned above it doesn't work.
void function_name(void)
{
const char delimiter[] = ",";
char line_read[9000];
char keep_me[9000];
int i = 0;
while(fgets(line_read, sizeof(line_read), filename) != NULL)
{
/*
* Check if the line read in contains anything
*/
if(line_read != NULL){
keep_me[i] = strtok(line_read, delimiter);
i++;
}
}
}
So to explain.
You're reading in your file using a while loop which reads the entire file line by line (fgets) into the array line_read.
Every time it reads in a line it will check to see if it contains anything (the NULL check).
If it does contain something it was parse it using strtok and read it into keep_me otherwise it will stay in the line_read array which you obviously don't use in your program.

How to reformat space delimited lines of text into separate <div> tags in C language?

I need to be able to (in C language) loop over a few lines of text where each line has some text in it where words are delimited by a variable number of white spaces. How can I detect the spaces and split each line into some kind of array so that I can put each word in a separate word tag in each line?
Any advice would be much appreciated.
Thanks
One way:
char* cp = strtok(inputString, " \t\n");
while (cp) {
// cp points to word in inputString, do something with it
cp = strtok(0, " \t\n"); // get next word
}
If you can't modify inputString -- as strtok() does -- you can loop over the string, testing each character with isspace(), from ctype.h.
You can use strtok() function to split into tokens. See strtok. It shows how to use strtok and split lines into words by space delimited.
You could do this:
start = end = 0;
while (str[end]) {
// extract word
while(str[end] && !isspace(str[end])) {
end++;
}
// word found between str[start] and str[end]
// do something with it
// skip whitespaces
while (str[end] && isspace(str[end])) {
end++;
}
start = end;
}

Resources