C parsing input text file into words - c

I am trying to parse input file (containing a text document with multiple lines and delimiters, i.e. "!,.?") into words. My function 'splitting function' is:
int splitInput(fp) {
int i= 0;
char line[255];
char *array[5000];
int x;
while (fgets(line, sizeof(line), fp) != NULL) {
array[i] = strtok(line, ",.!? \n");
printf("Check print - word %i:%s:\n",i, array[i]);
i++;
}
return 0;
}

Here's the corrected function [sorry for extra the style cleanup]:
int
splitInput(fp)
{
int i = 0;
char *cp;
char *bp;
char line[255];
char *array[5000];
int x;
while (fgets(line, sizeof(line), fp) != NULL) {
bp = line;
while (1) {
cp = strtok(bp, ",.!? \n");
bp = NULL;
if (cp == NULL)
break;
array[i++] = cp;
printf("Check print - word %i:%s:\n",i-1, cp);
}
}
return 0;
}
Now, take a look at the man page for strtok to understand the bp trick

If I understand your question correctly you want to read every line and split each line into words and add that into an array.
array[i] = strtok(line, ",.!? \n");
That will not work for obvious reasons because it will only return the first word for each line and you never allocate memory.
This is probably what you want.
char *pch;
pch = strtok(line, ",.!? \n");
while(pch != NULL) {
array[i++] = strdup(pch); // put the content of pch into array at position i and increment i afterwards.
pch = strtok(NULL, ",.!? \n"); // look for remaining words at the same line
}
Don't forget to free your array elements afterwards though using free.

Related

How to read from the file and write it in the structure? I have a little trouble with my code

I have to write this code, I mean I should read from the file name of students and their mark, and then sort students by the grow of mark. Now I just want to output only mark. I want to display grades using structures. I don't know where the problem is.
text.file
Jon 3
Alina 5
Ron 1
#include <stdio.h>
#define _CRT_SECURE_NO_WARNINGS
#include <string.h>
#include <stdlib.h>
int main()
{
const int N = 3;
int i = 0;
struct student {
char surname[50];
int mark;
};
struct student PI1[N];
char str[50];
const char s[1] = " ";
char* token;
FILE* ptr;
token = strtok(str, s);
ptr = fopen("test.txt", "r");
if (NULL == ptr) {
printf("file can't be opened \n");
}
while (fgets(str, 50, ptr) != NULL){
token = strtok(str, s);
strcpy(PI1[i].surname, token);
token = strtok(NULL, s);
PI1[i].mark = atoi(token);
i++;
}
fclose(ptr);
printf("The marks is:\n");
printf("%d %d %d", PI1[0].mark, PI1[1].mark, PI1[2].mark);
return 0;
}
You need to prevent the program from reading from the file pointer if opening the file fails:
ptr = fopen("test.txt", "r");
if (NULL == ptr) {
perror("test.txt");
return 1; // this could be one way
}
The second argument to strok should be a null terminated string. const char s[1] = " "; only has room for one character. No null terminator (\0). Make it:
const char s[] = " "; // or const char s[2] = " "; or const char *s = " ";
Don't iterate out of bounds. You need to check so that you don't try to put data in PI1[N] etc.
while (i < N && fgets(str, sizeof str, ptr) != NULL) {
// ^^^^^^^^
Check that strok actually returns a pointer to a new token. If it doesn't, the line you've read doesn't fulfill the requirements.
while (i < N && fgets(str, sizeof str, ptr) != NULL) {
token = strtok(str, s);
if(!token) break; // token check
strcpy(PI1[i].surname, token);
token = strtok(NULL, s);
if (token) // token check
PI1[i].mark = atoi(token);
else
break;
i++;
}
You could also skip the strcpy by reading directly into your struct student since char str[50]; has the same length as surname. str should probably be larger though, but for now:
while (i < N && fgets(PI1[i].surname, sizeof PI1[i].surname, ptr) != NULL) {
token = strtok(PI1[i].surname, s);
if(!token) break;
token = strtok(NULL, s);
if (token)
PI1[i].mark = atoi(token);
else
break;
i++;
}
Only print as many marks as you successfully read
printf("The marks are:\n");
for(int idx = 0; idx < i; ++idx) {
printf("%d ", PI1[idx].mark);
}
putchar('\n');

using strtok to get input from file

I am trying to create a program that takes in a number of processes (name, start time, remaining time) from a file, then uses a round robin algorithm to handle the queue.
The issue is, when I try to tokenize each line of the file by using strtok() and fgets(), the name of the process is always wrong.
For example, if the first line is P1 0 3 the output is like this:
void RoundRobin(char *filename) {
Queue *q = initQueue();
char string[MAX_SIZE];
FILE *file;
Process process[20];
char *token;
file = fopen(filename, "r");
if (!file) {
printf("File Cannot Be Opened");
}
fgets(string, 150, file);
token = strtok(string, "=");
token = strtok(NULL, "+");
int time_quantum = atoi(token);
int process_count = 0;
while (fgets(string, 150, file)) {
char *token1;
token1 = strtok(string, " ");
process[process_count].name = token1;
token1 = strtok(NULL, " ");
process[process_count].starting_time = atoi(token1);
token1 = strtok(NULL, " ");
process[process_count++].remaining_time = atoi(token1);
token1 = strtok(NULL, " ");
}
for (int i = 0; i < process_count; i++) {
printf("%s %d %d\n", process[i].name, process[i].starting_time, process[i].remaining_time);
}
fclose(file);
}
You are reusing a single char[] for all of your token parsing. fgets() will overwrite the contents of that char[] each time, and strtok() will return pointers to memory inside of that char[]. Thus, each time you read a new line from the file, the previous pointers you already stored in the process[] array are still pointing at the same memory, but the contents of that memory have been altered.
You need to instead allocate a separate char[] string for each name that you want to save in the process[] array. You can use strdup() for that, eg:
while (fgets(string, 150, file)){
char* token1 token1 = strtok(string, " ");
process[process_count].name = strdup(token1); // <-- HERE
...
}
// use process[] as needed...
for(int i = 0; i < process_count; i++){
free(process[i].name);
}
The problem is strtok() returns a pointer into the line that it parses. Hence all entries in the process array point to the same string array that is modified by the call to fgets().
You must duplicate the string you store in the process description structure:
void RoundRobin(const char *filename) {
char string[MAX_SIZE];
Process process[20];
Queue *q = initQueue();
char *token;
FILE *file;
file = fopen(filename, "r");
if (!file) {
printf("Cannot open file %s\n", filename);
return;
}
int time_quantum = 0;
int process_count = 0;
if (fgets(string, sizeof string, file)
&& (token = strtok(string, "=")) != NULL
&& (token = strtok(NULL, "+")) != NULL) {
time_quantum = atoi(token);
}
while (fgets(string, sizeof string, file)) {
char *token1;
if ((token1 = strtok(string, " ")) == NULL)
contine;
process[process_count].name = strdup(token1);
if ((token1 = strtok(string, " ")) == NULL)
contine;
process[process_count].starting_time = atoi(token1);
if ((token1 = strtok(string, " ")) == NULL)
contine;
process[process_count].remaining_time = atoi(token1);
process_count++;
}
for (int i = 0; i < process_count; i++) {
printf("%s %d %d\n", process[i].name, process[i].starting_time, process[i].remaining_time);
}
for (int i = 0; i < process_count; i++) {
free(process[i].name);
}
fclose(file);
}

Error when reading strings from CSV (Core Dumped)

I keep getting the same error, I'm new to programming so I'm not so sure if the Syntax is correct.
Every time I run it, it returns Segmentation Fault(core dumped), I'm not even sure If I can open a file with a string (address) instead of the filename in extense.
Also the files I'm reading from are CSV but in txt format.
I'm using C99
#define BUFFER_SIZE 1024
#define TAM_PERGUNTAS 128
struct question{
char category[TAM_PERGUNTAS];
char question[TAM_PERGUNTAS];
char option1[TAM_PERGUNTAS];
char option2[TAM_PERGUNTAS];
char option3[TAM_PERGUNTAS];
char correct[TAM_PERGUNTAS];
};
struct question vec_question[BUFFER_SIZE];
void questions() {
FILE *perguntas;
int numaleat=0;
int num_questions, counter = 0, index, temp_randomizer=0;
char line[BUFFER_SIZE];
char answer[32];
char address[TAM_PERGUNTAS];
address[0] = '\0';
srand(time(NULL));
printf("Digite agora o numero de perguntas desejadas.(MAX 20) : "); //Insert Number of questions
scanf("%d", &num_questions);
printf("\n");
for (counter = 0; counter < num_questions; counter++) {
temp_randomizer = rand() % j; //j Represents the number o CATEGORIES at play and acts as a marker in the SELECTION string
sprintf(address, "%s.txt", SELECTION[temp_randomizer]);
perguntas = fopen(address, "r");
if (perguntas == NULL) {
printf("ERROR OPENING FILE!");
}
index = 0;
while (fgets(line, sizeof(line), perguntas) != NULL) {
strcpy(vec_question[index].category, strtok(line, ";"));
strcpy(vec_question[index].question, strtok(NULL, ";"));
strcpy(vec_question[index].option1, strtok(NULL, ";"));
strcpy(vec_question[index].option2, strtok(NULL, ";"));
strcpy(vec_question[index].option3, strtok(NULL, ";"));
strcpy(vec_question[index].correct, strtok(NULL, ";"));
vec_question[index].correct[strlen(vec_question[index].correct) - 1] = '\0';
index++;
}
fclose(perguntas);
index = 20;
numaleat = rand() % index;
printf("%s : %s\n%s\n%s\n%s",vec_question[numaleat].category,vec_question[numaleat].question,vec_question[numaleat].option1,vec_question[numaleat].option2,vec_question[numaleat].option3);
for (int i = 0; i < num_users; i++) {
printf("\n%s: ", &users[i][20]);
scanf("%s", &answer[32]);
if (answer == vec_question[numaleat].correct)
userspoints[i] += 1;
}
}
}
In general one should assume that functions like strtok can fail.
Sometimes it fails and returns a NULL value. A short record in your input is a likely cause.
Consider using it with a loop, and breaking out of the loop once strtok returns NULL.
I found a simple example here.
#include <string.h>
#include <stdio.h>
int main () {
char str[80] = "This is - www.tutorialspoint.com - website";
const char s[2] = "-";
char *token;
/* get the first token */
token = strtok(str, s);
/* walk through other tokens */
while( token != NULL ) {
printf( " %s\n", token );
token = strtok(NULL, s);
}
return(0);
}
Note that it does one strtok to get the first token. That might return NULL in which case the loop doesn't run. If it doesn't return NULL then it prints that token, and asks strtok for the next token. It keeps doing that until strtok returns NULL.

Reading a csv file to a struct in C

The csv files will all have the following format:
Number of Spaces,10,,,,,,,,
,,,,,,,,,
,,,,,,,,,
Type,Set Id,Intraset Id,Name,Property Cost,House Cost,Hotel Cost,Rent,Rent with House,Rent With Hotel
Go,400,MU,,,,,,,
Property,0,0,0A,500,50,50,5,50,2000
Property,0,1,0B,1000,50,50,10,75,2500
Property,1,0,1A,2000,200,200,20,100,3000
Property,1,1,1B,2500,200,200,25,150,3000
Property,1,2,1C,3000,200,200,30,200,3500
Property,2,0,2A,4000,400,400,40,300,4000
Property,2,1,2B,4500,400,400,45,400,4000
Property,2,2,2C,5000,400,400,50,500,4000
Property,2,3,2D,5500,400,400,55,600,4500
The fourth line describes what each field in the lines below it are i.e. for line 5, type is property, set id is 0, name is 0A, etc. I have a struct Space that contains variables for all this information. The 5th line is special: it has type Go, get $400 for passing Go, name is MU, and none of the other fields apply. (This is a version of Monopoly).
Where I'm struggling is how to get the values that I need. So far I have only managed to get the number of spaces value (this determines the number of rows on the board) with this:
void openSpecs(char *fileName) {
FILE* file = fopen(fileName, "r");
if (file == NULL) {
printf("Could not open %s\n", fileName);
}
char c;
do {
fscanf(file, "%c", &c);
//printf("%c", c);
} while (!feof(file) && c != ',');
//printf("\n\n");
int numSpaces;
fscanf(file, "%d", &numSpaces);
//printf("there are %d spaces\n", numSpaces);
// note: the printf statements are there to help me see where I'm at in the file
fclose(file);
}
I'm conflicted on how to approach the rest of the file. I'm thinking of using a while loop to just skip the rest of the commas, and then just reading through line 4, as I don't need to save any of that. From there, I'm not sure what to do. If I use strtok, I need to have a line from the file already as a C string, correct? I can't statically allocate a C string and then use fscanf (no static allocation allowed), so how do I dynamically allocate for a string whose length is unknown?
Edit:
char str[4096];
fgets(str, 4096, file);
printf("%s\n", str);
int goCash = 0;
char* name = NULL;
char delim[2] = ",";
char* token;
token = strtok(str, delim); // this is Go
token = strtok(str, delim);
goCash = (int) token;
token = strtok(str, delim);
strcpy(name, token);
printf("you get %d for going past %s\n", goCash, name);
Be careful as strtok could run in to problems.
For example, consider the the following lines:
Property,0,0,0A,500,50,50,5,50,2000
Property,,0,0A,500,50,50,5,50,2000
Note that in the second line, the second field is missing and you have two consecutive delimiters: ",,". strtok doesn't give you any indication that there was a field missing, it just skips to the next available field.
You can fix this by replacing the occurrences of ,, with , ,
Another issue is that fgets includes the end of line character and you want to remove that.
enum { Type, SetId, IntrasetId, Name, PropertyCost, HouseCost, HotelCost,
Rent, RentwithHouse, RentWithHotel, total };
FILE *fp = fopen("test.txt", "r");
char buf[1000], temp[1000];
while(fgets(temp, sizeof(temp), fp))
{
//remove end of line
temp[strcspn(temp, "\r\n")] = 0;
//replace ",," with ", ,"
int j = 0;
for(int i = 0, len = strlen(temp); i < len; i++)
{
buf[j++] = temp[i];
if (temp[i]==',' && temp[i+1]==',')
buf[j++] = '0';
}
buf[j] = 0;
//read all the fields
char field[total][100];
for(int i = 0; i < total; i++) *field[i] = 0;
int i = 0;
char *ptr = strtok(buf, ",");
while(ptr)
{
strcpy(field[i++], ptr);
ptr = strtok(NULL, ",");
}
for(int i = 0; i < total; i++)
printf("%s, ", field[i]);
printf(" rent(%d)\n", atoi(field[RentWithHotel]));
}
fclose(fp);

How to get position of delimited separated string in C

How do i get the position of delimited separated string?
My text file looks like
at:x:25:25:Batch jobs daemon:/var/spool/atjobs:/bin/bash
avahi:x:109:111:User for Avahi:/var/run/avahi-daemon:/bin/false
beagleindex:x:110:112:User for Beagle indexing:/var/cache/beagle:/bin/bash
My C code looks like
#include<stdio.h>
int main(int argc, char *argv[])
{
char *str, *saveptr;
char ch[100];
char *sp;
FILE *f;
int j;
char searchString[20];
char *pos;
f = fopen("passwd", "r");
if (f == NULL)
{
printf("Error while opening the file");
}
while (fgets(ch, sizeof ch, f)!= NULL)
{
/*printf("%s\n", ch); */
for (j = 1, str = ch; ; j++, str= NULL)
{
char *token = strtok_r(str, ": ", &saveptr);
if (token == NULL)
break;
//printf("%s---\n---", token);
printf("%s",token);
}
}
fclose(f);
well, using strtok(str, ": ", will split your string on spaces as well as colons, which is probably not what you want. In addition, strtok treats multiple consecutive delimiter characters as a single delimiter (so it will never return an empty string between two colons), which is not what you want for parsing passwd.
Instead, you probably just want to use strchr:
while (fgets(ch, sizeof ch, f)!= NULL) {
char *token, *end;
for (j = 1, token = ch; token; j++, token = end) {
if ((end = strchr(token, ':'))) *end++ = 0;
...do something with token and j
I do not think you have to use strtok() just to get the position of a token separated by delimiters, rather simply walk through each line, and do a char by char comparison for the delimiter... (hope this will help you)
I prepared an input file called GetDelimPosition.txt:
at:x:25:25:Batch jobs daemon:/var/spool/atjobs:/bin/bash
avahi:x:109:111:User for Avahi:/var/run/avahi-daemon:/bin/false
jamil:x:25:25:Batch jobs daemon:/var/spool/atjobs:/bin/bash
javier:x:109:111:User for Avahi:/var/run/avahi-daemon:/bin/false
jiame:x:25:25:Batch jobs daemon:/var/spool/atjobs:/bin/bash
jose:x:109:111:User for Avahi:/var/run/avahi-daemon:/bin/false
And used the following code: (of course you will modify as needed)
#include <ansi_c.h>
//edit this line as needed:
#define FILE_LOC "C:\\dev\\play\\GetDelimPosition.txt"
int main(void)
{
FILE * fp;
char ch[260];
int line=-1;
int position[80][100]={0}; //lines x DelimPosition
memset(position, 0, 80*100*sizeof(int));
int i=-1,j=0, k=0;
int len;
fp = fopen(FILE_LOC, "r");
while (fgets(ch, sizeof ch, fp)!= NULL)
{
line++; //increment line
len = strlen(ch);
for(j=0;j<len;j++)
{
if(ch[j] == ':')
{
position[line][k] = j+1;//position of token (1 after delim)
k++; //increment position index for next token
}
}
k=0; //getting new line, zero position index
}
fclose(fp);
return 0;
}
To get the following results: (rows are lines in file, columns are positions of each token. First token is assumed at position 0, and not reported)

Resources