The csv files will all have the following format:
Number of Spaces,10,,,,,,,,
,,,,,,,,,
,,,,,,,,,
Type,Set Id,Intraset Id,Name,Property Cost,House Cost,Hotel Cost,Rent,Rent with House,Rent With Hotel
Go,400,MU,,,,,,,
Property,0,0,0A,500,50,50,5,50,2000
Property,0,1,0B,1000,50,50,10,75,2500
Property,1,0,1A,2000,200,200,20,100,3000
Property,1,1,1B,2500,200,200,25,150,3000
Property,1,2,1C,3000,200,200,30,200,3500
Property,2,0,2A,4000,400,400,40,300,4000
Property,2,1,2B,4500,400,400,45,400,4000
Property,2,2,2C,5000,400,400,50,500,4000
Property,2,3,2D,5500,400,400,55,600,4500
The fourth line describes what each field in the lines below it are i.e. for line 5, type is property, set id is 0, name is 0A, etc. I have a struct Space that contains variables for all this information. The 5th line is special: it has type Go, get $400 for passing Go, name is MU, and none of the other fields apply. (This is a version of Monopoly).
Where I'm struggling is how to get the values that I need. So far I have only managed to get the number of spaces value (this determines the number of rows on the board) with this:
void openSpecs(char *fileName) {
FILE* file = fopen(fileName, "r");
if (file == NULL) {
printf("Could not open %s\n", fileName);
}
char c;
do {
fscanf(file, "%c", &c);
//printf("%c", c);
} while (!feof(file) && c != ',');
//printf("\n\n");
int numSpaces;
fscanf(file, "%d", &numSpaces);
//printf("there are %d spaces\n", numSpaces);
// note: the printf statements are there to help me see where I'm at in the file
fclose(file);
}
I'm conflicted on how to approach the rest of the file. I'm thinking of using a while loop to just skip the rest of the commas, and then just reading through line 4, as I don't need to save any of that. From there, I'm not sure what to do. If I use strtok, I need to have a line from the file already as a C string, correct? I can't statically allocate a C string and then use fscanf (no static allocation allowed), so how do I dynamically allocate for a string whose length is unknown?
Edit:
char str[4096];
fgets(str, 4096, file);
printf("%s\n", str);
int goCash = 0;
char* name = NULL;
char delim[2] = ",";
char* token;
token = strtok(str, delim); // this is Go
token = strtok(str, delim);
goCash = (int) token;
token = strtok(str, delim);
strcpy(name, token);
printf("you get %d for going past %s\n", goCash, name);
Be careful as strtok could run in to problems.
For example, consider the the following lines:
Property,0,0,0A,500,50,50,5,50,2000
Property,,0,0A,500,50,50,5,50,2000
Note that in the second line, the second field is missing and you have two consecutive delimiters: ",,". strtok doesn't give you any indication that there was a field missing, it just skips to the next available field.
You can fix this by replacing the occurrences of ,, with , ,
Another issue is that fgets includes the end of line character and you want to remove that.
enum { Type, SetId, IntrasetId, Name, PropertyCost, HouseCost, HotelCost,
Rent, RentwithHouse, RentWithHotel, total };
FILE *fp = fopen("test.txt", "r");
char buf[1000], temp[1000];
while(fgets(temp, sizeof(temp), fp))
{
//remove end of line
temp[strcspn(temp, "\r\n")] = 0;
//replace ",," with ", ,"
int j = 0;
for(int i = 0, len = strlen(temp); i < len; i++)
{
buf[j++] = temp[i];
if (temp[i]==',' && temp[i+1]==',')
buf[j++] = '0';
}
buf[j] = 0;
//read all the fields
char field[total][100];
for(int i = 0; i < total; i++) *field[i] = 0;
int i = 0;
char *ptr = strtok(buf, ",");
while(ptr)
{
strcpy(field[i++], ptr);
ptr = strtok(NULL, ",");
}
for(int i = 0; i < total; i++)
printf("%s, ", field[i]);
printf(" rent(%d)\n", atoi(field[RentWithHotel]));
}
fclose(fp);
Related
I seem to be losing the reference to my pointers here. I dont know why but I suspect its the pointer returned by fgets that messes this up.
I was told a good way to read words from a file was to get the line then separate the words with strok, but how can I do this if my pointers inside words[i] keep dissapearing.
text
Natural Reader is
john make tame
Result Im getting.
array[0] = john
array[1] = e
array[2] =
array[3] = john
array[4] = make
array[5] = tame
int main(int argc, char *argv[]) {
FILE *file = fopen(argv[1], "r");
int ch;
int count = 0;
while ((ch = fgetc(file)) != EOF){
if (ch == '\n' || ch == ' ')
count++;
}
fseek(file, 0, SEEK_END);
size_t size = ftell(file);
fseek(file, 0, SEEK_SET);
char** words = calloc(count, size * sizeof(char*) +1 );
int i = 0;
int x = 0;
char ligne [250];
while (fgets(ligne, 80, file)) {
char* word;
word = strtok(ligne, " ,.-\n");
while (word != NULL) {
for (i = 0; i < 3; i++) {
words[x] = word;
word = strtok(NULL, " ,.-\n");
x++;
}
}
}
for (i = 0; i < count; ++i)
if (words[i] != 0){
printf("array[%d] = %s\n", i, words[i]);
}
free(words);
fclose(file);
return 0;
}
strtok does not allocate any memory, it returns a pointer to a delimited string in the buffer.
therefore you need to allocate memory for the result if you want to keep the word between loop iterations
e.g.
word = strdup(strtok(ligne, " ,.-\n"));
You could also hanle this by using a unique ligne for each line read, so make it an array of strings like so:
char ligne[20][80]; // no need to make the string 250 since fgets limits it to 80
Then your while loop changes to:
int lno = 0;
while (fgets(ligne[lno], 80, file)) {
char *word;
word = strtok(ligne[lno], " ,.-\n");
while (word != NULL) {
words[x++] = word;
word = strtok(NULL, " ,.-\n");
}
lno++;
}
Adjust the first subscript as needed for the maximum size of the file, or dynamically allocate the line buffer during each iteration if you don't want such a low limit. You could also use getline instead of fgets, if your implementation supports it; it can handle the allocation for, though you then need to free the blocks when you are done.
If you are processing real-world prose, you might want to include other delimiters in your list, like colon, semicolon, exclamation point, and question mark.
I have a hard time understanding how you process ascii files in c. I have no problem opening files and closing them or reading files with one value on each line. However, when the data is separated with characters, I really don't understand what the code is doing at a lower level.
Example: I have a file containing names separated with comas that looks like this:
"MARY","PATRICIA","LINDA","BARBARA","ELIZABETH","JENNIFER"
I have created an array to store them:
char names[6000][20];
And now, my code to process it is while (fscanf(data, "\"%s\",", names[index]) != EOF) { index++; }
The code executes for the 1st iteration and names[0] contains the whole file.
How can I separate all the names?
Here is the full code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
char names[6000][20]; // an array to store 6k names of max length 19
FILE * data = fopen("./022names.txt", "r");
int index = 0;
int nbNames;
while (fscanf(data, "\"%s\",", names[index]) != EOF) {
index++;
}
nbNames = index;
fclose(data);
printf("%d\n", index);
for (index=0; index<nbNames; index++) {
printf("%s \n", names[index]);
}
printf("\n");
return 0;
}
PS: I am thinking this might also be because of the data structure of my array.
If you want a simple solution, you can read the file character by character using fgetc. Since there are no newlines in the file, just ignore quotation marks and move to the next index when you find a comma.
char names[6000][20]; // an array to store 6k names of max length 19
FILE * data = fopen("./022names.txt", "r");
int name_count = 0, current_name_ind = 0;
int c;
while ((c = fgetc(data)) != EOF) {
if (c == ',') {
names[name_count][current_name_ind] = '\0';
current_name_ind = 0;
++name_count;
} else if (c != '"') {
names[name_count][current_name_ind] = c;
++current_name_ind;
}
}
names[name_count][current_name_ind] = '\0';
fclose(data);
"The code executes for the 1st iteration and names[0] contains the whole file...., How can I separate all the names?"
Regarding the first few statements:
char names[6000][20]; // an array to store 6k names of max length 19
FILE * data = fopen("./022names.txt", "r");
What if there are there are 6001 names. Or one of the names has more than 20 characters?
Or what if there are way less than 6000 names?
The point is that with some effort to enumerate the tasks you have listed, and some time mapping out what information is needed to create the code that matches your criteria, you can create a better product: The following is derived from your post:
Process ascii files in c
Read file content that is separated by characters
input is a comma separated file, with other delimiters as well
Choose a method best suited to parse a file of variable size
As mentioned in the comments under your question there are ways to create your algorithms in such way as to flexibly allow for extra long names, or for a variable number of names. This can be done using a few C standard functions commonly used in parsing files. ( Although fscanf() has it place, it is not the best option for parsing file contents into array elements.)
The following approach performs the following steps to accomplish the user needs enumerated above
Read file to determine number of, and longest element
Create array sized to contain exact contents of file using count of elements and longest element using variable length array (VLA)
Create function to parse file contents into array. (using this technique of passing VLA as function argument.)
Following is a complete example of how to implement each of these, while breaking the tasks into functions when appropriate...
Note, code below was tested using the following input file:
names.txt
"MARY","PATRICIA","LINDA","BARBARA","ELIZABETH","JENNIFER",
"Joseph","Bart","Daniel","Stephan","Karen","Beth","Marcia",
"Calmazzothoulumus"
.
//Prototypes
int count_names(const char *filename, size_t *count);
size_t filesize(const char *fn);
void populateNames(const char *fn, int longest, char arr[][longest]);
char *filename = ".\\names.txt";
int main(void)
{
size_t count = 0;
int longest = count_names(filename, &count);
char names[count][longest+1];//VLA - See linked info
// +1 is room for null termination
memset(names, 0, sizeof names);
populateNames(filename, longest+1, names);
return 0;
}
//populate VLA with names in file
void populateNames(const char *fn, int longest, char names[][longest])
{
char line[80] = {0};
char *delim = "\",\n ";
char *tok = NULL;
FILE * fp = fopen(fn, "r");
if(fp)
{
int i=0;
while(fgets(line, sizeof line, fp))
{
tok = strtok(line, delim);
while(tok)
{
strcpy(names[i], tok);
tok = strtok(NULL, delim);
i++;
}
}
fclose(fp);
}
}
//passes back count of tokens in file, and return longest token
int count_names(const char *filename, size_t *count)
{
int len=0, lenKeep = 0;
FILE *fp = fopen(filename, "r");
if(fp)
{
char *tok = NULL;
char *delim = "\",\n ";
int cnt = 0;
size_t fSize = filesize(filename);
char *buf = calloc(fSize, 1);
while(fgets(buf, fSize, fp)) //goes to newline for each get
{
tok = strtok(buf, delim);
while(tok)
{
cnt++;
len = strlen(tok);
if(lenKeep < len) lenKeep = len;
tok = strtok(NULL, delim);
}
}
*count = cnt;
fclose(fp);
free(buf);
}
return lenKeep;
}
//return file size in bytes (binary read)
size_t filesize(const char *fn)
{
size_t size = 0;
FILE*fp = fopen(fn, "rb");
if(fp)
{
fseek(fp, 0, SEEK_END);
size = ftell(fp);
fseek(fp, 0, SEEK_SET);
fclose(fp);
}
return size;
}
You can use the in-built strtok() function which is easy to use.
I have used the tok+1 instead of tok to omit the first " and strlen(tok) - 2 to omit the last ".
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
char names[6000][20]; // an array to store 6k names of max length 19
FILE * data = fopen("./022names.txt", "r");
int index = 0;
int nbNames;
char *str = (char*)malloc(120000*sizeof(char));
while (fscanf(data, "%s", str) != EOF) {
char *tok = strtok(str, ",");
while(tok != 0){
strncpy(names[index++], tok+1, strlen(tok)-2);
tok = strtok(0, ",");
}
}
nbNames = index;
fclose(data);
free(str); // just to free the memory occupied by the str variable in the heap.
printf("%d\n", index);
for (index=0; index<nbNames; index++) {
printf("%s \n", names[index]);
}
printf("\n");
return 0;
}
Also, the parameter 120000 is just the maximum number of characters that can be in the file. It is just 6000 * 20 as you mentioned.
I am reading a file called "dictionary.txt" by fgets and print out, but like 10% of the head text from the "dictionary.txt" is lost when I run the program.
I suspect whether it is the size of the buffer is small, but changing MAX_INT to bigger numbers doesn't help either.
#include <stdio.h>
#include<string.h>
#define MAX_INT 50000
void main() {
FILE *fp;
char* inp = (char*)malloc(sizeof(char)*MAX_INT);
int i;
int isKorean = 0;
char* buffer[MAX_INT];
char* ptr = (char*)malloc(sizeof(char)*MAX_INT);
if (fp = fopen("C://Users//user//Desktop//dictionary.txt", "r")) {
while (fgets(buffer, sizeof(buffer), fp)) {
ptr = strtok(buffer, "/"); //a line is looking like this : Umberto/영어("English" written in Korean)
for (i = 0; i < strlen(ptr); i++) {
if ((ptr[i] & 0x80) == 0x80) isKorean = 1; //check whether it's korean
if (!isKorean) printf("%c", ptr[i]); //if it's not korean, then print one byte
else {
printf("%c%c", ptr[i], ptr[i + 1]); //if it's korean, then print two bytes
i++;
}
isKorean = 0;
printf("\n");
}
ptr = strtok(NULL, " ");
printf("tagger:%s\n", ptr); //print the POS tagger of the word(it's in dictionary)
}
fclose(fp);
}
}
The function fgets has this syncpsis:
char *
fgets(char * restrict str, int size, FILE * restrict stream);
So why make buffer as pointer array?
char buffer[MAX_INT] is what we need.
And the following statement:
if (fp = fopen("/Users/weiyang/code/txt", "r")) is not safe, it’s better to add parentheses after assignment.
Okay, I found the answer.
By adding below after the "ptr = strtok(NULL, " ");" just worked. I also had to do something with the tagger part because it is also written in Korean.
ptr = strtok(NULL, " ");
for (i = 0; i < strlen(ptr); i++) {
printf("%c%c", ptr[i], ptr[i + 1]); //if it's korean, then print two bytes
i++;
}
I am trying to parse input file (containing a text document with multiple lines and delimiters, i.e. "!,.?") into words. My function 'splitting function' is:
int splitInput(fp) {
int i= 0;
char line[255];
char *array[5000];
int x;
while (fgets(line, sizeof(line), fp) != NULL) {
array[i] = strtok(line, ",.!? \n");
printf("Check print - word %i:%s:\n",i, array[i]);
i++;
}
return 0;
}
Here's the corrected function [sorry for extra the style cleanup]:
int
splitInput(fp)
{
int i = 0;
char *cp;
char *bp;
char line[255];
char *array[5000];
int x;
while (fgets(line, sizeof(line), fp) != NULL) {
bp = line;
while (1) {
cp = strtok(bp, ",.!? \n");
bp = NULL;
if (cp == NULL)
break;
array[i++] = cp;
printf("Check print - word %i:%s:\n",i-1, cp);
}
}
return 0;
}
Now, take a look at the man page for strtok to understand the bp trick
If I understand your question correctly you want to read every line and split each line into words and add that into an array.
array[i] = strtok(line, ",.!? \n");
That will not work for obvious reasons because it will only return the first word for each line and you never allocate memory.
This is probably what you want.
char *pch;
pch = strtok(line, ",.!? \n");
while(pch != NULL) {
array[i++] = strdup(pch); // put the content of pch into array at position i and increment i afterwards.
pch = strtok(NULL, ",.!? \n"); // look for remaining words at the same line
}
Don't forget to free your array elements afterwards though using free.
How do i get the position of delimited separated string?
My text file looks like
at:x:25:25:Batch jobs daemon:/var/spool/atjobs:/bin/bash
avahi:x:109:111:User for Avahi:/var/run/avahi-daemon:/bin/false
beagleindex:x:110:112:User for Beagle indexing:/var/cache/beagle:/bin/bash
My C code looks like
#include<stdio.h>
int main(int argc, char *argv[])
{
char *str, *saveptr;
char ch[100];
char *sp;
FILE *f;
int j;
char searchString[20];
char *pos;
f = fopen("passwd", "r");
if (f == NULL)
{
printf("Error while opening the file");
}
while (fgets(ch, sizeof ch, f)!= NULL)
{
/*printf("%s\n", ch); */
for (j = 1, str = ch; ; j++, str= NULL)
{
char *token = strtok_r(str, ": ", &saveptr);
if (token == NULL)
break;
//printf("%s---\n---", token);
printf("%s",token);
}
}
fclose(f);
well, using strtok(str, ": ", will split your string on spaces as well as colons, which is probably not what you want. In addition, strtok treats multiple consecutive delimiter characters as a single delimiter (so it will never return an empty string between two colons), which is not what you want for parsing passwd.
Instead, you probably just want to use strchr:
while (fgets(ch, sizeof ch, f)!= NULL) {
char *token, *end;
for (j = 1, token = ch; token; j++, token = end) {
if ((end = strchr(token, ':'))) *end++ = 0;
...do something with token and j
I do not think you have to use strtok() just to get the position of a token separated by delimiters, rather simply walk through each line, and do a char by char comparison for the delimiter... (hope this will help you)
I prepared an input file called GetDelimPosition.txt:
at:x:25:25:Batch jobs daemon:/var/spool/atjobs:/bin/bash
avahi:x:109:111:User for Avahi:/var/run/avahi-daemon:/bin/false
jamil:x:25:25:Batch jobs daemon:/var/spool/atjobs:/bin/bash
javier:x:109:111:User for Avahi:/var/run/avahi-daemon:/bin/false
jiame:x:25:25:Batch jobs daemon:/var/spool/atjobs:/bin/bash
jose:x:109:111:User for Avahi:/var/run/avahi-daemon:/bin/false
And used the following code: (of course you will modify as needed)
#include <ansi_c.h>
//edit this line as needed:
#define FILE_LOC "C:\\dev\\play\\GetDelimPosition.txt"
int main(void)
{
FILE * fp;
char ch[260];
int line=-1;
int position[80][100]={0}; //lines x DelimPosition
memset(position, 0, 80*100*sizeof(int));
int i=-1,j=0, k=0;
int len;
fp = fopen(FILE_LOC, "r");
while (fgets(ch, sizeof ch, fp)!= NULL)
{
line++; //increment line
len = strlen(ch);
for(j=0;j<len;j++)
{
if(ch[j] == ':')
{
position[line][k] = j+1;//position of token (1 after delim)
k++; //increment position index for next token
}
}
k=0; //getting new line, zero position index
}
fclose(fp);
return 0;
}
To get the following results: (rows are lines in file, columns are positions of each token. First token is assumed at position 0, and not reported)