I'm trying to use C to read a CSV file, iterate line by line (until EOF), and delimit/split each line by the comma. Then I wish to separate each column into "bins" and put add them to a struct (which isn't shown here; I defined it in a helper file) based on type.
For example, if I have 1,Bob, I'd like to split 1 and Bob into two variables. Here's what I've written so far.
void readFile(char file[25]) {
FILE *fp;
char line[1000];
fp = fopen(file, "r"))
while(fgets(line, 1000, fp)) {
char* tmp = strdup(line);
char* token;
while((token = strsep(&tmp, ","))) {
printf("%s\n", token); // I want to split token[0] and token[1]
}
}
fclose(fp);
}
T he above code does compile and run. I just don't know how to access each split of the token, like token[0] or token[1]. In python, this would be simple enough. I could just access 1 using token[0] and Bob using token[1] for each line. But here in C, I can't do that.
For testing purposes, all I'm doing right now is printing each line (in the second while loop), just to see how each split looks. I haven't implemented the code where I put each split line into its respective struct member.
I've searched Stack Overflow and found a multitude of threads on this topic. None of them seemed to help me except for this one, which I have drawn from. But I wasn't able to get the storing of split columns working.
In python, this would be simple enough. I could just access 1 using token[0] and Bob using token[1] for each line. But here in C, I can't do that.
Yes, you can, if only you define the array.
while (fgets(line, sizeof line, fp))
{
char *tmp = strchr(line, '\n');
if (tmp) *tmp = '\0'; // remove the '\n'
tmp = strdup(line);
#define MAXCOLUMNS 2
char *token[MAXCOLUMNS];
int c = 0;
while (tmp)
{
if (c == MAXCOLUMNS) puts("too many columns"), exit(1);
token[c++] = strsep(&tmp, ",");
}
if (1 <= c) printf("column 1: %s\n", token[0]);
if (2 <= c) printf("column 2: %s\n", token[1]);
// ONLY if the line's tokens are no longer needed:
free(*token);
}
Related
I have a dynamically updated text file with names of people, I want to parse the file to extract "Caleb" and the string that follows his name. However, his name may not always be in the list and I want to account for that.
I could do it in Java, but not even sure what to do in C. I could start by reading in the text file line by line, but then how would I check if "Caleb" is a substring of the string I just read in and handle the case when he isn't? I want to do this without using external libraries - what would be the best method?
Barnabas: Followed by a string
Bart: Followed by a string
Becky: Followed by a string
Bellatrix: Followed by a string
Belle: Followed by a string
Caleb: I want this string
Benjamin: Followed by a string
Beowul: Followed by a string
Brady: Followed by a string
Brick: Followed by a string
returns: "Caleb: I want this string" or "Name not found"
but then how would I check if "Caleb" is a substring of the string
The heart of the question as I read it. strstr does the job.
char *matchloc;
if ((matchloc = strstr(line, "Caleb:")) {
// You have a match. Code here.
}
However in this particular case you really want starts with Caleb, so we do better with strncmp:
if (!strncmp(line, "Caleb:", 6)) {
// You have a match. Code here.
}
So if you want to check if the user caleb exists, you can simple made a strstr, with your array of strings, and if exists you can make a strtok, to get only the string!
I dont know how you are opening the file, but you can use getline to get line by line!
You can do something like this:
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
int main(){
FILE *file;
char *fich="FILE.TXT";
char *line = NULL;
char *StringFile[100];
size_t len = 0;
ssize_t stringLength;
const char s[2] = ":"; //Divide string for this
char *token;
int check =0;
char *matchloc;
file=fopen(fich, "r");
if(file==NULL){
fprintf(stderr, "[ERROR]: cannot open file <%s> ", fich);
perror("");
exit(1);
}
while((stringLength = getline(&line, &len, file)) != -1){
if(line[strlen(line)-1] == '\n'){
line[strlen(line)-1] = '\0'; //Removing \n if exists
}
if((matchloc = strstr(line, "Caleb:"))){
check = 1;
strcpy(*StringFile, line);
token = strtok(*StringFile, s);
while( token != NULL ) {
token = strtok(NULL, s);
printf("%s\n", token);
break;
}
break;
}
}
if(check==0){
printf("Name not found\n");
}
return 0;
}
The code, can have some errors, but the idead is that! when founds the name, copy the line to array and the splits it.
Here is the format of the file that is being read:
type Extensions
application/mathematica nb ma mb
application/mp21 m21 mp21
I am able to read each entry from the file.
Now I want to make a key-value pair where the output for the above entries would be like
{nb: application/mathematica}
{ma:application/mathematica} and so on.
this is my current code to simply read through the entries
char buf[MAXBUF];
while (fgets(buf, sizeof buf, ptr) != NULL)
{
if(buf[0] == '#' || buf[0] == '\n')
continue; // skip the rest of the loop and continue
printf("%s\n", buf);
}
"How to split the a string into separate words in C, where the space is not constant?"
A simple answer would be using strtok() (string.h library), but keep it mind that this function affects your initial string. So, if you want to use it again, you should use an temporary variable equal to your initial string.
char *p = strtok(char *str, const char *delim)
where, in place of delim you should place : " ".
Basically, strtok splits your string according to given delimiters.
Here, i let u an example of strtok:
char* p = strtok(str," ");
while(p != NULL){
printf("%s\n",p);
p=strtok(NULL," ");
}
I'm writing a game in SDL2 for a school project, in C, I have a config that lists key-values pairs as such:
groundTiles: images/Overworld/groundTiles.png
and
cellHeight: 32
How should I go about parsing this data? Because my attempts result in the integers being read correctly but strings are either missing chars or are completely corrupt. I'm somewhat of a beginner to C, at least in terms of file i/o
I need another set of eyes on this code because I've spent too many hours on this already.
Could it have something to do with this struct in my header and how I'm using it to store temporary data?
typedef struct TileMapData_S
{
Uint32 col, row, cellWidth, cellHeight, numCells;
char *mapName;
char *emptyTileName;
Bool flag;
SDL_Color *colors;
Tile* tileTypes;
char *colorMap;
}TileMapData;
I've tried making it an unnamed struct in the function, then the source. No luck. I tried just not using a struct and fscanf'ing each piece of data into a separate variable. Same thing, no luck. If I did fscanf(file, "%s %s", buf, temp) with temp being the value of the key I'm parsing, then I get the first encounter of the string I'm looking for, then it copies itself to the other two char* that are holding the names of my sprites/files.
EDIT: This is my attempt based on comments, which does not work, any insight would be appreciated
while (!data->flag)
{
while (tempString != EOF)
{
tempString = strtok(buf, " \n");
if (strcmp(tempString, "width:") == 0)
{
tempString = strtok(buf, "\n\0 ");
map->numColumns = atoi(tempString);
continue;
}
.
.
.
if (strcmp(tempString, "groundTiles:") == 0)
{
data->mapName = strtok(buf, "\n\0 ");
data->mapName = tempString;
if (data->mapName != NULL)
{
data->flag = true;
}
else
{
data->flag = false;
}
continue;
}
.
.
.
tempString = fgets(buf, sizeof(buf), file);
slog(buf);
}
rewind(file);
}
I was expecting to get the string I wanted, without the whitespace/null-terminating char, but ended up with an infinite loop
END EDIT
I expect that when I parse groundTiles: images/Overworld/groundTiles.png
using fscanf(file, "%s", buf), doing strcmp on that and a known string (groundTiles:), then a second fscanf should provide the string images/Overworld/groundTiles.png
I am trying to read a file line by line and split it into words. Those words should be saved into an array. However, the program only gets the first line of the text file and when it tries to read the new line, the program crashes.
FILE *inputfile = fopen("file.txt", "r");
char buf [1024];
int i=0;
char fileName [25];
char words [100][100];
char *token;
while(fgets(buf,sizeof(buf),inputfile)!=NULL){
token = strtok(buf, " ");
strcpy(words[0], token);
printf("%s\n", words[0]);
while (token != NULL) {
token = strtok(NULL, " ");
strcpy(words[i],token);
printf("%s\n",words[i]);
i++;
}
}
After good answer from xing I decided to write my FULL simple program realizing your task and tell something about my solution. My program reads line-by-line a file, given as input argument and saves next lines into a buffer.
Code:
#include <assert.h>
#include <errno.h>
#define _WITH_GETLINE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define assert_msg(x) for ( ; !(x) ; assert(x) )
int
main(int argc, char **argv)
{
FILE *file;
char *buf, *token;
size_t length, read, size;
assert(argc == 2);
file = fopen(argv[1], "r");
assert_msg(file != NULL) {
fprintf(stderr, "Error ocurred: %s\n", strerror(errno));
}
token = NULL;
length = read = size = 0;
while ((read = getline(&token, &length, file)) != -1) {
token[read - 1] = ' ';
size += read;
buf = realloc(buf, size);
assert(buf != NULL);
(void)strncat(buf, token, read);
}
printf("%s\n", buf);
fclose(file);
free(buf);
free(token);
return (EXIT_SUCCESS);
}
For file file.txt:
that is a
text
which I
would like to
read
from file.
I got a result:
$ ./program file.txt
that is a text which I would like to read from file.
Few things which is worth to say about that solution:
Instead of fgets(3) I used getline(3) function because of easy way to knowledge about string length in line (read variable) and auto memory allocation for got string (token). It is important to remember to free(3) it. For Unix-like systems getline(3) is not provided by default in order to avoid compatibility problems. Therefore, #define _WITH_GETLINE macro is used before <stdio.h> header to make that function available.
buf contains only mandatory amount of space needed to save string. After reading one line from file buf is extended by the required amount of space by realloc(3). Is it a bit more "universal" solution. It is important to remember about freeing objects allocated on heap.
I also used strncat(3) which ensures that no more than read characters (length of token) would be save into buf. It is also not the best way of using strncat(3) because we also should testing a string truncation. But in general it is better than simple using of strcat(3) which is not recommended to use because enables malicious users to arbitrarily change a running program's functionality through a buffer overflow attack. strcat(3) and strncat(3) also adds terminating \0.
A getline(3) returns token with a new line character so I decided to replace it from new line to space (in context of creating sentences from words given in file). I also should eliminate last space but I do not wanted to complicate a source code.
From not mandatory things I also defined my own macro assert_msg(x) which is able to run assert(3) function and shows a text message with error. But it is only a feature but thanks to that we are able to see error message got during wrong attempts open a file.
The problem is getting the next token in the inner while loop and passing the result to strcpy without any check for a NULL result.
while(fgets(buf,sizeof(buf),inputfile)!=NULL){
token = strtok(buf, " ");
strcpy(words[0], token);
printf("%s\n", words[0]);
while (token != NULL) {//not at the end of the line. yet!
token = strtok(NULL, " ");//get next token. but token == NULL at end of line
//passing NULL to strcpy is a problem
strcpy(words[i],token);
printf("%s\n",words[i]);
i++;
}
}
By incorporating the check into the while condition, passing NULL as the second argument to strcpy is avoided.
while ( ( token = strtok ( NULL, " ")) != NULL) {//get next token != NULL
//if token == NULL the while block is not executed
strcpy(words[i],token);
printf("%s\n",words[i]);
i++;
}
Sanitize your loops, and don't repeat yourself:
#include <stdio.h>
#include <string.h>
int main(void)
{
FILE *inputfile = fopen("file.txt", "r");
char buf [1024];
int i=0;
char fileName [25];
char words [100][100];
char *token;
for(i=0; fgets(buf,sizeof(buf),inputfile); ) {
for(token = strtok(buf, " "); token != NULL; token = strtok(NULL, " ")){
strcpy(words[i++], token);
}
}
return 0;
}
Hi guys I have this file struct:
0
2 4
0: 1(ab) 5(b)
1: 2(b) 6(a)
2: 0(a) 2(b)
3: 2(a) 6(b)
4: 5(ab)
5: 2(a) 6(b)
6: 4(b) 6(ab)
Each line will feed a struct with its data (numbers + letters).
What's the best way to read the line and get the strings I want?
Example:
0
2 4
0,1,ab,5,b
1,2,b,5,a
...
The lines may vary in size because we can have 1, 2, 3, .... numbers.
I already did it :
//struct
#define MAX_ 20
struct otherstats{ //struct otherStats
int conectstat[MAX_];//conection with others stats
int transitions[MAX_];//Symbols betwen conection ASCI
}tableStats[MAX_];
struct sAutomate{
int stat_initial; //initial
int stats_finals[MAX_]; //final orfinals
struct otherstats tableStats[MAX_]; //otherStats 0 1 2 3 4 5 6
};
/* eXample that what i want ..using the example
sAutomate.stat_initial=0
sAutomate.stats_finals[0]=2
sAutomate.stats_finals[1]=4
Others Stats table
//0
sAutomate.tableStats[0].conectstat[0]=1;
sAutomate.tableStats[0].conectstat[1]=5;
sAutomate.tableStats[0].transitions[0]=ab;
sAutomate.tableStats[0].transitions[1]=b;
//1
sAutomate.tableStats[1].conectstat[0]=2;
sAutomate.tableStats[1].conectstat[1]=6;
sAutomate.tableStats[1].transitions[0]=b;
sAutomate.tableStats[1].transitions[1]=a;
///etc
*/
void scanfile(){ //function to read the file
struct sAutomate st; //initialize st struct
char filename[] = "txe.txt";
FILE *file = fopen ( filename, "r" );
char buf[81];
char parts[5][11];
fscanf(file,"%d", &st.stat_initial);//read first line
printf("initial state : %d \n", st.stat_initial);
fscanf(file,"%d",&st.stats_finals);
fscanf(file,"%d",&st.stats_finals);
while (fgets(buf, sizeof(buf), stdin) != NULL)
{
if (sscanf(buf, "%10[^:]: (%10[^(], %10[^)]), (%10[^(], %10[^)])",
parts[0], parts[1], parts[2], parts[3], parts[4]) == 5)
{
printf("parts: %s, %s, %s, %s, %s\n",
parts[0], parts[1], parts[2], parts[3], parts[4]);
}
else
{
printf("Invalid input: %s", buf);
}
}
//fclose
First problem I see is you're overwriting stats_finals:
fscanf(file,"%d",&st.stats_finals);
fscanf(file,"%d",&st.stats_finals);
What you wanted to do here was:
fscanf(file,"%d",&st.stats_finals[0]);
fscanf(file,"%d",&st.stats_finals[1]);
To save off both the "2" and the "4" from the text file.
Second major problem is you're reading from stdin:
while (fgets(buf, sizeof(buf), stdin) != NULL)
That doesn't read your text file, that reads input from the keyboard... So you wanted that to be:
while (fgets(buf, sizeof(buf), file) != NULL)
Third (minor) problem is that fscanf() will not read newlines, and fgets() will. This means when you go from reading your second stats_finals to the first read in the while loop, your first input will just be the left over newline character. That's not a big deal since you check for "invalid input", but it's worth noting.
Finally, your sscanf looks wrong to me:
sscanf(buf, "%10[^:]: (%10[^(], %10[^)]), (%10[^(], %10[^)])",
^ ^
That's a width of 10, Why are you checking for commas? You didn't
I don't think that's have any in your text file
what you wanted...
I think this is more what you were looking for:
sscanf(buf, "%[0-9]: %[0-9](%[^)]) %[0-9](%[^)])",
^
takes a digit (0 to 9)
EDIT
Missed your original point. If you don't know how long the strings will be that you're reading, you can't use sscanf(). It's that simple. :)
The scanf family assumes you know how many objects you'll be parsing and the format string takes in that many. There are other options however.
Read a single line with fgets as you're doing, but then you can tokenize it. Either with the C function strtok or by your own hand with a for loop.
One note however:
Since you don't know how long it is, this: char parts[5][11]; is not your best bet. This limits you to 2 entries... probably it would be better to do this dynamically (read the line then allocate the correct size to store your tokens in.)
If you really don't know how many numbers and letters the line will contain, why are you reading a fixed amount of numbers and letters?
You could read the whole line with fgets and then parse it with a tokenizer like strtok, something like this:
const char* const DELIMITERS = " ";
int i; // index for tableStats
char* token;
token = strtok(line, DELIMITERS);
// first integer
if (token == NULL || sscanf(token, "%d:", &i) < 1)
// error
/* it seems like you should have at least one element in your "list",
* otherwise this is not necessary
*/
token = strtok(NULL, DELIMITERS);
if (token == NULL || sscanf(token, "%d(%[^)])",
&(tableStats[i].connectstat[0]),
&(tableStats[i].transitions[0])) < 2)
// error
// read optional part
for (int j = 1; (token = strtok(NULL, DELIMITERS)) != NULL; ++j)
if (sscanf(token, "%d(%[^)])", &(tableStats[i].connectstat[j]),
&(tableStats[i].transitions[j])) < 3)
break;
Remember that strtok changes the string, make a copy of it if you still need it.
Obviusly the code is for the arbitrary long lines, reading the first two lines is trivial.