C - Reading CSV file in char array - c

I have a very little experience in C programming, particularly File Handling. I am developing a project in which I'm supposed to create a Sign Up/Log In system. I have a .csv file in which the data are separated by ,
What I am trying to do is reading the first and second column into two char arrays respectively.
char userLogin[100];
char userPassword[100];
FILE *file3 = fopen("C:\\Users\\Kshitiz\\Desktop\\BAAS\\signup_db.csv","r");
if(file3 != NULL){
while(!feof(file3)){
fscanf(file3,"%[^,],%s",userLogin,userPassword);
puts(userLogin);
puts(userPassword);
}
}
fclose(file3);
Content of signup_db.csv:
Username,Password
SBI063DDN,Qazwsx1234
ICICIDDN456,WSXEDC1234r
Expected Output:
Username
Password
SBI063DDN
Qazwsx1234
ICICIDDN456
WSXEDC1234r
Output which I'm getting:
Username
Password
SBI063DDN
Qazwsx1234
ICICIDDN456
WSXEDC1234r
WSXEDC1234r
Can anyone please help me how can I resolve this issue? Thank you!

The 'fscanf()' function returns the number of items of the argument list successfully filled. So instead try this:
while(fscanf(file3,"%[^,],%s",userLogin,userPassword) == 2)
{
puts(userLogin);
puts(userPassword);
}
The problem you mentioned is probably because of a new line character at the end of your file. When you read the last line, you have not yet reached the end of file. The above code solves this issue.

In my case I have the expected results, but I don't know if there is a difference with the compiler or if my csv file is different (I've tried to recreate it). Here is another way to parse the file, check if you have the expected results:
#include <stdio.h>
#include <string.h>
#define LINE_LENGTH 1000
int main(void) {
char userLogin[100];
char userPassword[100];
char line[LINE_LENGTH];
char *delimiter = ",";
char *token;
FILE *file3 = fopen("signup_db.csv", "r");
while(fgets(line, LINE_LENGTH, file3) != NULL) {
token = strtok(line, delimiter);
printf("%s\n", token);
token = strtok(NULL, delimiter);
printf("%s\n", token);
}
fclose(file3);
}

Related

File I/O Extraction with structures in C

The task is to read in a .txt file with a command line argument, within the file there is a list unstructured information listing every airport in the state of Florida note this is only a snippet of the total file. There is some data that must be ignored such as ASO ORL PR A 0 18400 - anything that does not pertain to the structured variables within AirPdata.
The assignment is asking for the site number, locID, fieldname, city, state, latitude, longitude, and if there is a control tower or not.
INPUT
03406.20*H 2FD7 AIR ORLANDO ORLANDO FL ASO ORL PR 28-26-08.0210N 081-28-23.2590W PR NON-NPIAS N A 0 18400
03406.18*H 32FL MEYER- INC ORLANDO FL ASO ORL PR 28-30-05.0120N 081-22-06.2490W PR NON-NPAS N 0 0
OUTPUT
Site# LocID Airport Name City ST Latitude Longitude Control Tower
------------------------------------------------------------------------
03406.20*H 2FD7 AIR ORLANDO ORLANDO FL 28-26-08.0210N 081-28-23.2590W N
03406.18*H 32FL MEYER ORLANDO FL 28-30.05.0120N 081-26-39.2560W N
etc.. etc. etc.. etc.. .. etc.. etc.. ..
etc.. etc. etc.. etc.. .. etc.. etc.. ..
my code so far looks like
#include <stdio.h>
#include <stdlib.h>
#include <strings.h>
typedef struct airPdata{
char *siteNumber;
char *locID;
char *fieldName;
char *city;
char *state;
char *latitude;
char *longitude;
char controlTower;
} airPdata;
int main (int argc, char* argv[])
{
char text[1000];
FILE *fp;
char firstwords[200];
if (strcmp(argv[1], "orlando5.txt") == 0)
{
fp = fopen(argv[1], "r");
if (fp == NULL)
{
perror("Error opening the file");
return(-1);
}
while (fgets(text, sizeof(text), fp) != NULL)
{
printf("%s", text);
}
}
else
printf("File name is incorrect");
fflush(stdout);
fclose(fp);
}
So far i'm able to read the whole file, then output the unstructured input onto the command line.
The next thing I tried to figure out is to extract piece by piece the strings and store them into the variables within the structure. Currently i'm stuck at this phase. I've looked up information on strcpy, and other string library functions, data extraction methods, ETL, I'm just not sure what function to use properly within my code.
I've done something very similar to this in java using substrings, and if there is a way to take a substring of the massive string of text, and set parameters on what substrings are held in what variable, that would potentially work. such as... LocID is never more than 4 characters long, so anything with a numerical/letter combination that is four letters long can be stored into airPdata.LocID for example.
After the variables are stored within the structures, I know I have to use strtok to organize them within the list under site#, locID...etc.. however, that's my best guess to approach this problem, i'm pretty lost.
I don't know what the format is. It can't be space-separated, some of the fields have spaces in them. It doesn't look fixed-width. Because you mentioned strtok I'm going to assume its tab-separated.
You can use strsep use that. strtok has a lot of problems that strsep solves, but strsep isn't standard C. I'm going to assume this is some assignment requiring standard C, so I'll begrudgingly use strtok.
The basic thing to do is to read each line, and then split it into columns with strtok or strsep.
char line[1024];
while (fgets(line, sizeof(line), fp) != NULL) {
char *column;
int col_num = 0;
for( column = strtok(line, "\t");
column;
column = strtok(NULL, "\t") )
{
col_num++;
printf("%d: %s\n", col_num, column);
}
}
fclose(fp);
strtok is funny. It keeps its own internal state of where it is in the string. The first time you call it, you pass it the string you're looking at. To get the rest of the fields, you call it with NULL and it will keep reading through that string. So that's why there's that funny for loop that looks like its repeating itself.
Global state is dangerous and very error prone. strsep and strtok_r fix this. If you're being told to use strtok, find a better resource to learn from.
Now that we have each column and its position, we can do what we like with it. I'm going to use a switch to choose only the columns we want.
for( column = strtok(line, "\t");
column;
column = strtok(NULL, "\t") )
{
col_num++;
switch( col_num ) {
case 1:
case 2:
case 3:
case 4:
case 5:
case 9:
case 10:
case 13:
printf("%s\t", column);
break;
default:
break;
}
}
puts("");
You can do whatever you like with the columns at this point. You can print them immediately, or put them in a list, or a structure.
Just remember that column is pointing to memory in line and line will be overwritten. If you want to store column, you'll have to copy it first. You can do that with strdup but *sigh* that isn't standard C. strcpy is really easy to use wrong. If you're stuck with standard C, write your own strdup.
char *mystrdup( const char *src ) {
char *dst = malloc( (sizeof(src) * sizeof(char)) + 1 );
strcpy( dst, src );
return dst;
}

Breaking a string in C with multiple spaces

Ok, so my code currently splits a single string like this: "hello world" into:
hello
world
But when I have multiple spaces in between, before or after within the string, my code doesn't behave. It takes that space and counts it as a word/number to be analyzed. For example, if I put in two spaces in between hello and world my code would produce:
hello
(a space character)
world
The space is actually counted as a word/token.
int counter = 0;
int index = strcur->current_index;
char *string = strcur->myString;
char token_buffer = string[index];
while(strcur->current_index <= strcur->end_index)
{
counter = 0;
token_buffer = string[counter+index];
while(!is_delimiter(token_buffer) && (index+counter)<=strcur->end_index)//delimiters are: '\0','\n','\r',' '
{
counter++;
token_buffer = string[index+counter];
}
char *output_token = malloc(counter+1);
strncpy(output_token,string+index,counter);
printf("%s \n", output_token);
TKProcessing(output_token);
//update information
counter++;
strcur->current_index += counter;
index += counter;
}
I can see the problem area in my loop, but I'm a bit stumped as to how to fix this. Any help would be must appreciated.
From a coding stand point, if you wanted to know how to do this without a library as an exercise, what's happening is your loop breaks after you run into the first delimeter. Then when you loop to the second delimeter, you don't enter the second while loop and print a new line again. You can put
//update information
while(is_delimiter(token_buffer) && (index+counter)<=strcur->end_index)
{
counter++;
token_buffer = string[index+counter];
}
Use the standard C library function strtok().
Rather than redevelop such a standard function.
Here's the related related manual page.
Can use as following in your case:
#include <string.h>
char *token;
token = strtok (string, " \r\n");
// do something with your first token
while (token != NULL)
{
// do something with subsequents tokens
token = strtok (NULL, " \r\n");
}
As you can observe, each subsequent call to strtok using the same arguments will send you back a char* adressing to the next token.
In the case you're working on a threaded program, you might use strtok_r() C function.
First call to it should be the same as strtok(), but subsequent calls are done passing NULL as the first argument. :
#include <string.h>
char *token;
char *saveptr;
token = strtok_r(string, " \r\n", &saveptr)
// do something with your first token
while (token != NULL)
{
// do something with subsequents tokens
token = strtok_r(NULL, " \r\n", &saveptr)
}
Just put the process token logic into aif(counter > 0){...}, which makes malloc happen only when there was a real token. like this
if(counter > 0){ // it means has a real word, not delimeters
char *output_token = malloc(counter+1);
strncpy(output_token,string+index,counter);
printf("%s \n", output_token);
TKProcessing(output_token);
}

Regarding FOPEN in C

I am having a problem regarding FOPEN in C.
I have this code which reads a particular file from a directory
FILE *ifp ;
char directoryname[50];
char result[100];
char *rpath = "/home/kamal/samples/pipe26/divpipe0.f00001";
char *mode = "r";
ifp = fopen("director.in",mode); %director file contains path of directory
while (fscanf(ifp, "%s", directoname) != EOF)
{
strcpy(result,directoname); /* Path of diretory /home/kamal/samples/pipe26 */
strcat(result,"/"); /* front slash for path */
strcat(result,name); /* name of the file divpipe0.f00001*/
}
Till this point my code works perfectly creating a string which looks " /home/kamal/samples/pipe26/divpipe0.f00001 ".
The problem arises when I try to use the 'result' to open a file, It gives me error. Instead if I use 'rpath' it works fine even though both strings contain same information.
if (!(fp=fopen(rpath,"rb"))) /* This one works fine */
{
printf(fopen failure2!\n");
return;
}
if (!(fp=fopen(result,"rb"))) /* This does not work */
{
printf(fopen failure2!\n");
return;
}
Could some one please tell why I am getting this error ?
I think you mean char result[100];; i.e. without the asterisk. (Ditto for directoryname.)
You're currently stack-allocating an array of 100 pointers. This will not end well.
Note that rpath and mode point to read-only memory. Really you should use const char* for those two literals.
The error is the array 'char* result[100]', here you are allocating an array of 100 pointers to strings, not 100 bytes / characters, which was your intent.

How can I split an input using multiple delimiters?

I want to split into tokens an lessons.txt file. This file has some people and these people's lessons. How can I do it ?
There is my lessons.txt file :
George Adam :Math,Science,Germany
Elizabeth McCurry :Music,Math,History
Tom Hans :Science,Music
Firstly, I want to split into ":". And I want to store names in an array. Secondly , I want to split into "," and these lessons I want to store an different array. How can I this ?
There is my code below :
char names[100] , *token, *lecture;
file=fopen("C:\\lessons.txt","r");
while(!feof(file))
{
fgets(names,sizeof(names),file);
printf("%s",names);
token=strtok(names,":");
while(token!=NULL)
{
token=strtok(NULL,":");
printf(" \n %s",token);
lecture=strtok(token,",");
while(lecture!=NULL)
{
lecture=strtok(NULL,",");
printf(" \n\n %s",lecture);
}
}
}
fclose(file);
So you want names to be stored in a separate array, and lessons to be stored in another?
You will need two separate tokens, you are using the same token for names and lessons.
Try this :
FILE *file;
file = fopen("C:\\lessons.txt", "r");
char names[100], *token, *difftok;
while (fgets(names, sizeof(names), file) != NULL) {
token = strtok(names, ":")
//puts(token); ---> George Adams
difftok = strtok(NULL, ",");
//puts(difftok); ---> Math
difftok = strtok(NULL, ",");
//puts(difftok); ---> Science
difftok = strtok(NULL, "\n");
//puts(difftok); ---> Germany
}
fclose(fp);
}
In my excerpt, token will always represent names, and difftok will always be lectures, from here I think you can figure out how to store the tokens into an array. Token goes into one, difftok into another.
Also, your EOF condition is wrong, feof returns a non-zero when it reaches end of file :
while(!feof(file))
Should be:
while(feof(file) == 0)
However, in this case I used fgets(...) != NULL because fgets return NULL when it reached end of file. You should probably use my condition as feof(file) == 0 encounters some end of file problems when used with your code and messes up the way the tokens parse the string.

The last character is not printed to a file

I am trying to figure out why using C function strtok is not working properly for me. Here's the problem:
I have a file which contains two types of information: headers and text descriptions. Each line in the file is either a header or part of a text description. A header starts with '>'. The description text follows the header and can span multiple lines. At the end of the text there is an empty line which separates the description from the next header. My aim is to write two separate files: one contains the headers on each line and the other contains the corresponding description on a line by itself. To implement the codes in C, I used fgets to read the file one line at a time into dynamically allocated memory. In order to write the description text on one single line, I used `strtok to get rid of any new line characters exists in the text.
My code is working properly for the header files. However, for the descriptions file, I noticed that the last character of the text is not printed out to the file even though it is printed to the stdout.
FILE *headerFile = fopen("Headers", "w"); //to write headers
FILE *desFile = fopen("Descriptions", "w"); //to write descriptions
FILE *pfile = fopen("Data","r");
if ( pfile != NULL )
{
int numOfHeaders =0;
char **data1 = NULL; //an array to hold a header line
char **data2 = NULL; //an array to hold a description line
char line[700] ; //maximum size for the line
while (fgets(line, sizeof line, pfile ))
{
if(line[0] =='>') //It is a header
{
data1 = realloc(data1,(numOfHeaders +1)* sizeof(*data1));
data1[numOfHeaders]= malloc(strlen(line)+1);
strcpy(data1[numOfHeaders],line);
fprintf(headerFile, "%s",line);//writes the header
if(numOfHeaders >0)
fprintf(desFile, "\n");//writes a new line in the desc file
numOfHeaders++;
}
//it is not a header and not an empty line
if(line[0] != '>' && strlen(line)>2)
{
data2 = realloc(data2,(numOfHeaders +1)* sizeof(*data2));
data2[numOfHeaders]= malloc(strlen(line)+1);
char *s = strtok(line, "\n ");
strcpy(data2[numOfHeaders],s);
fprintf(desFile, "%s",data2[numOfHeaders]);
printf(desFile, "%s",data2[numOfHeaders]);
}
} //end-while
fclose(desFile);
fclose(headerFile);
fclose(pfile );
printf("There are %d headers in the file.\n",numOfHeaders);
}
As mentioned in the comments:
fprintf(desFile, "%s",data2[numOfHeaders]); //okay
printf(desFile, "%s",data2[numOfHeaders]); //wrong
Second line should be:
printf("%s",data2[numOfHeaders]); //okay
Or, you could do this:
sprintf(buffer, "%s",data2[numOfHeaders]);
fprintf(desFile, buffer);
printf(buffer);
Other possible issues:
Without an input file it is not possible to know for certain what strtok() is doing, but here is a guess based on what you have described:
In these two lines:
data2[numOfHeaders]= malloc(strlen(line)+1);
char *s = strtok(line, "\n ");
if the string contained in data2 has any embedded spaces, s will only contain the segment occurring before that space. And because you are only calling it once before line gets refreshed:
while (fgets(line, sizeof line, pfile ))
only one token (the very first segment) will be read.
Not always, but Normally, strtok() is called in a loop:
char *s = {0};
s= strtok(stringToParse, "\n ");//make initial call before entering loop
while(s)//ALWAYS test to see if s contains new content, else NULL
{
//do something with s
strcpy(data2[numOfHeaders],s);
//get next token from string
s = strtok(NULL, "\n ");//continue to tokenize string until s is null
}
But, as I said above, you are calling it only once on that string before the content of the string is changed. It is possible then, that the segment not printing has simply not yet been tokenized by strtok().

Resources