Writing to CSV from another CSV with fgets() and skipping lines - c

FULL DISCLOSURE: SCHOOL ASSIGNMENT
I've been working on some code to pull data from a CSV and move it to another CSV, however I keep encountering this error I can't seem to overcome.
For the part I'm working on, the user supplies a command line argument with 'DELETE OPT1' where OPT1 is the ID of an entry in the CSV. The deleteStuff() function should go through the database.csv and delete the first entry/row of a matching ID. It should achieve this by creating a database.tmp, then copying the database.csv over to the tmp, excluding the first matching entry. The csv is then deleted and then the tmp is renamed to database.csv, as if nothing happened.
However, the source code (database.csv) seems to be doing something wrong and deleting everything but the ID's. Below, I've posted the source code and the starting database.csv, as well as what the code outputs after running DELETE 10. Any help would be appreciated, especially in understanding how to the next line of a fgets().
Note that the noSpaces() function just removes any empty spaces since it is possible for them to be included in the input according to our prof.
database.c:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
char show[] = "SHOW";
char delete[] = "DELETE";
char add[] = "ADD";
void noSpaces(char* s) {
const char* d = s;
do {
while (*d == ' ') {
++d;
}
} while (*s++ = *d++);
}
void showStuff() {
FILE* csv = fopen("database.csv", "rt");
if (csv == NULL) {
printf("\n File opening failed");
exit(1);
}
char buffer[800];
char *Gptr, *ID, *name, *cAge, *cGPA;
int counter = 1;
while (fgets(buffer, sizeof(buffer), csv)) {
ID = strtok(buffer, ",");
noSpaces(ID);
name = (strtok(NULL, ","));
noSpaces(name);
cAge = (strtok(NULL, ","));
noSpaces(cAge);
int age = atoi(cAge);
cGPA = (strtok(NULL, ","));
noSpaces(cGPA);
double GPA = strtod(cGPA, &Gptr);
printf("Record %d: ID=%-5s NAME:%-5s AGE:%-5d GPA:%.1f\n", counter, ID, name, age, GPA);
counter++;
}
fclose(csv);
}
void deleteStuff(char givenID[]) {
FILE* csvread = fopen("database.csv", "rt");
if (csvread == NULL) {
printf("\n File opening failed");
exit(1);
}
FILE* csvwrite = fopen("database.tmp", "wt");
char buffer[800];
char* ID;
int oneAndDone = 0;
while (fgets(buffer,sizeof(buffer),csvread)) {
ID = strtok(buffer, ",");
noSpaces(ID);
if ((strcmp(ID, givenID) == 0) && (oneAndDone == 0)) {
oneAndDone++;
continue;
}
fprintf(csvwrite, "%s", buffer);
}
system("rm database.csv");
system("mv database.tmp database.csv");
fclose(csvread);
fclose(csvwrite);
if (oneAndDone == 0) {
printf("Sorry, the user was not found. Nothing was deleted.\n\n");
exit(1);
}
}
void addStuff(char gID[], char gName[], char gAge[], char gGPA[]) {
char* Gptr;
int age = atoi(gAge);
double GPA = strtod(gGPA, &Gptr);
FILE* csvappend = fopen("database.csv", "at");
if (csvappend == NULL) {
printf("\n File opening failed");
exit(1);
}
fprintf(csvappend, "%s,%s,%d,%.1f", gID, gName, age, GPA);
fclose(csvappend);
}
void main(int argc, char* argv[]) {
if (argc == 1) {
printf("Your did not provide any arguments. Please enter: ./database CMD OPT1 OPT2 OPT3 OPT4 \n\n");
exit(1);
}
if (strcmp(argv[1], show) == 0) showStuff();
else if (strcmp(argv[1], delete) == 0) {
if (argc <= 2) {
printf("Name of record to delete is missing\n\n");
exit(1);
}
deleteStuff(argv[2]);
}
else if (strcmp(argv[1], add) == 0) {
if (argc <= 5) {
printf("Missing ID, Name, AGE, and GPA Arguments\n\n");
exit(1);
}
addStuff(argv[2], argv[3], argv[4], argv[5]);
}
else printf("The command you requested in invalid. Please select from one of these: SHOW, DELETE, ADD\n\n");
}
Original database.csv:
10,bob,18, 3.5
15,mary,20,4.0
5,tom, 17, 3.8
After database.csv:
155

When using strtok you must understand it modifies the string that you are tokenizing. If you will need to use the string after calling strtok, you should make a copy of the string first. Always check the man page for any function you are using if you are unsure exactly how it works, e.g. man 3 strtok
Be cautious when using these functions. If you do use them, note
that:
* These functions modify their first argument.
Unless you are programming for a microcontroller of some other device without an operating system (a "freestanding" system) the use of void main() is wrong. See: C11 Standard - ยง5.1.2.2.1 Program startup(p1). See also: What should main() return in C and C++?
You make use of the flag int oneAndDone = 0; -- which is fine. However, after you have found the ID provided on the command line and incremented oneAndDone++; (or just set oneAndDone = 1;), there is no longer a need to call strtok at all. Wouldn't it make more sense to completely skip the strtok call after oneAndDone is no longer 0? Something like:
char buffer[800];
char* ID;
int oneAndDone = 0;
while (fgets(buffer,sizeof(buffer),csvread)) {
if (oneAndDone == 0) {
char bcopy[800];
strcpy (bcopy, buffer);
ID = strtok(bcopy, ",");
noSpaces(ID);
if (strcmp(ID, givenID) == 0) {
oneAndDone = 1;
continue;
}
}
fprintf(csvwrite, "%s", buffer);
}
You have done a good job Not Skimping On Buffer Size with char buffer[800]; -- but don't use Magic-Numbers in your code either. Better:
...
#include <string.h>
#define MAXC 800 /* if you need a constant, #define one (or more) */
...
char buffer[MAXC];
char* ID;
int oneAndDone = 0;
while (fgets(buffer,sizeof(buffer),csvread)) {
if (oneAndDone == 0) {
char bcopy[MAXC];
...
You certainly want to close the files you are reading and writing from before you make system calls to rm and mv. For example:
fclose(csvread);
fclose(csvwrite);
system("rm database.csv");
system("mv database.tmp database.csv");
You always want to check the return fclose after-write to catch any stream error or problem with the file that may not be caught by checking only the write itself, e.g.
if (fclose(csvwrite) == EOF)
perror ("fclose-csvwrite");
That way you at least have an indication on whether the stream was flushed successfully and there were no errors on close.
Lastly, Enable Compiler Warnings and do not accept code until it compiles without warning. For gcc/clang use as minimum -Wall -Wextra -pedantic (consider adding -Wshadow as well). For VS use /W3. For any other compiler, check the documentation and enable comparable warnings. That would point you to add enclosing parenthesis around:
} while ((*s++ = *d++));
in void noSpaces(char* s).
That should get you going. Think through and make the changes and let me know if you have further problem.

Related

Matching text from 2 files

I have written a program that is designed to recover linux system passwords by searching for matching hashes which are present in two text files
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
#define MAXCHAR 1000
//Declaring Functions to match word in file
int matchfile(char *shadowfilename, char*hashtablefilename);
//shadowfilename for shadow.txt hashtablefilename for hash table
void UsageInfo(char *shadowfile, char * hashtablefile );
//Display usage info on arguments for program
void UsageInfo(char *shadowfile, char * hashtablefile) {
printf("Usage: %s %s <shadowfile> <hashtable>\n", shadowfile,hashtablefile);
}
//main function.
int main(int argc, char *argv[]) {
int result, errcode;
//Display format for user to enter arguments and
//End program if user does not enter exactly 3 arguments
if(argc < 3 || argc > 3) {
UsageInfo(argv[1],argv[2]);
exit(1);
}
system("cls");
//Pass command line arguments into searchstringinfile
result = matchfile(argv[1], argv[2]);
//Display error message
if(result == -1) {
perror("Error");
printf("Error number = %d\n", errcode);
exit(1);
}
return(0);
}
//Declaring Functions to match word in file
//int matchfile(char *shadowfilename, char *hashtablefilename);
//shadowfilename for shadow.txt hashtablefilename for hash table
int matchfile(char *shadowfilename, char *hashtablefilename){
FILE *shadowfile;
FILE *hashtable;
char strshadow[MAXCHAR];
char strhash[MAXCHAR];
shadowfile = fopen(shadowfilename, "r");
if (shadowfile == NULL){
printf("Could not open file %s",shadowfilename);
return 1;
}
hashtable = fopen(hashtablefilename, "r");
if (hashtable == NULL){
printf("Could not open file %s",hashtablefilename);
return 1;
}
//Getting text from the 2 files
while (fgets(strshadow, MAXCHAR, shadowfile) != NULL &&fgets(strhash,MAXCHAR,
hashtable) != NULL){
printf("%s", strshadow);
printf("%s", strhash);
int linenumber = 1;
int search_result = 0;
//Matching words line-by-line
if((strstr(strshadow,strhash)) != NULL) {
//Display line in which matched word is found
printf("A match found on line: %d\n", linenumber);
printf("\n%s\n", strhash);
search_result++;
}
linenumber++;
}
fclose(shadowfile);
return 0;
}
However, I am unable to match the two hash values present in the two files due to the characters in front of them.
hashtable.txt.
This file contains the missing password in plain-text and is corresponding hash values.
The format is as follows: (password):(hash)
banana:$1$$Tnq7a6/C1wwyKyt0V/.BP/:17482:0:99999:7:::
shadow.txt. This file contains the account username in plain-text and is corresponding hash values.
The format is as follows: (user):(hash)
pyc1:$1$$Tnq7a6/C1wwyKyt0V/.BP/:17482:0:99999:7:::
As seen above, the words 'banana' and 'pyc1' prevent the program from detecting the two hashes from being detected.
Could someone tell me the changes I need to make to overcome this ?
Thank you.
Edit:Clarified format of shadow.txt and hashtable.txt
The simplest way to skip characters in string until some condition is met is:
char someString[MAXCHAR];
for (char* ptr = someString; *ptr != '\0'; ptr++) {
if (conditionIsMet(ptr)) {
doSomething();
break;
}
}
In your case, conditionIsMet(ptr) should be comparing *ptr to ':' and in that case, the password hash is under (ptr + 1) (string starting from the next character). I think you can write the rest of the code yourself.

Get the user to enter a name but using file stream *fp

I am a beginner in c so I have a problem with get the user to input last name, a comma & then first name. However it will pass to the function call
int get_name(FILE *fp)
in my main function. I have a problem either if I have to use the arguments parameters.
Example, main (int argc, char *argv[])) or just main (void))
and from what I have been searching so far, FILE*fp cannot get the user to enter from stdin it only use to open the file(?) BUT I am required to get the user to input from keyboard and pass to the function. I have written some codes. but they don't seem to work but I am going to put down on here the one I am sure that I need a few changes most.
#define LINESIZE1024
int main(void){
FILE *fp;
char line[LINESIZE];
char first;
char last;
char comma;
while(1){
if(!fgets(line,LINESIZE,stdin)){
clearerr(stdin);
break;
}
if(fp = (sscanf(line,"%s %s %s",&last,&comma,&first)==3))
get_name(fp);
if(get_last_first(fp)== -1)
break;
printf("Please enter first name a comma and then last name");
}
BUT I got an error saying I can't use pass it from pointer to an integer. and many MORE but I accidentally closed my concolse and all the errors that appeared while I was trying to fix are gone. So please give me some ideas.
What about seconde code
while(1){
if(!fgets(line,LINESIZE,fp)){
clearerr(stdin);
break;
}
if(sscanf(line,"%s %s %s",last,comma,first)==3)
get_last_first(fp);
return 0;
}
It gave me errors too. fp,last,first,comma used uninitialized in this function
OK so I think I have fixed the previous problem now. However it doesn't print the name back if the name is given correctly. Here is my fixed main code.
int main(void){
FILE *fp = stdin;
char line[LINESIZE];
char first[16];
char last[16];
while(1){
if(!fgets(line,LINESIZE,stdin)){
clearerr(stdin);
break;
}
if(sscanf(line,"%s ,%s",last,first)==2)
if(get_name(fp)==2)
printf("Your name is: %s %s\n", first, last);
}
return 0;
}
here is my function.
int get_name(FILE *fp){
char line[LINESIZE];
char last[16], first[16];
int n;
/* returns -1 if the input is not in the correct format
or the name is not valid */
if(fgets(line, LINESIZE, fp) == NULL) {
return -1;
}
/* returns 0 on EOF */
if((n = sscanf(line, " %[a-zA-Z-] , %[a-zA-Z-]", last, first)) == EOF) {
return 0;
}
/* prints the name if it's valid */
if((n = sscanf(line, " %[a-zA-Z-] , %[a-zA-Z-]", last, first)) == 2) {
return 2;
}
return 1;
}
I thank you people so much for taking time to read and help me. Please don't be mean :)
Seems that you are making it more complicated than needed. Don't call fgets and scanf in main. Only do that in the function get_name.
It can be something like this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define LINESIZE 1024
int get_name(FILE *fp)
{
char line[LINESIZE];
char* t;
if(!fgets(line, LINESIZE,fp))
{
printf("Error reading input\n");
return 0;
}
t = strstr(line, ",");
if (t)
{
*t = '\0';
++t;
printf("First: %s - Last: %s\n", line, t);
return 2;
}
printf("Illegal input\n");
return 0;
}
int main(int argc, char **argv)
{
get_name(stdin);
return 0;
}
If you later decide that you want to read from a file, you can reuse the function get_name without changing it at all. All you need is to change main. Like:
int main(int argc, char **argv)
{
FILE* f = fopen("test.txt", "r");
if (f)
{
get_name(f);
fclose(f);
}
else
{
printf("Open file failed\n");
}
return 0;
}
If you want to read from the keyboard, read from stdin or use scanf, which internally reads from stdin. If you want to read from a file instead, use FILE *fp, but don't forget to open the file and check if it was successful (you'll find lots of tutorials for this).
Further, when reading in strings, you need an array of characters, not a single one. Note further, that scanf can already deal with formats like "everything that is not a ',' then a ',' then a string. Note that format "[^,]" means "any character except a ',':
So you could adapt the code as follows:
#define LINESIZE 1024
int main(void){
char line[LINESIZE];
char first[LINESIZE];
char last[LINESIZE];
while(fgets(line,LINESIZE,stdin)) {
if(sscanf(line,"%[^,],%s",last,first)==2) {
printf("Read in %s ... %s\n",last,first);
}
else {
printf("Please enter first name a comma and then last name");
}
}
return 0;
}
And if your professor is picky concerning the "use FILE*", you could write:
FILE *fp = stdin;
...
while(fgets(line,LINESIZE,fp)) {
...

Application crashes with segmentation fault

I know what this error means, but I can't see what I did wrong here. I implemented a function to delete files and directories, and I also want to implement the "-i" functionality for a "rm" command. Command used: rm -i filename.
I get asked if I want to remove it, I type "y", the file I want to remove gets deteleted, but the program crashes afterwards and I don't know why.
int has_args(char * argv[], char arg[])
{
int i = 0;
while(argv[i] != NULL) {
if(strcmp(argv[i], arg) == 0) {
return 1;
}
i++;
}
return 0;
}
void print_error(char *this, char *filename)
{
fprintf(stderr, "%s: Could not delete file: %s;\n%s\n", this, filename, strerror(errno));
}
int cmd_rm(int argc, char *argv[])
{
errno = 0;
if(argc > 1) {
if(has_args(argv, "-r")) {
remove_directory(argv[2]);
}
else if(has_args(argv, "-i")) {
char ans[2];
fprintf(stdout, "Delete file: '%s'?\n", argv[2]);
scanf("%s", ans);
trim(ans);
if((ans[0] == 'Y') || (ans[0] == 'y')) {
if(remove(argv[2])) {
print_error(argv[0], argv[2]);
}
}
}
else {
if(remove(argv[1])) {
print_error(argv[0], argv[1]);
}
}
}
else {
puts("ERROR: ");
print_usage(argv[0]);
}
return 0;
}
Why don't you try scanf("%c",&ch); as the answer if user can input only y or n, however if you like to support yes or no you may write the following code :
scanf("%1,3s",ans); /* Points out that the minimum length is 1 and the maximum length is 3 */
and as others have pointed out in some case cmd_rm() function should return 1 or any other value to indicate success in the deletion of file. However I can't see any success case in your function cmd_rm() and it would be better if you enclose your trim function.
But generally Segmentation faults are because of pointers not pointing to a correct position ( tricky pointers ). Debuggers come handy in finding these pointers.

Program works with string literals but not with string arrays

I have a hashtable ADT which has two functions, insert and lookup. I put in to the insert function a hash table, hash table size, ID #, and book title and that inserts it into the hash table. This works fine when I pass it a string literal, i.e. insert(...,"Hello, world!"...); It doesn't work when I read in strings from a file, store them in an array, and try and use my insert and lookup functions.
I have all of my code here but the most important files are main.c and hash.c. Hash.c has the newHash(), hash(), insert(), and lookup() functions and main.c reads from two files, in this case test1.lib.in and test1.req.in, and from the first file will get the library id and title of a book from each line and then put it in the hash table. From the second file, it gets requests for a book title and should print the ids in its linked list.
List of links to files https://docs.google.com/document/d/1tFNs-eVkfnCfjwAHcAUdHtUl1KVv_WcnW2IS0SRFvcM/edit?usp=sharing
Example of code that works.
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include "list.h"
#include "hash.h"
int main(){
ListHndl* temp = newHash(10);
insert(442440, "cvyaqbznxel", 10,temp);
lookup(temp,"cvyaqbznxel", 10);
return 0;
}
Code that doesn't work
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
#include "list.h"
#include "hash.h"
int main(int argc, char * argv[]) {
if (argc != 3) {
printf("Incorrect arguments, please specify 2 files to be read\n");
return EXIT_FAILURE;
}
FILE *file = fopen( argv[1], "r");
FILE *secondFile = fopen(argv[2], "r");
if (file == 0 || secondFile == 0) {
printf("Could not open a file\n");
return EXIT_FAILURE;
}
int numDataLines2;
int numDataLines;
int hashTableSize;
//First line of first file gives number of lines in file and
//size of hash table to be made
if(fscanf(file, "%d%d", &numDataLines, &hashTableSize) < 2) {
printf("Unable to parse first line of first file\n");
return EXIT_FAILURE;
}
ListHndl* theHash = newHash(hashTableSize);
int libraryID;
char *tempString = calloc(numDataLines,41*sizeof(char));
char lineHolder[129];
//discard the new line which always shows up
fgets(lineHolder, 128, file);
for(int i = 0; i < numDataLines; i++) {
//Gets the whole line to be scanned with sscanf
fgets(lineHolder, 128, file);
//If the line consists of just a newline char, continue
if(strcmp(lineHolder, "\n") == 0 ) {
continue;
}
//Scans the line retrieved from fgets and placed in lineHolder
if(sscanf(lineHolder, "%d, %40[^\n]", &libraryID,&tempString[i]) == 0){
printf("Unable to parse line %d of first file\n",i+2);
return EXIT_FAILURE;
}
insert(libraryID, &tempString[i], hashTableSize, theHash);
}
char String[41];
fgets(String, 40, secondFile);
numDataLines2 = atoi(String);
char *storeSecondFileStuff = calloc(numDataLines2,41*sizeof(char));
for(int i = 0; i< numDataLines2; i++) {
fgets(lineHolder, 128, secondFile);
if(strcmp(lineHolder, "\n") == 0) {
continue;
}
if(sscanf(lineHolder, "%40[^\n]",&storeSecondFileStuff[i]) == 0) {
printf("Unable to parse line %d of second file\n",i+2);
return EXIT_FAILURE;
}
lookup(theHash, &storeSecondFileStuff[i], hashTableSize);
}
printf("\n");
fclose(file);
fclose(secondFile);
return 0;
}
Thanks!
I think you have multiple problems. To start with, you might not be scanning your input line correctly. Change your line
if(sscanf(lineHolder, "%d, %40[^\n]", &libraryID,&tempString[i]) == 0)
to
if(sscanf(lineHolder, "%d, %40[^\n]", &libraryID, tempString) < 0)
that way, you will trap the situation where the sscanf function did not successfully convert both arguments - for example, if there is no comma in the input line. Note that sscanf returns the number of successful conversions; success would return a value of 2, so testing for <2 is the right way to go.
Note also that I changed &tempString[i] to tempString. The former points to some place along tempString - which only has 41 characters allocated to it. Yet you always allow up to 40 characters (plus '\0' to be written to it - so you will write past the end of the string. Since this is only a temporary variable, there is no sense in doing this. Just scan the input into the temp variable, then do whatever you need to do with it.
This means that your insert also changes, from
insert(libraryID, &tempString[i], hashTableSize, theHash);
to
insert(libraryID, tempString, hashTableSize, theHash);
Again, you need to do the same thing lower down in your code.
Here is an attempt at making the code work for you - see if this hits the spot. Note that all I really did was change the type of tempString and storeSecondFileStuff, and modified the way they were used in various function calls accordingly. I did not attempt to compile / run because of the complexity of the other files involved - but this should help a bit:
int main(int argc, char * argv[]) {
if (argc != 3) {
printf("Incorrect arguments, please specify 2 files to be read\n");
return EXIT_FAILURE;
}
FILE *file = fopen( argv[1], "r");
FILE *secondFile = fopen(argv[2], "r");
if (file == 0 || secondFile == 0) {
printf("Could not open a file\n");
return EXIT_FAILURE;
}
int numDataLines2;
int numDataLines;
int hashTableSize;
//First line of first file gives number of lines in file and
//size of hash table to be made
if(fscanf(file, "%d%d", &numDataLines, &hashTableSize) < 2) {
printf("Unable to parse first line of first file\n");
return EXIT_FAILURE;
}
ListHndl* theHash = newHash(hashTableSize);
int libraryID;
char **tempString = calloc(numDataLines,sizeof(char*)); // <<< ARRAY of pointers
char lineHolder[129];
//discard the new line which always shows up
fgets(lineHolder, 128, file);
for(int i = 0; i < numDataLines; i++) {
//Gets the whole line to be scanned with sscanf
fgets(lineHolder, 128, file);
tempString[i] = calloc(1, 41 * sizeof(char)); // <<< space for this string
//If the line consists of just a newline char, continue
if(strcmp(lineHolder, "\n") == 0 ) {
continue;
}
//Scans the line retrieved from fgets and placed in lineHolder
if(sscanf(lineHolder, "%d, %40[^\n]", &libraryID, tempString[i]) < 0){ // <<< changed
printf("Unable to parse line %d of first file\n",i+2);
return EXIT_FAILURE;
}
insert(libraryID, tempString[i], hashTableSize, theHash); // <<< changed
}
char String[41];
fgets(String, 40, secondFile);
numDataLines2 = atoi(String);
char **storeSecondFileStuff = calloc(numDataLines2, sizeof(char*)); // changed: again char **
for(int i = 0; i< numDataLines2; i++) {
fgets(lineHolder, 128, secondFile);
storeSecondFileStuff[i] = calloc(1, 41 * sizeof(char));
if(strcmp(lineHolder, "\n") == 0) {
continue;
}
if(sscanf(lineHolder, "%40[^\n]",storeSecondFileStuff[i]) == 0) {
printf("Unable to parse line %d of second file\n",i+2);
return EXIT_FAILURE;
}
lookup(theHash, storeSecondFileStuff[i], hashTableSize); // <<<< changed
}
printf("\n");
fclose(file);
fclose(secondFile);
return 0;
}

read csv file line by line and output the fifth column to use for matching in c

I am new in C, and I'm trying to find a way to read csv file and output the fifth text in the line until eof
My data looks like this:
05/02/2012
00:00:01.548,XOLT,1ZE86V280394811433,trackthepack,23.22.11.82,en_US,
05/02/2012
00:00:01.605,XOLT,1ZVzVrZVhOaGNtUnZi,hadees,50.16.47.103,en_US,VE
05/02/2012
00:00:01.647,XOLT,1ZbWhoY21GMGFHRnVY,hadees,50.19.203.230,en_US,VE
05/02/2012
00:00:02.275,XOLT,1Z4217060300279193,trackthepack,107.21.159.246,en_US,
05/02/2012
00:00:02.599,XOLT,1Z9X98040398954479,Cascademfg,66.117.15.81,en_US,NF
05/02/2012
00:00:02.639,XOLT,1Z3X252W0363295735,trackthepack,107.22.101.79,en_US,
I would need to read this file and store the value of the fifth text (e.g. 23.22.11.82) and use it further processing of a match.
In java, I use the following code to split out the csv line
String delims = "[,]";
while ((s1 = in.readLine()) != null && s1.length() != 0){
String[] tokens = s1.split(delims);
Is there a similar way in C? My code works faster if I run it in C, that is the reason.
I was able to try some c code and I was able to read the file (3 records) but it seems that it is not seeing the end of the line and I am hitting a segmentation error.
I am using fgets and strtok
THe input file is a variable length file delimiter by comma (,) and I want to get the fifth token in each line and then use it as a lookup key
here is the code :
#include "GeoIP.h"
#include "GeoIPCity.h"
static const char * _mk_NA( const char * p ){
return p ? p : "N/A";
}
int
main(int argc, char *argv[])
{
FILE *f;
FILE *out_f;
GeoIP *gi;
GeoIPRecord *gir;
int generate = 0;
char iphost[50];
char *nextWordPtr = NULL;
int wordCount =0;
char *rechost;
char recbuffer[1000];
char delims[]=",";
const char *time_zone = NULL;
char **ret;
if (argc == 2)
if (!strcmp(argv[1], "gen"))
generate = 1;
gi = GeoIP_open("../data/GeoIPCity.dat", GEOIP_MEMORY_CACHE);
if (gi == NULL) {
fprintf(stderr, "Error opening database\n");
exit(1);
}
f = fopen("city_test.txt", "r");
if (f == NULL) {
fprintf(stderr, "Error opening city_test.txt\n");
exit(1);
}
out_f = fopen("out_city_lookup_test.txt", "w");
if (out_f == NULL) {
fprintf(stderr, "Error opening out_city_lookup_test.txt\n");
exit(1);
}
//** Read the file line by line and get the ip address to use to lookup GeoIP **//
//* while (!feof(f)) {
while (fgets(recbuffer,1001,f) != NULL {
nextWordPtr = strtok (recbuffer,delims);
while (nextWordPtr != NULL & wordCount < 5) {
printf("word%d %s\n",wordCount,nextWordPtr);
if (wordCount == 4 ) {
printf("nextWordPtr %s\n",nextWordPtr);
strcpy(iphost, nextWordPtr);
printf("iphost %s\n",iphost);
}
wordCount++;
nextWordPtr = strtok(NULL,delims);
}
gir = GeoIP_record_by_name(gi, (const char *) iphost);
if (gir != NULL) {
ret = GeoIP_range_by_ip(gi, (const char *) iphost);
time_zone = GeoIP_time_zone_by_country_and_region(gir->country_code, gir->region);
printf("%s\t%s\t%s\t%s\t%s\t%s\t%f\t%f\t%d\t%d\t%s\t%s\t%s\n", iphost,
_mk_NA(gir->country_code),
_mk_NA(gir->region),
_mk_NA(GeoIP_region_name_by_code(gir->country_code, gir->region)),
_mk_NA(gir->city),
_mk_NA(gir->postal_code),
gir->latitude,
gir->longitude,
gir->metro_code,
gir->area_code,
_mk_NA(time_zone),
ret[0],
ret[1]);
fprintf(out_f,"%s\t%s\t%s\t%s\t%s\t%s\t%f\t%f\t%d\t%d\t%s\t%s\t%s\n", iphost,
_mk_NA(gir->country_code),
_mk_NA(gir->region),
_mk_NA(GeoIP_region_name_by_code(gir->country_code, gir->region)),
_mk_NA(gir->city),
_mk_NA(gir->postal_code),
gir->latitude,
gir->longitude,
gir->metro_code,
gir->area_code,
_mk_NA(time_zone),
ret[0],
ret[1]);
GeoIP_range_by_ip_delete(ret);
GeoIPRecord_delete(gir);
}
}
GeoIP_delete(gi);
fclose(out_f);
return 0;
Yes, not as elegant but you can use strtok to get the job done.
For what you want, a better approach is a lexer. If your end goal is complex, you might want a parser as well.
I've got an example lexer and parser here. It is more complex than what you need though. If you want something simple, strtok will do the job, but you will have several nasty surprises to watch out for. It will also be difficult to use outside the simple case you have presented here.

Resources