Capture quoted strings separated with commas from a file - c

let's say I want to take an input from a file like this :-
"8313515769001870,GRKLK,03/2023,eatcp,btlzg"
"6144115684794523,ZEATL,10/2033,arnne,drrfd"
for a structure I made as follows
typedef struct{
char Card_Number[20];
char Bank_Code[6];
char Expiry_Date[8];
char First_Name[30];
char Last_Name[30];
}Card;
This is my attempt to read the input from a file named 'file' in the reading mode, the str in fgets is storing the right string but it isn't getting absorbed c[i]:
FILE * fptr;
int count=0;
fptr= fopen("file","r");
Card *c = (Card*)calloc(10,sizeof(Card));
printf("StartAlloc\n");
int i=0;
char str[1000];
fgets(str,80,fptr);
if(fptr==NULL)
{return 0;}
do{
sscanf(str,"\"%[^,],%[^,],%[^,],%[^,],%[^,]\" \n",c[i].Card_Number,c[i].Bank_Code,c[i].Expiry_Date,c[i].First_Name,c[i].Last_Name);
i++;
}while(fgets(str,80,fptr)!=NULL);
I do not understand why the regex %[^,] is not capturing the individual elements, I have wasted a lot of time, and help would be greatly appreciated.

The last token doesn't end with a ',', so you can't use %[^,] for it. It is however followed by a '\"', so you can use %[^\"] instead :
sscanf(str,"\"%[^,],%[^,],%[^,],%[^,],%[^\"]\" \n",c[i].Card_Number,c[i].Bank_Code,c[i].Expiry_Date,c[i].First_Name,c[i].Last_Name);

Using fscanf() with the proper format you can retrieve the desired elements from each line :
"\"%[^,]%*c %[^,]%*c %[^,]%*c %[^,]%*c %[^\"]%*c\n"
With the previous format, the opening quote is ignored (\"), and the strings separated by commas are captured (%[^,]%*c). Finally the the closing quote is discarded (%[^\"]%*c), and the line break considered (\n), to let next line to be read.
This is how you can integrate it in your code :
while (fscanf(file, "\"%[^,]%*c %[^,]%*c %[^,]%*c %[^,]%*c %[^\"]%*c\n", c[i].Card_Number, c[i].Bank_Code, c[i].Expiry_Date, c[i].First_Name, c[i].Last_Name) != -1 ) i++;
Complete code snippet for testing purposes :
#include <stdio.h>
#include <stdlib.h>
typedef struct{
char Card_Number[20];
char Bank_Code[6];
char Expiry_Date[8];
char First_Name[30];
char Last_Name[30];
}Card;
int main(){
FILE *file;
file = fopen("data.csv", "r");
int i=0;
Card *c = (Card*)calloc(10,sizeof(Card));
while (fscanf(file, "\"%[^,]%*c %[^,]%*c %[^,]%*c %[^,]%*c %[^\"]%*c\n", c[i].Card_Number, c[i].Bank_Code, c[i].Expiry_Date, c[i].First_Name, c[i].Last_Name) != -1 ) {
printf("%s | %s | %s | %s | %s \n", c[i].Card_Number, c[i].Bank_Code, c[i].Expiry_Date, c[i].First_Name, c[i].Last_Name);
i++;
}
fclose(file);
return 0;
}

If you just need to read from the file, you could just use fscanf() instead of reading from file to a character array and then use sscanf() for that string.
And you needn't explicitly type cast the return value of calloc(). See is it necessary to type-cast malloc and calloc.
You are doing
if(fptr==NULL)
{return 0;}
after you tried to read from the file. If the file couldn't be opened the program would crash well before the control reaches this if statement.
Place this check right after opening the file like
FILE *fptr = fopen("file", "r");
if(fptr==NULL)
{
return EXIT_FAILURE;
}
and return value 0 is usually taken to mean success. Since input file not being found is an error, try returning EXIT_FAILURE instead.
And in the last %[^,]" in the format string of sscanf function in your program, there is no comma for the last entry of each line in the input file. So change it to read till the last"` is found.
Also, at the end of the format string, there's a space followed by a \n. The \n is redundant here as a space will match "One white-space character in format-string matches any combination of white-space characters in the input"
So the final format string could be
"\"%[^,],%[^,],%[^,],%[^,],%[^\"]\" "
And don't forget to close the files you've opened and free the memory you've allocated before the end of the program like
free(c); //for the Card pointer
fclose(fptr);

Related

How to read comma delimited data from txt file to struct

I have text file which have the following content
inputfile
I have used a function to get data from the input file which is comma delimited.
I want to read data from it and want to remove comma and store the data to Struct Resistor_struct.
I have tried the following code.
'''
#include<stdio.h>
//functions header
int blown_ressistors();
struct resistor_struct
{
char ID_LEN[5];
char id;
float max_poewr;
int resistance;
};
struct resistor_struct rs[100];
int blown_ressistors()
{
FILE *fp = fopen("input.txt", "r");
int i = 0;
if(fp!=NULL)
{
while(fscanf(fp, "%s[^,], %d[^,], %f[^,]",rs[i].ID_LEN,rs[i].resistance, rs[i].max_poewr)!=EOF)
{
printf("%s\t", rs[i].ID_LEN);
printf("%d\t", rs[i].resistance);
printf("%d\t\n", rs[i].max_poewr);
i++;
}
}
else
{
perror("Input.txt: ");
}
'''
output
output image
You don't want to compare the value returned from scanf with EOF.
In your case, the format string is such that scanf can never match more than 1 conversion specifier, since %s[^,], is trying to match the literal input string [^,], but the [ is guaranteed not to match since the first character that scanf will stop consuming for the %s is whitespace. And [ is not whitespace. Try something like:
while(fscanf(fp, " %4[^,], %d, %f", rs[i].ID_LEN, &rs[i].resistance, &rs[i].max_poewr) == 3 )
but note that this will behave oddly on whitespace in the first column. You might want to try: " %4[^, \t\n] , %d, %f", but quite frankly the better solution is to stop using scanf. Even with something trivial like this, your behavior will be undefined on an input like foo, 9999...9999 (where the 2nd column is any value that exceeds the capacity of an int). Just stop using scanf. Read the data and parse it with strtol and strtod.

How to scan strings from a text file with delimiter and then display it?

So i tried to make a program that will scan strings from a text file and then display it using loop. But, somehow my program cannot work and it is display weird symbols.. i am new to text file and i would appreciate a lot if someone can explain to me what is wrong with my code.
My code :
#include <stdio.h>
#include <string.h>
int main()
{
FILE *fPtr;
fPtr = fopen("alumni.txt", "r");
if (fPtr == NULL) {
printf("There is a error opening the file.");
exit(-1);
}
char name[20], design[50], category[20], location[20];
while (fscanf(fPtr, "%s:%[^\n]:%[^\n]:%[^\n]", &name, &design, &category, &location) != EOF) {
printf("Name : %s\n", name);
printf("Designation : %s\n", design);
printf("Category : %s\n", category);
printf("Location : %s\n", location);
}
}
and this is my text file,
Shanie:Programmer:Full Time:Kuala Lumpur
Andy:Sales Agent:Part Time:Johor Bahru
Elaine:Database Administrator Full Time Melaka
Stephanie:MIS manager:Full Time:Penang
You have two problems: The first is that %s will read space delimited "words", it won't stop at the :. The second problem is that the format %[^\n] reads all until newline.
So you need a scanset format for the first name as well as tell it to read until the next :, which is done with the format %[^:].
So please change to:
while (fscanf(fPtr, " %19[^:]:%49[^:]:%19[^:]:%19[^\n]", name, design, category, location) == 4) {
...
}
Please note a couple of other changes I made to your call and loop condition: First of all, I have added length specifiers to the formats, so fscanf will not write out of bounds of your arrays.
Secondly both the %s and %[] formats expects a char * argument, while you provided a pointer to arrays (&name will be of type char (*)[20] not char *). Arrays naturally decay to pointers to their first element, so e.g. name will decay to &name[0] which will be of the correct type char *.
Thirdly I changed the comparison to compare against 4, which is what fscanf will return if it successfully parsed the input.
Lastly I added a space before the first format, to skip any leading space (like the newline from the previous line).
To be sure to be able to continue even in the case of malformed input, I recommend you read full lines instead (using e.g. fgets), and then possibly use sscanf to parse each line.

How to Split strings with fgets or fscanf in C?

I understand how to read in a text file and scan/print the entire file, but how can a line be split into several strings? Also, can variables be assigned to those strings to be called later?
My code so far:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
FILE *fPointer;
fPointer = fopen("p1customer.txt", "r");
char singleLine[150];
int id;
while (!feof(fPointer)){
fscanf(fPointer, "%d", &id);
printf("%d",id);
}
fclose(fPointer);
return 0;
}
Example Text File to be read:
99999 John Doe Basketball
Example Output:
John Doe has ID number 99999 and plays Basketball
I am attempting to split/tokenize those strings and assign them variables (IDnumber, Name, Sport) and print the output in a new file.
you can use a library function strtok(str,chrs) function.
A sequence of calls of strtok(str,chrs) splits str into tokens, each delimited by a character from chrs.
The first call in a sequence is a non Null str.It finds the first token in str consisting of chars not int chrs;it terminates that by overwrtting the next characters of str by \0 and return pointer to token. Each subsequent call,indicated by a NULL value of str,retuens a pointer to next such token, searching from just past the end of privious one.
You should post an example of the input file so that you can help in more detail.
I've seen you've also entered a string, I guess you want to fill in with something but you did not specify that.
If you wanted to treat the file as a list of numbers, the sample of the code might be the following.
#include <stdio.h>
int main() {
FILE *infile;
char buf[100];
int len_file=0;
if(!(infile = fopen("p1customer.txt", "r"))) { /*checks the correct opening of the file*/
printf("Error in open p1customer.txt\n");
return 1;
}
while(fgets(buf,sizeof(buf),infile)!=NULL) /*check the lenght of the file (number of row) */
len_file++;
int id[len_file];
int i=0;
rewind(infile);
while(fgets(buf,sizeof(buf),infile)!=NULL) {
sscanf(buf,"%i",&id[i]);
i++;
}
for(i=0;i<len_file;i++)
printf("%i\n",id[i]);
fclose(infile);
return 0;
}
If you want to treat the file as an indefinite list of numbers on each row separated by a space, you can use the parsing of the string by using in the sscanf formatting %31[^ ]which has the task of reading the number until it encounters a space, also you can add a variable that is incremented for each char/number read.
Then you can refine the code by checking if there are any characters in the line using the isalpha function in the ctype.h library to see if there are any characters and then insert them into a string until you find the termination character '\ 0'.
The possibilities are infinite so it would useful have the input file, when you provided it, i'll update the answer.

How to save every line in file (IN C) in a variable? :)

I need to save every line of text file in c in a variable.
Here's my code
int main()
{
char firstname[100];
char lastname[100];
char string_0[256];
char string[256] = "Vanilla Twilight";
char string2[256];
FILE *file;
file = fopen("record.txt","r");
while(fgets(string_0,256,file) != NULL)
{
fgets(string2, 256, file);
printf("%s\n", string2);
if(strcmp(string, string2)==0)
printf("A match has been found");
}
fclose(file);
return 0;
}
Some lines are stored in the variable and printed on the cmd but some are skipped.
What should I do? When I tried sscanf(), all lines were complete but only the first word of each line is printed. I also tried ffscanf() but isn't working too. In fgets(), words per line are complete, but as I've said, some lines are skipped (even the first line).
I'm just a beginner in programming, so I really need help. :(
You're skipping over the check every odd number of lines, as you have two successive fgets() calls and only one strcmp(). Reduce your code to
while(fgets(string_0,256,file) != NULL)
{
if( ! strcmp(string_0, string2) )
printf("A match has been found\n");
}
FWIW, fgets() reads and stores the trailing newline, which can cause problem is string comparison, you need to take care of that, too.
As a note, you should always check the return value of fopen() for success before using the returned pointer.

Not understanding the C format specifiers when using fscanf()

So I am reading a text file in this format:
ABC 51.555 31.555
DEF 23.445 45.345
I am trying to use fscanf() to parse the data, because this file could grow or shrink it needs to be dynamic in the way it loads hence why i used malloc and i also want to store it in the struct below. I think the issue is with a space or even possible not writing the whole format specifier right. Here is my code.
typedef struct data
{
char name[4];
char lat[7];
char lng[7];
}coords;
int main(int argc, char *argv[])
{
////////////CREATES FILE POINTER/////////
FILE* fp;
///////////CREATES MALLOC POINTER TO STORE STRUCTS/////////////
coords* cp;
//////////OPENS FILE//////////
fp = fopen(argv[1], "r");
/////////GET THE TOTAL AMMOUNT OF LINES IN THE FILE/////////
fseek(fp, 0, SEEK_END);
long size = ftell(fp);
rewind(fp);
//////SKIPS FIRST LINE//////////
while(fgetc(fp) != (int)'\n')
{};
/////////ASSIGNS MEMORY THE SIZE OF THE FILE TO //////////
cp = malloc(sizeof(coords) * size);
//////////READS FILE AND STORES DATA///////
fscanf(fp,"%s[^ ] %s[^ ] %s[^\n]", cp->name, cp->lat, cp->lng);
printf("%s\n%lf\n%lf\n", cp->name, cp->lat, cp->lng);
fclose(fp);
return 0;
}
And yes I am aware I did not include the header files but I have got the right ones stdlib and stdio
UPDATE 1:
I have tried both replies and I get this on my screen:
ABC51.555
0.000000
0.000000
How come the 51.555 has not gone to the next item in the struct?
Thanks
///////////////////////////////////////////////////////////////UPDATE 2////////////////////////////////////////////////////////
Okay I have modified my code to do the following.
typedef struct data
{
char name[4];
char lat[6];
char lng[6];
}coords;
int main(int argc, char *argv[])
{
////////////CREATES FILE POINTER/////////
FILE* fp;
///////////CREATES MALLOC POINTER TO STORE STRUCTS/////////////
coords* cp;
//////////OPENS FILE//////////
fp = fopen(argv[1], "r");
/////////GET THE TOTAL SIZE OF THE FILE/////////
fseek(fp, 0, SEEK_END);
long size = ftell(fp);
long lines = -1;
rewind(fp);
//////GETS TOTAL AMMOUNT OF LINES/////////
char c;
while(c != EOF)
{
c = fgetc(fp);
if(c == '\n')
{
lines++;
}
}
rewind(fp);
////////////SKIPS FIRST LINE//////////
while(fgetc(fp) != (int)'\n')
{};
/////////ASSIGNS MEMORY THE SIZE OF THE FILE TO //////////
cp = malloc(sizeof(coords) * size);
//////////READS FILE AND STORES DATA///////
printf("Lines of text read: %d\n", lines);
fscanf(fp,"%s %s %s[^\n]", cp[0].name, cp[0].lat, cp[0].lng);
printf("%s\n", cp[0].name);
fclose(fp);
return 0;
}
Now when i try to print cp[0].name; I get the whole of the first line with no space in, like this.
ABC51.55531.555
If i got print cp[0].lat; I get this.
51.55531.555
And when i print cp[0].lng; I get this.
31.555
Which is the only correct one, I can not understand this behaviour. Why is it behaving like this? all the posts suggest (As i first thought) that each %s in fscanf would put it in to its own variable not concatenate them. Not mater if i use the dot notation or the direct -> it still has the same result.
Thanks :)
The format specifier "%s[^... attempts to read a whitspace delimited string, followed by the character [ and then the character ^. Since the string will always end at whitespace, the next character will always be whitespace, which won't match the [, and none of the rest of the format specifier will match.
ALWAYS check the return value of fscanf to make sure you read all the things you thing you did. If the return value is wrong, give a diagnostic.
ALWAYS use field size limits when reading into fixed size string arrays.
So in your case what you want is:
if (fscanf(fp, "%3s%6s%6s", cp->name, cp->lat, cp->lng) != 3) {
fprintf(stderr, "Incorrect data in input file, exiting!\n");
abort(); }
I'm not sure that you want to use the space delimiter [^ ]. fscanf already parses the string on whitespace as default. Try this and see if the string is correctly parsed:
fscanf(fp, "%s %s %s[^\n]", cp->name, cp->lat, cp-lng);
output should result in:
cp->name ---- ABC
cp->lat ----- 51.555
cp->lng ----- 31.555

Resources