Not understanding the C format specifiers when using fscanf() - c

So I am reading a text file in this format:
ABC 51.555 31.555
DEF 23.445 45.345
I am trying to use fscanf() to parse the data, because this file could grow or shrink it needs to be dynamic in the way it loads hence why i used malloc and i also want to store it in the struct below. I think the issue is with a space or even possible not writing the whole format specifier right. Here is my code.
typedef struct data
{
char name[4];
char lat[7];
char lng[7];
}coords;
int main(int argc, char *argv[])
{
////////////CREATES FILE POINTER/////////
FILE* fp;
///////////CREATES MALLOC POINTER TO STORE STRUCTS/////////////
coords* cp;
//////////OPENS FILE//////////
fp = fopen(argv[1], "r");
/////////GET THE TOTAL AMMOUNT OF LINES IN THE FILE/////////
fseek(fp, 0, SEEK_END);
long size = ftell(fp);
rewind(fp);
//////SKIPS FIRST LINE//////////
while(fgetc(fp) != (int)'\n')
{};
/////////ASSIGNS MEMORY THE SIZE OF THE FILE TO //////////
cp = malloc(sizeof(coords) * size);
//////////READS FILE AND STORES DATA///////
fscanf(fp,"%s[^ ] %s[^ ] %s[^\n]", cp->name, cp->lat, cp->lng);
printf("%s\n%lf\n%lf\n", cp->name, cp->lat, cp->lng);
fclose(fp);
return 0;
}
And yes I am aware I did not include the header files but I have got the right ones stdlib and stdio
UPDATE 1:
I have tried both replies and I get this on my screen:
ABC51.555
0.000000
0.000000
How come the 51.555 has not gone to the next item in the struct?
Thanks
///////////////////////////////////////////////////////////////UPDATE 2////////////////////////////////////////////////////////
Okay I have modified my code to do the following.
typedef struct data
{
char name[4];
char lat[6];
char lng[6];
}coords;
int main(int argc, char *argv[])
{
////////////CREATES FILE POINTER/////////
FILE* fp;
///////////CREATES MALLOC POINTER TO STORE STRUCTS/////////////
coords* cp;
//////////OPENS FILE//////////
fp = fopen(argv[1], "r");
/////////GET THE TOTAL SIZE OF THE FILE/////////
fseek(fp, 0, SEEK_END);
long size = ftell(fp);
long lines = -1;
rewind(fp);
//////GETS TOTAL AMMOUNT OF LINES/////////
char c;
while(c != EOF)
{
c = fgetc(fp);
if(c == '\n')
{
lines++;
}
}
rewind(fp);
////////////SKIPS FIRST LINE//////////
while(fgetc(fp) != (int)'\n')
{};
/////////ASSIGNS MEMORY THE SIZE OF THE FILE TO //////////
cp = malloc(sizeof(coords) * size);
//////////READS FILE AND STORES DATA///////
printf("Lines of text read: %d\n", lines);
fscanf(fp,"%s %s %s[^\n]", cp[0].name, cp[0].lat, cp[0].lng);
printf("%s\n", cp[0].name);
fclose(fp);
return 0;
}
Now when i try to print cp[0].name; I get the whole of the first line with no space in, like this.
ABC51.55531.555
If i got print cp[0].lat; I get this.
51.55531.555
And when i print cp[0].lng; I get this.
31.555
Which is the only correct one, I can not understand this behaviour. Why is it behaving like this? all the posts suggest (As i first thought) that each %s in fscanf would put it in to its own variable not concatenate them. Not mater if i use the dot notation or the direct -> it still has the same result.
Thanks :)

The format specifier "%s[^... attempts to read a whitspace delimited string, followed by the character [ and then the character ^. Since the string will always end at whitespace, the next character will always be whitespace, which won't match the [, and none of the rest of the format specifier will match.
ALWAYS check the return value of fscanf to make sure you read all the things you thing you did. If the return value is wrong, give a diagnostic.
ALWAYS use field size limits when reading into fixed size string arrays.
So in your case what you want is:
if (fscanf(fp, "%3s%6s%6s", cp->name, cp->lat, cp->lng) != 3) {
fprintf(stderr, "Incorrect data in input file, exiting!\n");
abort(); }

I'm not sure that you want to use the space delimiter [^ ]. fscanf already parses the string on whitespace as default. Try this and see if the string is correctly parsed:
fscanf(fp, "%s %s %s[^\n]", cp->name, cp->lat, cp-lng);
output should result in:
cp->name ---- ABC
cp->lat ----- 51.555
cp->lng ----- 31.555

Related

How to read a complete file with scanf maybe something like %[^\EOF] without loop in single statement

I want to know if I can read a complete file with single scanf statement. I read it with below code.
#include<stdio.h>
int main()
{
FILE * fp;
char arr[200],fmt[6]="%[^";
fp = fopen("testPrintf.c","r");
fmt[3] = EOF;
fmt[4] = ']';
fmt[5] = '\0';
fscanf(fp,fmt,arr);
printf("%s",arr);
printf("%d",EOF);
return 0;
}
And it resulted into a statement after everything happened
"* * * stack smashing detected * * *: terminated
Aborted (core dumped)"
Interestingly, printf("%s",arr); worked but printf("%d",EOF); is not showing its output.
Can you let me know what has happened when I tried to read upto EOF with scanf?
If you really, really must (ab)use fscanf() into reading the file, then this outlines how you could do it:
open the file
use fseek() and
ftell() to find the size of the file
rewind() (or fseek(fp, 0, SEEK_SET)) to reset the file to the start
allocate a big buffer
create a format string that reads the correct number of bytes into the buffer and records how many characters are read
use the format with fscanf()
add a null terminating byte in the space reserved for it
print the file contents as a big string.
If there are no null bytes in the file, you'll see the file contents printed. If there are null bytes in the file, you'll see the file contents up to the first null byte.
I chose the anodyne name data for the file to be read — there are endless ways you can make that selectable at runtime.
There are a few assumptions made about the size of the file (primarily that the size isn't bigger than can be fitted into a long with signed overflow, and that it isn't empty). It uses the fact that the %c format can accept a length, just like most of the formats can, and it doesn't add a null terminator at the end of the string it reads and it doesn't fuss about whether the characters read are null bytes or anything else — it just reads them. It also uses the fact that you can specify the size of the variable to hold the offset with the %n (or, in this case, the %ln) conversion specification. And finally, it assumes that the file is not shrinking (it will ignore growth if it is growing), and that it is a seekable file, not a FIFO or some other special file type that does not support seeking.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
const char filename[] = "data";
FILE *fp = fopen(filename, "r");
if (fp == NULL)
{
fprintf(stderr, "Failed to open file %s for reading\n", filename);
exit(EXIT_FAILURE);
}
fseek(fp, 0, SEEK_END);
long length = ftell(fp);
rewind(fp);
char *buffer = malloc(length + 1);
if (buffer == NULL)
{
fprintf(stderr, "Failed to allocate %ld bytes\n", length + 1);
exit(EXIT_FAILURE);
}
char format[32];
snprintf(format, sizeof(format), "%%%ldc%%ln", length);
long nbytes = 0;
if (fscanf(fp, format, buffer, &nbytes) != 1 || nbytes != length)
{
fprintf(stderr, "Failed to read %ld bytes (got %ld)\n", length, nbytes);
exit(EXIT_FAILURE);
}
buffer[length] = '\0';
printf("<<<SOF>>\n%s\n<<EOF>>\n", buffer);
free(buffer);
return(0);
}
This is still an abuse of fscanf() — it would be better to use fread():
if (fread(buffer, sizeof(char), length, fp) != (size_t)length)
{
fprintf(stderr, "Failed to read %ld bytes\n", length);
exit(EXIT_FAILURE);
}
You can then omit the variable format and the code that sets it, and also nbytes. Or you can keep nbytes (maybe as a size_t instead of long) and assign the result of fread() to it, and use the value in the error report, along the lines of the test in the fscanf() variant.
You might get warnings from GCC about a non-literal format string for fscanf(). It's correct, but this isn't dangerous because the programmer is completely in charge of the content of the format string.

Capture quoted strings separated with commas from a file

let's say I want to take an input from a file like this :-
"8313515769001870,GRKLK,03/2023,eatcp,btlzg"
"6144115684794523,ZEATL,10/2033,arnne,drrfd"
for a structure I made as follows
typedef struct{
char Card_Number[20];
char Bank_Code[6];
char Expiry_Date[8];
char First_Name[30];
char Last_Name[30];
}Card;
This is my attempt to read the input from a file named 'file' in the reading mode, the str in fgets is storing the right string but it isn't getting absorbed c[i]:
FILE * fptr;
int count=0;
fptr= fopen("file","r");
Card *c = (Card*)calloc(10,sizeof(Card));
printf("StartAlloc\n");
int i=0;
char str[1000];
fgets(str,80,fptr);
if(fptr==NULL)
{return 0;}
do{
sscanf(str,"\"%[^,],%[^,],%[^,],%[^,],%[^,]\" \n",c[i].Card_Number,c[i].Bank_Code,c[i].Expiry_Date,c[i].First_Name,c[i].Last_Name);
i++;
}while(fgets(str,80,fptr)!=NULL);
I do not understand why the regex %[^,] is not capturing the individual elements, I have wasted a lot of time, and help would be greatly appreciated.
The last token doesn't end with a ',', so you can't use %[^,] for it. It is however followed by a '\"', so you can use %[^\"] instead :
sscanf(str,"\"%[^,],%[^,],%[^,],%[^,],%[^\"]\" \n",c[i].Card_Number,c[i].Bank_Code,c[i].Expiry_Date,c[i].First_Name,c[i].Last_Name);
Using fscanf() with the proper format you can retrieve the desired elements from each line :
"\"%[^,]%*c %[^,]%*c %[^,]%*c %[^,]%*c %[^\"]%*c\n"
With the previous format, the opening quote is ignored (\"), and the strings separated by commas are captured (%[^,]%*c). Finally the the closing quote is discarded (%[^\"]%*c), and the line break considered (\n), to let next line to be read.
This is how you can integrate it in your code :
while (fscanf(file, "\"%[^,]%*c %[^,]%*c %[^,]%*c %[^,]%*c %[^\"]%*c\n", c[i].Card_Number, c[i].Bank_Code, c[i].Expiry_Date, c[i].First_Name, c[i].Last_Name) != -1 ) i++;
Complete code snippet for testing purposes :
#include <stdio.h>
#include <stdlib.h>
typedef struct{
char Card_Number[20];
char Bank_Code[6];
char Expiry_Date[8];
char First_Name[30];
char Last_Name[30];
}Card;
int main(){
FILE *file;
file = fopen("data.csv", "r");
int i=0;
Card *c = (Card*)calloc(10,sizeof(Card));
while (fscanf(file, "\"%[^,]%*c %[^,]%*c %[^,]%*c %[^,]%*c %[^\"]%*c\n", c[i].Card_Number, c[i].Bank_Code, c[i].Expiry_Date, c[i].First_Name, c[i].Last_Name) != -1 ) {
printf("%s | %s | %s | %s | %s \n", c[i].Card_Number, c[i].Bank_Code, c[i].Expiry_Date, c[i].First_Name, c[i].Last_Name);
i++;
}
fclose(file);
return 0;
}
If you just need to read from the file, you could just use fscanf() instead of reading from file to a character array and then use sscanf() for that string.
And you needn't explicitly type cast the return value of calloc(). See is it necessary to type-cast malloc and calloc.
You are doing
if(fptr==NULL)
{return 0;}
after you tried to read from the file. If the file couldn't be opened the program would crash well before the control reaches this if statement.
Place this check right after opening the file like
FILE *fptr = fopen("file", "r");
if(fptr==NULL)
{
return EXIT_FAILURE;
}
and return value 0 is usually taken to mean success. Since input file not being found is an error, try returning EXIT_FAILURE instead.
And in the last %[^,]" in the format string of sscanf function in your program, there is no comma for the last entry of each line in the input file. So change it to read till the last"` is found.
Also, at the end of the format string, there's a space followed by a \n. The \n is redundant here as a space will match "One white-space character in format-string matches any combination of white-space characters in the input"
So the final format string could be
"\"%[^,],%[^,],%[^,],%[^,],%[^\"]\" "
And don't forget to close the files you've opened and free the memory you've allocated before the end of the program like
free(c); //for the Card pointer
fclose(fptr);

Cannot get Call to Function Working

I found this piece of code at Reading a file character by character in C and it compiles and is what I wish to use. My problem that I cannot get the call to it working properly. The code is as follows:
char *readFile(char *fileName)
{
FILE *file = fopen(fileName, "r");
char *code;
size_t n = 0;
int c;
if (file == NULL)
return NULL; //could not open file
code = malloc(1500);
while ((c = fgetc(file)) != EOF)
{
code[n++] = (char) c;
}
code[n] = '\0';
return code;
}
I am not sure of how to call it. Currently I am using the following code to call it:
.....
char * rly1f[1500];
char * RLY1F; // This is the Input File Name
rly1f[0] = readFile(RLY1F);
if (rly1f[0] == NULL) {
printf ("NULL array); exit;
}
int n = 0;
while (n++ < 1000) {
printf ("%c", rly1f[n]);
}
.....
How do I call the readFile function such that I have an array (rly1f) which is not NULL? The file RLY1F exists and has data in it. I have successfully opened it previously using 'in line code' not a function.
Thanks
The error you're experiencing is that you forgot to pass a valid filename. So either the program crashes, or fopen tries to open a trashed name and returns NULL
char * RLY1F; // This is not initialized!
RLY1F = "my_file.txt"; // initialize it!
The next problem you'll have will be in your loop to print the characters.
You have defined an array of pointers char * rly1f[1500];
You read 1 file and store it in the first pointer of the array rly1f[0]
But when you display it you display the pointer values as characters which is not what you want. You should just do:
while (n < 1000) {
printf ("%c", rly1f[0][n]);
n++;
}
note: that would not crash but would print trash if the file read is shorter than 1000.
(BLUEPIXY suggested the post-incrementation fix for n BTW or first character is skipped)
So do it more simply since your string is nul-terminated, pass the array to puts:
puts(rly1f[0]);
EDIT: you have a problem when reading your file too. You malloc 1500 bytes, but you read the file fully. If the file is bigger than 1500 bytes, you get buffer overflow.
You have to compute the length of the file before allocating the memory. For instance like this (using stat would be a better alternative maybe):
char *readFile(char *fileName, unsigned int *size) {
...
fseek(file,0,SEEK_END); // set pos to end of file
*size = ftell(file); // get pos, i.e. size
rewind(file); // set pos to 0
code = malloc(*size+1); // allocate the proper size plus one
notice the extra parameter which allows you to return the size as well as the file data.
Note: on windows systems, text files use \r\n (CRLF) to delimit lines, so the allocated size will be higher than the number of characters read if you use text mode (\r\n are converted to \n so there are less chars in your buffer: you could consider a realloc once you know the exact size to shave off the unused allocated space).

How to read in the last word in a text file and into another text file in C?

SO i'm supposed to write a block of code that opens a file called "words" and writes the last word in the file to a file called "lastword". This is what I have so far:
FILE *f;
FILE *fp;
char string1[100];
f = fopen("words","w");
fp=fopen("lastword", "w");
fscanf(f,
fclose(fp)
fclose(f);
The problem here is that I don't know how to read in the last word of the text file. How would I know which word is the last word?
This is similar to what the tail tool does, you seek to a certain offset from the end of the file and read the block there, then search backwards, once you meet a whitespace or a new line, you can print the word from there, that is the last word. The basic code looks like this:
char string[1024];
char *last;
f = fopen("words","r");
fseek(f, SEEK_END, 1024);
size_t nread = fread(string, 1, sizeof string, f);
for (int I = 0; I < nread; I++) {
if (isspace(string[nread - 1 - I])) {
last = string[nread - I];
}
}
fprintf(fp, "%s", last);
If the word boundary is not find the first block, you continue to read the second last block and search in it, and the third, until your find it, then print all the characters after than position.
There are plenty of ways to do this.
Easy way
One easy approach would be to to loop on reading words:
f = fopen("words.txt","r"); // attention !! open in "r" mode !!
...
int rc;
do {
rc=fscanf(f, "%99s", string1); // attempt to read
} while (rc==1 && !feof(f)); // while it's successfull.
... // here string1 contains the last successfull string read
However this takes a word as any combination of characters separated by space. Note the use of the with filed in the scanf() format to make sure that there will be no buffer overflow.
More exact way
Building on previous attempt, if you want a stricter definition of words, you can just replace the call to scanf() with a function of your own:
rc=read_word(f, string1, 100);
The function would be something like:
int read_word(FILE *fp, char *s, int szmax) {
int started=0, c;
while ((c=fgetc(fp))!=EOF && szmax>1) {
if (isalpha(c)) { // copy only alphabetic chars to sring
started=1;
*s++=c;
szmax--;
}
else if (started) // first char after the alphabetics
break; // will end the word.
}
if (started)
*s=0; // if we have found a word, we end it.
return started;
}

Read Magic Number from .au File

I wrote a small program to get the magic number from an .au file and print it to console, but every time I try, instead of getting the intended .snd, I get .snd$ instead.
I'm not sure why this is happening, considering that I'm only reading in 4 bytes, which is what the magic number is comprised of. So, where is the extra character coming from?
#include <stdio.H>
int main()
{
FILE *fin;
int r;
char m[4], path[20];
scanf("%s", path);
fin = fopen(path, "r");
r = fread(&m, sizeof(char), 4, fin);
printf("magic number is %s\n", m);
return 0;
}
You're printing it as though it were a string, which in C, means that it's NUL-terminated. Change your code like this and it will work as you expect:
char m[5];
m[4] = '\0'; /* add terminating NUL */
Also, you should be aware that scanf is a dangerous function. Use a command line argument instead.
The problem is not how you are reading.
The problem is that your variable is only 4 chars length, and it needs a null character to indicate the end.
printf with %s will print the content of the variable until reach a null character, until that it can print garbage if your variable is not correctly ended.
To fix you can have a bigger variable and set the [4] char with null.
How the new code should look like:
#include <stdio.H>
int main()
{
FILE *fin;
int r;
char m[5], path[20];
scanf("%s", path);
/*Scanf can be dangerous because it can cause buffer overflow,
it means that you can fill your variable with more bytes than it supports, which can end up being used for buffer overflow attacks:
See more: http://en.wikipedia.org/wiki/Buffer_overflow */
fin = fopen(path, "r");
r = fread(&m, sizeof(char), 4, fin);
m[4] = '\0';
printf("magic number is %s\n", m);
return 0;
}

Resources