Beginner help: Parsing in C - c

I'm relatively new to this concept for parsing. And here is a simple, yet for me, it's mindbreaking, example.
I have a text file containing a series of numbers and letters. In each line of the text there are three elements. a letter, another letter, and a number. Consider the first as the source, the second as the destination and the number as size. The read them and put them into a structure array and be able to arrange them according to size. "a, b, 1" for the first line. "q, s, 5" for the 2nd, etc. And lastly, printing them in an arranged format (which is according to size)
Mind giving me a clues or starting points?
Update:
#include <stdio.h>
#include <conio.h>
#include <stdlib.h>
int main(){
FILE *fp;
fp= fopen("file.txt", "O");
int i;
struct arrangement{
char source;
char dest;
int cost;
};
struct arrangement rng[22];
for(i=0; i<22 ; i++){
fscanf(fp, "%c, %c, %d", rng[i].source, rng[i].dest, rng[i].cost);
printf("%c, %c, %d", rng[i].source, rng[i].dest, rng[i].cost);
}
getch();
return 0;
}
will this be able to "store all elements in the array?I still don't have any idea how I will arrange these according to size/cost without the source and destination being left out.

fscanf requires pointers to the variables, not the variables themselves. Your code may result in strange results, depending on compiler (gcc may emit a warning/error) and platform.
You also should break the loop if reaching EOF. i may then provide the last used entry (which may be partially valid or not, depending on the input).

Related

fscanf does not store data into variables correctly

what the code actually does, is storing the whole line into utenti[i].username so let's say in the file we got "Pierluigi,Pierluigi#gmail.com,1983,messicana,30,6.5" the whole line will be stored into utenti[i].username even though username max lenght is 20, obviously that wasn't the original purpose of the code, what is intended to do is to store each value into the right variables. I already used this kind of fopen and fscanf code in another one and it actually works, it stores the data in the right variables, but here it wont work. i was trying to understand why it doesn't work but i cant figure it out, so i'm asking here for help.
#include <stdio.h>
#include <stdlib.h>
#include "mystruct.h"//where i get my structures
#define MAX_DIM 1000
utente utenti[MAX_DIM];
int main()
{
FILE *utente;
int i = 0;
if ((utente = fopen("./Dati/utenti.csv", "r")) == NULL)
printf("Impossibile aprire il file.\n");
else{
while(!feof(utente)){
fscanf(utente,"%s,%s,%d,%s,%f,%f", utenti[i].username, utenti[i].email, &utenti[i].n_anno, utenti[i].tradizione, &utenti[i].fasciadiprezzo, &utenti[i].media_voti);
printf("utenti : %s\n", utenti[i].username);
i++;
}
}
fclose(utente);
return 0;
}
and here's the struct contained in mystruct.h
typedef struct utente{
char username[20];
char email[30];
int n_anno;//anno di nascita
char tradizione[20];
float fasciadiprezzo;
float media_voti;
struct prenotazione *p_prenotabili;
struct recensione *valutazioni;
}utente;
where "prenotazioni ..." and "recensione ..." are linked list
im programming with vs_code
The problem is with your format string:
fscanf(utente,"%s,%s,%d,%s,%f,%f",
utenti[i].username, utenti[i].email,
&utenti[i].n_anno, utenti[i].tradizione,
&utenti[i].fasciadiprezzo, &utenti[i].media_voti);
The %s format specifier reads all non-whitespace characters. Since commas and digits are not whitespace, they are all read by the first %s.
Instead of %s, you want to use %[. This allows you to specify a set of characters to capture or not capture. Since you want to read everything up to a comma, you want %[^,]. There should also be a space at the start of the format string to absorb any newlines from the prior line.
fscanf(utente, " %[^,],%[^,],%d,%[^,],%f,%f",
utenti[i].username, utenti[i].email,
&utenti[i].n_anno, utenti[i].tradizione,
&utenti[i].fasciadiprezzo, &utenti[i].media_voti);
Also, see why is while (!feof(file)) always wrong.
If my eyes do not deceive me, then utenteis found in different contexts. You can't do that. The compiler, generally speaking, should have thrown an error, бecause it is not clear what it is: the name of the structure type or the name of the local variable? I advise you to use the _t (t means type) postscript for the type names, for example, utente_t.
In addition, if the file contains quotation marks ", then they should also be specified in fscanf using the escape character \ :
fscanf (file, "\"%s,%s,%d,%s,%f,%f\"", ...);

fprintf prints strange integers

I would expect the code below to produce a file named stackexchangeiscool.txt and write
Hello!
I changed my mind: 17291729
But it doesn't. It produces a file named stackexchangeiscool.txt that contains the text
Hello!
I changed my mind: 17293748407
This code sure doesn't make sense; it is part of a larger code where I have strange problems.
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
FILE * file = NULL;
file = fopen("stackexchangeiscool.txt","w");
int n = 1729;
char myFirstString[] = "This is a string.";
fprintf(file, "Hello!\n");
sprintf(myFirstString, "I changed my mind: %d", n);
fprintf(file, "%s", myFirstString);
fprintf(file, "%d", n);
fclose(file);
return(0);
}
EDIT: A comment below suggested that the problem might come from the fact that myFirstString is too short to contain the new string with the integer in it. However, when I remove the
fprintf(file, "%d", n);
I get
Hello!
I changed my mind: 1729
So myFirstString seems to be able to contain the string and the integer.
The problem is, myFirstString only has enough room for 18 characters, but you're attempting to copy ~23 characters in. This means you're writing off the end of the array. I think what's happening is n is stored directly after myFirstString in memory, and by writing off the end of the array, you're "clobbering" n.
Make sure the array is large enough ahead of time so that everything can fit:
char myFirstString[100] = "This is a string.";
Here, I'm making it an arbitrarily large 100 length. You can either set it to an arbitrary fixed-length like this, or calculate the length you need and malloc it. Either way, you need to ensure that you have enough memory to work with before attempting to copy data.

Reading exact columns of a text file and putting the values/text into a vector

I'm building a code on which I have to read a text file with many types of data. Here is a part of the file:
1000000 923475248-18 Ramiro A. Xavier
999999 923501748-58 Ramiro A. Wolski
999998 923517472-32 Ramiro Q. Wollinger
(It has 1000000 lines)
And after reading it, I have to select the kind of data I want to work with, and I have to sort it, alphabetically or in crescent/decrescent order in case of the numbers.
I already have the code to sort it (Bubble Sort), but in my code, I have to type the data myself. So my question is:
How do I read one specific column of this file and save it's content on an array?
I'm working on this as a project, and I am not allowed to use C++ language or much complex structures such as ("cout <<, buf, aot, tmpline")
I managed to read the file and print what's in it with this code:
#include <stdio.h>
#include <stdlib.h>
#include <conio.h>
#include <string.h>
#include <time.h>
int main()
{
FILE *fp;
char c;
fp = fopen("list1000000.txt","r");
if(!fp)
{
printf( "Error");
exit(0);
}
while((c = getc(fp) ) != EOF)
printf("%c", c);
getch();
return 0;
fclose(fp);
getch();
return 0;
}
Thanks,
Eduardo
Read a line in.
Scan it for the various parts (numbers, first name, middle initial, last name, etc.)
Put the elements into a structure that holds all of them. Put pointers to the structures in an array.
Using one or more elements as keys, sort the structures as needed.

reading multiple variable types from single line in file C

Alright I've been at this all day and can't for the life of me get this down, maybe you chaps can help. I have a file that reads as follows
1301,105515018,"Boatswain","Michael R.",ABC, 123,="R01"
1301,103993269,"Castille","Michael Jr",ABC, 123,="R03"
1301,103993267,"Castille","Janice",ABC, 123,="R03"
1301,104727546,"Bonczek","Claude",ABC, 123,="R01"
1301,104731479,"Cruz","Akeem Mike",ABC, 123,="R01"
1301,105415888,"Digiacomo","Stephen",ABC, 123,="R02"
1301,106034479,"Annitto Grassis","Susan",ABC, 123,="R04"
1301,106034459,"Als","Christian",ABC, 123,="R01"
And here is my code...
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_NAME 15
#define MAX_SUBSEC 3
#define N 128
//void printArr(struct *students);
struct student{
int term;
int id;
char lastname[MAX_NAME];
char firstname[MAX_NAME];
char subjectname[MAX_SUBSEC];
int catalog;
char section[MAX_SUBSEC];
}students[10];
int main(){
int term;
int id;
char lastname[MAX_NAME];
char firstname[MAX_NAME];
char sub[MAX_SUBSEC];
int cat;
char sec[MAX_SUBSEC];
char fname[N];
FILE *inputf;
printf("Enter the name of the text file: ");
scanf("%123s",fname);
strcat(fname,".txt");
inputf = fopen(fname,"r");
if (inputf == NULL){
printf("I couldn't open the file for reading.\n");
exit(0);
}
//TROUBLE HERE!
fscanf(inputf, "%d,%d,%[^,]s", &students[0].term, &students[0].id,students[0].lastname);
printf("%d\n", students[0].term);
printf("%d\n", students[0].id);
printf("%s\n", students[0].lastname);
/*for (j = 1 ; j <= 10-1 ; j++){
for(k = 0 ; k <= 10-2 ; k++){
if(students[k] > students[k+1]){
temp = students[k];
students[k] = students[k+1];
students[k+1] = temp;
}
}
}*/
fclose(inputf);
system("pause");
return 0;
}
void printArr(int a[], int tally){
int i;
for(i = 0 ; i < tally ; i++){
printf("%d ", a[i]);
}
printf("\n");
}
My objective is to take each one of those values in the text file and input it to where it belongs in the struct and subsequently the struct array, but I can't get passed the first 2 ints.
Getting the lastname string, because it is a max of 15 characters, it spills over into the first name string right after it and takes what remaining characters it needs in order to fill up the lastname char array. Obviously I do not want this. As you can see I have tried strtok but it doesnt do anything, not sure what I have to do though as I have never used it before. Also have tried just including all the variables into fscanf statement, but I either get the same output, or it becomes a mess. As it is, I am extremely lost, how do I get these values into the variables they belong?!
EDIT: updated my code, I have gotten a little farther but not much. I can now print out just the last name but can not more farther from there, I cant get to the firstname string or any of the variables beyond it.
What you have there is a CSV file with quoted strings, and so I would recommend you use a CSV parser (or roll your own) rather than trying to do it all with scanf (since scanf cannot deal with quotes, e.g. commas within quoted strings). A quick Google search turns up libcsv.c which you may be able to use in your project.
With the fscanf format string "%d,%d,\"%[^\"]\",\"%[^\"]\",%[^,],%d,=\"%[^\"]\"" we can read a whole line's data. Besides, you have to define
char lastname[MAX_NAME+1];
char firstname[MAX_NAME+1];
char subjectname[MAX_SUBSEC+1];
int catalog;
char section[MAX_SUBSEC+1];
— the +1 to account for the terminating '\0' character.
I have a question for you... If you want to know how to use a diamond cutter, do you try it and see, or do you consult the manual? The problem here isn't the result of your choice, but your choice itself. Believe it or not, I have answered these questions so often that I'm tired of repeating myself. The answer is all in the manual.
Read the POSIX 2004 scanf manual — or the POSIX 2008/2013 version — and the answer this question and you'll have some idea of what you're not doing that you should be. Even fscanf code should use assert as a debugging aid to ensure the number of items read was correct.
%[^,]s It seems as though there's a mistake here. Perhaps you meant %[^,]. The %[ format specifier is a different format specifier to the %s format specifier, hence in the presumably mistaken code there are two directives: %[^,] and s. The s directive tells scanf to read an 's' and discard it.
1.There is a syntax error in
while(result != NULL){
printf(".....);
......
}
}//error
fscanf(inputf, "%s", lastname); can't read a line ,fscanf will stop when it comes across an space
fscanf reads one line at a time, and you can easily capture the contents of each line because your file is formatted pretty nicely, especially due to the comma separation (really useful if none of your separated values contain a comma).
You can pass fscanf a format like you're doing with "%d" to capture an int, "%s" to capture a string (ends at white space, be weary of this when for example trying to find a name like "Annitto Grassis, which would require 2 %s's), etc, from the currently read line of the file. You can be more advanced and use regex patterns to define the contents you want captured as chars, such as "Boatswain", a sequence comprised chars from the sets {A-Z}, {a-z}, and the {"}. You'll want to scan the file until you reach the end (signified by EOF in C) so you can do such and capture the contents of the line and appropriately assign the values to variables like so:
while( fscanf(inputf, "%d,%d,%[\"A-Za-z ],%[\"A-Za-z .]", &term, &id, lastname, firstname) != EOF) {
.... //do something with term, id, lastname, firstname - put them in a student struct
}
For more about regex, Mastering Regex by Jeff Friedl is a good book for learning about the topic.

Parsing wtmp logs with C

For our assignment we are given a copy of a wtmp log, and are expected to parse it, and output it in a sorted format, similar to the output of last.
Now, I know that the file wtmp consists of a list of utmp structures. The file provided is guaranteed to contain at least one utmp structure and I'm supposed to assume all structures in the binary file are constructed correctly.
I've read through man utmp, and I have successfully written a program to read in the structs from the binary file provided. (My apologies for the lengthy print method)
#include <stdio.h>
#include <string.h>
#include <utmp.h>
#include <stdlib.h>
void utmpprint(struct utmp *log);
int main() {
int logsize = 10;
FILE *file;
struct utmp log[logsize];
int i = 0;
file = fopen("wtmp", "rb");
if (file) {
fread(&log, sizeof(struct utmp), logsize, file);
for(i = 0; i < logsize; i++) {
utmpprint(&log[i]);
}
} else {
return(0);
}
return(0);
}
void utmpprint(struct utmp *log) {
printf("{ ut_type: %i, ut_pid: %i, ut_line: %s, ut_id: %s,
ut_user: %s, ut_host: %s, ut_exit: { e_termination: %i,
e_exit: %i }, ut_session: %i, timeval: { tv_sec: %i, tv_usec: %i },
ut_addr_v6: %i }\n\n", log->ut_type, log->ut_pid, log->ut_line,
log->ut_id, log->ut_user, log->ut_host, log->ut_exit.e_termination,
log->ut_exit.e_exit, log->ut_session, log->ut_tv.tv_sec,
log->ut_tv.tv_usec, log->ut_addr_v6);
}
Now, the problem I'm having is that when I run this, the output for ut_id is different than what I expect it to be.
From: man utmp
char ut_id[4]; /*Terminal name suffix, or inittab(5) ID */
My output:
... ut_line: pts/2, ut_id: ts/2jsmith, ut_user: jsmith, ...
I'm not quite sure what is going on here. What I think might be happening is that the ut_id field just might not exist in the struct that I am reading in. I think that might explain why the ut_id field is being displayed as the fields on either side of it squashed together.
I thought that I could possibly use fprintf formatting to get the field to display correctly, but it seems that you can only format text to one side of a char array or another, not grab specific parts from inside the string.
Otherwise, I'm pretty lost. Is this just a gap in my understanding of structs?
Not looking for answers, more so just some prodding in the right direction.
Also, what exactly is the terminal name suffix? Is that just the number that follows after pts/?
man utmp says "String fields are terminated by a null byte ('\0') if they are shorter than the size of the field." So, in particular, if they are the same size as the field then they are not terminated by a null byte. Well formed C strings must be terminated by a null byte. The fact that it looks like the ut_id field is 4 characters long "ts/2" suggests that it does not have a terminating null byte.
You're printing the char arrays using the %s formatting argument to printf. This keeps printing until it reaches a null byte. I suggest that you need to copy each field of the utmp to a temporary char array, which is one bigger than the size in the utmp structure. Make sure the last byte of that temporary array is a null byte, and it should print out OK.

Resources