C - parse key-value pairs from ASCII file [closed] - c

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 months ago.
Improve this question
I am trying to write a C function that reads the key-value pairs stored in an ASCII file that looks as follows (the text that indicates to ignore a certain line or part of a line is not in the file, of course: I added it for clarity):
[SECTION 1 HEADER - ignore this line]
VAR1 = 'xyz'
VAR2 = 3.0
VARIAB3 = 'blabla'
! COMMENT: ignore this line VAR7 = 'ABC123'
$---------another line to ignore-----------------
[SECTION 2 HEADER - ignore me!]
varname8 = 'abcd'
$---------yet another line to ignore-----------
[SECTION HEADER - ignore me]
VARIABLE10 = 4.05101e+05 $ignore from the dollar sign to eol
VARIABLE13 = 7e-06 $ignore from the dollar sign to eol
param_1=123
param_2=321
As you can see, not all the lines contain (key,value) pairs I would like to retain. Also, the names of the keys don't always have the same length and the values can be strings or numbers... Moreover, the '=' sign can be preceded and/or followed by zero or more spaces. Finally, comments appearing after the key,value pair should be ignored.
I've tried the following code to read from the file and parse the key-value pairs:
fh = fopen(filename, "r");
if(NULL == fh) {
perror(filename);
}
else
{
printf("File opened succesfully.\n");
while(fgets(buffer, 100, fh) != NULL)
{
{
int offset;
int res = sscanf(buffer, "%s = %s%n", key, value, &offset);
if(res==2)
{
printf("Found: %s = %s\n", key, value);
}
}
}
}
which gives the following output, clearly failing at capturing the last two key-value pairs:
Filename is: sample.tir
File opened succesfully.
Found: VAR1 = 'xyz'
Found: VAR2 = 3.0
Found: VARIAB3 = 'blabla'
Found: varname8 = 'abcd'
Found: VARIABLE10 = 4.05101e+05
Found: VARIABLE13 = 7e-06
File closed.
I believe the problem might be in the format string for sscanf: any ideas?

I suggest you check for and overwrite the '=' character, to separate the key and value. Perhaps like this snippet
char *ptr = strchr(buffer, '='); // find the '='
if(ptr == NULL)
continue;
*ptr = ' '; // replace with a space
int res = sscanf(buffer, "%s%s", key, value);
if(res != 2)
continue;
printf("Found: %s = %s\n", key, value);
Output
Found: VAR1 = 'xyz'
Found: VAR2 = 3.0
Found: VARIAB3 = 'blabla'
Found: varname8 = 'abcd'
Found: VARIABLE10 = 4.05101e+05
Found: VARIABLE13 = 7e-06
Found: param_1 = 123
Found: param_2 = 321

Related

Splitting by comma doesn't work as expected

I Read some data from a text file, I am trying to iterate line by line and split by comma, and I ignore lines that starts with #, here is the text file content:
#this is the simulation file for your exercise, please read it carefully.
#every line that begins with a pound sign [now known as "the" hashtag (#)] is a comment line. you can automatically skip it.
#here are a few examples.
#there will be 5 categories : Comedy, Adventure, Educational, SciFi, Fantasy.
#it is recommended that when you save in the main program, that you follow this convetion.
#the input syntax :
#id,book name,author,pages,yearofpublishing,category
CNV301,Treasure Island,Robert Louis Stevenson,304,1882,Adventure
8T88FF,Heir to The Empire,Timothy Zahn,416,1992,SciFi
911MAR10,Plumbing for Dummies,Gene Hamilton,242,1999,Educational
6U754E,Berserk,Kenturo Miura,224,1989,Fantasy
7R011,The Troll Cookbook : Human Delights,Underchief Trogdor,7,-35,Educational
M140,Funny Cats,Jean-Claude Suarès,78,1995,Comedy
V269W7,Linus the Vegetarian T. rex,Robert Neubecker,40,2013,Adventure
UFF404,Algebra 3,Nebi Rogen,300,0,Educational
424242,The Hitchhiker's Guide to the Galaxy,Douglas Adams,224,1979,Comedy
#add your own. you can use sites like : http://www.generatedata.com/ to create quick lists.
Here my code:
FILE* file = fopen(filepath, "r");
char line[256] = "";
while (fgets(line, sizeof(line), file) != NULL) {
if (!starts_with(line, "#") && !starts_with(line, " "))
{
if (line[0] == '#' || line[0] == '\n')
continue; // skip the rest of the loop and continue
printf("%s", line);
char* p;
p = strtok(line, ",");
while (p != NULL)
{
//printf("%s\n", p); //<-- line*******
p = strtok(NULL, ",");
}
}
}
fclose(file);
where:
int starts_with(const char* line, const char* c)
{
size_t lenpre = strlen(c),
lenstr = strlen(line);
return lenstr < lenpre ? 0 : strncmp(c, line, lenpre) == 0;
}
When I run the code and I prints the first line with some weird characters like: #this is the simulation file for your exercise, please read it carefully.
if I enables the commented line: //<-- line*******
I get error: "Access violation reading location", I only want to see the splitted values

Parsing item list line by line then character by character

I have to parse a game file that has this format:
ItemID = 3288 # This is a comments and begins with '#' character.
Name = "a magic sword"
Description = "It has some ancient runic inscriptions."
Flags = {MultiUse,Take,Weapon}
Attributes = {Weight=4200,WeaponType=1,WeaponAttackValue=48,WeaponDefendValue=35}
# A line can also begin with this character and it should be ignored.
and I have to parse it's data and put them into variables. I have tried many things, and I've been told that I will have to read the file line by line, then read each line character by character (so I'm able to read until '#' character) and then read the result word by word following the pattern. I have done this:
void ParseScriptFile(FILE* File) {
char Line[1024];
while (fgets(Line, sizeof(Line), File)) {
}
fclose(File);
}
I think I should read the lines inside the while loop but I don't know how would I read until # character is reached and if it does not exist just continue looping line through line. Is there an easy way to do this?
Use sscanf, like I did two for you
void ParseScriptFile(FILE* File) {
char Line[1024];
int ItemID; // variable to store ItemId
char name[40]; // string to store Name
while (fgets(Line, sizeof(Line), File)) {
sscanf(Line, "ItemID = %d", &ItemID);
sscanf(Line, "Name = %[^n]s", name); // ^n upto newline
}
printf("ItemId= %d\n", ItemID);
printf("Name= %s", name);
fclose(File);
}
Here's more or less workable code. Reading lines with fgets() is correct. You can then eliminate empty lines and comment lines trivially. If the line ends with a comment, you can convert the # into a null byte to ignore the comment. Then you need to scan for the entries name field (assume there are no spaces in the name part, to the left of the equals sign), and the = and the value on the right.
#include <stdio.h>
#include <string.h>
static
void ParseScriptFile(FILE *File)
{
char Line[1024];
while (fgets(Line, sizeof(Line), File))
{
if (Line[0] == '#' || Line[0] == '\n')
continue;
char *comment_start = strchr(Line, '#');
if (comment_start != NULL)
*comment_start = '\0';
char name[64];
char value[1024];
if (sscanf(Line, " %63s = %1023[^\n]", name, value) == 2)
printf("Name = [%s] Value = [%s]\n", name, value);
else
printf("Mal-formed line: [%s]\n", Line);
}
fclose(File);
}
int main(void)
{
ParseScriptFile(stdin);
return 0;
}
The program reads from standard input. An example run from your data file yielded:
Name = [ItemID] Value = [3288 ]
Name = [Name] Value = ["a magic sword"]
Name = [Description] Value = ["It has some ancient runic inscriptions."]
Name = [Flags] Value = [{MultiUse,Take,Weapon}]
Name = [Attributes] Value = [{Weight=4200,WeaponType=1,WeaponAttackValue=48,WeaponDefendValue=35}]
Note the space at the end of the ItemID value; there was a space before the # symbol.
If you need to handle strings that could themselves contain # symbols, you have to work harder (Curse = "You ###$%!" # Language, please!). Parsing an entry such as the value for Attributes is a separate task for a separate function (callable from this one). Indeed, you should be calling one or more functions to process each name/value pair. You probably also need some context passed to the ParseScriptFile() function so that the data can be saved appropriately. You wouldn't want to contaminate clean code with unnecessary global variables, would you?

Generating username from first name and last name in C [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I am writing a C program to generate userid's from a given file (users). The file has each user's first and last names each line (eg. "John Smith", "Steve Mathews" etc). The following while loop reads each line from users and prints in the console in all lowercase. In this case, single_line holds names in lower case.
while(!feof(fp)) {
fgets(single_line, 80, fp);
for(int i = 0; single_line[i]; i++){
single_line[i] = tolower(single_line[i]);
}
char f_letter = single_line[0];
char r_letters[20];
}
Now, I want to create a username for each single_line with the first letter of the first name and remaining letters of last name. So far, f_letter holds the first letter of first name, how can I make r_letters hold the remaining letters of last name?
char first_name[81];
char last_name[81];
char user_name[82];
while(!feof(fp)) {
if(fscanf(fp, "%80s %80s", first_name,last_name) == 2){
for(i = 0; i < strlen(first_name); i++){
first_name[i] = tolower(first_name[i]);
}
for(i = 0; i < strlen(last_name); i++){
last_name[i] = tolower(last_name[i]);
}
sprintf(user_name, "%c%s", first_name[0], last_name);
}
}
You can use strtok to extract tokens from strings using a delimiter.
For example, for a line of text read from a file using fgets (assuming each line only contains two words, as is your case), you can extract the first and second words as:
char *first_name = strtok(single_line, " ");
char *last_name = strtok(NULL, "\n");
newline character is used as the second delimiter because fgets preserves it when reading a line, so it can be used to extract the last token before the newline.
Consider the following function for creating usernames:
#include <stdio.h>
#include <string.h>
void create_usernames(char *filename) {
char line[80], username[80];
char *last;
FILE *fp = fopen(filename, "r");
// fgets returns a NULL pointer upon EOF or error
while (fgets(line, 80, fp) != NULL) {
last = strtok(line, " ");
last = strtok(NULL, "\n");
printf("%c%s\n", line[0], last);
}
fclose(fp);
}
For example, with a file file.txt containing names:
John Smith
John Diggle
Bruce Wayne
Steve Mathews
you would have:
create_usernames("file.txt");
JSmith
JDiggle
BWayne
SMathews
find the index of the space between the first and last name, say j. the index of the first letter of the last name would be j+1.

Parsing a line from file [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I have a file with the following line:
numOfItems = 100
I want to read the line from file and initialize the attribute "numOfItems" to 100. How could I do it, i.e, deleting the unnecessary spaces and read only the values I need?
Also, I have another line which is:
num Of Items = 100
which I need to parse as error (attributes and values cannot contain spaces).
In the first case I know how to remove the spaces at the beginning, but not the intervening spaces. In second case I don't know what to do.
I thought to use strtok, but couldn't manage to get what I needed.
Please help, thanks!
Using fgets and sscanf with %s %d and %n should parse lines of the format "item = value"
#include <stdio.h>
#include <string.h>
int main( void){
char line[256] = { '\0'};
char item[50] = { '\0'};
int value = 0;
int used = 0;
printf ( "enter string as \"item = value\" or x to exit\n");
while ( ( fgets ( line, 256, stdin))) {
if ( strcmp ( line, "x\n") == 0) {
break;
}
//%49s to prevent too many characters in item[50]
//%n will report the number of characters processed by the scan
//line[used] == '\n' will succeed if the integer is followed by a newline
if ( ( ( sscanf ( line, "%49s = %d%n", item, &value, &used)) == 2) && line[used] == '\n') {
printf ( "parsed item \"%s\" value \"%d\"\n", item, value);
}
else {
printf ( "problem parsing \n\t%s\n", line);
}
printf ( "enter string as \"item = value\" or x to exit\n");
}
return 0;
}

Parsing a line in C with multiple spaces in the same filed [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have a text file as follows:
id name area dist
1 surya kumar 1 2
when I try to parse this line in C using strtok() function with space as delimiter I'm getting the output as follows:
1
surya
Kumar
1
2
The second filed is actually a name so it can have multiple spaces in it. Is there a way to treat the second filed as a whole word and still be able to parse the entire line?
As your name may include numbers, I suggest counting the tokens.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define MAXFIELDS 10
int main(void) {
//char input[] = "1 Ludwig 2 3";
char input[] = "1 Ludwig 99 Beethoven 2 3";
char *token[MAXFIELDS];
char *tok;
char name [100];
int fields, index;
int id, area, dist;
fields = 0;
tok = strtok(input, " ");
while(tok != NULL) {
if (fields >= MAXFIELDS)
return 1; // error
token[fields++] = tok;
tok = strtok(NULL, " ");
}
if (fields < 4)
return 1; // error
index = 0;
id = atoi(token[index++]); // id field
strcpy(name, token[index++]); // name field
while(index < fields - 2) {
strcat(name, " "); // append to name
strcat(name, token[index++]);
}
area = atoi(token[index++]); // area field
dist = atoi(token[index++]); // dist field
printf("%d, %s, %d, %d\n", id, name, area, dist);
return 0;
}
Program output
1, Ludwig 99 Beethoven, 2, 3
Before calling strtok, count the number of spaces in the string to know where the 3rd fields does start (i.e. where the second field does end).

Resources