Printing Garbage [C] - c

i am having some problems with a buffer. Short story, i have to iterate over the lines in a text file, in which each line has information separated by an empty space, the problem is, the informartion can have an space in it so i wrote a code that check all the empty spaces of a string and checks if its a sperator, and if it is, ut replaces it by a ";".The problem: I write this to another var in where i use malloc to allocate its space, but it ends printing garbage, can somebody point me what's wrong in the code?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(){
int i;
char *destination;
char* str = "S:15 B:20 B A:15",letra;
destination = (char *)malloc(strlen(str)*sizeof(char));
for (i=0;i<strlen(str);i++){
printf("%c \n",str[i]);
letra = str[i];
if (i == 0){
destination[i] = letra;
}
else if (letra != ' '){
destination[i] = letra;
}
else if (letra == ' ' ){
if (isdigit(str[i-1])){
destination[i] = ";";
}
else{
destination[i] = letra;
}
}
}
printf("%s",destination);
return 0;
}

Here is how I would do it -- a simple, one-directional loop copying just the characters needed, discarding spaces and inserting a ; where necessary.
The space to store a string of length x in (according to strlen) is x+1 -- one extra position for the additional ending zero.
The check of when to insert a ; is easy: after initially skipping spaces (if your input string starts with those) but before any valid text is copied, d will still be 0. If it is not and you encountered a space, then insert a ;.
Only one scenario is not checked here: if the input string ends with one or more spaces, then the final character in destination will also be a ;. It can be trivially discarded by checking just before the Copy loop #3, or just before ending at #4 (and you need to test if d != 0 as well).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
int main (void)
{
int i, d;
char *destination;
char *str = "S:15 B:20 B A:15";
destination = malloc(strlen(str)+1);
i = 0;
d = 0;
while (str[i])
{
/* 1. Skip spaces at the start */
while (isspace (str[i]))
i++;
/* 2. Do we need to add a ';' here? */
if (d)
{
destination[d] = ';';
d++;
}
/* 3. Copy while not space or -zero- */
while (str[i] && !isspace (str[i]))
{
destination[d] = str[i];
d++;
i++;
}
}
/* 4. Close off. */
destination[d] = 0;
printf ("%s\n", destination);
return 0;
}

Related

'SER_' or '_' character appearing in end of (string)output in c

I am trying to print each word in a single line of a given sentence. It worked perfectly fine but a '_' appears in end of line. please help me with it and also proper manar to write it.
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stdlib.h>
int main() {
char *s,i,check=0;
s = malloc(1024 * sizeof(char));
scanf("%[^\n]", s);
s = realloc(s, strlen(s) + 1);
for(i=0;i<1024;i++ ||check<=2)
{
if(*(s+i)!=' ')
{
printf("%c",*(s+i));
check=0;
}
else
{
printf("\n");
check++;
}
// fflush(stdin);
}
return 0;
}
Output:
dkf fja fjlak d
dkf
fja
fjlak
d SER_
Output2:
-for(i=0;i<20;i++ ||check<=2)-
hello I am suraj Ghimire
hello
I
am
suraj
Ghi
I am not sure your code works as you say..
The type of i is not a char *, so it should be int.
You process the input string without considering the NULL terminating char, which leads to a lot of garbage prints.
You do not release allocated memory.
I suggest this slightly modified version:
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stdlib.h>
int main() {
char *s, *p;
/* Allocate a new string and verify the allocation has succeeded. */
s = malloc(1024 * sizeof(char));
if (!s) {
printf("malloc failed\n");
return 1;
}
/* Read from user. */
scanf("%[^\n]", s);
/* Work on a copy of `s` (must simpler and faster than a indexed access). */
p = s;
while (*p) {
if (*p != ' ') {
printf("%c",*p);
}else{
printf("\n");
}
p++;
}
free(s);
return 0;
}
Example output:
$ ./a.out
abc def gh i j kmlm opqrst
abc
def
gh
i
j
kmlm
opqrst
EDIT: As requested by the OP, further details regarding the NULL terminating char.
By convention, strings (array of characters) end with a specific character which we call the NULL terminating char. This character is 0 and marks the end of the string data.
In your example, the buffer which store the string is dynamically allocated in RAM. If you do not check for the NULL terminating character of the string, then you keep processing data as if it is part of the string (but it is not).
Going beyond this character make you access the following memory data (which is part of your program RAM data). Since these data can be anything (ranging from 0 to 255), printing them may lead to "gibberish" because they may not be printable and are definitely not consistent with your string.
In the "best" case the program halts with a "segmentation fault" because you are accessing a memory region you are not allowed to. In the "worst" case you print a lot of things before crashing.
This is typically called a data leak (whether it is RAM or ROM) because it exposes internal data of your program. In the specific case of your example there no sensitive data. But! Imagine you leak passwords or private keys stored in your program .. this can be a severe security issue!
There are a couple issues with your code.
Firstly, you need to check that the for loop does not exceed the bounds of the string.
Your for loop is always set to true because the logical OR operator || has a higher precedence than the comma operator. Because of this the loop will always run unless it gets stopped with break
Lastly your check is never reset to 0 after it reaches a value of 2.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main() {
char *s,i,check=0;
s = malloc(1024 * sizeof(char));
scanf("%[^\n]", s);
s = realloc(s, strlen(s) + 1);
for(i=0; i<strlen(s); i++) {
if(*(s+i) != ' ') {
printf("%c",*(s+i));
check=0;
} else {
printf("\n");
check++;
if (check > 2) break;
}
}
return 0;
}
Output:
Hello, this is a test
Hello,
this
is
a
test
for(i=0;i<1024;i++ ||check<=2)
There are two issues. One is length of string won't always be 1024, so it might be good to determine the length of string before print the string. The other is check<=2, which have to put in the second part of the for loop, so the test will be evaluated. Also it is better to calculate the length of string once. So I store the length of string in len.
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char *s, i, check = 0;
s = malloc(1024 * sizeof(char));
scanf("%[^\n]", s);
s = realloc(s, strlen(s) + 1);
size_t len = strlen(s);
for (i = 0; i < len || check <= 2; i++) {
if (*(s + i) != ' ') {
printf("%c", *(s + i));
check = 0;
} else {
printf("\n");
check++;
}
// fflush(stdin);
}
return 0;
}

Writing a C program that removes every occurrence of a char except the last one

Im trying to write a C program that removes all occurrences of repeating chars in a string except the last occurrence.For example if I had the string
char word[]="Hihxiivaeiavigru";
output should be:
printf("%s",word);
hxeavigru
What I have so far:
#include <stdio.h>
#include <string.h>
int main()
{
char word[]="Hihxiiveiaigru";
for (int i=0;i<strlen(word);i++){
if (word[i+1]==word[i]);
memmove(&word[i], &word[i + 1], strlen(word) - i);
}
printf("%s",word);
return 0;
}
I am not sure what I am doing wrong.
With short strings, any algorithm will do. OP's attempt is O(n*n) (as well as other working answers and #David C. Rankin that identified OP's short-comings.)
But what if the string was thousands, millions in length?
Consider the following algorithm: #paulsm4
Form a `bool` array used[CHAR_MAX - CHAR_MIN + 1] and set each false.
i,unique = n - 1;
From the end of the string (n-1 to 0) to the front:
if (character never seen yet) { // used[] look-up
array[unique] = array[i];
unique--;
}
Mark used[array[i]] as true (index from CHAR_MIN)
i--;
Shift the string "to the left" (unique - i) places
Solution is O(n)
Coding goal is too fun to just post a fully coded answer.
I would first write a function to determine if a char ch at a given position i is the last occurence of ch given a char *. Like,
bool isLast(char *word, char ch, int p) {
p++;
ch = tolower(ch);
while (word[p] != '\0') {
if (tolower(word[p]) == ch) {
return false;
}
p++;
}
return true;
}
Then you can use that to iteratively emit your desired characters like
int main() {
char *word = "Hihxiivaeiavigru";
for (int i = 0; word[i] != '\0'; i++) {
if (isLast(word, word[i], i)) {
putchar(word[i]);
}
}
putchar('\n');
}
And (for completeness) I used
#include <stdio.h>
#include <ctype.h>
#include <stdbool.h>
Outputs (as requested)
hxeavigru
Additional areas where you are currently hurting yourself.
Your for loop must NOT increment the index, e.g. for (int i=0; word[i];). This is because when you memmove() by 1, you have just incremented the indexes. That also means the value to save for last is now i - 1.
there should only be one call to strlen() in the program. You can simply subtract one from length each time memmove() is called.
only increment your loop counter variable when memmove() is not called.
Additionally, avoid hardcoding strings. You shouldn't have to recompile your code just to test the results of "Hihxiivaeiaigrui" instead of "Hihxiivaeiaigru". You shouldn't have to recompile just to remove all but the last 'a' instead of the 'i'. Either pass the string and character to find as arguments to your program (that's what int argc, char **argv are for), or prompt the user for input.
Putting it altogether you could do (presuming word is 1023 characters or less):
#include <stdio.h>
#include <string.h>
#define MAXC 1024
int main (int argc, char **argv) {
char word[MAXC]; /* storage for word */
strcpy (word, argc > 1 ? argv[1] : "Hihxiivaeiaigru"); /* copy to word */
int find = argc > 2 ? *argv[2] : 'i', /* character to find */
last = -1; /* last index where find found */
size_t len = strlen (word); /* only compute strlen once */
printf ("%s (removing all but last %c)\n", word, find);
for (int i=0; word[i];) { /* loop over each char -- do NOT increment */
if (word[i] == find) { /* is this my character to find? */
if (last != -1) { /* if last is set */
/* overwrite last with rest of word */
memmove (&word[last], &word[last + 1], (int)len - last);
last = i - 1; /* last now i - 1 (we just moved it) */
len = len - 1;
}
else { /* last not set */
last = i; /* set it */
i++; /* increment loop counter */
}
}
else /* all other chars */
i++; /* just increment loop counter */
}
puts (word); /* output result -- no need for printf (no coversions) */
}
Example Use/Output
$ ./bin/rm_all_but_last_occurrence
Hihxiivaeiaigru (removing all but last i)
Hhxvaeaigru
What if you want to use "Hihxiivaeiaigrui"? Just pass it as the 1st argument:
$ ./bin/rm_all_but_last_occurrence Hihxiivaeiaigrui
Hihxiivaeiaigrui (removing all but last i)
Hhxvaeagrui
What if you want to use "Hihxiivaeiaigrui" and remove duplicate 'a' characters? Just pass the string to search as the 1st argument and the character to find as the second:
$ ./bin/rm_all_but_last_occurrence Hihxiivaeiaigrui a
Hihxiivaeiaigrui (removing all but last a)
Hihxiiveiaigrui
Nothing removed if only one of the characters:
$ ./bin/rm_all_but_last_occurrence Hihxiivaeiaigrui H
Hihxiivaeiaigrui (removing all but last H)
Hihxiivaeiaigrui
Let me know if you have further questions.
Im trying to write a C program that removes all occurrences of repeating chars in a string except the last occurrence.
Process the string (or word) from last character and move towards the first character of string (or word). Now, think of it as a problem where you have to remove all occurrence of a character from string and except the first occurrence. Since, we are processing the string from last character to first character, so, we have to move the characters, which are remain after removing duplicates, to the start of string once you have processed whole string and, if, there were duplicate characters found in the string. The complexity of this algorithm is O(n).
Implementation:
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define INDX(x) (tolower(x) - 'a')
void remove_dups_except_last (char str[]) {
int map[26] = {0}; /* to keep track of a character processed */
size_t len = strlen (str);
char *p = str + len; /* pointer pointing to null character of input string */
size_t i = 0;
for (i = len; i != 0; --i) {
if (map[INDX(str[i - 1])] == 0) {
map[INDX(str[i - 1])] = 1;
*--p = str[i - 1];
}
}
/* if there were duplicates characters then only copy
*/
if (p != str) {
for (i = 0; *p; ++i) {
str[i] = *p++;
}
str[i] = '\0';
}
}
int main(int argc, char* argv[])
{
if (argc != 2) {
printf ("Invalid number of arguments\n");
return -1;
}
char str[1024] = {0};
/* Assumption: the input string/word will contain characters A-Z and a-z
* only and size of input will not be more than 1023.
*
* Leaving it up to you to check the valid characters in input string/word
*/
strcpy (str, argv[1]);
printf ("Original string : %s\n", str);
remove_dups_except_last (str);
printf ("Removed duplicated characters except the last one, modified string : %s\n", str);
return 0;
}
Testcases output:
# ./a.out Hihxiivaeiavigru
Original string : Hihxiivaeiavigru
Removed duplicated characters except the last one, modified string : hxeavigru
# ./a.out aa
Original string : aa
Removed duplicated characters except the last one, modified string : a
# ./a.out a
Original string : a
Removed duplicated characters except the last one, modified string : a
# ./a.out TtYyuU
Original string : TtYyuU
Removed duplicated characters except the last one, modified string : tyU
You can re-iterate to get each characters of your string, then if it is not "i" and not the last occurrence of the i, copy to a new string.
#include <stdio.h>
#include <string.h>
int main() {
char word[]="Hihxiiveiaigru";
char newword[10000];
char* ptr = strrchr(word, 'i');
int index=0;
int index2=0;
while (index < strlen(word)) {
if (word[index]!='i' || index ==(ptr - word)) {
newword[index2]=word[index];
index2++;
}
index++;
}
printf("%s",newword);
return 0;
}

Can't dynamically allocate a string

I tried to dynamically allocate a string using a function I named ALLO, but when I execute I get an error, which is my function ALLO can't get the string using getc, it gets skipped.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void ALLO(char *str){
char c=0;
int i = 0, j = 1;
str = (char*)malloc(sizeof(char));
printf("Enter String : ");
while (c != '\n') {
// read the input from keyboard standard input
c = getc(stdin);
// re-allocate (resize) memory for character read to be stored
str = (char*)realloc(str, j * sizeof(char));
// store read character by making pointer point to c
str[i] = c;
i++;
j++;
}
str[i] = '\0'; // at the end append null character to mark end of string
printf("\nThe entered string is : %s", str);
free(str); // important step the pointer declared must be made free
}
int main(){
char *NomAF;
int NAF;
printf("Entrer le nombre des ateliers : ");
scanf("%d",&NAF);
ALLO(NomAF);
return 0 ;
}
The semantics are wrong.
You ask the user for the names of the athletes, and then you scan it into an integer. You should ask for the number of athletes first. Then, after that, you allocate memory to accommodate each name.
int num_names;
scanf("%d", &num_names);
After you know the number of names, you then allocate a buffer for each name, separately.
char **names;
names = malloc(num_names * sizeof(char **));
for(int i = 0; i < num_names; i++)
ALLOC(&names[i]);
Also, you shouldn't be using scanf for user input. Use fgets instead, which is a little better.
Then, you also should be using a pointer to pointers to get those strings.
A little modified version of your code (which you should review and fix, as needed):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void ALLO(char **str){
/* use INT for getc() return */
int c=0, i = 0;
/* you are gettting 1 byte of memory */
*str = malloc(sizeof **str);
/* should use fprintf(stderr...) or fflush(stdout) to guarantee
* the sentence will be seen by user
*/
printf("Enter String : ");
while (c != '\n') {
// read the input from keyboard standard input
c = getc(stdin);
// re-allocate (resize) memory for character read to be stored
/* i = 0 in the first run,
*
* and you have 1 byte alloced in the first run.
*
* so you get 1 byte for actual getc() return
* 1 byte for next character + NULL byte
*
* NOTE: you are STORING the NULL byte in your string. You only
* check for it AFTER you do the assignment, so your strings
* contain a newline before the NULL byte.
*/
*str = (char*)realloc(*str, (i + 2) * sizeof **str);
// store read character by making pointer point to c
(*str)[i] = c;
// you can use only 'i' for this...
i++;
/* #i
*
* Using only 'i' requires that you understand what #i is doing
* during execution. #i will keep the current buffer position,
* and you know you need one more position for the next
* character and one more for the NULL byte.
*
* Therefore, in your realloc statemente, you need #(i + 2)
*/
}
(*str)[i] = '\0'; // at the end append null character to mark end of string
printf("\nThe entered string is : %s", *str);
// if you free here, you can't get the string at #main for printing.
// free is the last step
//free(str); // important step the pointer declared must be made free
}
int main(){
char **NomAF;
int NAF, i;
char buf[100];
printf("Number of athlets : ");
fgets(buf, sizeof(buf), stdin);
NAF = atoi(buf);
NomAF = malloc(NAF * sizeof *NomAF);
// check malloc errors
// get names
for(i = 0; i < NAF; i++) {
ALLO(&NomAF[i]);
printf("New name: %s\n", NomAF[i]);
}
// print names, then free() then
for(i = 0; i < NAF; i++) {
printf("Name: %s\n", NomAF[i]);
free(NomAF[i]);
}
// free the base pointer
free(NomAF);
return 0 ;
}
Add this
while((c=getchar()!='\n')&&c!=EOF);
before getc it skips white space.
Because after this scanf("%d",&NAF); take you are giving input 5(+enter) this goes 5'\n' 5 is got by scanf and '\n' is in buffer and this new line is got by your getc.
and change this str[i] = '\0'; to str[i-1] = '\0';, it replaces the newline with NULL and you allocated memory for i characters only.
You can return the string by return str; and change function return type as char* or if don't want that take a parameter that allocated by malloc.
See this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void ALLO(char *str)
{
char c=0;
int i = 0;
printf("Enter String : ");
while (c != '\n')
{
while((c=getchar()!='\n')&&c!=EOF);
c = getc(stdin);
str = (char*)realloc(str, (i+1) * sizeof(char));
if(!str) exit(1);
str[i] = c;
i++;
}
str[i-1] = '\0';
}
int main()
{
char *NomAF;
int NAF;
NomAF=malloc(sizeof(char));
if(!NomAF) exit(1);
printf("Entrer le nombre des ateliers : ");
scanf("%d",&NAF);
ALLO(NomAF);
printf(NomAF);
free(NomAF);
return 0 ;
}
output:
Entrer le nombre des ateliers : 5
Enter String : a
s
d
f
g
----->newline to stop the loop
asdfg
Process returned 0 (0x0) execution time : 8.205 s
Press any key to continue.
I entered it as a string not character by character, its not practical to ask the user the enter letter by letter
#include <stdio.h>
#include <stdlib.h>
#include <conio.h>
char * inputword(char **);
int main()
{
char *word;
printf("Enter Word:");
word=inputword(NULL);
printf("Word Entered:%s",word);
free(word);
printf("\nEnter Word 2:");
inputword(&word);
printf("Word Entered:%s",word);
free(word);
return 0 ;
}
char *inputword(char **word)
{
char *str=NULL,ch,*memerr="Memory Error";
int i=0,flag=1;
do
{
str=realloc(str,((i+1)*sizeof(char)));
if(!str)
{
printf(memerr);
exit(1);
}
ch = getch();
switch(ch)
{
case 13:
str[i] = '\0';
putc('\n',stdout);
flag=0;
break;
case '\b':
if(i>0) i--;
str[i--]='\0';
printf("\b \b");
break;
default:
str[i] = ch;
putc(ch,stdout);
}
i++;
}while(flag);
if(word!=NULL)
*word=str;
return str;
}
output:
Enter Word:Hai, How are You?(1 String)
Word Entered:Hai, How are You?(1 String)
Enter Word 2:Hai, How are You?(2 string)
Word Entered:Hai, How are You?(2 string)
Process returned 0 (0x0) execution time : 58.883 s
Press any key to continue.

Count how many words in a line of text? (in C Programming Language)

QUESTION:
What is wrong with this code example, what is missing?
Current incorrect output is:
There are 0 words in ""
Code Explanation:
Write a program that reads in a line of text, and prints out the number of words in that line of text. A word contains characters that are alphanumeric. Hint: Use the fgets() function.
Sample run:
Input:
from here to eternity
Output:
4
Input:
start here and turn 180 degrees
Output:
6
Code Snippet:
https://onlinegdb.com/H1rBwB83V
#include <stdio.h>
#include <ctype.h>
#include <stdbool.h>
#include <string.h>
#define MAXLEN 100
int countWords(char str[])
{
int i=0;
int count = 0;
bool flag = false;
while (str[i] != '\0')
{
if (isalnum(str[i]))
{
if (!flag)
{
count++;
flag = true;
}
}
else
flag = false;
i++;
}
return count;
}
int main(int argc, char **argv) {
char str[MAXLEN];
int count;
while (fgets(str, sizeof(str), stdin) != NULL)
{
str[strlen(str-1)] = '\0'; // the last character is the newline. replace with null
count = countWords(str);
printf("There are %d words in \"%s\"\n", count, str);
}
return 0;
}
Similar Tutorial:
https://www.sanfoundry.com/c-program-count-words-in-sentence/
You have an error here:
str[strlen (str - 1)] = '\0'; // the last character is the newline. replace with null
Using the pointer str - 1 leads to undefined behavior, as it points to memory outside the original string.
You actually meant to do this: strlen(str) - 1 (notice the -1 is moved outside the parentheses)

Parsing simple name/value pair settings in config file with leading and terminating spaces - C

This is the code I made so far. I apologize if my buffer sizes are an overkill.
The idea is to read the entire configuration file (in this example, it's file.conf), and for now we assume it exists. I'll add error checking later.
Once the file is read into stack space, then the getcfg() function searches the configuration data for the specified name, and if it's found, returns the corresponding value. My function works when the configuration file contains leading spaces before names or values; such spaces are ignored.
Say this is my configuration file:
something=data
apples=oranges
fruit=banana
animals= cats
fried =chicken
My code will work correctly with the first four entries of the config file. for example, if I use "something" as the name, then "data" will be returned.
The last item won't work as of yet because of the trailing spaces after "fried" and before the =. I want to be able to have my function automatically remove those spaces, too, especially in case an option format such as
somethingelse = items
begins to be used. (Note the spaces on both sides of the = sign.)
What can I do to make a less CPU-intensive version of my program that also detects and removes trailing spaces from the name and value when processing the name and values?
Here's my current code:
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
int getcfg(char* buf, char *name, char *val) {
int fl = 0, n = 0;
char cfg[1][10000], *p = buf;
memset(cfg, 0, sizeof(cfg));
while (*p) {
if (*p == '\n') {
if (strcmp(cfg[0], name) == 0) {
strcpy(val, cfg[1]);
return 1;
}
memset(cfg, 0, sizeof(cfg));
n = 0;
fl = 0;
} else {
if (*p == '=') {
n = 0;
fl = 1;
} else {
if (n != 0 || *p != ' ') {
cfg[fl][n] = *p;
n++;
}
}
}
p++;
}
return 0;
}
int main() {
char val[10000], buf[100000]; //val=value of config item, buf=buffer for entire config file ( > 100KB config file is nuts)
memset(buf, 0, sizeof(buf));
memset(val, 0, sizeof(val));
int h = open("file.conf", O_RDONLY);
if (read(h, buf, sizeof(buf)) < 1) {
printf("Can't read\n");
}
close(h);
printf("Value stat = %d ", getcfg(buf, "Item", val));
printf("Result = '%s'\n", val);
return 0;
}
Behold is a small (~15 lines) sscanf-based read_params() function which does the job. As a bonus, it understands the comments and complains about erroneous lines (if any):
$ cat config_file.c
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <sys/errno.h>
#define ARRAY_SIZE(a) ((sizeof (a)) / (sizeof (a)[0]))
enum { MAX_LEN=128 };
struct param {
char name[MAX_LEN];
char value[MAX_LEN];
};
void strtrim(char *s)
{
char *p = s + strlen(s);
while (--p >= s && isspace(*p))
*p = '\0';
}
int read_params(FILE *in, struct param *p, int max_params)
{
int ln, n=0;
char s[MAX_LEN];
for (ln=1; max_params > 0 && fgets(s, MAX_LEN, in); ln++) {
if (sscanf(s, " %[#\n\r]", p->name)) /* emty line or comment */
continue;
if (sscanf(s, " %[a-z_A-Z0-9] = %[^#\n\r]",
p->name, p->value) < 2) {
fprintf(stderr, "error at line %d: %s\n", ln, s);
return -1;
}
strtrim(p->value);
printf("%d: name='%s' value='%s'\n", ln, p->name, p->value);
p++, max_params--, n++;
}
return n;
}
int main(int argc, char *argv[])
{
FILE *f;
struct param p[32];
f = argc == 1 ? stdin : fopen(argv[1], "r");
if (f == NULL) {
fprintf(stderr, "failed to open `%s': %s\n", argv[1],
strerror(errno));
return 1;
}
if (read_params(f, p, ARRAY_SIZE(p)) < 0)
return 1;
return 0;
}
Let's see how it works (quotes mark the beginning and the end of each line for clarity):
$ cat bb | sed -e "s/^/'/" -e "s/$/'/" | cat -n
1 'msg = Hello World! '
2 'p1=v1'
3 ' p2=v2 # comment'
4 ' '
5 'P_3 =v3'
6 'p4= v4#comment'
7 ' P5 = v5 '
8 ' # comment'
9 'p6 ='
$ ./config_file bb
1: name='msg' value='Hello World!'
2: name='p1' value='v1'
3: name='p2' value='v2'
5: name='P_3' value='v3'
6: name='p4' value='v4'
7: name='P5' value='v5'
error at line 9: p6 =
Note: as an additional bonus, the value can be anything, except #\n\r chars, including spaces, as can be seen above with the 'Hello World!' example. If it's not what needed, add space and tab into the exception list at the second sscanf() for the value (or specify accepted characters there instead) and drop strtrim() function.
I'll provide a straight-forward version, with everything being done in main and no key:value saving - the function only recognizes where they are and print them. I used the input file you gave and added one more line in the end as something = more_data.
This version of the parser does not recognize multiple data itens (itens separated by spaces in the data fields, you'll have to figure it out as an exercise).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main(void)
{
int fd = open("file.conf", O_RDONLY, 0);
int i = 0;
char kv[100];
char c;
while (read(fd,&c,1) == 1) {
/* ignoring spaces and tabs */
if (c == '\t' || c == ' ') continue;
else if (c == '=') {
/* finished reading a key */
kv[i] = 0x0;
printf("key found [%s] ", kv);
i = 0;
continue;
} else if (c == '\n') {
/* finished reading a value */
kv[i] = 0x0;
printf(" with data [%s]\n", kv);
i = 0;
continue;
}
kv[i++] = c;
}
close(fd);
return 0;
}
And the output is:
key found [something] with data [data]
key found [apples] with data [oranges]
key found [fruit] with data [banana]
key found [animals] with data [cats]
key found [fried] with data [chicken]
key found [something] with data [more_data]
Explanation
while (read(fd,&c,1) == 1): reads one character at a time from the file.
if (c == '\t' || c == ' ') continue;: this is responsible for ignoring the white-spaces and tabs wherever they are.
else if (c == '='): If the program finds a = character, it concludes that what it just read was a key and treats it. What's inside that if should be easy to understand.
else if (c == '\n'): Then it uses a new-line character to recognize the end of a value. Again, what's inside the if is not hard to understand.
kv[i++] = c;: This is where we save the char value into the buffer kv.
So, with some minor changes, you can adapt this bit of code to become a parsing function that will suit your needs.
Edit and new code
As pointed out by John Bollinger in the comments, using read inside a while to read one character at a time is very costly. I'll post a second version of the program using the same input method OP was using (reading the whole file at once into a buffer) and then parsing it with another function.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
void parse(char *s)
{
char c, kv[100];
int i;
while ((c = *s++)) {
/* ignoring spaces and tabs */
if (c == '\t' || c == ' ') continue;
else if (c == '=') {
/* finished reading a key */
kv[i] = 0x0;
printf("key found [%s] ", kv);
i = 0;
continue;
} else if (c == '\n') {
/* finished reading a value */
kv[i] = 0x0;
printf(" with data [%s]\n", kv);
i = 0;
continue;
}
kv[i++] = c;
}
}
int main(void)
{
int fd = open("file.conf", O_RDONLY, 0);
char buffer[1000];
/* use the reading method that suits you best */
read(fd, buffer, sizeof buffer);
/* only thing parse() expects is a null-terminated string */
parse(buffer);
close(fd);
return 0;
}
It is very unusual to read a whole config file into memory as a flat image, and especially to keep such an image as the internal representation. One would ordinarily parse the file contents into key/value pairs as you go, and store a representation of those pairs.
Also, your use of read() is incorrect, as you cannot safely assume that it will read all bytes of the file in one call. One normally must call read() in a loop, keeping track of the return value from each call to know both when the end of the file is reached and where in the buffer to put the next bytes read.
If the configuration is supposed to be completely generic, so that you don't know in advance what keywords to expect, then you might organize the configuration data in a hash table or a binary search tree, with the parameter names as the keys. If you do know what parameters to expect (or at least which to allow), then you might have a variable or a struct member for each one.
Naturally, the approach to parameter lookup must be paired correctly with the data structure in which you store the parameters. Any of the approaches I suggested will make looking up multiple configuration parameters far faster. They would also avoid wasting memory, and would adapt to extremely large configurations (or at least could do so).
How best to approach reading the file depends on details of your config file format, such as whether keys and/or values are permitted to contain internal spaces, whether more than one key/value pair may appear on the same line, and whether there is an upper bound on the allowed length of config file lines or of keys and values. Here's an approach that expects one key/value pair per line, supports keys and values that contain internal whitespace (but not newlines), but neither of which is longer than 1023 characters, and where keys are not permitted to contain the '=' character:
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <assert.h>
int main() {
char key[1024];
char value[1024];
FILE *config;
int done;
config = fopen("file.conf", "r");
if (!config) {
perror("while opening file.conf");
return 1;
}
do {
char nl = '\0';
int nfields = fscanf(config, " %1023[^=\n]= %1023[^\n]%c", key, value, &nl);
int i;
done = 1;
if (nfields == EOF) {
if (ferror(config)) {
/* handle read error ... */
perror("while reading file.conf");
} else {
/* trailing empty line(s); ignore ... */
}
break;
} else if (nfields == 3) {
if (nl != '\n') {
/* handle excessive-length value ... */
} else {
done = 0;
}
} else if (nfields == 1) {
/* handle excessive-length key ... */
break;
} else {
assert(nfields == 2);
/* last key/value pair, not followed by a newline */
}
if (key[0] == '=') {
/* handle missing key ... */
break;
}
/* successfully read a key / value pair; truncate trailing whitespace */
for (i = strlen(key); key[--i] == ' '; ) {
/* nothing */
}
key[i + 1] ='\0';
for (i = strlen(value); value[--i] == ' '; ) {
/* nothing */
}
value[i + 1] ='\0';
/* record the key / value pair somewhere (but here we just print it) ... */
printf("key: [%s] value: [%s]\n", key, value);
} while (!done);
fclose(config);
return 0;
}
Important points to note about that include:
No mechanism for storing the key / value pairs is provided. I gave you a few options, and there are others, but you must decide what's best for your own purposes. Rather, the program above addresses the problem of parsing your config data once for all, so that you can avoid parsing it de novo every time you perform a lookup.
The code relies on fscanf() to consume any leading whitespace before the key and value, but in order to accommodate internal whitespace in the key and value, it cannot do the same for trailing whitespace.
Instead, it manually trims trailing whitespace from key and value.
The fscanf() format uses explicit field widths to avoid buffer overruns. It uses the %[ and %c field descriptors to scan data that may be or include whitespace.
Although it may look longish, do note how much of that code is dedicated to error handling.
Divide and conquer.
Getting the data and parsing it are best handled with 2 separate routines.
1) Use fgets() or other code with read() to read a line
int foo(FILE *inf) {
char buffer[1000];
while (fgets(buffer, sizeof buffer, inf)) {
if (Parse_KeyValue(buffer, &key_offset, &value_offset)) {
fprintf(stderr, "Bad Line '%s'\n", buffer);
return 1;
}
printf("'%s'='%s'\n", &buffer[key_offset], &buffer[value_offset]);
}
}
2) Parse the line. (Sample unchecked code)
// 0: Success
// 1: failure
int Parse_KeyValue(char *line, size_t *key_offset, size_t *value_offset) {
char *p = line;
while (isspace((unsigned char) *p)) p++;
*key_offset = p - line;
const char *end = p;
while (*p != '=') {
if (*p == '\0') return 1; // fail, no `=` found
if (!isspace((unsigned char) *p)) {
end = p+1;
}
p++;
}
*end = '\0';
p++; // consume `=`
while (isspace((unsigned char) *p)) p++;
*value_offset = p - line;
end = p;
while (*p) {
if (!isspace((unsigned char) *p)) {
end = p+1;
}
p++;
}
*end = '\0';
return 0;
}
This does allow for valid "" key and value. Adjust as needed.

Resources