File Parsing using sscanf in C - c

I am trying to read from a file and essentially split each line into 2 strings. However, my current code is giving me a seg fault.
char* lhsdata = (char *)malloc(5 * sizeof(char));
char* rhsdata = (char *)malloc(15 * sizeof(char));
while(fgets(line,sizeof line,filePtr)){
if (line[0] == '\n') break;
//Do operations on each line
sscanf(line, "%s %s\n", lhsdata, rhsdata);
printf("%s %s\n", lhsdata, rhsdata);
rhsdata[strlen(rhsdata)-1] = '\0';
}
The file looks like this
add $s1,$s0,$s0
add $t2,$s0,$s5
addi $t4,$s3,70
The last line of the file is empty
Tried playing around with the lhsdata and rhsdata but to no avail. Im pretty stubborn so I want to know how to fix this issue using sscanf (i know there are other options).

Related

How to print a string with its \n characters included?

Let's say we have char* str = "Hello world!\n". Obviously when you print this you will see Hello world!, but I want to make it so it will print Hello world!\n. Is there any way to print a string with its line break characters included?
Edit: I want to print Hello world!\n without changing the string itself. Obviously I could just do char* str = "Hello world \\n".
Also, the reason I'm asking this question is because I'm using fopen to open a txt file with a ton of line breaks. After making the file into a string, I want to split the string by each of its line breaks so I can modify each line individually.
I think it's a typical case of an XY Problem: you ask about a particular solution without really focusing on the original problem first.
After making the file into a string
Why do you think you need to read the entire file in at once? That's not normally necessary.
I want to split the string by each of its line breaks so I can modify each line individually.
You don't need to print the string to do that (you wanted "to make it so it will print Hello World!\n). You don't need to modify the string. You just need to read it in line by line! That's what fgets is for:
void printFile(void)
{
FILE *file = fopen("myfile.txt", "r");
if (file) {
char linebuf[1024];
int lineno = 1;
while (fgets(linebuf, sizeof(linebuf), file)) {
// here, linebuf contains each line
char *end = linebuf + strlen(linebuf) - 1;
if (*end == '\n')
*end = '\0'; // remove the '\n'
printf("%5d:%s\\n\n", lineno ++, linebuf);
}
fclose(file);
}
}
I want to make it so it will print Hello world!\n
If you really wanted to do it, you'd have to translate the ASCII LF (that's what \n represents) to \n on output, for example like this:
#include <stdio.h>
#include <string.h>
void fprintWithEscapes(FILE *file, const char *str)
{
const char *cr;
while ((cr = strchr(str, '\n'))) {
fprintf(file, "%.*s\\n", (int)(cr - str), str);
str = cr + 1;
}
if (*str) fprintf(file, "%s", str);
}
int main() {
fprintWithEscapes(stdout, "Hello, world!\nA lot is going on.\n");
fprintWithEscapes(stdout, "\nAnd a bit more...");
fprintf(stdout, "\n");
}
Output:
Hello, world!\nA lot is going on.\n\nAnd a bit more...

How to parse each column in a CSV file using C

I'm trying to use C to read a CSV file, iterate line by line (until EOF), and delimit/split each line by the comma. Then I wish to separate each column into "bins" and put add them to a struct (which isn't shown here; I defined it in a helper file) based on type.
For example, if I have 1,Bob, I'd like to split 1 and Bob into two variables. Here's what I've written so far.
void readFile(char file[25]) {
FILE *fp;
char line[1000];
fp = fopen(file, "r"))
while(fgets(line, 1000, fp)) {
char* tmp = strdup(line);
char* token;
while((token = strsep(&tmp, ","))) {
printf("%s\n", token); // I want to split token[0] and token[1]
}
}
fclose(fp);
}
T he above code does compile and run. I just don't know how to access each split of the token, like token[0] or token[1]. In python, this would be simple enough. I could just access 1 using token[0] and Bob using token[1] for each line. But here in C, I can't do that.
For testing purposes, all I'm doing right now is printing each line (in the second while loop), just to see how each split looks. I haven't implemented the code where I put each split line into its respective struct member.
I've searched Stack Overflow and found a multitude of threads on this topic. None of them seemed to help me except for this one, which I have drawn from. But I wasn't able to get the storing of split columns working.
In python, this would be simple enough. I could just access 1 using token[0] and Bob using token[1] for each line. But here in C, I can't do that.
Yes, you can, if only you define the array.
while (fgets(line, sizeof line, fp))
{
char *tmp = strchr(line, '\n');
if (tmp) *tmp = '\0'; // remove the '\n'
tmp = strdup(line);
#define MAXCOLUMNS 2
char *token[MAXCOLUMNS];
int c = 0;
while (tmp)
{
if (c == MAXCOLUMNS) puts("too many columns"), exit(1);
token[c++] = strsep(&tmp, ",");
}
if (1 <= c) printf("column 1: %s\n", token[0]);
if (2 <= c) printf("column 2: %s\n", token[1]);
// ONLY if the line's tokens are no longer needed:
free(*token);
}

C - Reads new line in a .txt

I am making a program that reads a file and counts how many words the .txt have. The program is working just fine, the problem is that if the txt have a break line it stops reading, so i have to put all my text in one line. As far as i know the problem is in fgets, that stops reading when it reaches EOF or a new line. My question is: How do i reads my text even with new lines ? Do i have to use fread() ? If so, would i do? Below is the part of the code where i read the .txt and put in an array.
char linha[10000];
int grandezaStrings = 100000;
int i = 0;
int contadorString = 0;
// This line reads the file.
fgets (linha, grandezaStrings,myFile);
// Used for special characters
setlocale (LC_ALL,"PORTUGUESE");
// Dynamic array to hold words
char ** strings = (char **)malloc(grandezaStrings * sizeof (char*));
char * pch;
for (i=0;i<grandezaStrings; i++){
strings[i] = (char *)malloc(100+1);
}
// Transfer all the words to my array.
i = 0;
pch = strtok(linha, " ,.!?:;()\n");
while (pch != NULL){
strlwr(pch);
strings[i] = pch;
contadorString++;
pch = strtok (NULL, " ,.!?:;()\n");
i++;
}
Thanks a lot!
fgets reads up to next new line. It is by design. You could loop (while (fgets(...) != NULL)) or if you want to load everything in memory in one single read, you can just use a fread :
// This line reads the whole file.
i = fread (linha, 1, grandezaStrings,myFile);
if (i < 0) { // test read Ok
perror("Lettura");
return 1;
}
linha[i] = '\0'; // add the terminating null
Use fgets() to get line by line from the file and perform your operation on the line.
while(fgets (linha, sizeof(linha),myFile) != NULL)
{
// perform the action on the line.
}
Use sizeof(linha) as the number of characters to be read.

Segmentation fault when calling fgets to read lines from file

I'm getting a seg fault when after calling fgets about 20 times. I'm opening a file (does not return null). It is in the format:
num1: value1
num2: value2
num3: value3
then reading lines from the file, storing values into an array, using nums as positions. Here is the code that seg faults:
edit: declaring myArray and line:
char myArray[3000];
char * line;
char * word;
line = (char *) malloc(100);
word = (char *) malloc(16);
while(fgets(line, 99, file)) {
printf("%s\n", line);
word = strtok(line, " :");
name = (int) strtol(word, NULL, 16);
word = strtok(NULL, " \n");
myArray[name] = word;
}
you'll notice I print out the line immediately after getting it. The file has 26 lines, but it only prints 23 line and then seg faults. Now, is it something I don't fully understand about fgets, or am I getting some synthax incorrect? I've tried allocating more memory to line, or more to word. I've also tried malloc -ing more memory after every call to strtok, but nothing seems to fix the seg fault.
The problems is the line myArray[name] = word; you're taking an array index from your input line and then setting the character at that position to low bits of the address of your word... I doubt that's actually what you want to do.
There's some other problems with your code, you're leaking the memory from the line word = (char *) malloc(16); because strtok returns a pointer into the string you initially pass it. You don't actually need to malloc anything for the code as written in the question, so you could have:
char myArray[3000];
char line[100];
char *word = NULL;
word needs to be a pointer since it's holding the result of strtok()
You clearly don't understand pointers, you need to review that before you can understand why your code isn't working the way you expect.
If you say what your code is actually meant to be doing I can give you some hints on how to fix it, but at the moment I can't quite tell what the intended result is.
EDIT: Did you intend to read in your numbers in hexadecimal? The last argument to strtol() is the base to be used for conversion... you could also just use atoi()
so your loop could look like:
char myArray[3000];
char line[100];
char *word = NULL;
while(fgets(line, 100, file)) {
printf("%s\n", line);
word = strtok(line, " :");
if(word == NULL) continue;
name = atoi(word); /* only if you didn't actually want hexadecimal */
word = strtok(NULL, " \n");
if(word == NULL) continue;
if(name > 0 && name < 3000) { /* as I said in a comment below */
strncpy(myArray + name, word, 3000 - name);
}
}

Read Txt file Language C

Hi guys I have this file struct:
0
2 4
0: 1(ab) 5(b)
1: 2(b) 6(a)
2: 0(a) 2(b)
3: 2(a) 6(b)
4: 5(ab)
5: 2(a) 6(b)
6: 4(b) 6(ab)
Each line will feed a struct with its data (numbers + letters).
What's the best way to read the line and get the strings I want?
Example:
0
2 4
0,1,ab,5,b
1,2,b,5,a
...
The lines may vary in size because we can have 1, 2, 3, .... numbers.
I already did it :
//struct
#define MAX_ 20
struct otherstats{ //struct otherStats
int conectstat[MAX_];//conection with others stats
int transitions[MAX_];//Symbols betwen conection ASCI
}tableStats[MAX_];
struct sAutomate{
int stat_initial; //initial
int stats_finals[MAX_]; //final orfinals
struct otherstats tableStats[MAX_]; //otherStats 0 1 2 3 4 5 6
};
/* eXample that what i want ..using the example
sAutomate.stat_initial=0
sAutomate.stats_finals[0]=2
sAutomate.stats_finals[1]=4
Others Stats table
//0
sAutomate.tableStats[0].conectstat[0]=1;
sAutomate.tableStats[0].conectstat[1]=5;
sAutomate.tableStats[0].transitions[0]=ab;
sAutomate.tableStats[0].transitions[1]=b;
//1
sAutomate.tableStats[1].conectstat[0]=2;
sAutomate.tableStats[1].conectstat[1]=6;
sAutomate.tableStats[1].transitions[0]=b;
sAutomate.tableStats[1].transitions[1]=a;
///etc
*/
void scanfile(){ //function to read the file
struct sAutomate st; //initialize st struct
char filename[] = "txe.txt";
FILE *file = fopen ( filename, "r" );
char buf[81];
char parts[5][11];
fscanf(file,"%d", &st.stat_initial);//read first line
printf("initial state : %d \n", st.stat_initial);
fscanf(file,"%d",&st.stats_finals);
fscanf(file,"%d",&st.stats_finals);
while (fgets(buf, sizeof(buf), stdin) != NULL)
{
if (sscanf(buf, "%10[^:]: (%10[^(], %10[^)]), (%10[^(], %10[^)])",
parts[0], parts[1], parts[2], parts[3], parts[4]) == 5)
{
printf("parts: %s, %s, %s, %s, %s\n",
parts[0], parts[1], parts[2], parts[3], parts[4]);
}
else
{
printf("Invalid input: %s", buf);
}
}
//fclose
First problem I see is you're overwriting stats_finals:
fscanf(file,"%d",&st.stats_finals);
fscanf(file,"%d",&st.stats_finals);
What you wanted to do here was:
fscanf(file,"%d",&st.stats_finals[0]);
fscanf(file,"%d",&st.stats_finals[1]);
To save off both the "2" and the "4" from the text file.
Second major problem is you're reading from stdin:
while (fgets(buf, sizeof(buf), stdin) != NULL)
That doesn't read your text file, that reads input from the keyboard... So you wanted that to be:
while (fgets(buf, sizeof(buf), file) != NULL)
Third (minor) problem is that fscanf() will not read newlines, and fgets() will. This means when you go from reading your second stats_finals to the first read in the while loop, your first input will just be the left over newline character. That's not a big deal since you check for "invalid input", but it's worth noting.
Finally, your sscanf looks wrong to me:
sscanf(buf, "%10[^:]: (%10[^(], %10[^)]), (%10[^(], %10[^)])",
^ ^
That's a width of 10, Why are you checking for commas? You didn't
I don't think that's have any in your text file
what you wanted...
I think this is more what you were looking for:
sscanf(buf, "%[0-9]: %[0-9](%[^)]) %[0-9](%[^)])",
^
takes a digit (0 to 9)
EDIT
Missed your original point. If you don't know how long the strings will be that you're reading, you can't use sscanf(). It's that simple. :)
The scanf family assumes you know how many objects you'll be parsing and the format string takes in that many. There are other options however.
Read a single line with fgets as you're doing, but then you can tokenize it. Either with the C function strtok or by your own hand with a for loop.
One note however:
Since you don't know how long it is, this: char parts[5][11]; is not your best bet. This limits you to 2 entries... probably it would be better to do this dynamically (read the line then allocate the correct size to store your tokens in.)
If you really don't know how many numbers and letters the line will contain, why are you reading a fixed amount of numbers and letters?
You could read the whole line with fgets and then parse it with a tokenizer like strtok, something like this:
const char* const DELIMITERS = " ";
int i; // index for tableStats
char* token;
token = strtok(line, DELIMITERS);
// first integer
if (token == NULL || sscanf(token, "%d:", &i) < 1)
// error
/* it seems like you should have at least one element in your "list",
* otherwise this is not necessary
*/
token = strtok(NULL, DELIMITERS);
if (token == NULL || sscanf(token, "%d(%[^)])",
&(tableStats[i].connectstat[0]),
&(tableStats[i].transitions[0])) < 2)
// error
// read optional part
for (int j = 1; (token = strtok(NULL, DELIMITERS)) != NULL; ++j)
if (sscanf(token, "%d(%[^)])", &(tableStats[i].connectstat[j]),
&(tableStats[i].transitions[j])) < 3)
break;
Remember that strtok changes the string, make a copy of it if you still need it.
Obviusly the code is for the arbitrary long lines, reading the first two lines is trivial.

Resources