im trying read a file and using fscanf to get some values down and store it in an array, however in the file, there will be some line starts with '#" (e.g. #this is just a command), and i want to skip them how should i do that? those lines that contains # will appear at random lines. got some of my code here:
//do line counts of how many lines contain parameters
while(!EOF) {
fgets(lines, 90, hi->agentFile);
count++;
if (lines[0] == '#') {
count--;
}
}
//mallocing an array of struct.
agentInfo* array = malloc(count*sizeof(agentInfo));
for (i = 0; i < count; i++) {
fscanf(hi->agentFile,"%d %d %c %s %c",&array[i].r,&array[i].c,
&array[i].agent_name,&array[i].function[80],
&array[i].func_par);
so i need to add something so i can skip lines start with '#', how?
Your EOF test is wrong. You also need to rewind the file between the fgets() loop and the fscanf() loop. And you need to replace the fscanf() loop with a second fgets() loop using sscanf() to read the data. Or you need to allocate the memory as you go while reading the file once. Let's leave that for later, though:
while(fgets(lines, sizeof(lines), hi->agentFile) != EOF)
{
if (lines[0] != '#')
count++;
}
agentInfo *array = malloc(count*sizeof(agentInfo));
if (array != 0)
{
int i;
rewind(hi->agentFile);
for (i = 0; fgets(lines, sizeof(lines), hi->agentFile) != EOF && i < count; i++)
{
if (lines[0] != '#')
{
if (sscanf(lines, "%d %d %c %s %c",&array[i].r,&array[i].c,
&array[i].agent_name,&array[i].function[80],
&array[i].func_par) != 5)
...format error in non-comment line...
}
}
assert(i == count); // else someone changed the file, or ...
}
Note that this checks for a memory allocation error and for format errors for non-comment lines.
Related
I have a text file "123.txt" with this content:
123456789
I want the output to be:
123
456
789
This means, a newline character must be inserted after every 3 characters.
void convert1 (){
FILE *fp, *fq;
int i,c = 0;
fp = fopen("~/123.txt","r");
fq = fopen("~/file2.txt","w");
if(fp == NULL)
printf("Error in opening 123.txt");
if(fq == NULL)
printf("Error in opening file2.txt");
while (!feof(fp)){
for (i=0; i<3; i++){
c = fgetc(fp);
if(c == 10)
i=3;
fprintf(fq, "%c", c);
}
if(i==4)
break;
fprintf (fq, "\n");
}
fclose(fp);
fclose(fq);
}
My code works fine, but prints a newline character also at the end of file, which is not desired. This means, a newline character is added after 789 in the above example. How can I prevent my program from adding a spurious newline character at the end of the output file?
As indicated in the comments, your while loop is not correct. Please try to exchange your while loop with the following code:
i = 0;
while(1)
{
// Read a character and stop if reading fails.
c = fgetc(fp);
if(feof(fp))
break;
// When a line ends, then start over counting (similar as you did it).
if(c == '\n')
i = -1;
// Just before a "fourth" character is written, write an additional newline character.
// This solves your main problem of a newline character at the end of the file.
if(i == 3)
{
fprintf(fq, "\n");
i = 0;
}
// Write the character that was read and count it.
fprintf(fq, "%c", c);
i++;
}
Example: A file containing:
12345
123456789
is turned into a file containing:
123
45
123
456
789
I think you should do your new line at the beggining of the lopp:
// first read
c = fgetc(fp);
i=0;
// fgetc returns EOF when end of file is read, I usually do like that
while((c = fgetc(fp)) != EOF)
{
// Basically, that means "if i divided by 3 is not afloating number". So,
// it will be true every 3 loops, no need to reset i but the first loop has
// to be ignored
if(i%3 == 0 && i != 0)
{
fprintf (fq, "\n");
}
// Write the character
fprintf(fq, "%c", c);
// and increase i
i++;
}
I can't test it right now, maybe there is some mistakes but you see what I mean.
I have a matrix G in MATLAB that I have printed into a text file using:
file = fopen('G.dat','w');
fprintf(file, [repmat('%f\t', 1, size(G, 2)) '\n'], G');
fclose(file);
The dimension of this matrix is 100 x 500. If I count rows and columns using awk, for instance, using
cat G.dat | awk '{print NF}END{print NR}'
I see that the dimensions correspond to the original one.
Now, I want to read this file, G.dat, from a C program that counts the columns of the first row just to understand the columns' dimension as in:
while (!feof(file) && (fscanf(file, "%lf%c", &k, &c) == 2) ) {
Ng++;
if (c == '\n')
break;
}
Unfortunately it gives me Ng = 50000 and it doesn't recognize any of the '\n'.
Instead, if I create the text file just by copying and pasting the data, it works. Can you explain me why? Thanks!
Are you working in Windows? Try opening your output file in text mode:
file = fopen('G.dat','wt');
This will automatically insert a carriage return before each newline when writing to the file.
Code's approach is too fragile to "counts the columns of the first row just to understand the columns' dimension". fscanf(file, "%lf%c"... is too susceptible to variant white-space delimiters and EOL to detect the '\n'.
Recommend explicitly examine white-space to determine width:
// return 0 on success, 1 on error
int GetWidth(FILE *file, size_t *width) {
*width = 0;
for (;;) {
int ch;
while (isspace(ch = fgetc(file))) {
if (ch == '\n') return 0;
}
if (ch == EOF) return 0;
ungetc(ch, file);
double d;
if (fscanf(file, "%lf", &d) != 1)) {
return 1; // unexpected non convertible text
}
(*width)++;
}
}
//Sample, usage
size_t width;
if (GetWidth(file, &width)) return 1;
// read entire file
rewind(file);
for (size_t line = 0; foo(); line++)
for (size_t column = 0; column<width; column++) {
double d;
if (fscanf(file, "%lf", &d) != 1)) {
break; // EOF, unexpected non convertible text or input error
}
}
...
}
Matlab writes rows as
%f\t%\f .. %f\t\n
which is a problem. I have used
dlmwrite('G.dat', G, '\t');
and it is fine!
if I understand the matlab syntax correctly, this expands to a format string like %f\t%f\t%f\t\n\%f\t%f\t%f\t\n for a 3x2 matrix. Note the extra \t at the end of each line. If this assumption is correct, the last fscanf() call in a line will assign the last \t to &c. The next fscanf() call just skips the \n because it doesn't match your format.
I'd propose you use fgets() instead for reading each line and then loop over the fields using strtok(), reading the values with atof() e.g.
char buf[8192];
if (fgets(buf, 8192, file))
{
if (strtok(buf, '\t'))
{
++Ng;
while (strtok(0, '\t')) ++Ng;
}
}
else
{
/* error reading ... */
}
I have file that will contain either two numbers with variable whitespace in between them or just a blank line. I need to know when the input is just a blank line, and then to not assign into those variables using fscanf.
Doing:
FILE *pFile = fopen (my_file, "r");
if (pFile == NULL) perror ("Error opening file");
int succ = 1, num1 = 0, num2 = 0;
while (succ != EOF)
{
succ = fscanf(pFile, "%d %d", &num1, &num2);
}
Works very well for detecting all of the numbers properly, but not the newline.
I tried:
fscanf(pFile, "%d %d %*[^\n]", &num1, &num2);
But it always properly assigns to both numbers. I want to be able to make a switch statement to do other logic based on if succ is 2 (indicating both numbers were assigned too) or 0 (indicating a blank line).
I'd prefer avoiding getline if possible; seems like that would be inefficient to use iostream stuff mixed with stdio functions.
I tried this guys suggestion, didn't work out, although logically I thought it would work, it didn't.
Honestly, I don't even understand why something like
"%d %d \n"
Wouldn't work. It's simple... scan the file until a newline, return back how many assignments were done, that's it!
"%d %d \n" will not work to achieve OP goals as any white-space directive consumes any white-space. Using ' ' or '\n' make no difference they both consume 0 or more white-spaces. (expect in '[]' specifiers)
OP wants to detect a '\n' and fscanf() is not a good choice. fgets() is better.
#define N (100 /* longest line */)
char buf[N];
while (fgets(buf, sizeof buf, pFile) != NULL) {
int cnt = sscanf("%d%d",&num1, &num2);
switch (cnt) {
case 0: Handle_NoNumbers(); break;
case 1: Handle_1Number(); break;
case 2: Handle_Success(); break;
}
}
The '%d' in fscanf(stream, "%d", &num) specifies to consume all leading white-space before a number without regard to '\n' or ' ', etc.
If code must use fscanf(), then code needs to consume leading white-space before calling fscanf(... "%d") in a way to distinguish '\n' from ' '.
// Consume white space except \n and EOF
int consume_ws(FILE *pFile) {
do {
int c = fgetc(pFile); // could use fscanf(...%c...) here
if (c == '\n' || c == EOF) return c;
} while (isspace(c));
ungetc(c, pFile); // put the char back
return 0;
}
...
while(1) {
int num[2];
int i;
for (i = 0; i < 2; i++) {
if (consume_ws(pFile)) {
Handle_ScantData(); // Not enough data
return;
}
if (1 != fscanf(pFile, "%d", &num[i]) {
Handle_NonnumericData();
return;
}
}
int ch = consume_ws(pFile);
if (ch == 0) Handle_ExtraData();
// Else do something with the 2 numbers
if (ch == EOF) return;
}
I read in a temp variable from a file, this is one word, e.g. "and", however, when I extract the first character, e.g. temp[1], the program crashes when running, I have tried break points, and it is on this line.
This is what happens when I run the code: http://prntscr.com/2vzkmp
These are the words when I don't try to extract a letter: http://prntscr.com/2vzktn
This is the error when I use breakpoints: http://prntscr.com/2vzlr3
This is the line that is messing up: " printf("\n%s \n",temp[0]);"
Here is the code:
int main(void)
{
char **dictmat;
char temp[100];
int i = 0, comp, file, found = 0, j = 0, foundmiss = 0;
FILE* input;
dictmat = ReadDict();
/*opens the text file*/
input = fopen("y:\\textfile.txt", "r");
/*checks if we can open the file, otherwise output error message*/
if (input == NULL)
{
printf("Could not open textfile.txt for reading \n");
}
else
{
/*allocates the memory location to the rows using a for loop*/
do
{
/*temp_line is now the contents of the line in the file*/
file = fscanf(input, "%s", temp);
if (file != EOF)
{
lowercase_remove_punct(temp, temp);
for (i = 0; i < 1000; i++)
{
comp = strcmp(temp, dictmat[i]);
if (comp == 0)
{
/*it has found the word in the dictionary*/
found = 1;
}
}
/*it has not found a word in the dictionay, so the word must be misspelt*/
if (found == 0 && (strcmp(temp, "") !=0))
{
/*temp is the variable that is misspelt*/
printf("\n%s \n",temp[0]);
/*checks for a difference of one letter*/
//one_let(temp);
}
found = 0;
foundmiss = 0;
}
} while (file != EOF);
/*closes the file*/
fclose(input);
}
free_matrix(dictmat);
return 0;
}
When printing a character, use %c, not %s.
There is a fundamental difference between the two. The latter is for strings.
When printf encounters a %c it inserts one byte in ASCII format into the output stream from the variable specified.
When it sees a %s it will interpret the variable as a character pointer, and start copying bytes in ASCII format from the address specified in the variable, until it encounters a byte that contains zero.
print char - not string:
printf("\n%c \n",temp[0]);
temp[0] is a charater. Thus if you are using
printf("\n%s \n",temp[0]);
it will print the string from address i.e. temp[0]. May be this location is not accessible, So it is crashing.
This change it to
printf("\n%c \n",temp[0]);
Why are you using %s as modifier, use %c
EDIT:
complete code with main is here http://codepad.org/79aLzj2H
and once again this is were the weird behavious is happening
for (i = 0; i<tab_size; i++)
{
//CORRECT OUTPUT
printf("%s\n", tableau[i].capitale);
printf("%s\n", tableau[i].pays);
printf("%s\n", tableau[i].commentaire);
//WRONG OUTPUT
//printf("%s --- %s --- %s |\n", tableau[i].capitale, tableau[i].pays, tableau[i].commentaire);
}
I have an array of the following strcuture
struct T_info
{
char capitale[255];
char pays[255];
char commentaire[255];
};
struct T_info *tableau;
This is how the array is populated
int advance(FILE *f)
{
char c;
c = getc(f);
if(c == '\n')
return 0;
while(c != EOF && (c == ' ' || c == '\t'))
{
c = getc(f);
}
return fseek(f, -1, SEEK_CUR);
}
int get_word(FILE *f, char * buffer)
{
char c;
int count = 0;
int space = 0;
while((c = getc(f)) != EOF)
{
if (c == '\n')
{
buffer[count] = '\0';
return -2;
}
if ((c == ' ' || c == '\t') && space < 1)
{
buffer[count] = c;
count ++;
space++;
}
else
{
if (c != ' ' && c != '\t')
{
buffer[count] = c;
count ++;
space = 0;
}
else /* more than one space*/
{
advance(f);
break;
}
}
}
buffer[count] = '\0';
if(c == EOF)
return -1;
return count;
}
void fill_table(FILE *f,struct T_info *tab)
{
int line = 0, column = 0;
fseek(f, 0, SEEK_SET);
char buffer[MAX_LINE];
char c;
int res;
int i = 0;
while((res = get_word(f, buffer)) != -999)
{
switch(column)
{
case 0:
strcpy(tab[line].capitale, buffer);
column++;
break;
case 1:
strcpy(tab[line].pays, buffer);
column++;
break;
default:
strcpy(tab[line].commentaire, buffer);
column++;
break;
}
/*if I printf each one alone here, everything works ok*/
//last word in line
if (res == -2)
{
if (column == 2)
{
strcpy(tab[line].commentaire, " ");
}
//wrong output here
printf("%s -- %s -- %s\n", tab[line].capitale, tab[line].pays, tab[line].commentaire);
column = 0;
line++;
continue;
}
column = column % 3;
if (column == 0)
{
line++;
}
/*EOF reached*/
if(res == -1)
return;
}
return ;
}
Edit :
trying this
printf("%s -- ", tab[line].capitale);
printf("%s --", tab[line].pays);
printf("%s --\n", tab[line].commentaire);
gives me as result
-- --abi -- Emirats arabes unis
I expect to get
Abu Dhabi -- Emirats arabes unis --
Am I missing something?
Does printf have side effects?
Well, it prints to the screen. That's a side effect. Other than that: no.
is printf changing its parameters
No
I get wrong resutts [...] what is going on?
If by wrong results you mean that the output does not appear when it should, this is probably just a line buffering issue (your second version does not print newline which may cause the output to not be flushed).
It's highly unlikely that printf is your problem. What is far, far more likely is that you're corrupting memory and your strange results from printf are just a symptom.
There are several places I see in your code which might result in reading or writing past the end of an array. It's hard to say which of them might be causing you problems without seeing your input, but here are a few that I noticed:
get_lines_count won't count the last line if it doesn't end in a newline, but your other methods will process that line
advance will skip over a newline if it is preceded by spaces, which will cause your column-based processing to get off, and could result in some of your strings being uninitialized
get_word doesn't do any bounds checks on buffer
There may be others, those were just the ones that popped out at me.
I tested your code, adding the missing parts (MAX_LINE constant, main function and a sample datafile with three columns separated by 2+ whitespace), and the code works as expected.
Perhaps the code you posted is still not complete (fill_table() looks for a -999 magic number from get_word(), but get_word() never returns that), your main function is missing, so we don't know if you are properly allocating memory, etc.
Unrelated but important: it is not recommended (and also not portable) to do relative movements with fseek in text files. You probably want to use ungetc instead in this case. If you really want to move the file pointer while reading a text stream, you should use fgetpos and fsetpos.
Your approach for getting help is very wrong. You assumed that printf had side effects without even understanding your code. The problem is clearly not in printf, but you held information unnecessarily. Your code is not complete. You should create a reduced testcase that compiles and displays your problem clearly, and include it in full in your question. Don't blame random library functions if you don't understand what is really wrong with your program. The problem can be anywhere.
From your comments, i am assuming if you use these printf statements,
printf("%s\n", tableau[i].capitale);
printf("%s", tableau[i].pays);
printf("%s\n", tableau[i].commentaire);
then everything works fine...
So try replacing your single printf statement with this. (Line no. 173 in http://codepad.org/79aLzj2H)
printf("%s\n %s %s /n", tableau[i].capitale, tableau[i].pays, tableau[i].commentaire);