I am trying to read a binary file for it's content
It has two sets of lines for each component
Second last character on the first line indicates the type of the component file (^A for assembly and ^B for part)
If the type is ^A I need to parse the file specified in next line which starts with name^#
àtype^#^Aà
name^#assembly1
àtype^#^Aà
name^#assembly2
àtype^#^Bà
name^#apart1
àtype^#^Bà
name^#apart2
When I try to parse this file, I can not read past the binary characters in the file.
First line contains a binary character (à) so I get an empty line. Second line has ^# after name, so I only get 'name' and the len is 4.
This is my code snippet
FILE *fp;
char line[256];
fp = fopen(name, "rb");
fgets(line, 256, fp);
printf("line %s\n", line);
printf("len %d\n\n", strlen(line));
fgets(line, 256, fp);
printf("line %s\n", line);
printf("len %d\n\n", strlen(line));
This is the output
line
len 0
line name
len 4
My aim is to parse the type of component (^A or ^B) and then get the name of the component.
Please help in pointing out how to solve this.
fgets and most <stdio.h> functions work with text, not binary data.
The "character" ^# has, I think, the binary value 0, which messes up all the string handling functions.
You need to read character-by-character and/or not use string functions with objects containing embedded zero bytes.
Related
I have a weird bug.
I wrote a function that gets a file and returns the length of each line:
void readFile1(char *argv[], int fileNumber,
char *array_of_lines[NUMBER_OF_LINES+1], int *currNumOfLines)
{
FILE* fp;
fp = fopen(argv[fileNumber], "r");
if (fp == NULL)
{
fprintf(stderr, MSG_ERROR_OPEN_FILE, argv[fileNumber]);
exit(EXIT_FAILURE);
}
char line[256];
while (fgets(line, sizeof(line), fp))
{
printf("\n line contains : %s size is : %lu\n",line,strlen(line));
}
}
The function always prints the right number + 2,
For example if file.txt contains only one line with "AAAAA" the function would print that the length is 7 instead of 5,
line contains : AAAAA
size is : 7
Does someone know where is the bug?
Don't forget that fgets leaves the newline in the buffer.
It seems you're reading a file created in Windows (where a newline is the two characters "\r\n") on a system where newline is only "\n". Those two characters are also part of the string and will be counted by strlen.
The reason I'm guessing you're reading a Windows-created file in a non-Windows system is because for files open in text-mode (the default) then the functions reading and writing strings will translate newlines from and to the operating-system dependent format.
For example on Windows, when writing plain "\n" it will be translated and actually written as "\r\n". When reading the opposite translation happens ("\r\n" becomes "\n").
On a system with plain "\n" line endings (like Linux or macOS), no translation is needed, and the "\r" part will be treated as any other character.
printf("\n line contains : %s size is : %lu\n",line,strlen(line));
Here's a giveaway
Obtained output
line contains : AAAAA
size is : 7
Expected output
line contains : AAAAA size is : 7
I am reading a raw text file into a character array and then I want to split the data line by line based on "\n". My code is attached, but I get very strange output.
INPUT FILE is a .txt file created with VIM on Windows, and it looks like:
london
manchester
britain
...
CODE (ignoring some var declarations):
....
char * buffer = 0;
long length;
fl = fopen ("file.txt", "r");
if (fl){
fseek (fl, 0, SEEK_END);
length = ftell (fl);
fseek (fl, 0, SEEK_SET);
buffer = malloc (length);
if (buffer){
fread (buffer, 1, length, fl);
}
fclose (fl);
printf(buffer)
}else{
printf("data file not found");
return -1;
}
char str[80] = "london\nmanchester\nbritain";
char* entity = strtok(buffer, "\n"); //LINE-A, replacing 'buffer' with 'str' the output is correct.
while (entity != NULL) {
printf("%s\n", entity); //this prints strange output as shown below
entity = strtok(NULL, "\n");
}
....
OUTPUT:
ondon
anchester
ritain
The first character is always missing.
However, if I replace "buffer" with "str" the declared character array with the same content as the file everything works as expected.
I do not understand why I am getting this error. Any advice please.
Mnay thanks!
This may be related to the fact that you aren't terminating your string buffer. I'm unable to produce your exact problem but I do get unexpected extra characters in the output. Amend the relevant lines of your code as follows and see if that helps:
buffer = malloc(length + 1);
if (buffer) {
fread(buffer, sizeof(char), length, fl);
buffer[length] = '\0';
}
fclose(fl);
If that doesn't solve your problem, you might have some weird (invisible) control characters in your text file. Try creating a brand new one and see if the problem still occurs.
Okay, so after reading both: How to read a specific line in a text file in C (integers) and What is the easiest way to count the newlines in an ASCII file? I figured that I could use the points mentioned in both to both efficiently and quickly read a single line from a file.
Here's the code I have:
char buf[BUFSIZ];
intmax_t lines = 2; // when set to zero, reads two extra lines.
FILE *fp = fopen(filename, "r");
while ((fscanf(fp, "%*[^\n]"), fscanf(fp, "%*c")) != EOF)
{
/* globals.lines_to_feed__queue is the line that we _do_ want to print,
that is we want to ignore all lines up to that point:
feeding them into "nothingness" */
if (lines == globals.lines_to_feed__queue)
{
fgets(buf, sizeof buf, fp);
}
++lines;
}
fprintf(stdout, "%s", buf);
fclose(fp);
Now the above code works wonderfully, and I'm extrememly pleased with myself for figuring out that you can fscanf a file up to a certain point, and then use fgets to read whatever data is at said point into a buffer, instead of having to fgets every single line and then fprintf the buf, when all I care about is the line that I'm printing: I don't want to be storing strings that I could care less about in a buffer that I'm only going to use once for a single line.
However, the only issue I've run into, as noted by the // when set to zero, reads two extra lines comment: when lines is initialized with a value of 0, and the line I want is like 200, the line I'll get will actually be line 202. Could someone please explain what I'm doing wrong here/why this is happening and whether my quick fix lines = 2; is fine or if it is insufficient (as in, is something really wrong going on here, and it just happens to work?)
There are two reasons why you have to set the lines to 2, and both can be derived from the special case where you want the first line.
On one hand, in the while loop the first thing you do is use fscanf to consume a line, then you check if the lines counter matches the line you want. The thing is that if the line you want is the one you just consumed you are out of luck. On the other hand you are basically moving through lines by finding the next \n and incrementing lines after you check if the current line is the one you're after.
These two factors combined cause the offset in the lines count, so the following is a version of the same function taking them into account. Additionally it also contains a break; statement once you get to the line you are looking for, so that the while loop stops looking further into the file.
void read_and_print_line(char * filename, int line) {
char buf[BUFFERSIZE];
int lines = 0;
FILE *fp = fopen(filename, "r");
do
{
if (++lines == line) {
fgets(buf, sizeof buf, fp);
break;
}
}while((fscanf(fp, "%*[^\n]"), fscanf(fp, "%*c")) != EOF);
if(lines == line)
printf("%s", buf);
fclose(fp);
}
Just as another way of looking at the problem… Assuming that your global specifies 1 when the first line is to be printed, 2 for the second, etc, then:
char buf[BUFSIZ];
FILE *fp = fopen(filename, "r");
if (fp == 0)
return; // Error exit — report error.
for (int lineno = 1; lineno < globals.lines_to_feed_queue; lineno++)
{
fscanf(fp, "%*[^\n]");
if (fscanf(fp, "%*c") == EOF)
break;
}
if (fgets(buf, sizeof(buf), fp) != 0)
fprintf(stdout, "%s", buf);
else
…requested line not present in file…
fclose(fp);
You could replace the break with fclose(fp); and return; if that's appropriate (but do make sure you close the file before exiting; otherwise, you leak resources).
If your line numbers are counted from 0, then change the lower limit of the for loop to 0.
First, about what is wrong here: this code is unable to read the very first line in the file (what happens if globals.lines_to_feed__queue is 0?). It would also miscount lines shall the file contain successive newlines.
Second, you must realize that there is no magic. Since you don't know at which offset the string in question lives, you have to patiently read file character by character, counting end-of-strings along the way. It doesn't matter if you delegate the reading/counting to fgets/fscanf, or fgetc each character for manual inspection - either way an uninteresting piece of file will make its way from the disk into the OS buffers, and then into the userspace for interpretation.
Your gut feeling is absolutely correct: the code is broken.
I have to write to a file as follows:
A
B
C
D
...
Each character of the alphabet needs to be written to different line in the file. I have the following program which writes characters one after another:
FILE* fp;
fp = fopen("file1","a+");
int i;
char ch= 'A';
for(i=0; i<26; i++){
fwrite(&ch, sizeof(char), 1, fp);
ch++;
}
fclose(fp);
How should I change the above program to write each character to a new line. (I tried writing "\n" after each character, but when I view the file using VI editor or ghex tool, I see extra characters; I am looking for a way so that vi editor will show file exactly as shown above).
I tried using the following after first fwrite:
fwrite("\n", sizeof("\n"), 1, fp);
Thanks.
fwrite("\n", sizeof("\n"), 1, fp);
should be
fwrite("\n", sizeof(char), 1, fp);
Otherwise, you are writing an extra \0 that is part of zero-termination of your "\n" string constant (sizeof("\n") is two, not one).
What "extra characters" do you see? You do realize that the "a+" parameter to fopen opens the file for appending, so that you're writing to the end of the file. Did you perhaps mean "w+", which will overwrite the file?
You could use:
fputc((int)ch, fp);
fputc((int)'\n', fp);
Or even fprintf(fp, "%c\n", ch);
I have a file that I want to be read from and printed out to the screen. I'm using XCode as my IDE. Here is my code...
fp=fopen(x, "r");
char content[102];
fread(content, 1, 100, fp);
printf("%s\n", content);
The content of the file is "Bacon!" What it prints out is \254\226\325k\254\226\234.
I have Googled all over for this answer, but the documentation for file I/O in C seems to be sparse, and what little there is is not very clear. (To me at least...)
EDIT: I switched to just reading, not appending and reading, and switched the two middle arguments in fread(). Now it prints out Bacon!\320H\320 What do these things mean? Things as in backslash number number number or letter. I also switched the way to print it out as suggested.
You are opening the file for appending and reading. You should be opening it for reading, or moving your read pointer to the place from which you are going to read (the beginning, I assume).
FILE *fp = fopen(x, "r");
or
FILE *fp = fopen(x, "a+");
rewind(fp);
Also, fread(...) does not zero-terminate your string, so you should terminate it before printing:
size_t len = fread(content, 1, 100, fp);
content[len] = '\0';
printf("%s\n", content);
I suppose, you meant this:
printf("%s\n", content);
Maybe:
fp = fopen(x, "a+");
if(fp)
{
char content[102];
memset(content, 0 , 102);
// arguments are swapped.
// See : http://www.cplusplus.com/reference/clibrary/cstdio/fread/
// You want to read 1 byte, 100 times
fread(content, 1, 100, fp);
printf("%s\n", content);
}
A possible reason is that you do not terminate the data you read, so printf prints the buffer until it finds a string terminator.