Reading from binary file C programming - c

I need to read binary which contains 3d dots coordinates but first there are 5 lines which are written to file in normal mode, so first I need to skip that part and read the dots. I tried the fread but failed. What am i doing wrong
VERSION 1
DOTS x y z
DOTCOUNT 10
DATA binary
33ËB3³ÊB33ÊBfæÊBffÊBfæÉBš™ÊBšÊBš™ÉBÍLÊBÍÌÉBÍLÉB

You are actually looking for the data after the first five '\n' characters.
You can read the first 256 bytes of the file, and look for the new lines. if they are there, you start reading your binary data, right after the fifth occurrence. if you did not find five '\n' character, continue reading the next chunk of 256 bytes, and look for the remaining number of '\n' characters. The binary data shall be read after the first five lines were consumed.
That's all

Related

Reading input from a file in C

I came across the following question:
If a file contains the line "I am a boy\r\n" then on reading this line into the array str using fgets(). What will str contain?
[A]. "I am a boy\r\n\0"
[B]. "I am a boy\r\0"
[C]. "I am a boy\n\0"
[D]. "I am a boy"
The answer has been given as option c with the explanation
Declaration: char *fgets(char *s, int n, FILE *stream);
fgets reads characters from stream into the string s. It stops when it reads either n - 1 characters or a newline character, whichever comes first.
However, I couldn't understand how will \r (carriage return) influence fgets. I mean, shouldn't it be that first "I am a boy" is read, then on encountering \r cursor is set at the initial position and "I" from "I am a body" is overwritten by \n and space following "I" is overwritten by \0.
Any help is deeply appreciated.
P.s: My claim is based on the explanation given on this link: https://www.quora.com/What-exactly-is-r-in-the-C-language
First, every time you see a multiple choice quiz on some programming website, I recommend you close the tab and do something productive instead such as watching videos of kittens. Because the questions seem to be just some variants of
Which of these is the first letter of the alphabet (only one is right)
A
a
6
a
the letter a
all of the above.
Carriage returns and line feeds do not affect the input read by a C program in that way. Each additional byte is just on top of the other bytes. Otherwise, this is very badly phrased question, as the answer be any of A, B, C or D, or maybe none of them. Saying that C is the only one that is right is wrong.
First question is what it means if "the file contains \r"? Here I assume that the author meant that the file contains the 10 characters I am a boy followed by ASCII 13 and ASCII 10 (carriage return and line feed).
In C there are two translation modes for reading files, text mode and binary mode. On POSIX systems (all those operating systems with X in their name, except for Windows eXcePtion) these are equal - the text mode is ignored. So when you read the line into a buffer with fgets on POSIX, it will look for that line feed and store all letters as is including the , so the buffer will have the following sequence of bytes I am a boy\r\n\0. Therefore A could be true.
But on Windows, the text mode translates the carriage return and the linefeed to one newline character with ASCII value 10 in memory, so what you will have is I am a boy\n\0. Therefore C could be true. If your file was opened in binary mode, you'll still have I am a boy\r\n\0 - so how'd you claim that C is the only one that can be true?
If the string that you'd read with fgets would be I am a boy\r\n (POSIX or binary mode) but you told fgets your buffer has space for only 12 characters, then you'd get 11 characters of the input and terminating \0, and therefore you'd have I am a boy\r\0. The carriage return character would remain in the stream. Therefore B could be true. B cannot be true if you indicated that the buffer will have more space.
Finally any of these array contents does contain the string I am a boy, therefore D would be true in all of the cases above.
And if your buffer didn't have enough space for 10 characters and the terminator then you'd have some prefix of the contents, such as I am a bo followed by \0 which means that none of these was true.

fread() a struct in c

For my assignment, I'm required to use fread/fwrite. I wrote
#include <stdio.h>
#include <string.h>
struct rec{
int account;
char name[100];
double balance;
};
int main()
{
struct rec rec1;
int c;
FILE *fptr;
fptr = fopen("clients.txt", "r");
if (fptr == NULL)
printf("File could not be opened, exiting program.\n");
else
{
printf("%-10s%-13s%s\n", "Account", "Name", "Balance");
while (!feof(fptr))
{
//fscanf(fptr, "%d%s%lf", &rec.account, rec.name, &rec.balance);
fread(&rec1, sizeof(rec1),1, fptr);
printf("%d %s %f\n", rec1.account, rec1.name, rec1.balance);
}
fclose(fptr);
}
return 0;
}
clients.txt file
100 Jones 564.90
200 Rita 54.23
300 Richard -45.00
output
Account Name Balance
540028977 Jones 564.90
200 Rita 54.23
300 Richard -45.00╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠
╠╠ü☻§9x°é -92559631349317831000000000000000000000000000000000000000000000.000000
Press any key to continue . . .
I can do this with fscanf (which Ive commented out), but I'm required to use fread/fwrite.
Why does it start with a massive number for Jone's account?
Why is there garbage after? Shouldn't feof stop this?
Are there any drawbacks using this method? or fscanf method?
How can I fix these?
Many thanks in advance
As the comments say, fread reads the bytes in your file without any interpretation. The file clients.txt consists of 50 characters, 16 in the first line plus 14 in the second plus 18 in the third line, plus two newline characters. (Your clients.txt does not contain a newline after the third line, as you will soon see.) The newline character is a single byte \n on UNIX or Mac OS X machines, but (probably) two bytes \r\n on Windows machines - hence either 50 or 51 characters. Here is the sequence of ASCII bytes in hexadecimal:
3130 3020 4a6f 6e65 7320 3536 342e 3930 100 Jones 564.90
0a32 3030 2052 6974 6120 3534 2e32 330a \n200 Rita 54.23\n
3330 3020 5269 6368 6172 6420 2d34 352e 300 Richard -45.
3030 00
Your fread statement copies these bytes without any interpretation directly into your rec1 data structure. That structure begins with int account;, which says to interpret the first four bytes as an int. As one of the comments noted, you are running your program on a little-endian machine (most likely an Intel machine), so the least significant byte is the first and the most significant byte is the fourth. Thus, your fread said to interpret the sequence of four ASCII characters "100 " as the four byte integer 0x20303031, which equals, in decimal, 540028977. The next member of your struct is char name[100];, which means that the next 100 bytes of data in rec1 will be the name. But the fread was told to read sizeof(rec1)=112 bytes (4 byte account, 100 byte name, 8 byte balance). Since your file is only 50 (or 52) characters, fread will have only been able to fill in that many bytes of rec1. The return value of fread, had you not discarded it, would have told you that the read stopped short of the number of bytes you requested. Since you hit EOF, the feof call breaks out of the loop after that first pass, having consumed the entire file in one gulp.
All of your output was produced by the first and only call to fprintf. The number 540028977 and the following space were produced by the "%d " and the rec1.account argument. The next bit is only partly determinate, and you got lucky: The "%s" specifier and the corresponding rec1.name argument will print the next characters as ASCII until a \0 byte is found. Thus, the output will begin with the 50-4 (or 52-4) remaining characters of your file -- including the two newlines -- and potentially continue forever, because there are no \0 bytes in your file (or in any text file), which means that after printing the last character of your file, what you are seeing is whatever garbage happened to be in the automatic variable rec1 when your program started. (That kind of unintentional output is similar to the famous heartbleed bug in OpenSSL.) You were lucky the garbage included a \0 byte after only a few dozen more characters. Note that printf has no way to know that rec1.name was declared to be only a 100 byte array -- it only got the pointer to the beginning of name -- it was your responsibility to guarantee that rec1.name contained a terminating \0 byte, and you never did that.
We can tell a little bit more. The number -9.2559631349317831e61 (which is pretty ugly in "%f" format) is the value of rec1.balance. The 8 bytes for that double value on an IEEE 754 machine (like your Intel and all modern computers) are in hex 0xcccccccccccccccc. Sixty four of the peculiar ╠ symbol appear in the "%s" output corresponding to rec1.name, while only 100-46 = 54 characters remain of the 100, so your "%s" output has run off the end of rec1.name, and includes rec1.balance into the bargain, and we learn that your terminal program interpreted the non-ASCII character 0xcc as ╠. There are many ways to interpret bytes bigger than 127 (0x7f); in latin-1 it would have been Ì for example. The graphical character ╠ is the representation of the 0xcc (204) byte in the ancient MS-DOS character set, Windows code page 437. Not only are you running on an Intel machine, it is a Windows machine (of course the mostly likely possibility to begin with).
That answers your first two questions. I'm not sure I understand your third question. The "drawbacks" I hope are obvious.
As for how to fix it, there is no reasonably simple way to read and interpret a text file using fread. To do so, you would need to duplicate much of the code in the libc fscanf function. The only sensible way is to first use fwrite to create a binary file; then fread will work naturally to read it back. So there have to be two programs -- one to write a binary clients.bin file, and a second to read it back. Of course, that does not solve the problem of where the data for that first program should come from in the first place. It could come from reading clients.txt using fscanf. Or it could be included in the source code of the fwrite program, for example by initializing an array of struct rec like this:
struct rec recs[] = {{100, "Jones", 564.90},
{200, "Rita", 54.23},
{300, "Richard", -45.00}};
Or it could come from reading a MySQL database, or... The one place it is unlikely to originate is in a binary file (easily) readable with fread.

How do you scan redirected files in C (STDIN)?

Say I'm calling a program:
$ ./dataset < filename
where filename is any file with x amount of line pairs where the first line contains a string and second line contains 10 numbers separated by spaces. The last line ends with "END"
How can I then start putting the first lines of pairs (string) into:
char *experiments[20] // max of 20 pairs
and the second lines of the pairs (numbers) into:
int data[10][20] // max of 20, 10 integers each
Any guidance? I don't even understand how I'm supposed to scan the file into my arrays.
Update:
So say this is my file:
Test One
0 1 2 3 4 5 6 7 8 9
END
Then redirecting this file would mean if I want to put the first line into my *experiments, that I would need to scan it as such?
scanf("%s", *experiments[0]);
Doing so gives me an error: Segmentation fault (core dumped)
What is incorrect about this?
Say my file is simply numbers, for ex:
0 1 2 3 4 5 6 7 8 9
Then,
scanf("%d", data[0][0]); works, and will hold value of '1'. Is there an easier way to do this for the whole line of data? i.e. data[0-9][0].
find the pseudo-code, code explains how to read the input
int main()
{
char str[100]; // make sure that this size is enough to hold the single line
int no_line=1;
while(gets(str) != NULL && strcmp(str,"END"))
{
if(no_line % 2 == 0)
{
/*read integer values from the string "str" using sscanf, sscanf can be called in a loop with %d untill it fails */
}
else
{
/*strore string in your variable "experiments" , before copying allocate a memory for the each entry */
}
no_line++;
}
}
The redirected file is associated with the FILE * stdin. It's already opened for you...
otherwise, you can treat it the same as any other text file, and/or use the functions that are dedicated to standard input - with the only exception that you cannot seek in the file and not retrieve the size of the input.
For the data sizes you're talking about, by far the easiest thing to do is just slurp all of the content into a buffer and work on that: you don't have to be super-stingy, just make sure that you don't overrun.
If you want to be super-stingy with memory, preallocate a 4kB buffer with malloc(), progressively read() into it from stdin, and realloc() another 4kB every time the input exceeds what you've already read. If you don't care so much about being stingy with memory (e.g. on a modern machine with gigabytes of memory), just malloc() something much bigger than the expected input (e.g. a megabyte) and bug out if the input is more than that: this is far simpler to implement but less general/elegant.
You then have all of the input in a buffer and you can do what you like with it, which depends too strongly on the format of the input for me to say how you should approach that part.

Postscript file reading ahead

I am a newbie to programming in postscript directly.
I am reading from a file and I am using the read command to parse symbols.
Most of the symbols I am checking for are 2 characters in length and one is 3 characters in length. I would change the one that is 3 characters in length into two characters, but that won't help for the following reasons:
The symbols are a standard so I can't say.. well, thanks for making
just one of them 3 characters unlike all the rest!
The symbol that has 3 characters has the first 2 characters the same as another symbol.
I need to be able to "peek" at the next character in the file if the first two symbols match and not modify the the read order if the character peeked at doesn't match what I want to make the 3 character symbol.
Example:
File text as a string: "AB3;ABB4;"
In this case I will read A, then B and because the two characters together are "AB", I need to see if reading the next character produces a B.. if not, I don't want to have modified my reading order so I can proceed in the code normally to extracting the value. If it does, I now have my 3 character symbol and can proceed to extract the value normally.
Thank you.
Postscript doesn't have an unget function, so you can't push a byte back onto the stream to read it again later. But, depending upon OS-support, Level 2 Postscript lets you reposition a file. So you could implement your peek function like this:
% file peek int true
% false
/peek {
dup fileposition exch dup % pos file file
read { %true: pos file int
3 1 roll exch setfileposition true
}{ %false: pos file
pop pop false
} ifelse
} def

How to read till end of file in MATLAB?

I have a file a.txt:
03,17.406199
05,14.580129
07,13.904058
11,14.685388
15,14.062603
20,14.364573
25,18.035175
30,21.681789
50,22.662820
The number of rows in the file are not known. I want to read the file and store
3
5
7
11
15
20
30
50
in one array and the float values in another.
How do I read in a file when the length of data is not known?
If the number of entries is the same in every row, and if all the entries are numeric, then
you can simply do
a = load('a.txt');
a will be a matrix with two columns.
Read line-by-line until you hit the EOF marker.
Certain functions (like TEXTSCAN) will continue recycling the format string until the end of the file is reached. Other functions (like FSCANF) can take Inf as a size option, indicating that it should continue reading until the end of the file. If you are reading data line-by-line in a loop, you can use the FEOF function to test if the end of the file has been reached.
Since your elements are separated by commas, take a look at csvread. This should read the entire file into a single matrix, which you can then split into the two vectors you want.
Disclaimer: not tested!
fileContents = csvread('a.txt');
integerColumn = fileContents(:, 1);
doubleColumn = fileContents(:, 2);

Resources