I am trying to read 256 bytes of whatever data in my input file and construct the information of them into a struct, then write that struct into my output file. Since I can't simply open and read the output file, I wonder what I should do to make sure I have sucessfully written the struct to my output file?
Check the return value from fwrite- i.e. read manual page http://www.cplusplus.com/reference/cstdio/fwrite/
You can read again the file and cast it to your struct and check it's values.
You've got several basic issues. Firstly, you need to verify that the data
you have read in is valid. Secondly, having constructed your data structure
in memory you need to write it out to a different file. Thirdly, you want
to guarantee that this has been written out correctly - but you don't say
what you are allowed to in order to verify the output file's correctness.
Reading the data in is easy - fread() is your friend here. I would read it
all into a (void *) of the appropriate size.
What I've done in similar data in / data out use-cases in the past is to
include a simple (depending on the application) checksum as the first element
in your output data structure. Use bcopy() or memcpy() to transfer your read-in
data to your output structure, then calculate the checksum on the data and
update the checksum field.
Finally, you fwrite() all that data to the output file, and check the return
value - the number of objects written out. If there is an error (number written
is less than desired), you need to check errno and handle the error case.
In my copy of the manual (Solaris 11.x), the error codes possible for
fwrite(3c) are those for fputc(3c).
Finally finally, you can determine whether sufficient bytes have been written
to the output file by comparing the statbut from a stat() call immediately
after opening the output file, and one immediately your fwrite() + fclose()
has returned.
If you opened the file for reading and writing, you should also be able to fseek to the beginning (offset 0) and re-read the file from there. That's assuming you're ok with opening the file for reading and writing together.
Check the return code as chux suggested. Open the input and output files in a hex editor and compare the first 256 bytes. Emacs has one that works well enough.
Related
I have recently started to learn Go. To start with I decided that I would write some code to open a file and output its contents on the terminal window. So far I have been writing code like this:
file, err := os.Open("./blah.txt")
data := make([]byte, 100)
count, err := file.Read(data)
To obtain up to 100 bytes from a file. Is there any way to ascertain the byte count on a file, such that you could set the correct (or more sensible) byte array length just using the standard Go library?
I understand you could use a slice with something like Append() once the extremities of the array have been reached, but I just wondered whether the file size/length/whatever could be accessed prior to instantiating an array through file metadata or something similar.
While you could certainly get the file's size prior to reading
from it (see the other answer), doing this is usually futile
for a number of reasons:
A filesystem is an inherently racy medium: any number of processes
might update a given file simultaneously, and even remove it.
On a filesystem with POSIX semantics (most commodity OSes
excluding Windows) the only guarantee a successful opening of a file
gives you is that it's possible to read data from it,
and that's basically all. (Well, reading may fail due to the error
in the underlying media but let's not digress further).
What would you do if you did the equivalent of a fstat(2) call,
as suggested, and it told you the file contains 42 terabytes of data?
Would you try to allocate a sufficiently large array to hold its contents?
Would you implement some custom logic which classifies the file's
size into several ranges and performs custom processing based on that—like,
say, slurping files less than N megabytes in length and reading
bigger files piecemeal?
What if the file grew bigger (was appended to) after you obtained its size?
What if you later decide to be a more Unix-way-ready and make it possible
to read the data from your program's standard input stream—like the cat
program on Unix (or its type Windows cousin) does?
You can't know how much data will be piped through that stream;
and potentially it might be of indefinite length (consider being piped
the contents of some busy log file on a continuously running system).
Sure, in some applications you assume the contents of files do not
change under you feet; one example is archivers like zip or tar which
record the file's metadata, including its size, along with the file.
(By the way, tar detects a file might have changed while the program
was reading its contents and warns the user in that case).
But what I'm leading you to, is that for a task as simple as yours,
there's little point in doing it the way you've come up with.
Instead, just use a buffer of some "sensible" size and gateway the data
between its source and destination through that buffer.
That is, you allocate the buffer, enter a loop, and on each iteration of
it you try to read as much data as fits in the buffer, process whatever
the Read function indicated it was able to read, then handle an
end-of-file condition or an error, if it was indicated.
To round up this small crash course, I'd hint that the standard library
already has io.Copy which, in your
case, may be called like
_, err := io.Copy(os.Stdout, f)
and will shovel all the contents of f to the standard output of your
program until EOF or an error is detected.
Last time I checked, this function used an internal buffer of 32 KiB in size,
but you may always check the source code of your Go installation.
I assume what you need is a way to get file size in bytes to create a slice of the same size:
f, err := f.Stat()
// handle error
// ...
size := f.Size()
(see FileInfo for more)
You can then use this size to initialise a slice.
data := make([]byte, size)
You can also consider reading the whole file in one call using ioutil.ReadFile.
I am working on a database flat file project using c language. I have created a structure with few members. I am using fwrite() to write them in binary file and fread() to fetch the data. My two major question
1st can we write structure in text file? I have seen no good example. Is it practically wrong to write it in text format? when I write using "w" instead of "wb" I get the text format but with some extra words.
2nd how these fread() & fwrite works(). They operate on a block of data how they get the address of next block. I mean we do have the pointer but file doesnt have any address so how the pointer go to next block?
1st can we write structure in text file ? i have seen no good example
.is it practically wrong to write it in text format ?when i write
using "w" instead of "wb" i get the text format but with some extra
words
Imagine your structure contains some integers inside. Now if you write them using fwrite, these integers will be written in file in binary format.
If you try to interpret this as text, this won't work, text editor will try to interpret the binary values as characters - which will most likely not work as you expect.
e.g. if your structure contains integer 3, when written using fwrite, it will be stored as
00000000 0000000 0000000 00000011 (3 in binary)
assuming big endian notation. Now if you will try to read above using a text editor, of course you will not get desired effect.
Not saying anything about the padding bytes which maybe inserted in your structure by compiler.
2nd how these fread() & fwrite works(). They operate on a block of
data how they get the address of next block. I mean we do have the
pointer but file doesnt have any address so how the pointer go to next
block?
This is most likely taken care of using OS.
PS. I suggest you read more about serialization, and try to understand difference between text and binary files.
I'm trying to understand Linux (UNIX) low-level interfaces and as an exercise want to write a code which will copy a file with holes into a new file (again with holes).
So my question is, how to read from the first file not till the first hole, but till the very end of the file?
If I'm not mistaken, read() returns 0 when reaches the first hole(EOF).
I was thinking about seeking right byte by byte and trying to read this byte, but then I have to know the number of holes in advance.
If by holes you mean sparse files, then you have to find the holes in the input file and recreate them using lseek when writing the output file. Since Linux 3.1, you can even use lseek to jump to the beginning or end of a hole, as described in great detail in the man page.
As ThiefMaster already pointed out, normal file operations will treat holes simply as sequences of zero bytes, so you won't see the EOF you mention.
For copies of sparse files, from the cp manual;
By default, sparse SOURCE files are detected by a crude heuristic and the corresponding DEST file is made sparse as well. That is the behavior selected by --sparse=auto. Specify --sparse=always to create a sparse DEST file whenever the SOURCE file contains a long enough sequence of zero bytes. Use --sparse=never to inhibit creation of sparse files.
Thus, try --sparse=always if you need to copy a sparse file 'as-is' (still seems affected by an algo)
A file is not presented as if it has any gaps. If your intention is to say that the file has sections on one area of the disk, then more on another, etc., you are not going to be able to see this through a call to open() on that file and a series of read() calls. You would instead need to open() and read() the raw disk instead, seeking to sectors on your own.
If your meaning of "holes" in a file is as #ThiefMaster says, just areas of 0 bytes -- these are only "holes" according to your application use of the data; to the file system they're just bytes in a file, no different than any other. In this case, you can copy it through a simple read of the data source and write to the data target, and you will get a full copy (along with what you're calling holes).
having some issues with a networking assignment. End goal is to have a C program that grabs a file from a given URL via HTTP and writes it to a given filename. I've got it working fine for most text files, but I'm running into some issues, which I suspect all come from the same root cause.
Here's a quick version of the code I'm using to transfer the data from the network file descriptor to the output file descriptor:
unsigned long content_length; // extracted from HTTP header
unsigned long successfully_read = 0;
while(successfully_read != content_length)
{
char buffer[2048];
int extracted = read(connection,buffer,2048);
fprintf(output_file,buffer);
successfully_read += extracted;
}
As I said, this works fine for most text files (though the % symbol confuses fprintf, so it would be nice to have a way to deal with that). The problem is that it just hangs forever when I try to get non-text files (a .png is the basic test file I'm working with, but the program needs to be able to handle anything).
I've done some debugging and I know I'm not going over content_length, getting errors during read, or hitting some network bottleneck. I looked around online but all the C file i/o code I can find for binary files seems to be based on the idea that you know how the data inside the file is structured. I don't know how it's structured, and I don't really care; I just want to copy the contents of one file descriptor into another.
Can anyone point me towards some built-in file i/o functions that I can bludgeon into use for that purpose?
Edit: Alternately, is there a standard field in the HTTP header that would tell me how to handle whatever file I'm working with?
You are using the wrong tool for the job. fprintf takes a format string and extra arguments, like this:
fprintf(output_file, "hello %s, today is the %d", cstring, dayoftheweek);
If you pass the second argument from an unknown source (like the web, which you are doing) you can accidentally have %s or %d or other format specifiers in the string. Then fprintf will try to read more arguments than it was passed, and cause undefined behaviour.
Use fwrite for this:
fwrite(buffer, 1, extracted, output_file);
A couple things with your code:
For fprintf - you are using the data as the second argument, when in fact it should be the format, and the data should be the third argument. This is why you are getting problems with the % character, and why it is struggling when presented with binary data, because it is expecting a format string.
You need to use a different function, such as fwrite, to output the file.
As a side note this is a bit of a security problem - if you fetch a specially crafted file from the server it is possible to expose random areas of your memory.
In addition to Seth's answer: unless you are using a third-party library for handling all the HTTP stuff, you need to deal with the Transfer-Encoding header and the possible compression, or at least detect them and throw an error if you don't know how to handle that case.
In general, it may (or may not) be a good idea to parse the HTTP response headers, and only if they contain exclusively stuff that you understand should you continue to interpret the data that follows the header.
I bet your program is hanging because it's expecting X bytes but receiving Y instead, with X < Y (most likely, sans compression - but PNG don't compress well with gzip). You'll get chunks [*] of data, with one of the chunks most likely spanning content_length so your condition while(successfully_read != content_length) is always true.
You could try running your program under strace or whatever its equivalent is for your OS, if you want to see how your program continues trying to read data it will never get (because you've likely made an HTTP/1.1 request that holds the connection open, and you haven't made a second request) or has ended (if the server closes the connection, your (repeated) calls to read(2) will just return 0, which leaves your (still true) loop condition unchanged.
If you are sending your program's output to stdout, you may find that it produces no output - this can happen if the resource you are retrieving contains no newline or other flush-forcing control characters. Other stdio buffering regimes may apply when output goes to a file. (For example, the file will remain empty until the stdio buffers have accumulates at least 4096 bytes.)
[*] Then there's also Transfer-Encoding: chunked, as #roland-illig alludes to, which will ruin the exact equivalence between content_length (presumably derived from the eponymous HTTP header) and the actual number of bytes transferred over the socket.
You are opening the file as a text file. Doing so means that the program will add \r\n characters at the end of every write() call. Try opening the file as binary, and those errors in size shall go away.
I am reading and writting a structure into a text file which is not readable. I have to write readable data into the file from the structure object.
Here is little more detail of my code:
I am having the code which reads and writes a list of itemname and code into a file (file.txt). The code uses linked list concept to read and write data.
The data are stored into a structure object and then writen into a file using fwrite.
The code works fine. But I need to write a readable data into the text file.
Now the file.txt looks like bellow,
㵅㡸䍏䥔䥆㘸䘠㵅㩃䠀\䵏㵈䑜㵅㡸䍏䥔䥆㘸䘠\㵅㩃䠀䵏㵈䑜㵅㡸䍏䥔䥆㘸䘠㵅㩃䠀䵏㵈\䑜㵅㡸䍏䥔䥆㘸䘠㵅㩃䠀䵏㵈䑜㵅㡸䍏䥔\䥆㘸䘠㵅㩃䠀䵏㵈
I am expecting the file should be like this,
pencil aaaa
Table bbbb
pen cccc
notebook nnnn
Here is the snippet:
struct Item
{
char itemname[255];
char dspidc[255];
struct Item *ptrnext;
};
// Writing into the file
printf("\nEnter Itemname: ");
gets(ptrthis->itemname);
printf("\nEnter Code: ");
gets(ptrthis->dspidc);
fwrite(ptrthis, sizeof(*ptrthis), 1, fp);
// Reading from the file
while(fread(ptrthis, sizeof(*ptrthis), 1, fp) ==1)
{
printf("\n%s %s", ptrthis->itemname,ptrthis->dspidc);
ptrthis = ptrthis->ptrnext;
}
Writing the size of an array that is 255 bytes will write 255 bytes to file (regardless of what you have stuffed into that array). If you want only the 'textual' portion of that array you need to use a facility that handles null terminators (i.e. printf, fprintf, ...).
Reading is then more complicated as you need to set up the idea of a sentinel value that represents the end of a string.
This speaks nothing of the fact that you are writing the value of a pointer (initialized or not) that will have no context or validity on the next read. Pointers (i.e. memory locations) have application only within the currently executing process. Trying to use one process' memory address in another is definitely a bad idea.
The code works fine
not really:
a) you are dumping the raw contents of the struct to a file, including the pointer to another instance if "Item". you can not expect to read back in a pointer from disc and use it as you do with ptrthis = ptrthis->ptrnext (i mean, this works as you "use" it in the given snippet, but just because that snippet does nothing meaningful at all).
b) you are writing 2 * 255 bytes of potential crap to the file. the reason why you see this strange looking "blocks" in your file is, that you write all 255 bytes of itemname and 255 bytes of dspidc to the disc .. including terminating \0 (which are the blocks, depending on your editor). the real "string" is something meaningful at the beginning of either itemname or dspidc, followed by a \0, followed by whatever is was in memory before.
the term you need to lookup and read about is called serialization, there are some libraries out there already which solve the task of dumping data structures to disc (or network or anything else) and reading it back in, eg tpl.
First of all, I would only serialize the data, not the pointers.
Then, in my opinion, you have 2 choices:
write a parser for your syntax (with yacc for instance)
use a data dumping format such as rmi serialization mechanism.
Sorry I can't find online docs, but I know I have the grammar on paper.
Both of those solution will be platform independent, be them big endian or little endian.