I have a piece of code using a FILE* file with a fwrite:
test = fwrite(&object,sizeof(object),1,file);
I want to serialize some internal data structure with an indexing structure (so, I'm using neither Google's Protobuf nor Cap'n Proto, since that is a custom data structure with some specific indexing requirements). Now, inside my project, I want to use Google Test in order to test the serialization, in order to check that what it has been serialized it could be deserialized and easily retrieved, too. In the testing phase, I want to pass to fwrite a FILE* object which is not a file, but a handler to some allocated main memory, so that no file is procuded, and that I can directly check the main memory for the results of the serialization. Is it possible to virtualize the FILE* and to write directly into the main memory? I would like to keep fwrite for writing data structures for performance reasons, without being forced to write two different methods for serialization (Sometimes i'm writing on the fly with no further memory occupation for transcoding). Thanks in advance.
One way is to create a dynamic library with all those fopen/fwrite functions (that would do something for your magic filename and fall back to the original ones otherwise) and load it with LD_PRELOAD. To fall back to the originals, resolve them with "dlsym" and RTLD_NEXT.
Another way is to include a special header at the top of the library/test which would have a statement like "#define fopen my_fopen". Inside the file with the implementation of "my_fopen" you need to put "#undef fopen" before including original "stdio.h". This approach will only work for your source code files that include the header but will not hook the functions for the binary libraries that you link.
fopencookie did the job I was looking for.
http://man7.org/linux/man-pages/man3/fopencookie.3.html
Can I Create FILE Instance (FILE*) by byte[ ] data (on Memory)? Don't Write file.
(C, Linux)
I Need for 'MiniSEED' format data parsing by offical MiniSEED Library.
These Library is supported parsing 'MiniSEED' format packet Data that was written in file.
But I Need to parsing 'MiniSEED' data in Byte[] array directly. don't create real file.
(because I must get 'MiniSEED' data by realtime TCP Packet, continuously
and These Library support only way to parse data by written file.)
So I try to solve the problem Created FILE Instance by byte[] data directly.
I think this solution is best way without changing the library as an easy way.
You can create a FILE handle from in-memory data in Linux, because the Linux C libraries do support fmemopen() from POSIX.1-2008.
Calling fmemopen(buffer, size, "r") yields a read-only FILE handle to an in-memory object containing size bytes at buffer.
However, I don't understand why you'd need such a thing.
The official Mini-SEED library does provide function msr_unpack() (and msr_unpack_data()) to parse Mini-SEED data records.
The functions you are probably looking at using, ms_readmsr() and ms_readtraces() (or their thread-safe variants ms_readmsr_r() and ms_readtraces_r(), just read each record from the file, passing each to msr_unpack() (and in case of traces, to mst_addmsrtogroup() or mstl_addmsr()).
In other words, the library does support parsing in-memory data. Your assertion that it only supports parsing files is clearly incorrect.
The man pages describing the library functions do not seem to be available on the net, but if you download libmseed sources, you can read the library function man pages using man -l libmseed/doc/[function].3.
As a compromise, you might use mmap to create a direct mapping between the memory and the file. This will allow you to update the contents directly (by accessing the memory) and the library may access the same data through the file interface. Under Unix systems, depending upon the size of the data, the file may not actually need to be written to disk. It may reside in the kernel's cache structure for faster access (this happens by default: nothing extra you need to do).
No, there's no portable, standard way of creating a FILE * that represents an in-memory stream of bytes.
The typical solution is to instead make the read and write function(s) hookable, so that instead of hard-coding e.g. read() you make the library call an (optionally) application-supplied function.
So far, I understood:
Files have some information in their 'headers' which programs use to detect the file type.
Is it always true? If so how can I see the header?
Is it the only way of defining a file's type?
Is there a way to manually create an empty file (at least in Linux) with no headers at all?
If so, can I manually write its header and create a simple 'jpg' file
No, files simply have bytes and some metadata like a filename, permissions, last modified time. The format of those bytes is completely free and open without convention. Certainly some file types like jpegs,gifs,audio and video files have headers specified in their formats. Viewing a header depends completely on the format involved. They will normally be comprised of byte codes meaningless to the human eye so some software would normally be required to decode and view them.
yes.
touch emptyFile
Sounds painful. Use a library to write a jpeg. Headers are not necessarily easy to create. Someone else has done this hard work for you so I'd use it.
A file is nothing more than a sequence of bytes, and it has no default, internal structure. It is an abstraction made by the OS to make it more convenient to store and manipulate data.
Files may represent different types of things like images, videos, audio and plain text, so these need to be interpreted in a certain way in order to interact with their contents. For instance; an image is opened in an image viewer, a PDF document is opened in a PDF viewer; an audio file is opened in a media player. That doesn't mean you cannot open an image in a text editor - the file's contents will only be interpreted differently.
The closest thing to file metadata in UNIX and Linux is the inode - which stores information about files, but are not part of the files themselves - and the files's magic number. Use stat to inspect the inode and use file to determine its type (sometimes based on its magic number).
Also check out man file for more information about file types.
see i m writing one avi demuxer library. In which i have exported multiple API to do different functionality. NOw when 1st time one aviopen() is called with i/p filename, i parse whole file & save some info in some structure which i have malloc. now when again any API is called for that file should use that structure's info & do some work.
i DO NOT want to expose those structure to library user. even i DO NOT want no give him the pointer of that structure. In that case how can i keep the track of that structure???
i also want to give multiple file support in my library so if any application wants to open more than one file at a time then he can do that.
so here how can i maintain handle of each opened file for allocated structure?
An opaque pointer is the usual way to implement this.
If you don't want to pass a pointer at all for some reason, you could keep a global ("private") array/hash of your structures, and give your users an index in that global container (could be a plain int). That's much more work (and much more failure-prone and potentially racy) than just handing out opaque pointers though.
I have always been able to read and write basic text files in C++, but so far no one has discussed much more than that.
My question is this:
If developing a file type by myself for use by an application I also create, how would I go about writing the data to a file and preserve the layout, formatting, etc.? Are there any standards, or does it just depend on the creativity of the programmer?
You basically have to come up with your own file format and write binary data.
You can also serialize your object model and write the output to a file, but that's usually less efficient.
Better to use an existing database, or use xml (or other) for simple needs. If you want to write a file in a format that already exists, find a library that supports it.
You have to know the binary file format for the file you are trying to create. Consider Joel's post on this topic: the 97-2003 File Format is a 349 page spec.
Nearly all the time, to do something like that, you use an API, to avoid the grunt work. Be careful however, because trial and error and figuring out "what works" by trial and error can result in an upgrade of the program breaking your code. Plus you have to take into account other operating systems, minor version differences, patches, etc.
There are a number of standards of course. The likely one to use is some flavor of xml since there are libraries and tools that already exist to help you work with it, but nothing is stopping you from inventing your own.
Well you could store the data in a format you could read, but which maintained the integrity of your data (XML or JSON for instance).
Or (shudder) you could come up with your own propriatory binary format, and use that.
you would go at it exactly the same way as you would a text file. writing your data byte by byte, encoded in such a way that when you read the file you know what you are reading.
for a spreadsheet application you could even use a text format (OOXML, OpenDocument) to store presentation and content information.
Or you could define binary datastructures and write that directly to the file.
the choice between text or binary format depends on the application. for a configuration file you may prefer a text file which can be modified outside your app, for a database you will most likely choose a binary format for performance reasons.
See wotsit.org for information on file formats for various file types. Example: You can figure out exactly how to write out a .BMP file and how it is composed.
Writing to a database can be done by using a wrapper class in your language, mainly passing it SQL commands.
If you create a binary file , you can write any file to it . The only drawback is that you have to know exactly where it starts and where it ends .
Use xml (something open, descriptive, and validatable), and stick with the text. There are standards for this sort of thing as well, including ODF
You can open the file as binary, instead of text (how one does this depends somewhat on the platform), from there you can write the data directly out to disk. The only real caveat to this is endianess, which can become an issue when moving the files from one architecture to another (x86 to PPC for instance).
Writing binary data to disk is really no harder than writing text, and really, your creativity is key for how you store the data.
The general problem is usually referred to as serialization of your application state and in your case with a source/target of a file in whatever format makes sense for you. These days the preferred input/output format is XML, and you may want to look into the existing standards in this field. The problem then becomes how do I map from the state of my system to the particular schema. Boost has a serialization framework that you may want to check out.
/Allan
There are a variety of approaches you can take, but in general you'll want some sort of serialization library. BOOST::Serialization, or Google's Protocal Buffers are a good example of these. The basic idea is that you have memory structures (classes and objects) that represent your data, and you want to write that data to a file in a way that can be used to reconstruct those structures again.
If you're hesitant to use a library, you can do it all manually, but realize that you can end up writing a lot of redundant code, or developing your own library. See fopen, fread, fwrite and fclose for a starting point.
A typical binary file format for custom data is an "indexed file format" consisting of
-------
|index|
-------
|data |
-------
Where the index contains records "pointing" to the data.
The index consists of records containing an offset and a size. The offset tells you where in the file the data is stored and the size tells you the size of the data at that offset (i.e. the number of bytes to read).
typedef struct {
size_t offset
size_t size
} Index
typedef struct {
int ID
char First[20]
char Last[20]
char *RandomInfo
} Data
Suppose you want to store 50 records in the file you would create 50 indices and 50 data structures. The 50 index structures would be written to the file first, followed by the 50 data structures.
To read the file you would read in the 50 index structures, then from the data in the read-in index structures you could tell where to "seek" to read the data records.
Look up (fopen, fread, fwrite, fclose, ftell) for functions to read/write the data.
(Sorry my semicolon key doesn't work)
You usually use a third party library for these things. For example, you would link in a database library for say Oracle that would allow you to talk to the database. Because the underlying file type, ( i.e. Excel spreadsheet vs Openoffice, Oracle vs MySQL, etc. ) differ these libraries abstract away your need to care how the file is constructed.
Hope that helps you find what you're looking for!
1985 called, and said they have some help IFF you are willing to read up. The interchange file format is still in use today and provides some basic metadata around binary files, such as RIFF or WAV audio. (Unfortunately, TIFF is a false friend.) It allegedly even inspired PNG, so it can't be that bad.