Binary compatibility of FILE* - c

I am designing C library which does some mathematical calculations. I need to specify serialization interface to be able to save and then load some data. The question is, is it correct (from binary compatibility point of view) to use FILE* pointer in the public API of library?
Target platfoms are:
Linux x86, x86_64 with gcc >= 3.4.6
Windows x86, x86_64 >= WinXP with VS >= 2008sp1
I need to be as much binary compatible as it possible, so at the moment my variant is the following:
void SMModuleSave(SMModule* module, FILE* dest);
SMModule* SMModuleLoad(FILE* src);
So I am curious if it is correct to use FILE* or better switch to wchar*/char* ?

I don't agree with ThiefMaster: there's no benefit in going native (ie using file descriptors of type int on linux and handles of type void * on windows) when there's an equivalent portable solution.
I'd probably go with FILE * instead of opening the files by name from within the library: It might be more of a hassle for library users, but it's also more flexible as most libc implementations provide various ways for file opening (fopen(), _wfopen(), _fdopen(), fdopen(), fmemopen(),...) and you don't have to maintain seperate wide-char APIs yourself.

I'd use neither but let the user pass a file descriptor as an int.
Then you can fdopen() it in your code to get a FILE*.
However, when using windows it might not be the best solution even though it does have some helper functions to get a numeric file descriptor.
However, passing a FILE* or a const char* should be fine, too. I'd prefer passing a filename as it's less code to write if a library takes care of opening/closing a file.

Yes it is correct, from a stable binary interface perspective, to use FILE * here. I think perhaps you're confusing this with using FILE as opposed to a pointer to it. Notice that your standard library's fopen, fgets, etc. functions all use (both as arguments and return values) the FILE * type as part of their public interfaces.

A FILE * is a standard ANSI/ISO C89 and C99 (even K&R) type. It is a portability dream and I'd prefer it over anything else. You're safe to go with it. It won't get any better than that.

Related

Is it possible to write fopen(), fscanf() or fprintf() functions using only C?

I have no issues with the library functions. I know that they work well. I am interested in their implementation. My question is: Can I write working versions of these functions for Windows x64 using only C?
Many of the stnadard library functions are written in C, and fopen, fread etc. are no exception. You can write a wrapper around open, read, write etc. which are usually lower level functions.
If those are not available, you can also do the same, calling the respective OS functions and wrapping them with your own implementation, you just have to make sure that they are complying to the standard.
Just as an example you can find a source for fopen here.

Is it possible to use functions that acts on FILE* on custom structures?

Very often I see libraries that implements their own stream functionalities, instead of using FILE*. The typical interface will have a close function, similar to fclose(), and several open functions, one of which usually mimics fopen() and one of which usually accepts a few callbacks that should be used to open/close the stream, read to/write from the stream.
As a reference, good examples of what I am talking about are
http://www.xmlsoft.org/xmlio.html or https://developer.gnome.org/gio/.
The approach, in general, seems very straightforward to me, however these libraries do not usually implement a replacement for all the functions in the standard library (e.g., fscanf(), fprintf(), ...).
Thus I wonder if an extension mechanism is available for standard library FILE* as well (e.g.: opening by providing callbacks for some low-level required functionalities). I was not able to find any reference about this capability, so I guess it is not part of any standard.
Anyway, here is my question: is this functionality available in the C standard library (any standard is fine, as long as it is portable)? If not is there any easy (i.e., it does not require to re-implement the whole stdio.h functions) option that allows to implement it on top of the standard library?
It depends on the C library you're using. Glibc, for example, supports custom streams through fopencookie (further documentation here). FreeBSD (and probably other BSDs as well, including OS X) have funopen. It doesn't look like Microsoft's C library supports such a feature.

string FILE stdio compatible?

Is there anything like a string file in stdio/string/stdlib ? I mean a special way to fopen a FILE stream, which actually directs the writes to an internal buffer and takes care of buffer allocation/reallocation ? After fclose, the text should be available as null-terminated char[] or similar.
I need to interface to legacy code that receives a FILE* as an argument and writes to it, and I'd prefer to avoid writing to a temporary disk file.
Other forms of storage could do instead of char[] (f.i. string), but a FILE* pointer must be available.
I am looking for an alternative to creating a temporary disk file.
fmemopen & open_memstream are in the POSIX 2008 standard, probably inspired by GNU libc string streams, and give in-memory FILE* streams.
See also this question quite similar to yours, and also that answer.
BTW, many operating systems have RAM based or virtual memory based filesystems (à la tmpfs)
If you are coding in C++11 (not in C) and perhaps for some earlier C++ standard you can of course use std::stringstream-s
So you could use open_memstream on Posix, and some other solution on Windows (just with #if _POSIX_C_SOURCE > 200809L per feature_test_macros(7) ...)
The C standard does not provide (yet) any in-memory FILE streams, so if you need them you have to code or use platform-specific functions.
Create the temporary file using CreateFile(... FILE_ATTRIBUTE_TEMPORARY, FILE_FLAG_DELETE_ON_CLOSE ...) and then convert the HANDLE to FILE*.
You said you didn't like a write to a temporary file, so these flags to CreateFile are a strong hint to Windows to keep the file in cache if possible. And if Windows would run of of RAM, even a char[] can end up in a swap file anyway.

How to deal with Unicode paths in a cross-platfrom C library?

I'm contributing to a C library. It has a function that takes a char* parameter for a file path name. The authors are mostly UNIX developers, and this works fine on unixes where char* mostly means UTF-8. (At least in GCC, the character set is configurable and UTF-8 is the default.)
However, char* means ANSI on Windows, which implies that it is currently impossible to use Unicode path names with this library on Windows, where wchar_t* should be used and only UTF-16 is supported. (A quick search on StackOverflow reveals that the ANSI Windows API functions can not be used with UTF-8.)
The question is, what is the right way to deal with this? We've come up with various ways to do it, but neither of us are Windows experts, so we can't really decide how to do it properly. Our goal is that the users of the library should be able to write cross-platform code that would work on unixes as well as windows.
Under the hood, the library has #ifdefs in place to differentiate between operating systems so that it can use POSIX functions on UNIXes and Win32 APIs on Windows.
So far, we've come up with the following possibilities:
Offer a separate windows-only function that accepts a wchar_t*.
Require UTF-16 on Windows and #ifdef the library header in such a way that the function would accept wchar_t* on Windows.
Add a flag that would tell the function to cast the given char* to wchar_t* and call the widechar Windows APIs.
Create a variant of the function that takes a file descriptor (or file handle on Windows) instead of a file path.
Always require UTF-8 (even on Windows), and then inside the function, convert UTF-8 to UTF-16 and call the widechar Windows APIs.
The problem with options 1-4 is that they would require the user to consciously take care of portability themselves. Option 5 sounds good, but I'm not sure if this is the right way to go.
I'm also open to other suggestions or ideas that can solve this. :)
Since portability is an important goal for you, I think it is imperative for your function semantics to be precisely defined. Among other things, that means that the arguments' types and meanings don't vary across platforms. So, if you have a function that accepts regular char based paths then it should accept such paths on all systems, and the encoding expected of those paths should be well-defined (which does not necessarily mean "the same"). That rules out options (2) and (3).
Moreover, portability requires the same functions to be usable across all platforms; that rules out (1). Option (4) could be ok if a stream- and/or file descriptor-based approach were the only one provided by your library, but it yields portability only with respect to those functions, not with respect to the path-based ones. (And note that stream (FILE *) APIs are defined by C, whereas file descriptors are a POSIX concept, not native to C. In principle, therefore, streams are more portable than file descriptors.)
(5) could work, but it places stronger constraints than you actually need. It is not essential for the function to define the encoding expected (though it can); it suffices for it to define how that encoding is determined.
Additionally, you could add wchar_t-based functions that work everywhere (as opposed to Windows-only). Those might be more convenient for Windows users. Similar to alternative (4), however, that provides portability only with respect to those functions. Supposing that you don't want to drop the char-based ones, you would need to pair this alternative with some variation on (5).

Alternative declaring the file pointer instead FILE * pointer in C

I am looking for a alternative for declaring the file pointer without using "FILE * pointer", so I mean without using the IO standard library function, maybe as a function.
With the library its easy but how could the function look like without the library.
Can somebody give an advice?
glibc includes a low-level file API for handling files with int file descriptors.
In fact, I think if you were to cast a FILE* to an integer in the glibc implementation, you would obtain a file descriptor usable with the low-level file API, and vise versa.

Resources