How exactly _fsopen() works?

How exactly _fsopen() works? - c

How exactly _fsopen() works? Does Linux also has similar way of opening files which prepares the file for subsequent shared reading or writing based on shflag?
Referred article here.

How exactly _fsopen() works?
You've linked to the docs. It does what they say it does. If you're asking how it is implemented then we cannot answer because that information is proprietary.
and Does linux also has similar way of opening files which prepares the file for subsequent shared reading or writing based on shflg?
Linux does not have share modes. That's a Windows quirk. Under Linux or other Unix-like operating systems such as macOS, you don't need special flags or modes to share files between processes.
Overall, _fsopen() is an MS-specific variant of the C standard library's fopen() function. In addition to the share-mode flag, which is not relevant to other operating systems, it performs parameter validation in the manner of various other MS extension functions. On Linux, one takes responsibility for validating one's own arguments and simply uses fopen().

On Windows files are opened using the CreateFileW function which uses the NtCreateFile system call.
Argument dwShareMode is used to specify file sharing policy and contains combination of flags FILE_SHARE_DELETE, FILE_SHARE_READ and FILE_SHARE_WRITE which are mapped to shflag argument of _fsopen.
If you want to know how possible implementation of the function can look like, then first you should keep in mind that MSVCRT tries to support to some equivalent of POSIX file descriptor API. Then check the following functions:
_open_osfhandle allows you to convert NT HANDLE to POSIX-like file descriptor
_fdopen allows you to get a FILE * from a file descriptor (equivalent of POSIX fdopen function).
So the possible implementation can look like this (in pseudo code):
FILE *_fsopen(...)
{
HANDLE hFile = CreateFileW(...);
int fd = _open_osfhandle(hFile, ...);
return _fdopen(fd, ...);
}
Linux doesn't provide an equivalent of file sharing policy, so there is no equivalent.
PS: Another related function is _wsopen - combines CreateFileW and _open_osfhandle.

Related

What is relationship between some C and Unix functions

For example, in C, we have fopen, and in Unix, we have open. There are some subtle differences between them, but they are doing the same thing.
There are also many other functions that both existing in C and Unix, what is the relationship between them? Which one should I prefer?

open is a system call from Unix systems.
fopen is the standard c function to open a file.
There's some advantages of using fopen rather than open.
It's mult-platform, as it's C standard, you can port your program to any platform with a C compiler.
It supports use of C standard functions, (i.e: fprintf, fscanf)
If you are handling with text files, those functions can deal with different new lines characters (Unix/Windows)

fopen(3) is returning a FILE* on success, but open(2) is returning a file descriptor on success, so they are not doing the same (since not giving the same type).
However, on Linux, fopen is internally using the open system call (and some others too...).
<stdio.h> file handles are handling buffering. With system calls like open and read you'll better do your own buffering.
See also this & that and read Advanced Linux Programming & syscalls(2). Be aware that on Linux, from the user-land application point of view, a system call is essentially an atomic elementary operation (e.g. the SYSCALL or SYSENTER machine instruction).
Use strace(1) to find out which system calls are executed (by a given process or command).
On Linux, the libc is implementing standard functions (like fprintf ....) above system calls.
Many system calls don't have any libc counterpart (except their wrapper), e.g. poll(2)

Is it possible to fake a file stream, such as stdin, in C?

I am working on an embedded system with no filesystem and I need to execute programs that take input data from files specified via command like arguments or directly from stdin.
I know it is possible to bake-in the file data with the binary using the method from this answer: C/C++ with GCC: Statically add resource files to executable/library but currently I would need to rewrite all the programs to access the data in a new way.
Is it possible to bake-in a text file, for example, and access it using a fake file pointer to stdin when running the program?

If your system is an OS-less bare-metal system, then your C library will have "retargetting" stubs or hooks that you need to implement to hook the library into the platform. This will typically include low-level I/O functions such as open(), read(), write(), seek() etc. You can implement these as you wish to implement the basic stdin, stdout, stderr streams (in POSIX and most other implementations they will have fixed file descriptors 0, 1 and 2 respectively, and do not need to be explicitly opened), file I/O and in this case for managing an arbitrary memory block.
open() for example will be passed a file or device name (the string may be interpreted any way you wish), and will return a file descriptor. You might perhaps recognise "cfgdata:" as a device name to access your "memory file", and you would return a unique descriptor that is then passed into read(). You use the descriptor to reference data for managing the stream; probably little more that an index that is incremented by the number if characters read. The same index may be set directly by the seek() implementation.
Once you have implemented these functions, the higher level stdio functions or even C++ iostreams will work normally for the devices or filesystems you have supported in your low level implementation.

As commented, you could use the POSIX fmemopen function. You'll need a libc providing it, e.g. musl-libc or possibly glibc. BTW for benchmarking purposes you might install some tiny Linux-like OS on your hardware, e.g. uclinux

What is the meaning of low-level I/O? How do I implement it in this program?

I need to write a C program that accepts three command line arguments:
input file one
input file two
name of output file
The program needs to read the data in from files 1 and 2 and concatenate the first file followed by the second file, resulting in the third file.
This seems like it should be pretty easy, but one of the stipulations of the assignment is to only use low-level I/O.
What exactly does that mean (low-level I/O)?

To answer the only question (what is low-level I/O) it probably means operating system native input/output functions.
In POSIX this would be e.g. open(), close(), read() and write().
On Windows e.g. CreateFile(), CloseHandle(), ReadFile() and WriteFile().

Low level basically stands for OS level. This can be done by using System calls.
Application developers often do not have direct access to the system calls, but can access them through an application programming interface (API). The functions that are included in the API invoke the actual system calls. By using the API, certain benefits can be gained:
Portability: as long a system supports an API, any program using that API can compile and run.
Ease of Use: using the API can be significantly easier then using the actual system call.
For more information on system calls have a look here ,here and here.
For your program have a look here.

Using `read` system call on a directory

I was looking at an example in K&R 2 (8.6 Example - Listing Directories). It is a stripped down version of Linux command ls or Windows' dir. The example shows an implementation of functions like opendir, readdir. I've tried and typed the code word-by-word but it still doesn't work. All it does is that it prints the a dot (for the current directory) and exits.
One interesting thing I found in the code (in the implementation of readdir) was that it was calling the system calls like open and read on directory. Something like -
int fd, n;
char buf[1000], *bufp;
bufp = buf;
fd = open("dirname", O_RDONLY, 0);
n = read(fd, bufp, 1000);
write(fd, bufp, n);
When I run this code I get no output even when the folder name "dirname" has some files in it.
Also, the book says, that the implementation is for Version 7 and System V UNIX systems. Is that the reason why it is not working on Linux?
Here is the code- http://ideone.com/tw8ouX.
So does Linux not allow read system calls on directories? Or something else is causing this?

In Version 7 UNIX, there was only one unix filesystem, and its directories had a simple on-disk format: array of struct direct. Reading it and interpreting the result was trivial. A syscall would have been redundant.
In modern times there are many kinds of filesystems that can be mounted by Linux and other unix-like systems (ext4, ZFS, NTFS!), some of which have complex directory formats. You can't do anything sensible with the raw bytes of an arbitrary directory. So the kernel has taken on the responsibility of providing a generic interface to directories as abstract objects. readdir is the central piece of that interface.
Some modern unices still allow read() on a directory, because it's part of their history. Linux history began in the 90's, when it was already obvious that read() on a directory was never going to be useful, so Linux has never allowed it.
Linux does provide a readdir syscall, but it's not used very much anymore, because something better has come along: getdents. readdir only returns one directory entry at a time, so if you use the readdir syscall in a loop to get a list of files in a directory, you enter the kernel on every loop iteration. getdents returns multiple entries into a buffer.
readdir is, however, the standard interface, so glibc provides a readdir function that calls the getdents syscall instead of the readdir syscall. In an ordinary program you'll see readdir in the source code, but getdents in the strace. The C library is helping performance by buffering, just like it does in stdio for regular files when you call getchar() and it does a read() of a few kilobytes at a time instead of a bunch of single-byte read()s.
You'll never use the original unbuffered readdir syscall on a modern Linux system unless you run an executable that was compiled a long time ago, or go out of your way to bypass the C library.

In fact Linux dosn't allow read for directories. See man page and search for errno EISDIR. You will find
The read() and pread() functions shall fail if ...
The fildes argument refers to a directory and the implementation does not allow the directory to be read using read() or pread(). The readdir() function should be used instead.
. Other UNIXes allow it nevertheless.

what is a file handle and where it is useful for a programmer?

I am learning assembly language along with C. this new chapter I started talks about 'file handles', file handles for screen display and file handles for keyboard input etc. I don't know what is a file handle? I am referring to IBM PC ASSEMBLY LANGUAGE PROGRAMMING by Peter Abel

There is a generic concept generally called a "handle" in the context of computer software APIs. In the comments you have probably found a link to the Wikipedia article on that subject.
You are dealing with a specific implementation of a handle data type -- the IBM PC/DOS file handles returned from the int 0x21 interface. If you would like to learn more about these specific file handles, you might want to consult the book Undocumented DOS, which details the in-memory data structures which allow you to investigate these handles further.
Another specific type of handle is the file descriptor returned from the POSIX-standard interface named open(). This function is implemented in the C run-time library on platforms such as Linux, Windows NT, Mac OS, and many other systems. The integer returned from a call to open() may not be a negative number.
Unless you are running under DOS, your file handles are probably provided by the Windows NT Operating System. These file handles are returned from CreateFile() (which is used to open as well as create files), and the only illegal value for a handle returned from this function is INVALID_HANDLE_VALUE. I.e., the Windows NT API may return what would be considered (via casting) a "negative" integer, although it has opened the file.
In all of these cases, the file handle is used to refer to some data structure that keeps track of how the file is open. One important thing which is tracked is the current file position. The position or pointer is set in POSIX by the lseek() function and is read by the tell() function. Any read() or write() takes place from the position of the current file pointer.
Your program can open the same file under two different handles. In this case, the file pointer for each handle is distinct. Updating the file pointer of one handle using lseek() will not affect the file pointer of the other handle to the same file.

A file handle is an integer value which is used to address an open file. Such handles are highly operating system specific, but on systems that support the open() call, you create a handle like this:
int handle = open( "foo.txt", OTHER_STUFF_HERE );
You can then use the handle with read/write calls. The non-portability of handles mean that most people avoid them and instead use the stream library functions in C, such as fopen, fread, fwrite etc.

A handle is something the kernel uses internally to access some resource. Only the kernel really knows what it means, the user process is only told what value to use when it wants to access this resource. They have another advantage in that file handles can be shared among processes - whereas you can't do this with pointers.
Windows uses handles all over the place... files, bitmaps, device contexts, fonts, etc.