I need to concurrently read from a file in different offsets using C.
dup unforunately creates a file descriptor that shares offset and flags with the original.
Is there a function like dup that does not share the offset and flags?
EDIT I only have access to the file pointer FILE* fp; I do not have the file path
EDIT This program is compiled for windows in addition to mac and many flavors of linux
SOLUTION
We can use pread on posix systems, and I wrote a pread function for windows which solves this problem
https://github.com/Storj/libstorj/blob/master/src/utils.c#L227
On Linux, you can recover the filename from /proc/self/fd/N, where N is the integral value of the file descriptor:
sprintf( linkname, "/proc/self/fd/%d", fd );
Then use readlink() on the resulting link name.
If the file has been renamed or deleted, you may be out of luck.
But why do you need another file descriptor? You can use pread() and/or pwrite() on the original file descriptor to read/write from/to the file without affecting the current offset. (caveat: on Linux, pwrite() to a file opened in append mode is buggy - POSIX states that pwrite() to a file opened in append mode will write to the offset specified in the pwrite() call, but the Linux pwrite() implementation is broken and will ignore the offset and append the data to the end of the file - see the BUGS section of the Linux man page)
No, neither C nor POSIX (since you mention dup()) has a function for opening a new, independent file handle based on an existing file handle. As you observed, you can dup() a file descriptor, but the result refers to the same underlying open file description.
To get an independent handle, you need to open() or fopen() the same path (which is possible only if the FILE refers to an object accessible through the file system). If you don't know what path that is, or if there isn't any in the first place, then you'll need a different approach.
Some alternatives to consider:
buffer some or all of the file contents in memory, and read as needed from the buffer to serve your needs for independent file offsets;
build an internal equivalent of the tee command; this will probably require a second thread, and you'll probably not be able to read one file too far ahead of the other, or to seek in either one;
copy the file contents to a temp file with a known name, and open that as many times as you want;
if the FILE corresponds to a regular file, map it into memory and access its contents there. The POSIX function fmemopen() could be useful in this case to adapt the memory mapping to your existing stream-based usage.
On windows (assuming VisualStudio), you can get access to the OS file handle from the stdio FILE handle.
From there, reopen it and convert back to a new FILE handle.
This is windows only, but I think Andrews answer will work for Linux and probably the Mac as well - unfortunately there is no portable way to have it work on all systems.
#include <Windows.h>
#include <fcntl.h>
#include <io.h>
#include <stdio.h>
FILE *jreopen(FILE* f)
{
int n = _fileno(f);
HANDLE h = (HANDLE)_get_osfhandle(n);
HANDLE h2 = ReOpenFile(h, GENERIC_READ, FILE_SHARE_READ, 0);
int n2 = _open_osfhandle((intptr_t)h2, _O_RDONLY);
FILE* g = _fdopen(n2, "r");
return g;
}
I was able to use pread and pwrite on POSIX systems, and I wrapped ReadFile/WriteFile on Windows Systems into pread and pwrite functions
#ifdef _WIN32
ssize_t pread(int fd, void *buf, size_t count, uint64_t offset)
{
long unsigned int read_bytes = 0;
OVERLAPPED overlapped;
memset(&overlapped, 0, sizeof(OVERLAPPED));
overlapped.OffsetHigh = (uint32_t)((offset & 0xFFFFFFFF00000000LL) >> 32);
overlapped.Offset = (uint32_t)(offset & 0xFFFFFFFFLL);
HANDLE file = (HANDLE)_get_osfhandle(fd);
SetLastError(0);
bool RF = ReadFile(file, buf, count, &read_bytes, &overlapped);
// For some reason it errors when it hits end of file so we don't want to check that
if ((RF == 0) && GetLastError() != ERROR_HANDLE_EOF) {
errno = GetLastError();
// printf ("Error reading file : %d\n", GetLastError());
return -1;
}
return read_bytes;
}
ssize_t pwrite(int fd, const void *buf, size_t count, uint64_t offset)
{
long unsigned int written_bytes = 0;
OVERLAPPED overlapped;
memset(&overlapped, 0, sizeof(OVERLAPPED));
overlapped.OffsetHigh = (uint32_t)((offset & 0xFFFFFFFF00000000LL) >> 32);
overlapped.Offset = (uint32_t)(offset & 0xFFFFFFFFLL);
HANDLE file = (HANDLE)_get_osfhandle(fd);
SetLastError(0);
bool RF = WriteFile(file, buf, count, &written_bytes, &overlapped);
if ((RF == 0)) {
errno = GetLastError();
// printf ("Error reading file :%d\n", GetLastError());
return -1;
}
return written_bytes;
}
#endif
Related
If I want to use a physical file along with other types of streams such as a socket, I can simply convert a file handle into a file descriptor:
#include <stdlib.h>
#include <stdio.h>
int main(void) {
FILE *f = fopen("uniquefilename.ext", "w");
int fd = fileno(f);
printf("%d\n", fd);
fclose(f);
return 0;
}
Does the GNU Standard Library provide a way to obtain a physical file's descriptor directly? Something to the effect of:
int fd = some_call("file_name.ext", "mode");
It seems I need to note I am completely aware of how a descriptor is not implicitly bound to any specific file. I was misleading when I wrote "obtain a physical file's descriptor"; what I should have wrote is something like "create a descriptor enabling access to a specific physical file".
It does not.
However, you can use the open function directly! This is part of Linux itself, not the C standard library (technically the C standard library provides a small wrapper to allow you to call it as a C function).
Example usage:
int fd = open("file_name.ext", O_RDWR); // not fopen
// do stuff with fd
close(fd); // not fclose
Note: The man page recommends including <sys/types.h>, <sys/stat.h>, and <fcntl.h>, and for close you need <unistd.h>. That's quite a few headers, and I don't know if they're all necessary.
I am trying to open a file in c using open() and I need to check that the file is a regular file (it can't be a directory or a block file). Every time I run open() my returned file discriptor is 3 - even when I don't enter a valid filename!
Here's what I have
/*
* Checks to see if the given filename is
* a valid file
*/
int isValidFile(char *filename) {
// We assume argv[1] is a filename to open
int fd;
fd = open(filename,O_RDWR|O_CREAT,0644);
printf("fd = %d\n", fd);
/* fopen returns 0, the NULL pointer, on failure */
}
Can anyone tell me how to validate input files?
Thanks!
Try this:
int file_isreg(const char *path) {
struct stat st;
if (stat(path, &st) < 0)
return -1;
return S_ISREG(st.st_mode);
}
This code will return 1 if regular, 0 if not, -1 on error (with errno set).
If you want to check the file via its file descriptor returned by open(2), then try:
int fd_isreg(int fd) {
struct stat st;
if (fstat(fd, &st) < 0)
return -1;
return S_ISREG(st.st_mode);
}
You can find more examples here, (specifically in the path.c file).
You should also include the following headers in your code (as stated on stat(2) manual page):
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
For future reference, here is an excerpt of the stat(2) manpage regarding the POSIX macros available for st_mode field validations:
S_ISREG(m) is it a regular file?
S_ISDIR(m) directory?
S_ISCHR(m) character device?
S_ISBLK(m) block device?
S_ISFIFO(m) FIFO (named pipe)?
S_ISLNK(m) symbolic link? (Not in POSIX.1-1996.)
S_ISSOCK(m) socket? (Not in POSIX.1-1996.)
int isValidFile(char *filename) {
// We assume argv[1] is a filename to open
int fd;
fd = open(filename,O_RDWR|***O_CREAT***,0644);
printf("fd = %d\n", fd);
/* fopen returns 0, the NULL pointer, on failure */
}
you are using 0_CREAT which prompts the function to create if the file doesn't exist.this in the table its number is 3 (0,1,2 being std input std output and std error)
Wrong: check if the file is OK, then if it is, go open it and use it.
Right: go open it. If you can't, report the problem and bail out. Otherwise, use it (checking and reporting errors after each opetation).
Why: you have just checked that a file is OK. That's fine, but you cannot assume it will be OK in 0.000000017 seconds from now. Perhaps the disk wil overheat and break down. Perhaps some other process will mass-delete your entire file collection. Perhaps your cat will trip over the network cable. So let's just check if it's OK again, and then go open it. Wow, what a great idea! No wait...
I'm working on a C application that evaluates data from a USB laser scanner, which acts as a serial device. For testing purpose, I'm also allowing test data to be read from a file, because it is not convenient to always have the scanner connected.
I open the file/device like this:
FILE *fp = fopen(argv[1], "a+b");
And depending on whether I want to read from a file or the device, I pass a file path or something like /dev/cu.usbmodemfd121 (I'm on a Mac).
This works fine as long as I've previously initialized the laser scanner, but I'd rather have my application do that. In order to do that, though, I must first figure out if I'm reading from a file or the device. How can I do that, given the FILE * returned by fopen?
I've tried to use fseek(fp, 1, SEEK_END) which I expected to fail for the scanner, since it's stream doesn't have an "end", but for some reasons fseek does not fail..
You could get the file descriptor using fileno and then do a fstat on it. The struct stat it populates contains thinks like st_mode which shows the type of fd. I am guessing for your non-file device S_ISCHR will be true or at least S_ISREG will be false.
If you have control over it, don't do fopen at all. Use open directly to get the file descriptor and then use fdopen if you really want C streams.
#cnicutar's solution worked just fine. Here's what I ended up with, in case it helps somebody (error checking removed for clarity):
#include <fcntl.h> /* open syscall */
#include <sys/stat.h>
int fd = -1;
int status;
FILE *fp = NULL;
struct stat fd_stat;
bool serial_device = false;
fd = open("/foo/bar/baz", O_RDONLY);
fp = fdopen(fd, "rb");
status = fstat(fd, &fd_stat);
printf("S_ISCHR: %d\n", S_ISCHR(fd_stat.st_mode));
printf("S_ISREG %d\n", S_ISREG(fd_stat.st_mode));
if(!S_ISREG(fd_stat.st_mode)) {
serial_device = true;
}
In Unix, if you have a file descriptor (e.g. from a socket, pipe, or inherited from your parent process), you can open a buffered I/O FILE* stream on it with fdopen(3).
Is there an equivalent on Windows for HANDLEs? If you have a HANDLE that was inherited from your parent process (different from stdin, stdout, or stderr) or a pipe from CreatePipe, is it possible to get a buffered FILE* stream from it? MSDN does document _fdopen, but that works with integer file descriptors returned by _open, not generic HANDLEs.
Unfortunately, HANDLEs are completely different beasts from FILE*s and file descriptors. The CRT ultimately handles files in terms of HANDLEs and associates those HANDLEs to a file descriptor. Those file descriptors in turn backs the structure pointer by FILE*.
Fortunately, there is a section on this MSDN page that describes functions that "provide a way to change the representation of the file between a FILE structure, a file descriptor, and a Win32 file handle":
_fdopen, _wfdopen: Associates a stream with a file that was
previously opened for low-level I/O and returns a pointer to the open
stream.
_fileno: Gets the file descriptor associated with a stream.
_get_osfhandle: Return operating-system file handle associated
with existing C run-time file descriptor
_open_osfhandle: Associates C run-time file descriptor with an
existing operating-system file handle.
Looks like what you need is _open_osfhandle followed by _fdopen to obtain a FILE* from a HANDLE.
Here's an example involving HANDLEs obtained from CreateFile(). When I tested it, it shows the first 255 characters of the file "test.txt" and appends " --- Hello World! --- " at the end of the file:
#include <windows.h>
#include <io.h>
#include <fcntl.h>
#include <cstdio>
int main()
{
HANDLE h = CreateFile("test.txt", GENERIC_READ | GENERIC_WRITE, 0, 0,
OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL, 0);
if(h != INVALID_HANDLE_VALUE)
{
int fd = _open_osfhandle((intptr_t)h, _O_APPEND | _O_RDONLY);
if(fd != -1)
{
FILE* f = _fdopen(fd, "a+");
if(f != 0)
{
char rbuffer[256];
memset(rbuffer, 0, 256);
fread(rbuffer, 1, 255, f);
printf("read: %s\n", rbuffer);
fseek(f, 0, SEEK_CUR); // Switch from read to write
const char* wbuffer = " --- Hello World! --- \n";
fwrite(wbuffer, 1, strlen(wbuffer), f);
fclose(f); // Also calls _close()
}
else
{
_close(fd); // Also calls CloseHandle()
}
}
else
{
CloseHandle(h);
}
}
}
This should work for pipes as well.
Here is a more elegant way of doing this instead of CreateFile: specify "N" in fopen(). It's a Microsoft-specific extension to fopen, but since this code is platform-specific anyway, it's ok. When called with "N", fopen adds _O_NOINHERIT flag when calling _open internally.
Based on this:
Windows C Run-Time _close(fd) not closing file
I'd like to read only what is already in the buffer of a FILE object, so that afterwards the buffer is empty (and I can use things like sendfile which operates on file descriptors). I came up with this function, which seem to work on my 64bit Linux installation:
int readbuf(FILE *stream, char buf[], size_t *size) {
off_t pos = ftello(stream);
if (pos < 0) return -1;
off_t realpos = lseek(fileno(stream), 0, SEEK_CUR);
if (realpos < 0) return -1;
if (pos > realpos) {
errno = EIO;
return -1;
}
size_t bufsize = realpos - pos;
if (bufsize > *size) {
*size = bufsize;
errno = ERANGE;
return -1;
}
*size = bufsize;
if (fread(buf, bufsize, 1, stream) < 1) {
return -1;
}
return 0;
}
Now I wonder, can I assume this to work on other POSIX compliant operating systems? (On systems that provide all the involved functions.)
If the underlying file descriptor is seekable (either a regular file or a block device, unless you have other weird seekable objects on your system...) then there's no point in what you're trying to do. Just use ftello to get the logical position in the FILE, then discard the FILE and use sendfile. Using the already-buffered data in userspace is actually slower than sendfile anyway.
If the underlying file descriptor is not seekable, your whole approach does not work, because lseek will always return -1 and ftello will return EOF. A potential solution in this case:
Use dup to make a new file descriptor referring to the same open file description.
Open /dev/null write-only, and dup2 it on top of the old file descriptor number used by the FILE.
Reading from the FILE will succeed until the buffer is exhausted, then give read errors, since the file descriptor now refers to a non-readable file.
At this point, you're free to read directly from the duplicated fd made in the first step. You're also free to fclose the FILE.
For seekable files on Unix platforms you're supposed to be able to use fflush() to coordinate fd-based use with FILE*-based use, including for reading. The full details are given in http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_05_01 and http://pubs.opengroup.org/onlinepubs/9699919799/functions/fflush.html.
This is an extension over what standard C gives you (unsurprisingly).
I do not believe the stdio API guarantees that this would work on any system. For instance, it might perform readahead if it notices the buffer is empty.
Your "solution" would be at most a specific implementation hack.