pread for very large files - c

I am reading a large file using pread as follows:
ssize_t s = pread(fd, buff, count, offset);
if (s != (ssize_t) count)
fprintf(stderr, "s = %ld != count = %ld\n", s, count);
assert(s == (ssize_t ) count);
The above code has been working fine for small files (upto 1.5GB). However, for large file sizes, the returned number of bytes is different than the expected count.
In particular, for 2.4GB file size, my count is set to 2520133890 and the assertion fails with the fprintf saying:
s = 2147479552 != count = 2520133890
What makes this puzzling is that I am working on a 64-bit system and hence, sizeof(ssize_t) = 8.
What is the cause of this failure and how do I resolve this so that I can read the whole file in one go?

Looks like you use linux, and magic number return by pread is 2147479552 = 0x7ffff000, so the answer is in man 2 read:
On Linux, read() (and similar system calls) will transfer at
most 0x7ffff000 (2,147,479,552) bytes, returning the number of bytes
actu‐ ally transferred. (This is true on both 32-bit and 64-bit
systems.)
So you need at least twice to call pread to get your data,
this restriction not related to _FILE_OFFSET_BITS=64, O_LARGEFILE, sizeof(off_t) and etc things,
this restriction is create by rw_verify_area in linux kernel:
/*
* rw_verify_area doesn't like huge counts. We limit
* them to something that fits in "int" so that others
* won't have to do range checks all the time.
*/
int rw_verify_area(int read_write, struct file *file, const loff_t *ppos, size_t count)
...
return count > MAX_RW_COUNT ? MAX_RW_COUNT : count;
#define MAX_RW_COUNT (INT_MAX & PAGE_CACHE_MASK)

From your description it sounds like you're doing a 32-bit build, and you haven't enabled the Large File Support (LFS). In order to do this, you need to set the macro _FILE_OFFSET_BITS to the value 64.
So, please double-check that you're really doing a 64-bit build like you say.. EDIT: Ok I believe you are indeed using a 64-bit system.
I think the correct cause of your problem, as pointed out in the answer https://stackoverflow.com/a/36568630/75652 , is explained in the read(2) man page: http://man7.org/linux/man-pages/man2/read.2.html . In order to handle this, you need code like
bytes_left = count;
while (bytes_left > 0)
{
trans = pread (fd, buff, bytes_left, offset);
if (trans == -1)
{
if (errno == EINTR)
continue;
else
return trans;
}
buff += trans;
bytes_left -= trans;
offset += trans;
}
return count - bytes_left;

Related

Read/write exactly N bytes from/to file descriptor with C on Unix

I know that read/write C functions from <unistd.h> are not guaranteed to read/write exactly N bytes as requested by size_t nbyte argument (especially for sockets).
How to read/write full buffer from/to a file(or socket) descriptor?
That read() and write() do not guarantee to transfer the full number of bytes requested is a feature, not a shortcoming. If that feature gets in your way in a particular application then it is probably better to use the the existing facilities of the standard library to deal with it than to roll your own (though I certainly have rolled my own from time to time).
Specifically, if you have a file descriptor on which you want to always transfer exact numbers of bytes then you should consider using fdopen() to wrap it in a stream and then performing I/O with fread() and fwrite(). You might also use setvbuf() to avoid having an intermediary buffer. As a possible bonus, you can then also use other stream functions with that, such as fgets() and fprintf().
Example:
int my_fd = open_some_resource();
// if (my_fd < 0) ...
FILE *my_file = fdopen(my_fd, "r+b");
// if (my_file == NULL) ...
int rval = setvbuf(my_file, NULL, _IONBF, 0);
// if (rval != 0) ...
Note that it is probably best to thereafter use only the stream, not the underlying file descriptor, and that is the main drawback of this approach. On the other hand, you can probably allow the FD to be lost, because closing the stream will also close the underlying FD.
Nothing particularly special is required to make fread() and fwrite() to transfer full-buffer units (or fail):
char buffer[BUF_SIZE];
size_t blocks = fread(buffer, BUF_SIZE, 1, my_file);
// if (blocks != 1) ...
// ...
blocks = fwrite(buffer, BUF_SIZE, 1, my_file);
// if (blocks != 1) ...
Do note that you must get the order of the second and third arguments right, however. The second is the transfer unit size, and the third is the number of units to transfer. Partial units will not be transferred unless an error or end-of-file occurs. Specifying the transfer unit as the full number of bytes you want to transfer and asking (therefore) for exactly one unit is what achieves the semantics you ask about.
You use a loop.
For example, with proper error checking:
/** Read a specific number of bytes from a file or socket descriptor
* #param fd Descriptor
* #param dst Buffer to read data into
* #param minbytes Minimum number of bytes to read
* #param maxbytes Maximum number of bytes to read
* #return Exact number of bytes read.
* errno is always set by this call.
* It will be set to zero if an acceptable number of bytes was read.
* If there was
and to nonzero otherwise.
* If there was not enough data to read, errno == ENODATA.
*/
size_t read_range(const int fd, void *const dst, const size_t minbytes, const size_t maxbytes)
{
if (fd == -1) {
errno = EBADF;
return 0;
} else
if (!dst || minbytes > maxbytes) {
errno = EINVAL;
return 0;
}
char *buf = (char *)dst;
char *const end = (char *)dst + minbytes;
char *const lim = (char *)dst + maxbytes;
while (buf < end) {
ssize_t n = read(fd, buf, (size_t)(lim - buf));
if (n > 0) {
buf += n;
} else
if (n == 0) {
/* Premature end of input */
errno = ENODATA; /* Example only; use what you deem best */
return (size_t)(buf - (char *)dst);
} else
if (n != -1) {
/* C library or kernel bug */
errno = EIO;
return (size_t)(buf - (char *)dst);
} else {
/* Error, interrupted by signal delivery, or nonblocking I/O would block. */
return (size_t)(buf - (char *)dst);
}
}
/* At least minbytes, up to maxbytes received. */
errno = 0;
return (size_t)(buf - (char *)dst);
}
Some do find it odd that it clears errno to zero on successful calls, but it is perfectly acceptable in both standard and POSIX C.
Here, it means that typical use cases are simple and robust. For example,
struct message msgs[MAX_MSGS];
size_t bytes = read_range(fd, msgs, sizeof msgs[0], sizeof msgs);
if (errno) {
/* Oops, things did not go as we expected. Deal with it.
If bytes > 0, we do have that many bytes in msgs[].
*/
} else {
/* We have bytes bytes in msgs.
bytes >= sizeof msgs[0] and bytes <= sizeof msgs.
*/
}
If you have a pattern where you have fixed or variable sized messages, and a function that consumes them one by one, do not assume that the best option is to try and read exactly one message at a time, because it is not.
This is also why the above example has minbytes and maxbytes instead of a single exactly_this_many_bytes parameter.
A much better pattern is to have a larger buffer, where you memmove() the data only when you have to (because you're running out of room, or because the next message is not sufficiently aligned).
For example, let's say you have a stream socket or file descriptor, where each incoming message consists of a three byte header: the first byte identifies the message type, and the next two bytes (say, less significant byte first) identify the number of data payload bytes associated with the message. This means that the maximum total length of a message is 1+2+65535 = 65538 bytes.
For efficiently receiving the messages, you'll use a dynamically allocated buffer. The buffer size is a software engineering question, and other than that it has to be at least 65538 bytes, its size – and even whether it should grow and shrink dynamically – depends on the situation. So, we'll just assume that we have unsigned char *data; pointing to a buffer of size size_t size; already allocated.
The loop itself could look something like the following:
size_t head = 0; /* Offset to current message */
size_t tail = 0; /* Offset to first unused byte in buffer */
size_t mlen = 0; /* Total length of the current message; 0 is "unknown"*/
while (1) {
/* Message processing loop. */
while (head + 3 <= tail) {
/* Verify we know the total length of the message
that starts at offset head. */
if (!mlen)
mlen = 3 + (size_t)(data[head + 1])
+ (size_t)(data[head + 2]) << 8;
/* If that message is not yet complete, we cannot process it. */
if (head + mlen > tail)
break;
/* type datalen, pointer to data */
handle_message(data[head], mlen - 3, data + head + 3);
/* Skip message in buffer. */
head += mlen;
/* Since we do not know the length of the next message,
or rather, the current message starting at head,
we do need to reset mlen to "unknown", 0. */
mlen = 0;
}
/* At this point, the buffer contains less than one full message.
Whether it is better to always move a partial leftover message
to the beginning of the buffer, or only do so if the buffer
is full, depends on the workload and buffer size.
The following one may look complex, but it is actually simple.
If the current start of the buffer is past the halfway mark,
or there is no more room at the end of the buffer, we do the move.
Only if the current message starts in the initial half, and
when there is room at the end of the buffer, we leave it be.
But first: If we have no data in the buffer, it is always best
to start filling it from the beginning.
*/
if (head >= tail) {
head = 0;
tail = 0;
} else
if (head >= size/2 || tail >= size) {
memmove(data, data + head, tail - head);
tail -= head;
head = 0;
}
/* We do not have a complete message, but there
is room in the buffer (assuming size >= 65538),
we need to now read more data into the buffer. */
ssize_t n = read(sourcefd, data + tail, size - tail);
if (n > 0) {
tail += n;
/* Check if it completed one or more messages. */
continue;
} else
if (n == 0) {
/* End of input. If buffer is empty, that's okay. */
if (head >= tail)
break;
/* Ouch: We have partial message in the buffer,
but there will be no more incoming data! */
ISSUE_WARNING("Discarding %zu byte partial message due to end of input.\n", tail - head);
break;
} else
if (n != -1) {
/* This should not happen. If it does, it is a C library
or kernel bug. We treat it as fatal. */
ISSUE_ERROR("read() returned %zd; dropping connection.\n", n);
break;
} else
if (errno != EINTR) {
/* Everything except EINTR indicates an error to us; we do
assume that sourcefd is blocking (not nonblocking). */
ISSUE_ERROR("read() failed with errno %d (%s); dropping connection.\n", errno, strerror(errno));
break;
}
/* The case n == -1, errno == EINTR usually occurs when a signal
was delivered to a handler using this thread, and that handler
was installed without SA_RESTART. Depending on what kind of
a device or socket sourcefd is, there could be additional cases;
but in general, it just means "something unrelated happened,
but you were to be notified about it, so EINTR you get".
Simply put, EINTR is not really an error, just like
EWOULDBLOCK/EAGAIN is not an error for nonblocking descriptors,
they're just easiest to treat as an "error-like situation" in C.
*/
}
/* close(sourcefd); */
Note how the loop does not actually try to read any specific amount of data? It just reads as much as it can, and processes it as it goes.
Could one read such messages precisely, by first reading exactly the three-byte header, then exactly the data payload? Sure, but that means you make an awful amount of syscalls; at minimum two per message. If the messages are common, you probably do not want to do that because of the syscall overhead.
Could one use the available buffer more carefully, and remove the type and data payload length from the next message in the buffer as soon as possible? Well, that is the sort of question one should discuss with colleagues or developers having written such code before. There are positives (mainly, you save three bytes), and negatives (added code complexity, which always makes code harder to maintain long term, and risks introducing bugs). On a microcontroller with just 128 bytes of buffer for incoming command messages, I probably would do that; but not on a desktop or server that prefers a few hundred kilobytes to a couple of megabytes of buffer for such code (since the memory "waste" is often covered by the smaller number of syscalls especially when processing lots of messages). No quick answers! :)-
Both read and write on success return ssize_t containing amount of bytes read/written. You can use it to construct a loop:
A reliable read():
ssize_t readall(int fd, void *buff, size_t nbyte) {
size_t nread = 0; size_t res = 0;
while (nread < nbyte) {
res = read(fd, buff+nread, nbyte-nread);
if (res == 0) break;
if (res == -1) return -1;
nread += res;
}
return nread;
}
A reliable write() (almost same):
ssize_t writeall(int fd, void *buff, size_t nbyte) {
size_t nwrote = 0; size_t res = 0;
while (nwrote < nbyte) {
res = write(fd, buff+nwrote, nbyte-nwrote);
if (res == 0) break;
if (res == -1) return -1;
nwrote += res;
}
return nwrote;
}
Basically it reads/writes until total amount of bytes != nbyte.
Please note, this answer uses only <unistd.h> functions, assuming there is a reason to use it. If you can use <stdio.h> too, see answer by John Bollinger, which uses fdopen;setvbuf and then fread/fwrite. Also, take a look at answer by Blabbo is Verbose for read_range function with a lot of features.

How to fix a segmentation fault for Ansi C Tcp client?

I'm trying to expand an example of a Tcp client developed using Ansi C, following the book "TCP/IP Sockets in C". The client connects to a Tcp Server providing strings of different lengths depending on the request provided by the client (I developed my own simple protocol). When the returned strings are short in length, everything works fine. When they're over a certain length (it happens for example with 4KB), the client crashes with a Segmentation Fault error.
The socket is handled using a wrapper to stream the i/o:
FILE *str = fdopen(sock, "r+"); // Wrap for stream I/O
And the transmission and reception are handled using fwrite() and fread().
This is the call that generates the error in my project (the caller):
uint8_t inbuf[MAX_WIRE_SIZE];
size_t respSize = GetNextMsg(str, inbuf, MAX_WIRE_SIZE); // Get the message
And this is the implementation of the GetNextMsg() function, that use to receive the data and unframe it:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <netinet/in.h>
#include "Practical.h"
/* Read 4-byte length and place in big-endian order.
* Then read the indicated number of bytes.
* If the input buffer is too small for the data, truncate to fit and
* return the negation of the *indicated* length. Thus a negative return
* other than -1 indicates that the message was truncated.
* (Ambiguity is possible only if the caller passes an empty buffer.)
* Input stream is always left empty.
*/
uint32_t GetNextMsg(FILE *in, uint8_t *buf, size_t bufSize)
{
uint32_t mSize = 0;
uint32_t extra = 0;
if (fread(&mSize, sizeof(uint32_t), 1, in) != 1)
return -1;
mSize = ntohl(mSize);
if (mSize > bufSize)
{
extra = mSize - bufSize;
mSize = bufSize; // Truncate
}
if (fread(buf, sizeof(uint8_t), mSize, in) != mSize)
{
fprintf(stderr, "Framing error: expected %d, read less\n", mSize);
return -1;
}
if (extra > 0)
{ // Message was truncated
uint32_t waste[BUFSIZE];
fread(waste, sizeof(uint8_t), extra, in); // Try to flush the channel
return -(mSize + extra); // Negation of indicated size
}
else
return mSize;
}
I suspect that this could be related to the fact that with Tcp, sender and receiver are handling data with a streaming behavior, therefore it's not granted that the receiver
gets all of the data at once, as the simple example from which I started probably assumed. In fact, with short strings everything works. With longer strings, it doesn't.
I've done a simplified debug inserting a printf as a first thing inside of the function, but when I have the crash this doesn't even get printed.
It seems like an issue with the FILE *str passed as an argument to the function, when
via the socket a message longer than usual is received.
The buffers are sized far bigger than the length of the message causing the issue (1MB vs 4KB).
I've even tried to increase the size of the socket buffer via the setsockopt:
int rcvBufferSize;
// Retrieve and print the default buffer size
int sockOptSize = sizeof(rcvBufferSize);
if (getsockopt(sock, SOL_SOCKET, SO_RCVBUF, &rcvBufferSize, (socklen_t*)&sockOptSize) < 0)
DieWithSystemMessage("getsockopt() failed");
printf("Initial Receive Buffer Size: %d\n", rcvBufferSize);
// Double the buffer size
rcvBufferSize *= 10;
if (setsockopt(sock, SOL_SOCKET, SO_RCVBUF, &rcvBufferSize,
sizeof(rcvBufferSize)) < 0)
DieWithSystemMessage("setsockopt() failed");
but this didn't help.
Any ideas about the reason and how could I fix it?
This code:
{ // Message was truncated
uint32_t waste[BUFSIZE];
fread(waste, sizeof(uint8_t), extra, in); // Try to flush the channel
reads extra bytes into a buffer of size 4*BUFSIZE (4 because you intended to make the buffer unit8_t, but accidentally made it uint32_t instead).
If extra is larger than 4*BUFSIZE, then you will have a local buffer overflow and stack corruption, possibly resulting in a crash.
To do this correctly, something like this is needed:
int remaining = extra;
while (remaining > 0) {
char waste[BUFSIZE];
int to_read = min(BUFSIZE, remaining);
int got = fread(waste, 1, to_read, in);
if (got <= 0) break;
remaining -= got;
}

FTDI. Would need to know more details about the FT_Write() function

A question about the FT_Write() function.
FT_STATUS FT_Write (FT_HANDLE ftHandle, LPVOID lpBuffer, DWORD dwBytesToWrite, LPDWORD lpdwBytesWritten)
I wonder about the lpdwBytesWritten.
When will it ever return *lpdwBytesWritten < dwBytesToWrite? Would FT_Write return other than FT_OK in that case?
And how much data can I send in one call to FT_Write? What limit has the dwBytesToWrite parameter?
I have not been able to find the answers to these questions. Have read in the FTDI Knowledgebase and also in the D2XX Programmer's Guide.
FT_Write returns status other than FT_OK on "critical" errors.
This function (as well as FT_Read) might return FT_OK and put in *lpdwBytesWritten any number from 0 (timeout) to dwBytesToWrite (transmission finished).
Intermediate values means that not all data transferred yet but transmission process can be continued.
Transmission loop might be like this:
BYTE *buf = pointer_to_data;
DWORD len = length_of_data;
FT_STATUS status;
DWORD written;
for (;;) {
status = FT_Write(handle, buf, len, &written);
if (status != FT_OK)
return status;
if (written == 0)
return -1; // or FT_OTHER_ERROR if no special timeout handling required
if (written == len)
return FT_OK;
len -= written;
buf += written;
}
See also Example 3, FT2232C Test Application (file t_titan.cpp, function DoRxTest at line 289)
dwBytesToWrite is a double word, that is 32 bits. A rather large number of bytes to write. My guess is that the function can return FT_OK even if the number of bytes written is < the number of bytes to write. It is useful to know how many bytes have been written, so that the next time you call the function, you know exactly where to set your transmit pointer in the buffer.

Making a Device Driver in Minix

I'm trying to create a character device driver on Minix. I would like it to be able to accept read() and write() calls. My understanding is that I would need to use sys_safecopyfrom() for the function which runs the read() function and sys_safecopyto() for the function which runs the write() function. The issue is that I keep getting a similar error (although not exactly the same, but I think that the differences are memory locations) when I run it like this. The error is:
verify_grant: grant verify failed: access invalid: want 0x..., have 0x...
grant 2 verify to copy ... -> ... by ... failed err -1
read: Operation not permitted
The "..." are memory locations and the error is similar for write except for the memory locations and it says "write" instead of "read" on the last line.
I think that the relevant code is the following:
#include <minix/drivers.h>
#include <minix/chardriver.h>
#include <stdio.h>
#include <stdlib.h>
#include <minix/ds.h>
...
static struct chardriver hello_tab =
{
.cdr_open = hello_open,
.cdr_close = hello_close,
.cdr_read = hello_read,
.cdr_write = hello_write,
};
...
static ssize_t hello_read(devminor_t UNUSED(minor), u64_t position,
endpoint_t endpt, cp_grant_id_t grant, size_t size, int UNUSED(flags),
cdev_id_t UNUSED(id))
{
u64_t dev_size;
char *ptr;
int ret;
char *buf = HELLO_MESSAGE;
printf("hello_read()\n");
/* This is the total size of our device. */
dev_size = (u64_t) strlen(buf);
/* Check for EOF, and possibly limit the read size. */
if (position >= dev_size) return 0; /* EOF */
if (position + size > dev_size)
size = (size_t)(dev_size - position); /* limit size */
/* Copy the requested part to the caller. */
ptr = buf + (size_t)position;
if ((ret = sys_safecopyfrom(endpt, grant, 0, (vir_bytes) ptr, size)) != OK)
return ret;
/* Return the number of bytes read. */
printf("Message is :%s", ptr);
return size;
}
static ssize_t hello_write(devminor_t UNUSED(minor), u64_t position,
endpoint_t endpt, cp_grant_id_t grant, size_t size, int UNUSED(flags),
cdev_id_t UNUSED(id))
{
u64_t dev_size;
char *ptr;
int ret;
char *buf = HELLO_MESSAGE;
printf("hello_write()\n");
/* This is the total size of our device. */
dev_size = (u64_t) strlen(buf);
/* Check for EOF, and possibly limit the read size. */
if (position >= dev_size) return 0; /* EOF */
if (position + size > dev_size)
size = (size_t)(dev_size - position); /* limit size */
/* Copy the requested part to the caller. */
ptr = buf + (size_t)position;
if ((ret = sys_safecopyto(endpt, grant, 0, (vir_bytes) ptr, size)) != OK)
return ret;
/* Return the number of bytes read. */
return size;
}
The hello_read function is based off of the hello_write functions but I think that it should still work and should read the information into ptr.
Also, I'm a bit hazy on how I would go about getting the second argument in the write() function (the buffer) in my hello_write() function. Is it contained in one of hello_read()'s arguments?
Thanks for your help!
So, I know it's been a long time and there's no activity here but I thought I would answer the question.
I am going to start by saying the the error occurs when passing the wrong arguments into sys_safecopyto/from.
Now to really debug this I would want to see the rest of the code you had. But for anyone else who comes across this problem I'm going to give some tips
look at how many bytes you are passing the the sys_safecopy funcitons
make sure you are putting the correct offset with the buffer when writing. For
the case I used it in that was (buffer_ptr + current_size)
make sure if you are using an earlier version of minix that you are putting in the correct amount of parameters into the sys_safecopy funcitons (could be 5 args or 6 args, the last one on older versions of minix for the hello driver would just be "D" ;) )

linux virtual file as device driver

I write a linux char device driver to simulate a file. The data is stored in an array and I want to implement a "read-file"-handler...
static ssize_t data_read(struct file *f, char __user *buf, size_t count, loff_t *f_pos){
char *msg_pointer;
int bytes_read = 0;
if(vault.storage==NULL)
return -EFAULT;
msg_pointer = vault.storage + *f_pos;
while (count && (*f_pos < vault.size) ) {
put_user(*(msg_pointer++), buf++);
count--;
bytes_read++;
++*f_pos;
}
return bytes_read;
}
vault.storage is a pointer to a kmalloc-creation. If I test the code by copying with dd it works as expected, but when I want to open the file with C
if((fp_data = open("/dev/vault0", O_RDWR)) < 0){
perror("could not open file.\n");
}
err = write(fp_data, "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890", 36);
if (err < 0){
perror("failed to write to sv \n");
}
read(fp_data, buffer, 36);
read(fp_data, buffer, 36);
the first read-command returns 4.. the second 0 - how is this possible?
write performed on a file is not guaranteed to write all the bytes requested atomically ... that is only reserved for a pipe or FIFO when the requested write-amount is less than PIPE_BUF in size. For instance, write can be interrupted by a signal after writing some bytes, and there will be other instances where write will not output the full number of requested bytes before returning. Therefore you should be testing the number of bytes written before reading back any information into a buffer to make sure you are attempting to read-back the same number of bytes written.
Put a printk in the data_read call and print the count and print what is returned to the user(check the value of bytes_read). The bytes_read is returned to the read() call in the use space. Make sure you are returning correct value. And you can also print the fpos and check what is happening.
Here I assume that your drivers read and write functions are called properly, I mean major and minor numbers of your device file belongs to your driver

Resources