Reading a struct through low level I/O - c

For a lab we are required to read in from binary files using low level io (open/lseek/close not fopen/fseek/fclose) and manipulate the data. My question is how do I read or write structs using these methods.
The struct is as follows
typedef struct Entry {
char title[33];
char artist[17];
int val;
int cost;
} Entry_T;
I originally planned on creating a buffer of sizeof(Entry_T) and read the struct simply, but I don't think that's possible using low level I/O. Am I supposed to create 4 buffers and fill them sequentially, use one buffer and reallocate it for the right sizes, or is it something else entirely. An example of writing would be helpful as well, but I think I may be able to figure it out after I see a read example.

Because your structures contain no pointers and all elements are fixed size, you can simply write and read the structures. Error checking omitted for brevity:
const char *filename = "...";
int fd = open(filename, O_RDWR|O_CREAT, 0644);
Entry_t ent1 = { "Spanish Train", "Chris De Burgh", 100, 30 };
ssize_t w_bytes = write(fd, &ent1, sizeof(ent1));
lseek(fd, 0L, SEEK_SET);
Entry_t ent2;
ssize_t r_bytes = read(fd, &ent2, sizeof(ent2));
assert(w_bytes == r_bytes);
assert(w_bytes == sizeof(ent1));
assert(strcmp(ent1.title, ent2.title) == 0);
assert(strcmp(ent1.artist, ent2.artist) == 0);
assert(ent1.val == ent2.val && ent1.cost == ent2.cost);
close(fd);
If your structures contain pointers or variable length members (flexible array members), you have to work harder.
Data written like this is not portable unless all the data is in strings. If you migrate the data between a big-endian and little-endian machine, one side will misinterpret what the other thinks it wrote. Similarly, there can be problems when moving data between 32-bit and 64-bit builds on a single machine architecture (if you have long data, for example, then the 32-bit system probably uses sizeof(long) == 4) but the 64-bit system probably uses sizeof(long) == 8 — unless you're on Windows.

The low-level functions might be OS specific. However they are generally these:
fopen() -> open()
fread() -> read()
fwrite() -> write()
fclose() -> close()
Note that while the 'fopen()' set of functions use a 'FILE *' as a token to represent the file, the 'open()' set of functions use an integer (int).
Using read() and write(), you may write entire structures. So, for:
typedef struct Entry {
char title[33];
char artist[17];
int val;
int cost;
} Entry_T;
You may elect to read, or write as follows:
{
int fd = (-1);
Entry_T entry;
...
fd=open(...);
...
read(fd, &entry, sizeof(entry));
...
write(fd, &entry, sizeof(entry));
...
if((-1) != fd)
close(fd);
}

Here you go:
Entry_T t;
int fd = open("file_name", _O_RDONLY, _S_IREAD);
if (fd == -1) //error
{
}
read(fd, &t, sizeof(t));
close(fd);

Related

C - write data to a file, either write all or write nothing

In my case, writing partial data is nonsense, so I figure out this:
ssize_t write_all(int fd, unsigned char* data, size_t size) {
ssize_t w;
size_t written = 0;
unsigned char* buf = data;
do {
w = write(fd, buf, size-written);
if (w > 0) {
written += w;
if (written == size) {
return written;
} else {
buf += w;
}
} else {
lseek(fd, SEEK_CUR, -written);
return 0;
}
} while (1);
}
Is this correct? Or are there any better practices?
Here is what I would do:
create a temporary file somewhere convenient (like /tmp)
write out that file until you reach the point where you can declare the operation a success
unlink the original and move the new file to the same location/name.
Once you call write() you are at the mercy of the kernel as far as when the data will actually be flushed to the disk.
You can use O_SYNC to add a level of assurance that the data has been written - O_SYNC will cause write() to block until the data is written to disk. O_DSYNC has the same behavior, but it will block until all of the file-system metadata is written as well, giving a stronger guarantee that the data will be retrievable once write() returns.

How do I create a file descriptor backed by a simple char array?

I have a function that receives a descriptor (you know, one of those things that open() and socket() spit), reads from it and does something with the data:
int do_something(int fd);
I want to test this function. Preferably, the input data should sit right next to the test assert for the sake of easy debugging. (Therefore, actual file reading should be avoided.) In my mind, the ideal would be something like
unsigned char test_input[] = { 1, 2, 3 };
int fd = char_array_to_fd(test_input);
ck_assert(do_something(fd) == 1234);
(ck_assert is from the Check framework. It's just a typical unit test assert.)
Is there a way to implement char_array_to_fd()? I don't mind if I need to NULL-terminate the array or send the length in.
I imagine that I can open a socket to myself and write on one end so the test function receives the data on the other end. I just don't want to write something awkward and find out that Unix had something less contrived all along. The solution should be any-Unix friendly.
(Basically, I'm asking for a C equivalent of ByteArrayInputStream.)
Alternatively: Should I be thinking in some other way to solve this problem?
On Linux, you can use memfd_create() to create a memory-backed temporary file:
unsigned char test_input[] = ...;
int fd = memfd_create( "test_input", 0 );
// write test data to the the "file"
write( fd, test_input, sizeof( test_input );
// reset file descriptor to the start of the "file"
lseek( fd, 0, SEEK_SET );
Note that completely lacks error checking.
You can use mkstemp to make a temporary file and write to it or read from it:
int fd = mkstemp("tempXXXXXX");
If you want something more dynamic, you can use socketpair to create a pair of connected sockets.
int pair[2];
socketpair(AF_UNIX, SOCK_STREAM, 0, pair);
You can then fork process or thread to interact with your program under test.
#ChrisDodd's answer is already accepted, but I wanted to add the pipes solution I had developed (thanks to #Someprogrammerdude's comment) for completeness sake:
struct writer_args {
int fd;
unsigned char *buffer;
size_t size;
};
/* Pipe writer thread function */
void *write_buffer(void *_args)
{
struct writer_args *args = _args;
/*
* error handling omitted.
* Should probably also be thrown into a loop, in case it writes less
* than args->size.
*/
write(args->fd, args->buffer, args->size);
close(args->fd);
return NULL;
}
/*
* Wrapper for quick & easy testing of the do_something() function.
* Replaces the fd parameter for a char array and its size.
*/
static int __do_something(unsigned char *input, size_t size)
{
pthread_t writer_thread;
struct writer_args args;
int fd[2];
int result;
pipe(fd); /* error handling omitted */
/* fd[0] is for reading, fd[1] is for writing */
/*
* We want one thread for reading, another one for writing.
* This is because pipes have a nonstandardized maximum buffer capacity.
* If we write too much without reading, it will block forever.
*/
args.fd = fd[1];
args.buffer = input;
args.size = size;
/* error handling omitted */
pthread_create(&writer_thread, NULL, write_buffer, &args);
result = do_something(fd[0]);
close(fd[0]);
pthread_join(writer_thread, NULL); /* error handling omitted */
return result;
}
Then, I can keep testing do_something as much as I want:
ret = __do_something(input1, input1_size);
if (ret != 1234)
fprintf(stderr, "Fail. Expected:1234 Actual:%d\n", ret);
ret = __do_something(input2, input2_size);
if (ret != 0)
fprintf(stderr, "Fail. Expected:0 Actual:%d\n", ret);
ret = __do_something(input3, input3_size);
if (ret != 555)
fprintf(stderr, "Fail. Expected:555 Actual:%d\n", ret);
...

Write and read on a socket returns different byte count

I'm writing a client-server application that uses AF_UNIX sockets.
The client generates a string and then sends it on the socket after it has sent an header. The server then reads the header, allocates space for the string and then reads the string.
The header is defined as:
typedef struct {
unsigned long key;
op_t op; // op_t is an enum
} header_t;
And the string is stored with its length:
typedef struct {
unsigned int len;
char* buf;
} data_t;
There's also another struct that groups these two things into one (not a choice of mine, I have to use these things as they are).
typedef struct {
header_t hdr;
data_t data;
} message_t;
I'm using writev() system call to send the data over the socket, this way:
int sendRequest(long fd, message_t *msg) {
struct iovec to_send[3];
/* Header */
to_send[0].iov_base = &(msg->hdr);
to_send[0].iov_len = sizeof(header_t);
/* Data */
to_send[1].iov_base = &(msg->data.len);
to_send[1].iov_len = sizeof(msg->data.len);
to_send[2].iov_base = msg->data.buf;
to_send[2].iov_len = msg->data.len;
int c;
if((c = writev(fd, to_send, (msg->data.len > 0) ? 3 : 2)) < 0) {
return -1;
}
printf("#### %i BYTES WRITTEN (header: %i) ####\n",c, to_send[0].iov_len);
return 0;
}
To read, I use two distinct functions, one to read the header, and one to read the data:
int readHeader(long fd, header_t *hdr) {
struct iovec to_read[1];
to_read[0].iov_base = hdr;
to_read[0].iov_len = sizeof(header_t);
errno = 0;
int c;
if((c = readv(fd, to_read, 1)) <= 0) {
return -1;
}
printf("[H] %i BYTES READ \n",c);
return 0;
}
int readData(long fd, data_t *data) {
struct iovec to_read[2];
/* First, read how long is the buffer */
to_read[0].iov_base = &(data->len);
to_read[0].iov_len = sizeof(data->len);
int c;
if((errno = 0, c = readv(fd, to_read, 1)) <= 0)
return -1;
if(data->len > 0) {
data->buf = calloc(data->len, sizeof(char));
if(data->buf == NULL)
return -1;
/* Read the string */
to_read[1].iov_base = data->buf;
to_read[1].iov_len = data->len;
if((errno = 0, c += readv(fd, &to_read[1], 1)) <= 0) {
free(data->buf);
return -1;
}
}
else {
data->buf = NULL;
}
printf("[D] %i BYTES READ (%i + %i)",c, to_read[0].iov_len, to_read[1].iov_len);
return 0;
}
And here comes the problem.
If I send a string 8193 bytes long, everything works fine on the client, that outputs
8213 bytes written (header: 16) which is correct, because 16 bytes are from the header, 4 bytes are from the len field and 8193 are from the string.
But the server prints this:
[H] 16 bytes read (okay) and then [D] 8176 bytes read (wrong!). So, there are 21 bytes to read left. Why? If I try to send a string that has a length of 8192 or less, everything works fine. And assuming that there's a limit on the bytes that can be read by readv(), what is the correct way to read everything that was written?
what is the correct way to read everything that was written?
There is no guarantee that a readv will return all the data at once. If the first read does not return all the requested bytes than you need to call read/readv again to get the rest.
But the server prints this: [H] 16 bytes read (okay) and then [D] 8176 bytes read (wrong!).
Surely it would be safer to say "unexpected" than "wrong". It is safe to assume that readv() returns the number of bytes actually read, as it is specified to do. You appear to have fallen into one of the classic traps of these POSIX low-level I/O functions: in any given call, they are not guaranteed to transfer the full number of bytes you request them to do. This one reason why on success they return the number of bytes transferred.
what is the correct way to read everything that was written?
I more often see people run into this with the read() and write() functions, but readv() and writev() work the same way except as necessary to spread input across multiple buffers or to gather output from multiple buffers, respectively. With either set of functions, if you want to transfer a specific number of bytes then you must be prepared to perform multiple reads or writes in a loop, at each iteration picking up where the previous iteration left off.
It may at this point occur to you that transferring the full contents of your multiple, differently-sized buffers across multiple calls could get complicated fairly quickly (and if it hadn't before then it should do now). The readv() and writev() functions are not really intended for the kind of use to which you are trying to put them. They are barely any higher-level than read() and write() themselves, and they are best suited for treating multiple fixed-size buffers as a single, larger buffer.
For a case such as yours, I think it will be easier to use read() and write(). You could consider writing helper functions that wrap those those to perform the needed looping to fully read a specified number of bytes; this will save you repeating such code for each separate data item you want to transfer. Having written such functions, the main code will actually be simpler than what you have now, because you will not need to set up the iovectors.

pread for very large files

I am reading a large file using pread as follows:
ssize_t s = pread(fd, buff, count, offset);
if (s != (ssize_t) count)
fprintf(stderr, "s = %ld != count = %ld\n", s, count);
assert(s == (ssize_t ) count);
The above code has been working fine for small files (upto 1.5GB). However, for large file sizes, the returned number of bytes is different than the expected count.
In particular, for 2.4GB file size, my count is set to 2520133890 and the assertion fails with the fprintf saying:
s = 2147479552 != count = 2520133890
What makes this puzzling is that I am working on a 64-bit system and hence, sizeof(ssize_t) = 8.
What is the cause of this failure and how do I resolve this so that I can read the whole file in one go?
Looks like you use linux, and magic number return by pread is 2147479552 = 0x7ffff000, so the answer is in man 2 read:
On Linux, read() (and similar system calls) will transfer at
most 0x7ffff000 (2,147,479,552) bytes, returning the number of bytes
actu‐ ally transferred. (This is true on both 32-bit and 64-bit
systems.)
So you need at least twice to call pread to get your data,
this restriction not related to _FILE_OFFSET_BITS=64, O_LARGEFILE, sizeof(off_t) and etc things,
this restriction is create by rw_verify_area in linux kernel:
/*
* rw_verify_area doesn't like huge counts. We limit
* them to something that fits in "int" so that others
* won't have to do range checks all the time.
*/
int rw_verify_area(int read_write, struct file *file, const loff_t *ppos, size_t count)
...
return count > MAX_RW_COUNT ? MAX_RW_COUNT : count;
#define MAX_RW_COUNT (INT_MAX & PAGE_CACHE_MASK)
From your description it sounds like you're doing a 32-bit build, and you haven't enabled the Large File Support (LFS). In order to do this, you need to set the macro _FILE_OFFSET_BITS to the value 64.
So, please double-check that you're really doing a 64-bit build like you say.. EDIT: Ok I believe you are indeed using a 64-bit system.
I think the correct cause of your problem, as pointed out in the answer https://stackoverflow.com/a/36568630/75652 , is explained in the read(2) man page: http://man7.org/linux/man-pages/man2/read.2.html . In order to handle this, you need code like
bytes_left = count;
while (bytes_left > 0)
{
trans = pread (fd, buff, bytes_left, offset);
if (trans == -1)
{
if (errno == EINTR)
continue;
else
return trans;
}
buff += trans;
bytes_left -= trans;
offset += trans;
}
return count - bytes_left;

Read/Write struct to fifo in C

I'm trying to pass structs between processes using named pipes. I got stuck at trying to open the pipe non-blocking mode. Here's my code for writing to the fifo:
void writeUpdate() {
// Create fifo for writing updates:
strcpy(fifo_write, routing_table->routerName);
// Check if fifo exists:
if(access(fifo_write, F_OK) == -1 )
fd_write = mkfifo(fifo_write, 0777);
else if(access(fifo_write, F_OK) == 0) {
printf("writeUpdate: FIFO %s already exists\n", fifo_write);
//fd_write = open(fifo_write, O_WRONLY|O_NONBLOCK);
}
fd_write = open(fifo_write, O_WRONLY|O_NONBLOCK);
if(fd_write < 0)
perror("Create fifo error");
else {
int num_bytes = write(fd_write, routing_table, sizeof(routing_table));
if(num_bytes == 0)
printf("Nothing was written to FIFO %s\n", fifo_write);
printf("Wrote %d bytes. Sizeof struct: %d\n", num_bytes,sizeof(routing_table)+1);
}
close(fd_write);
}
routing_table is a pointer to my struct, it's allocated, so there's no prob with the name of the fifo or smth like that.
If I open the fifo without the O_NONBLOCK option, it writes smth for the first time, but then it blocks because I'm having troubles reading the struct too. And after the first time, the initial fifo is created, but other fifo's appear, named '.', '..'.
With O_NONBLOCK option set, it creates the fifo but always throws an error: 'No such device or address'. Any idea why this happens? Thanks.
EDIT: Ok, so I'm clear now about opening the fifo, but I have another problem, in fact reading/writing the struct to the fifo was my issue to start with. My code to read the struct:
void readUpdate() {
struct rttable *updateData;
allocate();
strcpy(fifo_read, routing_table->table[0].router);
// Check if fifo exists:
if(access(fifo_read, F_OK) == -1 )
fd_read = mkfifo(fifo_read, 777);
else if(access(fifo_read, F_OK) == 0) {
printf("ReadUpdate: FIFO %s already exists\n Reading from %s\n", fifo_read, fifo_read);
}
fd_read = open(fifo_read, O_RDONLY|O_NONBLOCK);
int num_bytes = read(fd_read, updateData, sizeof(updateData));
close(fd_read);
if(num_bytes > 0) {
if(updateData == NULL)
printf("Read data is null: yes");
else
printf("Read from fifo: %s %d\n", updateData->routerName, num_bytes);
int result = unlink(fifo_read);
if(result < 0)
perror("Unlink fifo error\n");
else {
printf("Unlinking successful for fifo %s\n", fifo_read);
printf("Updating table..\n");
//update(updateData);
print_table_update(updateData);
}
} else
printf("Nothing was read from FIFO %s\n", fifo_read);
}
It opens the fifo and tries to read, but it seems like nothing is in the fifo, although in writeUpdate the first time it says it wrote 4 bytes (this seems wrong too). At reading, first time around it prints 'a' and then num_bytes is always <=0.
I've looked around and only found this example, with simple write/read, is there smth more needed when writing a struct?
My struct looks like this:
typedef struct distance_table {
char dest[20]; //destination network
char router[20]; // via router..
int distance;
} distance_table;
typedef struct rttable {
char routerName[10];
char networkName[20];
struct distance_table table[50];
int nrRouters;
} rttable;
struct rttable *routing_table;
"No such device or address" is the ENXIO error message. If you look at the open man page, you'll see that this error is reported in particular if:
O_NONBLOCK | O_WRONLY is set, the named file is a FIFO and no process
has the file open for reading. (...)
which is exactly your situation. So the behavior you are seeing is normal: you can't write (without blocking) to a pipe that has no readers. The kernel won't buffer your messages if nothing is connected to the pipe for reading.
So make sure you start the "consumer(s)" before your "producer", or remove the non-blocking option on the producer.
BTW: using access is, in most circumstances, opening yourself to time of check to time of use issues. Don't use it. Try the mkfifo - if it works, you're good. If it fails with EEXISTS, you're good too. If it fails otherwise, clean up and bail out.
For the second part of your question, it really depends completely on how exactly the data you are trying to send is structured. Serializing a random struct in C is not easy at all, especially if it contains variable data (like char *s for example).
If you struct contains only primitive types (and no pointers), and both sides are on the same machine (and compiled with the same compiler), then a raw write on one side and read on the other of the whole struct should work.
You can look at C - Serialization techniques for more complex data types for example.
Concerning your specific example: you're getting mixed up between pointers to your structs and plain structs.
On the write side you have:
int num_bytes = write(fd_write, routing_table, sizeof(routing_table));
This is incorrect since routing_table is a pointer. You need:
int num_bytes = write(fd_write, routing_table, sizeof(*routing_table));
// or sizeof(struct rttable)
Same thing on the read side. On the receiving size you're also not allocating updateData as far as I can tell. You need to do that too (with malloc, and remember to free it).
struct rttable *updateData = malloc(sizeof(struct rrtable));

Resources