Write atomically to a file using Write() with snprintf() - c

I want to be able to write atomically to a file, I am trying to use the write() function since it seems to grant atomic writes in most linux/unix systems.
Since I have variable string lengths and multiple printf's, I was told to use snprintf() and pass it as an argument to the write function in order to be able to do this properly, upon reading the documentation of this function I did a test implementation as below:
int file = open("file.txt", O_CREAT | O_WRONLY);
if(file < 0)
perror("Error:");
char buf[200] = "";
int numbytes = snprintf(buf, sizeof(buf), "Example string %s" stringvariable);
write(file, buf, numbytes);
From my tests it seems to have worked but my question is if this is the most correct way to implement it since I am creating a rather large buffer (something I am 100% sure will fit all my printfs) to store it before passing to write.

No, write() is not atomic, not even when it writes all of the data supplied in a single call.
Use advisory record locking (fcntl(fd, F_SETLKW, &lock)) in all readers and writers to achieve atomic file updates.
fcntl()-based record locks work over NFS on both Linux and BSDs; flock()-based file locks may not, depending on system and kernel version. (If NFS locking is disabled like it is on some web hosting services, no locking will be reliable.) Just initialize the struct flock with .l_whence = SEEK_SET, .l_start = 0, .l_len = 0 to refer to the entire file.
Use asprintf() to print to a dynamically allocated buffer:
char *buffer = NULL;
int length;
length = asprintf(&buffer, ...);
if (length == -1) {
/* Out of memory */
}
/* ... Have buffer and length ... */
free(buffer);
After adding the locking, do wrap your write() in a loop:
{
const char *p = (const char *)buffer;
const char *const q = (const char *)buffer + length;
ssize_t n;
while (p < q) {
n = write(fd, p, (size_t)(q - p));
if (n > 0)
p += n;
else
if (n != -1) {
/* Write error / kernel bug! */
} else
if (errno != EINTR) {
/* Error! Details in errno */
}
}
}
Although there are some local filesystems that guarantee write() does not return a short count unless you run out of storage space, not all do; especially not the networked ones. Using a loop like above lets your program work even on such filesystems. It's not too much code to add for reliable and robust operation, in my opinion.
In Linux, you can take a write lease on a file to exclude any other process opening that file for a while.
Essentially, you cannot block a file open, but you can delay it for up to /proc/sys/fs/lease-break-time seconds, typically 45 seconds. The lease is granted only when no other process has the file open, and if any other process tries to open the file, the lease owner gets a signal. (If the lease owner does not release the lease, for example by closing the file, the kernel will automagically break the lease after the lease-break-time is up.)
Unfortunately, these only work in Linux, and only on local files, so they are of limited use.
If readers do not keep the file open, but open, read, and close it every time they read it, you can write a full replacement file (must be on the same filesystem; I recommend using a lock-subdirectory for this), and hard-link it over the old file.
All readers will see either the old file or the new file, but those that keep their file open, will never see any changes.

Related

Unable to write the complete script onto a device on the serial port

The script file has over 6000 bytes which is copied into a buffer.The contents of the buffer are then written to the device connected to the serial port.However the write function only returns 4608 bytes whereas the buffer contains 6117 bytes.I'm unable to understand why this happens.
{
FILE *ptr;
long numbytes;
int i;
ptr=fopen("compass_script(1).4th","r");//Opening the script file
if(ptr==NULL)
return 1;
fseek(ptr,0,SEEK_END);
numbytes = ftell(ptr);//Number of bytes in the script
printf("number of bytes in the calibration script %ld\n",numbytes);
//Number of bytes in the script is 6117.
fseek(ptr,0,SEEK_SET);
char writebuffer[numbytes];//Creating a buffer to copy the file
if(writebuffer == NULL)
return 1;
int s=fread(writebuffer,sizeof(char),numbytes,ptr);
//Transferring contents into the buffer
perror("fread");
fclose(ptr);
fd = open("/dev/ttyUSB3",O_RDWR | O_NOCTTY | O_NONBLOCK);
//Opening serial port
speed_t baud=B115200;
struct termios serialset;//Setting a baud rate for communication
tcgetattr(fd,&serialset);
cfsetispeed(&serialset,baud);
cfsetospeed(&serialset,baud);
tcsetattr(fd,TCSANOW,&serialset);
long bytesw=0;
tcflush(fd,TCIFLUSH);
printf("\nnumbytes %ld",numbytes);
bytesw=write(fd,writebuffer,numbytes);
//Writing the script into the device connected to the serial port
printf("bytes written%ld\n",bytesw);//Only 4608 bytes are written
close (fd);
return 0;
}
Well, that's the specification. When you write to a file, your process normally is blocked until the whole data is written. And this means your process will run again only when all the data has been written to the disk buffers. This is not true for devices, as the device driver is the responsible of determining how much data is to be written in one pass. This means that, depending on the device driver, you'll get all data driven, only part of it, or even none at all. That simply depends on the device, and how the driver implements its control.
On the floor, device drivers normally have a limited amount of memory to fill buffers and are capable of a limited amount of data to be accepted. There are two policies here, the driver can block the process until more buffer space is available to process it, or it can return with a partial write only.
It's your program resposibility to accept a partial read and continue writing the rest of the buffer, or to pass back the problem to the client module and return only a partial write again. This approach is the most flexible one, and is the one implemented everywhere. Now you have a reason for your partial write, but the ball is on your roof, you have to decide what to do next.
Also, be careful, as you use long for the ftell() function call return value and int for the fwrite() function call... Although your amount of data is not huge and it's not probable that this values cannot be converted to long and int respectively, the return type of both calls is size_t and ssize_t resp. (like the speed_t type you use for the baudrate values) long can be 32bit and size_t a 64bit type.
The best thing you can do is to ensure the whole buffer is written by some code snippet like the next one:
char *p = buffer;
while (numbytes > 0) {
ssize_t n = write(fd, p, numbytes);
if (n < 0) {
perror("write");
/* driver signals some error */
return 1;
}
/* writing 0 bytes is weird, but possible, consider putting
* some code here to cope for that possibility. */
/* n >= 0 */
/* update pointer and numbytes */
p += n;
numbytes -= n;
}
/* if we get here, we have written all numbytes */

Is it possible to write directly from file to the socket?

I'm interested in the basic principles of Web-servers, like Apache or Nginx, so now I'm developing my own server.
When my server gets a request, it's searching for a file (e.g index.html), if it exists - read all the content to the buffer (content) and write it to the socket after. Here's a simplified code:
int return_file(char* content, char* fullPath) {
file = open(fullPath, O_RDONLY);
if (file > 0) { // File was found, OK
while ((nread = read(file, content, 2048)) > 0) {}
close(file);
return 200;
}
}
The question is pretty simple: is it possible to avoid using buffer and write file content directly to the socket?
Thanks for any tips :)
There is no standardized system call which can write directly from a file to a socket.
However, some operating systems do provide such a call. For example, both FreeBSD and Linux implement a system call called sendfile, but the precise details differ between the two systems. (In both cases, you need the underlying file descriptor for the file, not the FILE* pointer, although on both these platforms you can use fileno() to extract the fd from the FILE*.)
For more information:
FreeBSD sendfile()
Linux sendfile()
What you can do is write the "chunk" you read immediately to the client.
In order to write the content, you MUST read it, so you can't avoid that, but you can use a smaller buffer, and write the contents as you read them eliminating the need to read the whole file into memory.
For instance, you could
unsigned char byte;
// FIXME: store the return value to allow
// choosing the right action on error.
//
// Note that `0' is not really an error.
while (read(file, &byte, 1) > 0) {
if (write(client, &byte, 1) <= 0) {
// Handle error.
}
}
but then, unsigned char byte; could be unsigned char byte[A_REASONABLE_BUFFER_SIZE]; which would be better, and you don't need to store ALL the content in memory.
}
No, it is not. There must be an intermediate storage that you use for reading/writing the data.
There is one edge case: when you use memory mapped files, the mapped file's region can be used for writing into socket. But internally, the system would anyway perform a read into memory buffer operation.

how to use 'cat' in following simple device read program

static ssize_t device_read (struct file* filp, char *bufStoreData, size_t bufCount, loff_t* curOffset)
{
printk(KERN_INFO"reading from the device");
ret = copy_to_user(bufStoreData,virtual_device.data,bufCount);
return ret;
}
does copy_to_user returns number of bytes remaining to read or number of bytes read?
whats the use of bufcount if i am using cat
if all the data is not read in single call how it can read the remaining data?Is this responsibility of application to issue system call again or the driver works automatically?
I need to understand this basic concept.
copy_to_user() returns the number of bytes that couldn't be copied to user space. If the complete buffer could be copied, it returns 0.
Normally, if !=0, means that there was some sort of memory problem (writting past a legal memory address), so these situations should be detected and reported to the user.
static ssize_t device_read (struct file* filp, char *bufStoreData,
size_t bufCount, loff_t* curOffset)
{
size_t bytes_to_copy;
printk(KERN_INFO"reading from the device");
/* do stuff to get device data into virtual_device.data . Also
update virtual_device.datasize */
bytes_to_copy = (virtual_device.datasize <= bufCount)?
virtual_device.datasize : bufCount;
/* note that I'm not using bufCount, but an hypothetical field in
virtual_device that gives me how much data the device has ready
for the user. I choose the lower of both */
/* Also recall that if the number of bytes requested by the user is
less than the number of bytes the device has generated, then the
next read should return the remainder of the device data, so the
driver should carry the count of how many bytes have been copied
to the user and how many are left. This is not covered in this
example. */
ret = copy_to_user(bufStoreData,virtual_device.data, bytes_to_copy);
if (ret != 0)
return -EPERM; /* if copy was not successful, report it */
return bytes_to_copy;
}
When the user issues ret = read (fd, buffer, sizebuff); it expects one of these things and should react accordingly:
ret is equal to sizebuff. That means that read could return all the data the user requested. Nothing else to do here.
ret is positive, but less than sizebuff. That means that the read gave the user some data, but not as much as he requested. The user process must re-issue the read syscall to retrieve the remaining data, if needed. Something like: ret = read (fd, buffer+ret, sizebuff-ret);
ret is 0. This means that the device has no more data to send. It's the EOF condition. User process should close the device.
ret is < 0. This is an error condition. User process must check errno and take appropiate measures.
Your device driver will have to return an appropiate value in device_read according to what happened to the device when it was read.
On the other hand, a process like cat expects to read as much as 4096 bytes per read call. If the device sends less than that, it will print the received data and will ask for more. cat will only stop if it receives a signal (Ctrl-C for example), or if a read call returns an unrecoverable error (such as ENODEVICE, which should be generated by your driver if such condition arises), or if reads 0 bytes (EOF condition).
A rather silly device that returns "Hello, world" to the user process. It employs some global data that must be reset in device_open function. Note that if several processes are going to use your device at the same time, these global data must be turned into instance data (using file->private_data). This device_read example shows how to deal with device buffers and user buffers, and how to keep track of bytes sent to the user, so the device never sends more data than it has, never sends more data than the user requests, and when the device runs out of data, it returns 0 to the user.
int curindx = 0; /* should be reset upon calling device_open */
static ssize_t device_read (struct file* filp, char *bufStoreData,
size_t bufCount, loff_t* curOffset)
{
size_t bytes_to_copy;
char device_data[]="Hello, world!\n";
size_t remaindersize;
remaindersize = strlen(device_data) - curindx;
bytes_to_copy = (remaindersize <= bufCount)?
remaindersize : bufCount;
ret = copy_to_user(bufStoreData,device_data+curindx, bytes_to_copy);
if (ret != 0)
return -EPERM; /* if copy was not successful, report it */
curindx += bytes_to_copy;
return bytes_to_copy;
}
1) does copy_to_user returns number of bytes remaining to read or number of bytes read?
copy_to_user returns a number of bytes that could not be copied.
2) whats the use of bufcount if i am using cat
bufCount is a number of bytes user can read. In other words, it's a buffer size of user space application. I guess cat uses multiple of PAGE_SIZE for buffer size, actually you can check it yourself by adding printk to your device_read() function:
print(KERN_INFO "bufCount=%ld\n", bufCount);
3) if all the data is not read in single call how it can read the remaining data? Is this responsibility of application to issue system call again or the driver works automatically?
User space programs use read() system call to read data from files (including block and character devices) which returns 0 only if the end of file is reached. That's how they know when to stop. So, yes, it's responsibility of user-space program to read remaining data (if it needs to).
ssize_t ret;
...
while ((ret = read(fd, buf, bufsize)) > 0) {...};
if (ret < 0)
error();
On the other hand, the responsibility of device driver is to correctly maintain offsets inside its internal structures and return values that make sense.
P/S:
I'd recommend you to read a book "Linux device drivers" which is freely available in internet (http://lwn.net/Kernel/LDD3/) and touches these topics in details.

how is select() alerted to an fd becoming "ready"?

I don't know why I'm having a hard time finding this, but I'm looking at some linux code where we're using select() waiting on a file descriptor to report it's ready. From the man page of select:
select() and pselect() allow a program to monitor multiple file descriptors,
waiting until one or more of the file descriptors become "ready" for some
class of I/O operation
So, that's great... I call select on some descriptor, give it some time out value and start to wait for the indication to go. How does the file descriptor (or owner of the descriptor) report that it's "ready" such that the select() statement returns?
It reports that it's ready by returning.
select waits for events that are typically outside your program's control. In essence, by calling select, your program says "I have nothing to do until ..., please suspend my process".
The condition you specify is a set of events, any of which will wake you up.
For example, if you are downloading something, your loop would have to wait on new data to arrive, a timeout to occur if the transfer is stuck, or the user to interrupt, which is precisely what select does.
When you have multiple downloads, data arriving on any of the connections triggers activity in your program (you need to write the data to disk), so you'd give a list of all download connections to select in the list of file descriptors to watch for "read".
When you upload data to somewhere at the same time, you again use select to see whether the connection currently accepts data. If the other side is on dialup, it will acknowledge data only slowly, so your local send buffer is always full, and any attempt to write more data would block until buffer space is available, or fail. By passing the file descriptor we are sending to to select as a "write" descriptor, we get notified as soon as buffer space is available for sending.
The general idea is that your program becomes event-driven, i.e. it reacts to external events from a common message loop rather than performing sequential operations. You tell the kernel "this is the set of events for which I want to do something", and the kernel gives you a set of events that have occured. It is fairly common for two events occuring simultaneously; for example, a TCP acknowledge was included in a data packet, this can make the same fd both readable (data is available) and writeable (acknowledged data has been removed from send buffer), so you should be prepared to handle all of the events before calling select again.
One of the finer points is that select basically gives you a promise that one invocation of read or write will not block, without making any guarantee about the call itself. For example, if one byte of buffer space is available, you can attempt to write 10 bytes, and the kernel will come back and say "I have written 1 byte", so you should be prepared to handle this case as well. A typical approach is to have a buffer "data to be written to this fd", and as long as it is non-empty, the fd is added to the write set, and the "writeable" event is handled by attempting to write all the data currently in the buffer. If the buffer is empty afterwards, fine, if not, just wait on "writeable" again.
The "exceptional" set is seldom used -- it is used for protocols that have out-of-band data where it is possible for the data transfer to block, while other data needs to go through. If your program cannot currently accept data from a "readable" file descriptor (for example, you are downloading, and the disk is full), you do not want to include the descriptor in the "readable" set, because you cannot handle the event and select would immediately return if invoked again. If the receiver includes the fd in the "exceptional" set, and the sender asks its IP stack to send a packet with "urgent" data, the receiver is then woken up, and can decide to discard the unhandled data and resynchronize with the sender. The telnet protocol uses this, for example, for Ctrl-C handling. Unless you are designing a protocol that requires such a feature, you can easily leave this out with no harm.
Obligatory code example:
#include <sys/types.h>
#include <sys/select.h>
#include <unistd.h>
#include <stdbool.h>
static inline int max(int lhs, int rhs) {
if(lhs > rhs)
return lhs;
else
return rhs;
}
void copy(int from, int to) {
char buffer[10];
int readp = 0;
int writep = 0;
bool eof = false;
for(;;) {
fd_set readfds, writefds;
FD_ZERO(&readfds);
FD_ZERO(&writefds);
int ravail, wavail;
if(readp < writep) {
ravail = writep - readp - 1;
wavail = sizeof buffer - writep;
}
else {
ravail = sizeof buffer - readp;
wavail = readp - writep;
}
if(!eof && ravail)
FD_SET(from, &readfds);
if(wavail)
FD_SET(to, &writefds);
else if(eof)
break;
int rc = select(max(from,to)+1, &readfds, &writefds, NULL, NULL);
if(rc == -1)
break;
if(FD_ISSET(from, &readfds))
{
ssize_t nread = read(from, &buffer[readp], ravail);
if(nread < 1)
eof = true;
readp = readp + nread;
}
if(FD_ISSET(to, &writefds))
{
ssize_t nwritten = write(to, &buffer[writep], wavail);
if(nwritten < 1)
break;
writep = writep + nwritten;
}
if(readp == sizeof buffer && writep != 0)
readp = 0;
if(writep == sizeof buffer)
writep = 0;
}
}
We attempt to read if we have buffer space available and there was no end-of-file or error on the read side, and we attempt to write if we have data in the buffer; if end-of-file is reached and the buffer is empty, then we are done.
This code will behave clearly suboptimal (it's example code), but you should be able to see that it is acceptable for the kernel to do less than we asked for both on reads and writes, in which case we just go back and say "whenever you're ready", and that we never read or write without asking whether it will block.
From the same man page:
On exit, the sets are modified in place to indicate which file descriptors actually changed status.
So use FD_ISSET() on the sets passed to select to determine which FDs have become ready.

locking of copy_[to/from]_user() in linux kernel

as stated in: http://www.kernel.org/doc/htmldocs/kernel-hacking.html#routines-copy this functions "can" sleep.
So, do I always have to do a lock (e.g. with mutexes) when using this functions or are there exceptions?
I'm currently working on a module and saw some Kernel Oops at my system, but cannot reproduce them. I have a feeling they are fired because I'm currently do no locking around copy_[to/from]_user(). Maybe I'm wrong, but it smells like it has something to do with it.
I have something like:
static unsigned char user_buffer[BUFFER_SIZE];
static ssize_t mcom_write (struct file *file, const char *buf, size_t length, loff_t *offset) {
ssize_t retval;
size_t writeCount = (length < BUFFER_SIZE) ? length : BUFFER_SIZE;
memset((void*)&user_buffer, 0x00, sizeof user_buffer);
if (copy_from_user((void*)&user_buffer, buf, writeCount)) {
retval = -EFAULT;
return retval;
}
*offset += writeCount;
retval = writeCount;
cleanupNewline(user_buffer);
dispatch(user_buffer);
return retval;
}
Is this save to do so or do I need locking it from other accesses, while copy_from_user is running?
It's a char device I read and write from, and if a special packet in the network is received, there can be concurrent access to this buffer.
You need to do locking iff the kernel side data structure that you are copying to or from might go away otherwise - but it is that data structure you should be taking a lock on.
I am guessing your function mcom_write is a procfs write function (or similar) right? In that case, you most likely are writing to the procfs file, your program being blocked until mcom_write returns, so even if copy_[to/from]_user sleeps, your program wouldn't change the buffer.
You haven't stated how your program works so it is hard to say anything. If your program is multithreaded and one thread writes while another can change its data, then yes, you need locking, but between the threads of the user-space program not your kernel module.
If you have one thread writing, then your write to the procfs file would be blocked until mcom_write finishes so no locking is needed and your problem is somewhere else (unless there is something else that is wrong with this function, but it's not with copy_from_user)

Resources