close() system call takes long time to finish - c

I am testing performance using sendfile() to copy big files under Linux 6.4.
My code follows the following logic and is compiled with gcc
read_fd = open (argv[1], O_RDONLY);
fstat(read_fd, &stat_buf);
write_fd = open (argv[2], O_WRONLY | O_CREAT, stat_buf.st_mode);
left_to_write = stat_buf.st_size;
while (left_to_write > 0) {
written=sendfile (write_fd, read_fd, &offset, stat_buf.st_size);
if (written == -1)
return -1;
else {
left_to_write -= written;
bytes_done=stat_buf.st_size-left_to_write;
printf("%ld bytes written, %ld bytes left to write\n", written, left_to_write);
}
}
close(read_fd);
close(write_fd); /* this takes minutes */
The sendfile() call is very fast; I can see it writing chunks of 2GB about every 5 seconds.
When the while loop is over, it stays at close(output) for minutes before finishing successfully.
Why does close(output) takes so long to run? Is it flushing a buffer? How can I make it faster?

It's due to writeback. close(output) must report any write errors that have not yet been reported; for example, it must ensure that all the blocks required for the data have been allocated, or else report ENOSPC.
The data are not necessarily written to disk when close() returns; if you want this, you'll need to call fsync() (and endure a longer wait).
See also the NOTES section of the manual page.

Related

Write atomically to a file using Write() with snprintf()

I want to be able to write atomically to a file, I am trying to use the write() function since it seems to grant atomic writes in most linux/unix systems.
Since I have variable string lengths and multiple printf's, I was told to use snprintf() and pass it as an argument to the write function in order to be able to do this properly, upon reading the documentation of this function I did a test implementation as below:
int file = open("file.txt", O_CREAT | O_WRONLY);
if(file < 0)
perror("Error:");
char buf[200] = "";
int numbytes = snprintf(buf, sizeof(buf), "Example string %s" stringvariable);
write(file, buf, numbytes);
From my tests it seems to have worked but my question is if this is the most correct way to implement it since I am creating a rather large buffer (something I am 100% sure will fit all my printfs) to store it before passing to write.
No, write() is not atomic, not even when it writes all of the data supplied in a single call.
Use advisory record locking (fcntl(fd, F_SETLKW, &lock)) in all readers and writers to achieve atomic file updates.
fcntl()-based record locks work over NFS on both Linux and BSDs; flock()-based file locks may not, depending on system and kernel version. (If NFS locking is disabled like it is on some web hosting services, no locking will be reliable.) Just initialize the struct flock with .l_whence = SEEK_SET, .l_start = 0, .l_len = 0 to refer to the entire file.
Use asprintf() to print to a dynamically allocated buffer:
char *buffer = NULL;
int length;
length = asprintf(&buffer, ...);
if (length == -1) {
/* Out of memory */
}
/* ... Have buffer and length ... */
free(buffer);
After adding the locking, do wrap your write() in a loop:
{
const char *p = (const char *)buffer;
const char *const q = (const char *)buffer + length;
ssize_t n;
while (p < q) {
n = write(fd, p, (size_t)(q - p));
if (n > 0)
p += n;
else
if (n != -1) {
/* Write error / kernel bug! */
} else
if (errno != EINTR) {
/* Error! Details in errno */
}
}
}
Although there are some local filesystems that guarantee write() does not return a short count unless you run out of storage space, not all do; especially not the networked ones. Using a loop like above lets your program work even on such filesystems. It's not too much code to add for reliable and robust operation, in my opinion.
In Linux, you can take a write lease on a file to exclude any other process opening that file for a while.
Essentially, you cannot block a file open, but you can delay it for up to /proc/sys/fs/lease-break-time seconds, typically 45 seconds. The lease is granted only when no other process has the file open, and if any other process tries to open the file, the lease owner gets a signal. (If the lease owner does not release the lease, for example by closing the file, the kernel will automagically break the lease after the lease-break-time is up.)
Unfortunately, these only work in Linux, and only on local files, so they are of limited use.
If readers do not keep the file open, but open, read, and close it every time they read it, you can write a full replacement file (must be on the same filesystem; I recommend using a lock-subdirectory for this), and hard-link it over the old file.
All readers will see either the old file or the new file, but those that keep their file open, will never see any changes.

Disk write does not work with malloc in C

I did write to disk using C code.
First I tried with malloc and found that write did not work (write returned -1):
fd = open('/dev/sdb', O_DIRECT | O_SYNC | O_RDWR);
void *buff = malloc(512);
lseek(fd, 0, SEEK_SET);
write(fd, buff, 512);
Then I changed the second line with this and it worked:
void *buff;
posix_memalign(&buff,512,512);
However, when I changed the lseek offset to 1: lseek(fd, 1, SEEK_SET);, write did not work again.
First, why didn't malloc work?
Then, I know that in my case, posix_memalign guarantees that start address of memory alignment must be multiple of 512. But, should not memory alignment and write be a separate process? So why I could not write to any offset that I want?
From the Linux man page for open(2):
The O_DIRECT flag may impose alignment restrictions on the length
and address of user-space buffers and the file offset of I/Os.
And:
Under Linux 2.4, transfer sizes, and the alignment of the user
buffer and the file offset must all be multiples of the logical block
size of the filesystem. Under Linux 2.6, alignment to 512-byte
boundaries suffices.
The meaning of O_DIRECT is to "try to minimize cache effects of the I/O to and from this file", and if I understand it correctly it means that the kernel should copy directly from the user-space buffer, thus perhaps requiring stricter alignment of the data.
Maybe the documentation doesn't say, but it's quite possible that write and reads to/from a block device is required to be aligned and for entire blocks to succeed (this would explain why you get failure in your first and last cases but not in the second). If you use linux the documentation of open(2) basically says this:
The O_DIRECT flag may impose alignment restrictions on the length and
address of user-space buffers and the file offset of I/Os. In Linux
alignment restrictions vary by file system and kernel version and
might be absent entirely. However there is cur‐
rently no file system-independent interface for an application to discover these restrictions for a given file or file system. Some
file systems provide their own interfaces for doing so, for example
the XFS_IOC_DIOINFO operation in xfsctl(3).
Your code shows lack of error handling. Every line in the code contains functions that may fail and open, lseek and write also reports the cause of error in errno. So with some kind of error handling it would be:
fd = open('/dev/sdb', O_DIRECT | O_SYNC | O_RDWR);
if( fd == -1 ) {
perror("open failed");
return;
}
void *buff = malloc(512);
if( !buff ) {
printf("malloc failed");
return;
}
if( lseek(fd, 0, SEEK_SET) == (off_t)-1 ) {
perror("lseek failed");
free(buff);
return;
}
if( write(fd, buff, 512) == -1 ) {
perror("write failed");
free(buff);
return;
}
in that case you would at least get a more detailed explaination on what goes wrong. In this case I suspect that you get EIO (Input/output error) from the write call.
Note that the above maybe isn't complete errorhandling as perror and printf themselves can fail (and you might want to do something about that possibility).

Fork Process / Read Write through pipe SLOW

ANSWER
https://stackoverflow.com/a/12507520/962890
it was so trivial.. args! but lots of good information received. thanks to everyone.
EDIT
link to github: https://github.com/MarkusPfundstein/stream_lame_testing
ORIGINAL POST
i have some questions regarding IPC through pipelines. My goal is to receive MP3 data per TCP/IP stream, pipe it through LAME to decode it to wav, do some math and store it on disk (as a wav). I am using non blocking IO for the whole thing.
What irritates me a bit is that the tcp/ip read is way more fast than the pipe line trough lame. When i send a ~3 MB mp3 the file gets read on the client side in a couple of seconds. In the beginning, i can also write to the stdin of the lame process, than it stops writing, it reads the rest of the mp3 and if its finished i can write to lame again. 4096 bytes take approx 1 second (to write and read from lame). This is pretty slow, because i want to decode my wav min 128kbs.
The OS Is a debian 2.6 kernel on a this micro computer:
https://www.olimex.com/dev/imx233-olinuxino-maxi.html
65 MB RAM
400 MhZ
ulimit -n | grep pipe returns 512 x 8 , means 4096 which is ok. Its a 32 bit system.
The weird thing is that
my_process | lame --decode --mp3input - output.wav
goes very fast.
Here is my fork_lame code (which shall essentialy connect stout of my process to stdin of lame and visa versa)
static char * const k_lame_args[] = {
"--decode",
"--mp3input",
"-",
"-",
NULL
};
static int
fork_lame()
{
int outfd[2];
int infd[2];
int npid;
pipe(outfd); /* Where the parent is going to write to */
pipe(infd); /* From where parent is going to read */
npid = fork();
if (npid == 0) {
close(STDOUT_FILENO);
close(STDIN_FILENO);
dup2(outfd[0], STDIN_FILENO);
dup2(infd[1], STDOUT_FILENO);
close(outfd[0]); /* Not required for the child */
close(outfd[1]);
close(infd[0]);
close(infd[1]);
if (execv("/usr/local/bin/lame", k_lame_args) == -1) {
perror("execv");
return 1;
}
} else {
s_lame_pid = npid;
close(outfd[0]); /* These are being used by the child */
close(infd[1]);
s_lame_fds[WRITE] = outfd[1];
s_lame_fds[READ] = infd[0];
}
return 0;
}
This are the read and write functions. Please not that in write_lame_in. when i write to stderr instead of s_lame_fds[WRITE], the output is nearly immedieatly so its definitly the pipe through lame. But why ?
static int
read_lame_out()
{
char buffer[READ_SIZE];
memset(buffer, 0, sizeof(buffer));
int i;
int br = read(s_lame_fds[READ], buffer, sizeof(buffer) - 1);
fprintf(stderr, "read %d bytes from lame out\n", br);
return br;
}
static int
write_lame_in()
{
int bytes_written;
//bytes_written = write(2, s_data_buf, s_data_len);
bytes_written = write(s_lame_fds[WRITE], s_data_buf, s_data_len);
if (bytes_written > 0) {
//fprintf(stderr, "%d bytes written\n", bytes_written);
s_data_len -= bytes_written;
fprintf(stderr, "data_len write: %d\n", s_data_len);
memmove(s_data_buf, s_data_buf + bytes_written, s_data_len);
if (s_data_len == 0) {
fprintf(stderr, "finished\n");
}
}
return bytes_written;
}
static int
read_tcp_socket(struct connection_s *connection)
{
char buffer[READ_SIZE];
int bytes_read;
bytes_read = connection_read(connection, buffer, sizeof(buffer)-1);
if (bytes_read > 0) {
//fprintf(stderr, "read %d bytes\n", bytes_read);
if (s_data_len + bytes_read > sizeof(s_data_buf)) {
fprintf(stderr, "BUFFER OVERFLOW\n");
return -1;
} else {
memcpy(s_data_buf + s_data_len,
buffer,
bytes_read);
s_data_len += bytes_read;
}
fprintf(stderr, "data_len: %d\n", s_data_len);
}
return bytes_read;
}
The select stuff is pretty basic select logic. All blocks are non blocking of course.
Anyone any idea? I'd really appreciate any help ;-)
Oops! Did you check your LAME output?
Looking at your code, in particular
static char * const k_lame_args[] = {
"--decode",
"--mp3input",
"-",
"-",
NULL
};
and
if (execv("/usr/local/bin/lame", k_lame_args) == -1) {
means you are accidentally omitting the --decode flag as it will be argv[0] for LAME, instead of the first argument (argv[1]). You should use
static char * const k_lame_args[] = {
/* argv[0] */ "lame",
/* argv[1] */ "--decode",
/* argv[2] */ "--mp3input",
/* argv[3] */ "-",
/* argv[4] */ "-",
NULL
};
instead.
I think you are seeing a slowdown because you're accidentally recompressing the MP3 audio. (I noticed this just a minute ago, so haven't checked if LAME does that if you omit the --decode flag, but I believe it does.)
It is possible there is some sort of a blocking issue wrt. nonblocking pipes (not really being nonblocking), causing your end to block until LAME consumes the data.
Could you try an alternative approach? Use normal, blocking pipes, and a separate thread (using pthreads), which has the singular purpose of writing data from a circular buffer to LAME. Your main thread then keeps filling the circular buffer from your TCP/IP connection, and can easily also track and report buffer levels -- very useful during development and debugging. I've had much better success with blocking pipes and threads than nonblocking pipes, in general.
In Linux, threads really do not have that much of an overhead, so you should be comfortable in using them even on embedded architectures. The only trick you must master is specifying a sensible stack size for the worker thread -- in this case 16384 bytes is quite likely enough -- because only the initial stack given to the process will automatically grow and threads stacks are fixed an by default quite large.
Do you need example code?
Edited to add:
Your program receives data from the TCP/IP connection probably at a steady rate. However, LAME consumes the data in largeish chunks. In other words, the situation is like a car being towed, with the tow car jerking and stopping, with the towee jerking into it every time: both your process and LAME are most of the time waiting the other to receive/send more data.
First, those two close are not required (actually, you shouldn't do that), because the two dup2 which follow will do it automatically :
close(STDOUT_FILENO);
close(STDIN_FILENO);

open syscall on fifo not blocking?

I'm creating a quite-big project as an homework where I need to create a server program which listen to 2 fifos, where clients will write.
Everything works, but there is something that is making me angry: whenever I do an operation, which is composed from some write/reads between client and server, when I close fifos on client, it looks like server "think" that there is still someone keeping those fifos opened.
Due to this, the server tries to read 64 byte after each operation, obviusly failing (reading 0 bytes). Only one time per operation this thing happens, it doesn't keep trying to read 64 byte
It doesn't create any problem to clients but it's really strange and I hate those type of bugs
I think it's a problem connected to open/close and to the fact that clients use a lock.
Note, flags used on the open operation are specified in this pseudocode text
Server behaviour:
Open Fifo(1) for READING (O_RDONLY)
Open Fifo(2) for WRITING (O_WRONLY)
Do some operations
Close Fifo(1)
Close Fifo(2)
Client behaviour:
Set a lock on Fifo(1) (waiting if there is already one)
Set a lock on Fifo(2) (same as before)
Open Fifo(1) for WRITING (O_WRONLY)
Open Fifo(2) for READING (O_RDONLY)
Do some operations
Close Fifo(1)
Close Fifo(2)
Get lock from Fifo(1)
Get lock from Fifo(2)
I can't post directly the code, except from the functions used for networking because the project is quite big and I don't use syscalls directly. Here you are:
int Network_Open(const char* path,int oflag)
{
return open(path,oflag);
}
ssize_t Network_IO(int fifo,NetworkOpCodes opcode,void* data,size_t dataSize)
{
ssize_t retsize = 0;
errno = 0;
if (dataSize == 0) return 0;
while ((retsize = (opcode == NetworkOpCode_Write? write(fifo,data,dataSize) : read(fifo,data,dataSize))) < 0)
{
if (errno != EINTR) break;
}
return retsize;
}
Boolean Network_Send(int fifo,const void* data,size_t dataSize)
{
return ((ssize_t)dataSize) == Network_IO(fifo,NetworkOpCode_Write,(void*)data,dataSize);
}
Boolean Network_Receive(int fifo,void* data,size_t dataSize)
{
return ((ssize_t)dataSize) == Network_IO(fifo,NetworkOpCode_Read,data,dataSize);
}
Boolean Network_Close(int fifo)
{
if (fifo >= 0)
return close(fifo) == 0;
}
Any help will be appreciated, thanks.
EDIT 1:
Client output: http://pastie.org/2523854
Server output (strace): http://pastie.org/2523858
Zero bytes returned from (blocking) read() indicates an end of file, i.e., that the other end has closed the FIFO. Read the manpage for read.
The zero bytes result from read() means that the other process has finished. Now your server must close the original file descriptor and reopen the FIFO to serve the next client. The blocking operations will resume once you start working with the new file descriptor.
That's the way it is supposed to work.
AFAIK, after you get the zero bytes, further attempts to read on the file descriptor will also return 0 bytes, in perpetuity (or until you close the file descriptor). Even if another process opens the FIFO, the original file descriptor will continue to indicate EOF (the other client process will be hung waiting for a server process to open the FIFO for reading).

Does Linux's splice(2) work when splicing from a TCP socket?

I've been writing a little program for fun that transfers files over TCP in C on Linux. The program reads a file from a socket and writes it to file (or vice versa). I originally used read/write and the program worked correctly, but then I learned about splice and wanted to give it a try.
The code I wrote with splice works perfectly when reading from stdin (redirected file) and writing to the TCP socket, but fails immediately with splice setting errno to EINVAL when reading from socket and writing to stdout. The man page states that EINVAL is set when neither descriptor is a pipe (not the case), an offset is passed for a stream that can't seek (no offsets passed), or the filesystem doesn't support splicing, which leads me to my question: does this mean that TCP can splice from a pipe, but not to?
I'm including the code below (minus error handling code) in the hopes that I've just done something wrong. It's based heavily on the Wikipedia example for splice.
static void splice_all(int from, int to, long long bytes)
{
long long bytes_remaining;
long result;
bytes_remaining = bytes;
while (bytes_remaining > 0) {
result = splice(
from, NULL,
to, NULL,
bytes_remaining,
SPLICE_F_MOVE | SPLICE_F_MORE
);
if (result == -1)
die("splice_all: splice");
bytes_remaining -= result;
}
}
static void transfer(int from, int to, long long bytes)
{
int result;
int pipes[2];
result = pipe(pipes);
if (result == -1)
die("transfer: pipe");
splice_all(from, pipes[1], bytes);
splice_all(pipes[0], to, bytes);
close(from);
close(pipes[1]);
close(pipes[0]);
close(to);
}
On a side note, I think that the above will block on the first splice_all when the file is large enough due to the pipe filling up(?), so I also have a version of the code that forks to read and write from the pipe at the same time, but it has the same error as this version and is harder to read.
EDIT: My kernel version is 2.6.22.18-co-0.7.3 (running coLinux on XP.)
What kernel version is this? Linux has had support for splicing from a TCP socket since 2.6.25 (commit 9c55e01c0), so if you're using an earlier version, you're out of luck.
You need to splice_all from pipes[0] to to every time you do a single splice from from to pipes[1] (the splice_all is for the amount of bytes just read by the last single splice) . Reason: pipes represents a finite kernel memory buffer. So if bytes is more than that, you'll block forever in your splice_all(from, pipes[1], bytes).

Resources