I want to mmap stdin. But I can't mmap stdin.
So I have to call read and realloc in a loop, and I want to optimize it by choosing a good buffer size.
fstat of the file descriptor 0 gives a struct stat with a member named st_size, which seems to represent the amount of bytes that are already in the pipe buffer corresponding to stdin.
Depending on how I call my program, st_size varies between 0 and the full pipe buffer size of stdin, which is about 65520. For example in this program:
#include <stdio.h>
#include <sys/stat.h>
#include <unistd.h>
int main()
{
struct stat buf;
int i;
for (i=0; i<20; i++)
{
fstat(0, &buf);
printf("%lld\n", buf.st_size);
usleep(10);
}
}
We can observe the buffer is still being filled:
$ cc fstat.c
$ for (( i=0; i<10000; i++ )) do echo "0123456789" ; done | ./a.out
9196
9306
9350
9394
9427
9471
9515
9559
...
And the output of this program changes everytime I re-run it.
I'd like to use st_size for the initial buffer size so I have to do less calls to realloc.
I have three questions, most important one first:
Can I 'flush' stdin, i.e. wait until the buffer is not being filled anymore ?
Could I do better ? Maybe there is another way to optimize this, to make the realloc in a loop make feel less dirty.
Can you confirm I can't use mmap with stdin ?
Few details:
I wanted to use mmap because I already wrote a program, for educational purpose, that does what nm does, using mmap. I'm wondering how it does handle stdin.
I know that st_size can be 0, in which case I will have no choice but to define a buffer size.
I know stdin can give more bytes than the pipe buffer size.
I want to optimize for educational purpose. There is no imperative need to it.
Thank you
Related
I am trying to speed up my C program to spit out data faster.
Currently I am using printf() to give some data to the outside world. It is a continuous stream of data, therefore I am unable to use return(data).
How can I use write() or fwrite() to give the data out to the console instead of file?
Overall my setup consist of program written in C and its output goes to the python script, where the data is processed further. I form a pipe:
./program_in_c | script_in_python
This gives additional benefit on Raspberry Pi by using more of processor's cores.
#include <unistd.h>
ssize_t write(int fd, const void *buf, size_t count);
write() writes up to count bytes from the buffer starting at buf to
the file referred to by the file descriptor fd.
the standard output file descriptor is: 1 in linux at least!
concern using flush the stdoutput buffer as well, before calling to write system call to ensure that all previous garabge was cleaned
fflush(stdout); // Will now print everything in the stdout buffer
write(1, buf, count);
using fwrite:
size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream);
The function fwrite() writes nmemb items of data, each size bytes
long, to the stream pointed to by stream, obtaining them from the
location given by ptr.
fflush(stdout);
int buf[8];
fwrite(buf, sizeof(int), sizeof(buf), stdout);
Please refare to man pages for further reading, in the links below:
fwrite
write
Well, there's little or no win in trying to overcome the already used buffering system of the stdio.h package. If you try to use fwrite() with larger buffers, you'll probably win no more time, and use more memory than is necessary, as stdio.h selects the best buffer size appropiate to the filesystem where the data is to be written.
A simple program like the following will show that speed is of no concern, as stdio is already buffering output.
#include <stdio.h>
int
main()
{
int c;
while((c = getchar()) >= 0)
putchar(c);
}
If you try the above and below programs:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int
main()
{
char buffer[512];
int n;
while((n = read(0, buffer, sizeof buffer)) > 0)
write(1, buffer, n);
if (n < 0) {
perror("read");
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
You will see that there's no significative difference or, even, the first program will be faster, despite it is doing I/O on a per character basis. (as B. Kernighan & Dennis Ritchie wrote it in her first edition of "The C programming language") Most probably the first program will win.
The calls to read() and write() involve a system call each, with a buffer size decided by you. The individual getchar() and putchar() calls don't. They just store the received chars in a memory buffer, as you print them, whose size has been decided by the stdio.h library implementation, based on the filesystem, and it flushes the buffer, once it is full of data. If you grow the buffer size in the second program, you'll see that you get better results increasing it up to a point, but after that you'll see no more increment in speed. The number of calls made to the library is insignificant with respect to the time involved in doing the actual I/O, and selecting a very large buffer, will eat much memory from your system (and a Raspberry Pi memory is limited in this sense, to 1Gb or ram) If you end making swap due to a so large buffer, you'll lose the battle completely.
Most filesystems have a preferred buffer size, because the kernel does write ahead (the kernel reads more than what you asked for, on sequential reads, in prevision that you'll continue reading more after you consumed the data) and this affects the optimum buffer size. For that, the stat(2) system call tells you what is the optimum buffer size, and stdio uses that when it selects the actual buffer size.
Don't think you are going to get better (or much better) than the program listed first above. Even if you use large enough buffers.
What is not correct (or valid) is to intermix calls that do buffering (like all the stdio package's) with basic system calls (like read(2) or write(2) ---as I've seen recommending you to use fflush(3) after write(2), which is totally incoherent--- that do not buffer the data) there's no earn (and probably you'll get your output incorrectly ordered, if you do part of the calls using printf(3) and part using write(2) (this happens more in pipelines like you plan to do, because the buffers are not line oriented ---another characteristic of buffered output in stdio---)
Finally, I recomend you to read "The Unix programming environment" by Dennis Ritchie and Rob Pike. It will teach you a lot of unix, but one very good thing is that it will teach you to use perfectly the stdio package and the unix filesystem calls for reading and writing. With a little of luck you'll find it in .pdf on internet.
The next program shows you the effect of buffering:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int
main()
{
int i;
char *sep = "";
for (i = 0; i < 10; i++) {
printf("%s%d", sep, i);
sep = ", ";
sleep(1);
}
printf("\n");
}
One would assume you are going to see (on the terminal) the program, writing the numbers 0 to 9, separated by , and paced on one second intervals.
But due to the buffering, what you observe is quite different, you'll see how your program waits for 10 seconds without writing anything at all on the terminal, and at the end, writes everything in one shot, including the final line end, when the program terminates, and the shell shows you the prompt again.
If you change the program to this:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int
main()
{
int i;
char *sep = "";
for (i = 0; i < 10; i++) {
printf("%s%d", sep, i);
fflush(stdout);
sep = ", ";
sleep(1);
}
printf("\n");
}
You'll see the expected output, because you have told stdio to flush the buffer at each loop pass. In both programs you did 10 calls to printf(3), but there was only one write(2) at the end to write the full buffer. In the second version you forced stdio to do one such write(2) after each printf, and that showed the data out as the program passed through the loop.
Be careful, because another characteristic of stdio can be confounding you, as printf(3), when you print to a terminal device, flushes the output at each \n, but when you run it through a pipe, it does it only when the buffer fills up completely. This saves system calls (in FreeBSD, for example, the buffer size selected by stdio is around 32kb, large enough to force two blocks to write(2) and optimum (you'll not get better going above that size)
The console output in C works almost the same way as a file. Once you have included stdio.h, you can write on the console output, named stdout (for "standard output"). In the end, the following statement:
printf("hello world!\n");
is the same as:
char str[] = "hello world\n";
fwrite(str, sizeof(char), sizeof(str) - 1, stdout);
fflush(stdout);
I have a sample program that takes in an input from the terminal and executes it in a cloned child in a subshell.
#define _GNU_SOURCE
#include <stdlib.h>
#include <sys/wait.h>
#include <sched.h>
#include <unistd.h>
#include <string.h>
#include <signal.h>
int clone_function(void *arg) {
execl("/bin/sh", "sh", "-c", (char *)arg, (char *)NULL);
}
int main() {
while (1) {
char data[512] = {'\0'};
int n = read(0, data, sizeof(data));
// fgets(data, 512, stdin);
// int n = strlen(data);
if ((strcmp(data, "exit\n") != 0) && n > 1) {
char *line;
char *lines = strdup(data);
while ((line = strsep(&lines, "\n")) != NULL && strcmp(line, "") != 0) {
void *clone_process_stack = malloc(8192);
void *stack_top = clone_process_stack + 8192;
int clone_flags = CLONE_VFORK | CLONE_FS;
clone(clone_function, stack_top, clone_flags | SIGCHLD, (void *)line);
int status;
wait(&status);
free(clone_process_stack);
}
} else {
exit(0);
}
}
return 0;
}
The above code works in an older Linux system (with minimal RAM( but not in a newer one. Not works means that if I type a simple command like "ls" I don't see the output on the console. But with the older system I see it.
Also, if I run the same code on gdb in debugger mode then I see the output printed onto the console in the newer system as well.
In addition, if I use fgets() instead of read() it works as expected in both systems without an issue.
I have been trying to understand the behavior and I couldn't figure it out. I tried doing an strace. The difference I see is that the wait() return has the output of the ls program in the cases it works and nothing for the cases it does not work.
Only thing I can think of is that read(), since its not a library function has undefined behavior across systems. But I can't agree as to how its affecting the output.
Can someone point me out to why I might be observing this behavior?
EDIT
The code is compiled as:
gcc test.c -o test
strace when it's not working as expected is shown below
strace when it's working as expected (only difference is I added a printf("%d\n", n); following the call for read())
Thank you
Shabir
There are multiple problems in your code:
a successful read system call can return any non zero number between 1 and the buffer size depending on the type of handle and available input. It does not stop at newlines like fgets(), so you might get line fragments, multiple lines, or multiple lines and a line fragment.
furthermore, if read fills the whole buffer, as it might when reading from a regular file, there is no trailing null terminator, so passing the buffer to string functions has undefined behavior.
the test if ((strcmp(data, "exit\n") != 0) && n > 1) { is performed in the wrong order: first test if read was successful, and only then test the buffer contents.
you do not set the null terminator after the last byte read by read, relying on buffer initialization, which is wasteful and insufficient if read fills the whole buffer. Instead you should make data one byte longer then the read size argument, and set data[n] = '\0'; if n > 0.
Here are ways to fix the code:
using fgets(), you can remove the line splitting code: just remove initial and trailing white space, ignore empty and comment lines, clone and execute the commands.
using read(), you could just read one byte at a time, collect these into the buffer until you have a complete line, null terminate the buffer and use the same rudimentary parser as above. This approach mimics fgets(), by-passing the buffering performed by the standard streams: it is quite inefficient but avoids reading from handle 0 past the end of the line, thus leaving pending input available for the child process to read.
It looks like 8192 is simply too small a value for stack size on a modern system. execl needs more than that, so you are hitting a stack overflow. Increase the value to 32768 or so and everything should start working again.
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include<stdio.h>
int main(){
int fd;
char bf[4096];
int buf_size=4096;
fd = open("/proc/18022/cmdline", O_RDONLY);
int bytes = read(fd, bf, buf_size-1);
printf("%d %s\n\n",bytes,bf);
close(fd);
}
Above code is always reading only 3072 bytes while cmdline have more characters than 3072.
If i copy content of cmdline to gedit and then run the above code on this newly created file then it is reading all bytes of the file.
I googled it and found that it reads bytes up to SSIZE_MAX but my doubt is why it is reading all the bytes in second case.
You should not rely on reading a whole file from the first try, even if you know you've allocated enough space for the read. Instead you should read in chunks and process the bytes chunk-by-chunk:
char buff[4096];
while((cnt = read(fd, bf, buf_size-1)) > 0) {
// process the bytes just read, or append them to
// a larger buffer
}
Quoting from the man page for read():
It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are reading from a pipe, or from a terminal), or because read() was interrupted by a signal.
For /proc files, we can see here that:
The most distinctive thing about files in this directory is the fact that all of them have a file size of 0, with the exception of kcore, mtrr and self.
and
You might wonder how you can see details of a process that has a file size of 0. It makes more sense if you think of it as a window into the kernel. The file doesn't actually contain any data; it just acts as a pointer to where the actual process information resides.
which means that the contents of those pseudo-files are sent by the kernel, in batches as large as the kernel wants. This looks very similar to a pipe, where a produces writes data and a consumer reads it, each of them operating at different speeds.
I wrote this code in C:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
void random_seed(){
struct timeval tim;
gettimeofday(&tim, NULL);
double t1=tim.tv_sec+(tim.tv_usec/1000000.0);
srand (t1);
}
void main(){
FILE *f;
int i;
int size=100;
char *buf=(char*)malloc(size);
f = fopen("output.txt", "a");
setvbuf (f, buf, _IOFBF, size);
random_seed();
for(i=0; i<200; i++){
fprintf(f, "[ xx - %d - 012345678901234567890123456789 - %d]\n", rand()%10, getpid());
fflush(f);
}
fclose(f);
free(buf);
}
This code opens in append mode a file and attaches 200 times a string.
I set the buf of size 100 that can contains the full string.
Then I created multi processes running this code by using this bash script:
#!/bin/bash
gcc source.c
rm output.txt
for i in `seq 1 100`;
do
./a.out &
done
I expected that in the output the strings are never mixed up, as I read that when opening a file with O_APPEND flag the file offset will be set to the end of the file prior to each write and i'm using a fully buffered stream, but i got the first line of each process is mixed as this:
[ xx - [ xx - 7 - 012345678901234567890123456789 - 22545]
and some lines later
2 - 012345678901234567890123456789 - 22589]
It looks like the write is interrupted for calling the rand function.
So...why appear these lines?
Is the only way to prevent this the use file locks...even if i'm using only the append mode?
Thanks in advance!
You will need to implement some form of concurrency control yourself, POSIX makes no guarantees with respect to concurrent writes from multiple processes. You get some guarantees for pipes, but not for regular files written to from different processes.
Quoting POSIX write():
This volume of POSIX.1-2008 does not specify behavior of concurrent writes to a file from multiple processes. Applications should use some form of concurrency control.
(At the end of the Rationale section.)
You open the file in the fully buffered mode. That means that every line of the output first goes into the buffer and when the buffer overflows it gets flushed to the file regardless whether it contains incomplete lines. That causes chunks of output from different processes writing into the same file concurrently to be interleaved.
An easy fix would be to open the file in line buffered mode _IOLBF, so that the buffer gets flushed on each complete line. Just make sure that the buffer size is at least as big as your longest line, otherwise it will end up writing incomplete lines. The buffer is normally flushed with a single write() system call, so that lines from different processes won't interleave each other.
There is no guarantee that write() system call is atomic for different filesystems though, but it normally works as expected because write() normally locks the file descriptor in the kernel with a mutex before proceeding.
I've written code that should ideally take in data from one document, encrypt it and save it in another document.
But when I try executing the code it does not put the encrypted data in the new file. It just leaves it blank. Someone please spot what's missing in the code. I tried but I couldn't figure it out.
I think there is something wrong with the read/write function, or maybe I'm implementing the do-while loop incorrectly.
#include <stdio.h>
#include <stdlib.h>
#include <termios.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>
int main (int argc, char* argv[])
{
int fdin,fdout,n,i,fd;
char* buf;
struct stat fs;
if(argc<3)
printf("USAGE: %s source-file target-file.\n",argv[0]);
fdin=open(argv[1], O_RDONLY);
if(fdin==-1)
printf("ERROR: Cannot open %s.\n",argv[1]);
fdout=open(argv[2], O_WRONLY | O_CREAT | O_EXCL, 0644);
if(fdout==-1)
printf("ERROR: %s already exists.\n",argv[2]);
fstat(fd, &fs);
n= fs.st_size;
buf=malloc(n);
do
{
n=read(fd, buf, 10);
for(i=0;i<n;i++)
buf[i] ^= '#';
write(fd, buf, n);
} while(n==10);
close(fdin);
close(fdout);
}
You are using fd instead of fdin in fstat, read and write system calls. fd is an uninitialized variable.
// Here...
fstat(fd, &fs);
// And here...
n=read(fd, buf, 10);
for(i=0;i<n;i++)
buf[i] ^= '#';
write(fd, buf, n);
You're reading and writing to fd instead of fdin and fdout. Make sure you enable all warnings your compiler will emit (e.g. use gcc -Wall -Wextra -pedantic). It will warn you about the use of an uninitialized variable if you let it.
Also, if you checked the return codes of fstat(), read(), or write(), you'd likely have gotten errors from using an invalid file descriptor. They are most likely erroring out with EINVAL (invalid argument) errors.
fstat(fd, &fs);
n= fs.st_size;
buf=malloc(n);
And since we're here: allocating enough memory to hold the entire file is unnecessary. You're only reading 10 bytes at a time in your loop, so you really only need a 10-byte buffer. You could skip the fstat() entirely.
// Just allocate 10 bytes.
buf = malloc(10);
// Or heck, skip the malloc() too! Change "char *buf" to:
char buf[10];
All said it true, one more tip.
You should use a larger buffer that fits the system hard disk blocks, usually 8192.
This will increase your program speed significantly as you will have less access to the disk by a factor of 800. As you know, accessing to disk is very expensive in terms of time.
Another option is use stdio functions fread, fwrite, etc, which already takes care of buffering, still you'll have the function call overhead.
Roni