I am interested in bringing a system down (for, say 15 minutes) by allocating a lot of file descriptors and causing Out-of-File-Descriptor failure. (Don't worry, I am not trying to hack into anything. This is for testing a service I am writing... to see how it behaves under other programs misbehaving.) Any best practices for that? Should I just keep saying fopen() in a infinite for loop? And after 15 minutes, I can kill the process? Does anybody have experience with this?
Update: I am running Linux and the program I am writing will have super user privileges.

Did you consider lowering with setrlimit RLIMIT_NOFILE the file descriptor limit before running your program?
This can be done simply with the bash ulimit -n builtin, in the same shell where you test your application, e.g.:
ulimit -n 32
And it won't perturb much a lot of other services already running. Lowering that limit will make your application (run in the same shell) hurt it quickly (for your testing purposes).
On the entire system level you might also write into /proc/sys/fs/file-max e.g. with
echo 1024 > /proc/sys/fs/file-max

Depends on OS implementation, but call fopen on same file from same process will not allocate new file description, but just increment reference counter.
I would recommend you to read something about stress testing
Here are some usable software(you don't tag any OS platform):

I had this happen once in normal use. I believe you run of inodes in linux. I don't know a faster way that just opening files. Just be careful, we locked our system up. It was a while ago so I don't remember what was trying to open a file, but things generally assume they can get a file handle and don't behave as well as they should in the case they can't. ~Ben

My 2 cents:
1.Write a program that creates a lot of file descriptors. You can achieve it by one of the following methods:
(a)Opening lot of different files in your code
(b)Opening a lot of socket descriptors
(c)Creating a lot of threads
2.Now, keep spawning multiple instances of the program created in Step-1 (i.e. create multiple processes) using a shell script or something similar.
In linux as well as most other operating systems, there is a limit on the number of file descriptors per process (In linux by default it is 1024 I guess. You can check it using ulimit -a). So, your process will just fail when you do this. I am really not so sure that just by increasing the number of file descriptor usage you can make the system go down.

You can use mkstemp to get file descriptors of temporary files.


stat() system call is being blocked

stat() system call is taking long time when I am trying to do a stat on a file which is corrupted. Magic number is corrupted.
I have a print after this call in my source code which is getting printed after some delay.
I am not sure if stat() is doing any retry on the call. If any documentation available please share it. It would be great help.
It returned input output error. Error no 5 EIO. So i am not sure if the file or the filesystem is corrupted
This can be caused by bad blocks on an aging or damaged spinning disk. There are two other symptoms that will likely occur concurrently:
Copious explicit I/O errors reported by the kernel in the system logs.
A sudden spike in load average. This happens because processes which are stuck waiting on I/O are in uninterrupted sleep while the kernel busy loops in an attempt to interact with the hardware, causing the system to become sluggish temporarily. You cannot stop this from happening, or kill processes in uninterrupted sleep. It's a sort of OS Achille's heel.
If this is the case, unmount the filesystems involved and run e2fsck -c -y on them. If it is the root filesystem, you will need to, e.g., boot the system with a live CD and do it from there. From man e2fsck:
This option causes e2fsck to use badblocks(8) program to do a read-only scan of the device in
order to find any bad blocks. If any bad blocks are found, they are added to the bad block
inode to prevent them from being allocated to a file or directory. If this option is specified twice, then the bad block scan will be done using a non-destructive read-write test.
Note that -cc takes a long time; -c should be sufficient. -y answers yes automatically to all questions, which you might as well do since there may be a lot of those.
You will probably lose some data (have a look in /lost+found afterward); hopefully the system still boots. At the very least, the filesystems are now safe to mount. The disk itself may or may not last a while longer. I've done this and had them remain fine for months more, but don't count on it.
If this is a SMART drive, there are apparently some other tools you can use to diagnose and deal with the same problem, although what I've outlined here is probably good enough.

which is better way to edit RLIMIT_NPROC value

My application creates per connection thread . Application is ruinng under the non-zero user id and Sometimes number of threads surpasses default value 1024 . I want to edit this number so I have few options
run as root [very bad idea and also have to compromise with securty ,so dropping it]
run under underprivilaged user use setcap and give capability CAP_SYS_RESOURCE . then I can add code im my program
struct rlimit rlp; /* will initilize this later with values of nprocs(maximum number of desired threads)*/
setrlimit(RLIMIT_NPROC, &rlp);
*The maximum number of processes (or, more precisely on Linux, threads) that can
* created for the real user ID of the
*calling process. Upon encountering this limit, fork(2) fails with the error
Other thing is editing /etc/securitylimits.conf where simply I can make entry for the development user and can put lines e.g.
#devuser hard nproc 20000
#devuser soft nproc 10000
where 10k is enough .So being litle reluctant in chaning source code should I proceed with last option . And I am more curios to know what is more robust and standars approach.
seeking your opinions , and thank you in advance :)
PS: What will happen if a single process will be served with more than 1k threads . ofcource i have 32GB of Ram also
First, I believe you are wrong in having nearly a thousand threads. Threads are quite costly, and it is usually not reasonable to have so much of them. I would suggest having a few dozen threads at most (unless you run on a very costly super-computer).
You could have some event loop around a multiplexing syscall like poll(2). Then a single thread can deal with many thousands of connections. Read about the C10K problem and epoll. Consider using some event libraries like libevent or libev etc...
You could start your application as root (perhaps by using setuid techniques), set-up the required resources (in particular, opening privileged TCP/IP ports), and change the user with setreuid(2)
Read Advanced Linux Programming...
You could also wrap your application around a tiny setuid C program which increase the limits using setrlimit(2), change the user with setreuid, and at last execve(2) your real program.

How to avoid caching effects in read benchmarks

I have a read benchmark and between consecutive runs, I have to make sure that the data does not reside in memory to avoid effects seen due to caching. So far what I used to do is: run a program that writes a large file between consecutive runs of the read benchmark. Something like
./write --size 64G --path /tmp/test.out
The write program simply writes an array of size 1G 64 times to file. Since the size of the main memory is 64G, I write a file that is approx. the same size. The problem is that writing takes a long time and I was wondering if there are better ways to do this, i.e. avoid effects seen when data is cached.
Also, what happens if I write data to /dev/null?
./write --size 64G --path /dev/null
This way, the write program exits very fast, no I/O is actually performed, but I am not sure if it overwrites 64G of main memory, which is what I ultimately want.
Your input is greatly appreciated.
You can drop all caches using a special file in /proc like this:
echo 3 > /proc/sys/vm/drop_caches
That should make sure cache does not affect the benchmark.
You can just unmount the filesystem and mount it back. Unmounting flushes and drops the cache for the filesystem.
Use echo 3 > /proc/sys/vm/drop_caches to flush the pagecache, directory entries cache and inodes cache.
You can the fadvise calls with FADV_DONTNEED to tell the kernel to keep certain files from being cached. You can also use mincore() to verify that the file is not cached. While the drop_caches solution is clearly simpler, this might be better than wiping out the entire cache as that effects all processes on the box.. I don't think you need elevated privledges to use fadvise while I bet you do for writing to /proc. Here is a good example of how to use fadvise calls for this purpose:
One (crude) way that almost never fails is to simply occupy all that excess memory with another program.
Make a trivial program that allocates nearly all the free memory (while leaving enough for your benchmark app). Then memset() the memory to something to ensure that the OS will commit it to physical memory. Finally, do a scanf() to halt the program without terminating it.
By "hogging" all the excess memory, the OS won't be able to use it as cache. And this works in both Linux and Windows. Now you can proceed to do your I/O benchmark.
(Though this might not go well if you're sharing the machine with other users...)

How to know if a file is being copied?

I am currently trying to check wether the copy of a file from a directory to another is done.
I would like to know if the target file is still being copied.
So I would like to get the number of file descriptors openned on this file.
I use C langage and don't really find a way to resolve that problem.
If you have control of it, I would recommend using the copy-move idiom on the program doing the copying:
cp file1 otherdir/.file1.tmp
mv otherdir/.file1.tmp otherdir/file1
The mv just changes some filesystem entries and is atomic and very fast compared to the copy.
If you're able to open the file for writing, there's a good chance that the OS has finished the copy and has released its lock on it. Different operating systems may behave differently for this, however.
Another approach is to open both the source and destination files for reading and compare their sizes. If they're of identical size, the copy has very likely finished. You can use fseek() and ftell() to determine the size of a file in C:
fseek(fp, 0L, SEEK_END);
sz = ftell(fp);
In linux, try the lsof command, which lists all of the open files on your system.
edit 1: The only C language feature that comes to mind is the fstat function. You might be able to use that with the struct's st_mtime (last modification time) field - once that value stops changing (for, say, a period of 10 seconds), then you could assume that file copy operation has stopped.
edit 2: also, on linux, you could traverse /proc/[pid]/fd to see which files are open. The files in there are symlinks, but C's readlink() function could tell you its path, so you could see whether it is still open. Using getpid(), you would know the process ID of your program (if you are doing a file copy from within your program) to know where to look in /proc.
I think your basic mistake is trying to synchronize a C program with a shell tool/external program that's not intended for synchronization. If you have some degree of control over the program/script doing the copying, you should modify it to perform advisory locking of some sort (preferably fcntl-based) on the target file. Then your other program can simply block on acquiring the lock.
If you don't have any control over the program performing the copy, the only solutions depend on non-portable hacks like lsof or Linux inotify API.
(This answer makes the big, big assumption that this will be running on Linux.)
The C source code of lsof, a tool that tells which programs currently have an open file descriptor to a specific file, is freely available. However, just to warn you, I couldn't make any sense out of it. There are references to reading kernel memory, so to me it's either voodoo or black magic.
That said, nothing prevents you from running lsof through your own program. Running third-party programs from your own program is normally something you try to avoid for several reasons, like security (if a rogue user changes lsof for a malicious program, it will run with your program's privileges, with potentially catastrophic consequences) but inspecting the lsof source code, I came to the conclusion that there's no public API to determine which program has which file open. If you're not afraid of people changing programs in /usr/sbin, you might consider this.
int isOpen(const char* file)
char* command;
// you should either try to fix it yourself, or use a function of the `exec`
// family that won't trigger shell expansion.
// It would be an EXTREMELY BAD idea to call `lsof` without an absolute path
// since it could result in another program being run. If this is not where
// `lsof` resides on your system, change it to the appropriate absolute path.
asprintf(&command, "/usr/sbin/lsof \"%s\"", file);
int result = system(command);
return result;
If you also need to know which program has your file open (presumably cp?), you can use popen to read the output of lsof in a similar fashion. popen descriptors behave like fopen descriptors, so all you need to do is fread them and see if you can find your program's name. On my machine, lsof output looks like this:
$ lsof document.pdf
SomeApp 873 felix txt REG 14,3 303260 5165763 document.pdf
As poundifdef mentioned, the fstat() function can give you the current modification time. But fstat also gives you the size of the file.
Back in the dim dark ages of C when I was monitoring files being copied by various programs I had no control over I always:
Waited until the target file size was >= the source size, and
Waited until the target modification time was at least N seconds older than the current time. N being a number such a 5, and set larger if experience showed that was necessary. Yes 5 seconds seems extreme, but it is safe.
If you don't know what the target file is then the only real choice you have is #2, but user a larger N to allow for the worse case network and local CPU delays, with a healthy safety factor.
using boost libs will solve the issue
boost::filesystem::fstream fileStream(filePath, std::ios_base::in | std::ios_base::binary);
//not getting copied
//Wait, the file is getting copied

popen performance in C

I'm designing a program I plan to implement in C and I have a question about the best way (in terms of performance) to call external programs. The user is going to provide my program with a filename, and then my program is going to run another program with that file as input. My program is then going to process the output of the other program.
My typical approach would be to redirect the other program's output to a file and then have my program read that file when it's done. However, I understand I/O operations are quite expensive and I would like to make this program as efficient as possible.
I did a little bit of looking and I found the popen command for running system commands and grabbing the output. How does the performance of this approach compare to the performance of the approach I just described? Does popen simply write the external program's output to a temporary file, or does it keep the program output in memory?
Alternatively, is there another way to do this that will give better performance?
On Unix systems, popen will pass data through an in-memory pipe. Assuming the data isn't swapped out, it won't hit disk. This should give you just about as good performance as you can get without modifying the program being invoked.
popen does pretty much what you are asking for: it does the pipe-fork-exec idiom and gives you a file pointer that you can read and write from.
However, there is a limitation on the size of the pipe buffer (~4K iirc), and if you arent reading quickly enough, the other process could block.
Do you have access to shared memory as a mount point? [on linux systems there is a /dev/shm mountpoint]
1) popen keep the program output in memory. It actually uses pipes to transfer data between the processes.
2) popen looks IMHO as the best option for performance.
It also have an advantage over files of reducing latency. I.e. your program will be able to get the other program output on the fly, while it is produced. If this output is large, then you don't have to wait until the other program is finished to start processing its output.
The problem with having your subcommand redirect to a file is that it's potentially insecure while popen communication can't be intercepted by another process. Plus you need to make sure the filename is unique if you're running several instances of your master program (and thus of your subcommand). The popen solution doesn't suffer from this.
The performance of popen is just fine as long as your don't read/write one byte chunks. Always read/write multiples of 512 (like 4096). But that does apply to file operations as well. popen connects your process and the child process through pipes, so if you don't read then the pipe fills up and the child can't write and vice versa. So all the exchanged data is in memory, but it's only small amounts.
(Assuming Unix or Linux)
Writing to the temp file may be slow if the file is on a slow disk. It also means the entire output will have to fit on the disk.
popen connects to the other program using a pipe, which means that output will be sent to your program incrementally. As it is generated, it is copied to your program chunk-by-chunk.
