Write one billion files in one Folder BUT "(No space left on device)" error - c

I'm trying to write 1 billion of files in one folder using multi thread but next my program wrote 20 million files I got "No space left on device". I did not close my program because It still writing same files.
I don't have any problems with "inode", I used only 7%.
No problem with /tmp, /var/tmp, there are empty.
I increased fs.inotify.max_user_watches to 1048576.
I use debian and EXT4 as filesystem.
Is there same one meet this problem and thank you so much for help.
Running tune2fs -l /path/to/drive gives
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 260276224
Block count: 195197952
Reserved block count: 9759897
Free blocks: 178861356
Free inodes: 260276213
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 1024
Blocks per group: 24576
Fragments per group: 24576
Inodes per group: 32768
Inode blocks per group: 2048
Flex block group size: 16
Filesystem created: ---
Last mount time: ---
Last write time: ---
Mount count: 2
Maximum mount count: -1
Last checked: ---
Check interval: 0 ()
Lifetime writes: 62 GB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: ---
Directory Hash Seed: ---
Journal backup: inode blocks

check this question
How to store one billion files on ext4?
you have fewer blocks than inodes which is not going to work, though I think that is the least of your problems. If you really want to do this (would a database be better?) you may need to look into filesystems other an ext4 zfs springs to mind as an option that allows 2^48 entries per directory and should do what you want.
If this question https://serverfault.com/questions/506465/is-there-a-hard-limit-to-the-number-of-files-a-directory-can-have is anything to go by, there is a limit on the number of files per directory using ext4 which you are likely hitting

Related

Mounting ext4 mounts as ext2 in linux 4.4.0 [RHEL]

commands I executed
mkfs .ext4 -F /dev/xxx0
mke2fs 1.42.9 (28-Dec-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
524288 inodes, 2097152 blocks
104857 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=2147483648
64 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632
Allocating group tables: done
Writing inode tables: done
Writing superblocks and filesystem accounting information: done
mount -o dax /dev/xxx0 /mnt/ext4-xxx0
mount output
/dev/xxx0 on /mnt/ext4-xxx0 type ext2 (rw,relatime,seclabel,block_validity,barrier,dax,user_xattr,acl)
dmesg output
[ 2917.044866] EXT4-fs (xxx0): DAX enabled. Warning: EXPERIMENTAL, use at your own risk
[ 2917.044869] EXT4-fs (xxx0): mounting ext2 file system using the ext4 subsystem
...
...
[ 2917.045459] EXT4-fs (xxx0): mounted filesystem without journal. Opts: dax
which subsequently results in failure of fallocate, as the ext2 is not supported fos ext2.
any suggestions/solution would be appreciated.
Provide the type as ext4 while mounting. Or else it will take default type (or) type mentioned in /etc/fstab.
# mount -o dax -t ext4 /dev/xxx0 /mnt/ext4-xxx0

How do I revover a btrfs filesystem that will not mount (but mount returns without error), checks ok, errors out on restore?

SYNOPSIS
mount -o degraded,ro /dev/disk/by-uuid/ec3 /mnt/ec3/ && echo noerror
noerror
DESCRIPTION
mount -t btrfs fails but returns with noerror as above and only since the last reboot.
btrfs check seems clean to me (I am simple user).
btrfs restore errors out with "We have looped trying to restore files in"...
I had a lingering artifact btrfs filesystem show giving " *** Some devices missing " on the volume. This meant it would not automount on boot and I have been manually mounting (+ searching for a resolution to that)
I have previously used rdfind to deduplicate with hard links (as many as 10 per file)
I had just backed up using btrfs send/recieve but have to check if I have everything - this was the main Raid1 server
DETAILS
btrfs-find-root /dev/disk/by-uuid/ec3
Superblock thinks the generation is 103093
Superblock thinks the level is 1
Found tree root at 8049335181312 gen 103093 level 1
btrfs restore -Ds /dev/disk/by-uuid/ec3 restore_ec3
We have looped trying to restore files in
df -h /mnt/ec3/
Filesystem Size Used Avail Use% Mounted on
/dev/dm-0 16G 16G 483M 97% /
mount -o degraded,ro /dev/disk/by-uuid/ec3 /mnt/ec3/ && echo noerror
noerror
df /mnt/ec3/
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/dm-0 16775168 15858996 493956 97% /
btrfs filesystem show /dev/disk/by-uuid/ec3
Label: none uuid: ec3
Total devices 3 FS bytes used 1.94TiB
devid 6 size 2.46TiB used 1.98TiB path /dev/mapper/26d2e367-65ea-47ad-b298-d5c495a33efe
devid 7 size 2.46TiB used 1.98TiB path /dev/mapper/3c193018-5956-4637-9ec2-dd5e49a4a412
*** Some devices missing #### comment, this is an old artifact unchanged since before unable to mount
btrfs check /dev/disk/by-uuid/ec3
Checking filesystem on /dev/disk/by-uuid/ec3
UUID: ec3
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 2132966506496 bytes used err is 0
total csum bytes: 2077127248
total tree bytes: 5988204544
total fs tree bytes: 3492638720
total extent tree bytes: 242151424
btree space waste bytes: 984865976
file data blocks allocated: 3685012271104
referenced 3658835013632
btrfs-progs v4.1.2
Update:
After a reboot (had to wait for a slot to go down) the system mounts manually but not completely clean.
Now asking question on irc #btrfs:
! http://pastebin.com/359EtZQX
Hi, I'm scratching my head and searched in vain to remove *** Some devices missing. Can anyone help give me a clue to clean this up?
- is there a good way to 'fix' the artifacts I am seeing? trying: scrub, balance. Try: resize, defragment.
- would I be advised to move to a new clean volume set?
- would a fix via a btrfs send/recieve be safe from propogating errors?
- or (more painfully) rsync to a clean volume? http://pastebin.com/359EtZQX (My first ever day using irc)

Is there a way to calculate I/O and memory of current process in C?

If I use
/usr/bin/time -f"%e,%P,%M,%I,%O"
I get (for the last three placeholders) the memory the process used, and if there was some input and output during it.
Obviously, it's easy to get %e or something like it using sys/time.h, but is there a way to get %M, %I and %O programmatically?
You could read and parse the files in the /proc filesystem. /proc/self refers to the process accessing the /proc filesystem.
/proc/self/statm contains information about memory usage, measured in pages. Sample output:
% cat /proc/self/statm
1115 82 63 12 0 79 0
Fields are size resident share text lib data dt; see the proc manual page for some additional details.
/proc/self/io contains the I/O for the current process. Sample output:
% cat /proc/self/io
rchar: 2012
wchar: 0
syscr: 6
syscw: 0
read_bytes: 0
write_bytes: 0
cancelled_write_bytes: 0
Unfortunately, io isn't documented in the proc manual page (at least on my Debian system). I had too check the iotop source code to see how it obtained the per process I/O information.

How to find which process is leaking file handles in Linux?

The problem incident:
Our production system started denying services with an error message "Too many open files in system". Most of the services were affected, including inability to start a new ssh session, or even log in into virtual console from the physical terminal. Luckily, one root ssh session was open, so we could interact with the system (morale: keep one root session always open!). As a side effect, some services (named, dbus-daemon, rsyslogd, avahi-daemon) saturated the CPU (100% load). The system also serves a large directory via NFS to a very busy client which was backing up 50000 small files at the moment. Restarting all kinds of services and programs normalized their CPU behavior, but did not solve the "Too many open files in system" problem.
The suspected cause
Most likely, some program is leaking file handles. Probably the culprit is my tcl program, which also saturated the CPU (not normal). However, killing it did not help, but, most disturbingly, lsof would not reveal large amounts of open files.
Some evidence
We had to reboot, so whatever information was collected is all we have.
root#xeon:~# cat /proc/sys/fs/file-max
205900
root#xeon:~# lsof
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
init 1 root cwd DIR 8,6 4096 2 /
init 1 root rtd DIR 8,6 4096 2 /
init 1 root txt REG 8,6 124704 7979050 /sbin/init
init 1 root mem REG 8,6 42580 5357606 /lib/i386-linux-gnu/libnss_files-2.13.so
init 1 root mem REG 8,6 243400 5357572 /lib/i386-linux-gnu/libdbus-1.so.3.5.4
...
A pretty normal list, definitely not 200K files, more like two hundred.
This is probably, where the problem started:
less /var/log/syslog
Mar 27 06:54:01 xeon CRON[16084]: (CRON) error (grandchild #16090 failed with exit status 1)
Mar 27 06:54:21 xeon kernel: [8848865.426732] VFS: file-max limit 205900 reached
Mar 27 06:54:29 xeon postfix/master[1435]: warning: master_wakeup_timer_event: service pickup(public/pickup): Too many open files in system
Mar 27 06:54:29 xeon kernel: [8848873.611491] VFS: file-max limit 205900 reached
Mar 27 06:54:32 xeon kernel: [8848876.293525] VFS: file-max limit 205900 reached
netstat did not show noticeable anomalies either.
The man pages for ps and top do not indicate an ability to show open file count. Probably the problem will repeat itself after a few months (that was our uptime).
Any ideas on what else can be done to identify the open files?
UPDATE
This question has changed the meaning, after qehgt identified the likely cause.
Apart from the bug in NFS v4 code, I suspect there is a design limitation in Linux and kernel-leaked file handles can NOT be identified. Consequently, the original question transforms into:
"Who is responsible for file handles in the Linux kernel?" and "Where do I post that question?". The 1st answer was helpful, but I am willing to accept a better answer.
Probably the root cause is a bug in NFSv4 implementation: https://stackoverflow.com/a/5205459/280758
They have similar symptoms.

How do I create a sparse file programmatically, in C, on Mac OS X?

I'd like to create a sparse file such that all-zero blocks don't take up actual disk space until I write data to them. Is it possible?
There seems to be some confusion as to whether the default Mac OS X filesystem (HFS+) supports holes in files. The following program demonstrates that this is not the case.
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
void create_file_with_hole(void)
{
int fd = open("file.hole", O_WRONLY|O_TRUNC|O_CREAT, 0600);
write(fd, "Hello", 5);
lseek(fd, 99988, SEEK_CUR); // Make a hole
write(fd, "Goodbye", 7);
close(fd);
}
void create_file_without_hole(void)
{
int fd = open("file.nohole", O_WRONLY|O_TRUNC|O_CREAT, 0600);
write(fd, "Hello", 5);
char buf[99988];
memset(buf, 'a', 99988);
write(fd, buf, 99988); // Write lots of bytes
write(fd, "Goodbye", 7);
close(fd);
}
int main()
{
create_file_with_hole();
create_file_without_hole();
return 0;
}
The program creates two files, each 100,000 bytes in length, one of which has a hole of 99,988 bytes.
On Mac OS X 10.5 on an HFS+ partition, both files take up the same number of disk blocks (200):
$ ls -ls
total 400
200 -rw------- 1 user staff 100000 Oct 10 13:48 file.hole
200 -rw------- 1 user staff 100000 Oct 10 13:48 file.nohole
Whereas on CentOS 5, the file without holes consumes 88 more disk blocks than the other:
$ ls -ls
total 136
24 -rw------- 1 user nobody 100000 Oct 10 13:46 file.hole
112 -rw------- 1 user nobody 100000 Oct 10 13:46 file.nohole
As in other Unixes, it's a feature of the filesystem. Either the filesystem supports it for ALL files or it doesn't. Unlike Win32, you don't have to do anything special to make it happen. Also unlike Win32, there is no performance penalty for using a sparse file.
On MacOS, the default filesystem is HFS+ which does not support sparse files.
Update: MacOS used to support UFS volumes with sparse file support, but that has been removed. None of the currently supported filesystems feature sparse file support.
This thread becomes a comprehensive source of info about the sparse files. Here is the missing part for Win32:
Decent article with examples
Tool that estimates if it makes sense to make file as sparse
Regards
hdiutil can handle sparse images and files but unfortunately the framework it links against is private.
You could try defining external symbols as defined by the DiskImages framework below but this is most likely not acceptable for production code, plus since the framework is private you'd have to reverse engineer its use cases.
cristi:~ diciu$ otool -L /usr/bin/hdiutil
/usr/bin/hdiutil:
/System/Library/PrivateFrameworks/DiskImages.framework/Versions/A/DiskImages (compatibility version 1.0.8, current version 194.0.0)
[..]
cristi:~ diciu$ nm /System/Library/PrivateFrameworks/DiskImages.framework/Versions/A/DiskImages | awk -F' ' '{print $3}' | c++filt | grep -i sparse
[..]
CSparseFile::sector2Band(long long)
CSparseFile::addIndexNode()
CSparseFile::readIndexNode(long long, SparseFileIndexNode*)
CSparseFile::readHeaderNode(CBackingStore*, SparseFileHeaderNode*, unsigned long)
[... cut for brevity]
Later Edit
You could use hdiutil as an external process and have it create an sparse disk image for you. From the C process you would then create a file in the (mounted) sparse disk image.
If you seek (fseek, ftruncate, ...) to past the end, the file size will be increased without allocating blocks until you write to the holes. But there's no way to create a magic file that automatically converts blocks of zeroes to holes. You have to do it yourself.
This may be helpful to look at (the OpenBSD cp command inserts holes instead of writing zeroes).
patch
If you want portability, the last resort is to write your own access function so that you manage an index and a set of blocks.
In essence you manage a single file as the OS manages the disk keeping the chain of the blocks that are part of the file, the bitmap of allocated/free blocks etc.
Of course this will lead to a non optimized and slower access, I would reccomend this apprach only if the requirement to save space is absolutely critical and you have enough time to write a robust set of access functions.
And even in that case, I would first investigate if your problem is in need of a different solution. Probably you should store your data differently?
It looks like OS X supports sparse files on UDF volumes. I tried titaniumdecoy's test program on OS X 10.9 and it did generate a sparse file on a UDF disk image. Also, not that UFS is no longer supported in OS X, so if you need sparse files, UDF is the only natively supported file system that supports them.
I also tried the program on SMB shares. When the server is Ubuntu (ext4 filesystem) the program creates a sparse file, but 'ls -ls' through SMB doesn't show that. If you do 'ls -ls' on the Ubuntu host itself it does show the file is sparse. When the server is Windows XP (NTFS filesystem) the program does not generate a sparse file.

Resources