I need to create a very high number of files which are not very large (like 4kb,8kb).
It's not possible on my computer cause it takes all inodes up to 100% and I cannot create more files :
$ df -i /dev/sda5
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda5 54362112 36381206 17980906 67% /scratch
(I started deleting files, it's why it's now 67%)
The bytes-per-nodes are of 256 on my filesystem (ext4)
$ sudo tune2fs -l /dev/sda5 | grep Inode
Inode count: 54362112
Inodes per group: 8192
Inode blocks per group: 512
Inode size: 256
I wonder if it's possible to set this value very low even below 128(during reformating). If yes,what value should I use?
Thx
The default bytes per inode is usually 16384, which is the default inode_ratio in /etc/mke2fs.conf (it's read prior to filesystem creation). If you're running out of inodes, you might try for example:
mkfs.ext4 -i 8192 /dev/mapper/main-var2
Another option that affects this is -T, typically -T news which further reduces it to 4096.
Also, you can not change the number of inodes in a ext3 or ext4 filesystem without re-creating or hex-editing it. Reiser filesystems are dynamic so you'll never have an issue with them.
You can find out the approximate inode ratio by dividing the size of available space by the number of available inodes. For example:
$ sudo tune2fs -l /dev/sda1 | awk -F: ' \
/^Block count:/ { blocks = $2 } \
/^Inode count:/ { inodes = $2 } \
/^Block size:/ { block_size = $2 } \
END { blocks_per_inode = blocks/inodes; \
print "blocks per inode:\t", blocks_per_inode, \
"\nbytes per inode:\t", blocks_per_inode * block_size }'
blocks per inode: 3.99759
bytes per inode: 16374.1
I have found solution to my problem on the mke2fs man page :
-I inode-size
Specify the size of each inode in bytes. mke2fs creates 256-byte inodes by default. In kernels after 2.6.10 and some earlier vendor kernels it is possible to utilize
inodes larger than 128 bytes to store extended attributes for improved performance. The inode-size value must be a power of 2 larger or equal to 128. The larger the
inode-size the more space the inode table will consume, and this reduces the usable space in the filesystem and can also negatively impact performance. Extended
attributes stored in large inodes are not visible with older kernels, and such filesystems will not be mountable with 2.4 kernels at all. It is not possible to change
this value after the filesystem is created.
The maximun you will be able to set is given by your block-size.
sudo tune2fs -l /dev/sda5 | grep "Block size"
Block size: 4096
Hope this can help....
Related
For those reaching here; Unfortunately I could not recover the data, after various tries and reproducing the problem it was too costy to keep trying, so we just used a past backup to recreate the information needed
A human error broke an 150G UFS filesystem (Solaris).
When trying to do a backup of the filesytem (c0t0d0s3) the ufsdump(1M) hasn't been correctly used.
I will explain the background that led to this ...
The admin used:
# ufsdump 0f /dev/dsk/c0t0d0s3 > output_1
root#ats-br000432 # ufsdump 0f /dev/dsk/c0t0d0s3 > output_1
Usage: ufsdump [0123456789fustdWwnNDCcbavloS [argument]] filesystem
This is a bad usage, so it created a file called output_1 with 0 bytes:
# ls -la output_1
-rw-r--r-- 1 root root 0 abr 12 14:12 output_1
Then, the syntax used was:
# ufsdump 0f /dev/rdsk/c0t0d0s3 output_1
Which wrote that 0 bytes file output_1 to /dev/rdsk/c0t0d0s3 - which was the partition slice
Now, interestingly, due to being a 0 bytes file, we thought that this would cause no harm to the filesystem, but it did.
When trying to ls in the mountpoint, the partition claimed there was an I/O error, when umounting and mounting again, the filesystem showed no contents, but the disk space was still showing as used just like it was previously.
I assume, at some point, the filesystem 'header' was affected, right? Or was it the slice information?
A small fsck try brings up this:
** /dev/rdsk/c0t0d0s3
** Last Mounted on /database
** Phase 1 - Check Blocks and Sizes
INCORRECT DISK BLOCK COUNT I=11 (400 should be 208)
CORRECT?
Disk block count / I=11
this seems that the command broke filesystem information regarding its own contents, right?
When we tried to fsck -y -F ufs /dev/dsk.. various files have been recovered, but not the dbf files we are after (which are GB sized)
What can be done now? Should I try every superblock information from newfs -N ?
EDIT: new information regarding partition
newfs output showing superblock information
# newfs -N /dev/rdsk/c0t0d0s3
Warning: 2826 sector(s) in last cylinder unallocated
/dev/rdsk/c0t0d0s3: 265104630 sectors in 43149 cylinders of 48 tracks, 128 sectors
129445,6MB in 2697 cyl groups (16 c/g, 48,00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
Initializing cylinder groups:
.....................................................
super-block backups for last 10 cylinder groups at:
264150944, 264241184, 264339616, 264438048, 264536480, 264634912, 264733344,
264831776, 264930208, 265028640
My Posix C program needs to grow a file to X bytes large - typically 128MB or 256MB.
The current approach is to initialise a memory buffer of 16MB and repeatedly write the buffer into the opened file using fwrite.
Is there a more efficient approach?
You can use the ftruncate system call.
You can quickly fill a file with all zeros by seeking to an offset and writing a byte there. The contents of the file before that offset will be filled with zeros if the file was not that large already.
On Linux this will create a sparse file. The file will appear be of size 256MB but it will actually use very little space on disk.
Use ftruncate , despite its name it can be used to extend files too. It might be slightly less portable compared to your current method as posix does not require ftruncate to be able to extend a file (but XSI does)
Note about sparse files - which all of the answers create for you
code:
// ft.c
int main()
{
FILE*fp=fopen("sparse", "w");
ftruncate(fileno(fp), 2147483647UL);
fclose(fp);
return 0;
}
show the disk before and after running ./ft:
me#mycomputer ~> gcc ft.c -o ft
me#mycomputer ~> du -sh .
368M .
me#mycomputer ~> df -h .
Filesystem Size Used Available Capacity Mounted on
/ 98G 8.3G 89G 9% /
me#mycomputer ~> ./ft
me#mycomputer ~> df -h .
Filesystem Size Used Available Capacity Mounted on
/ 98G 8.3G 89G 9% /
me#mycomputer ~> du -sh .
368M .
me#mycomputer ~> ls -l sparse
-rw-r--r-- 1 me other 2147483647 Dec 10 12:55 sparse
Now we back up sparse:
me#mycomputer ~> tar cvf /tmp/my.tar.tar ./sparse
a ./sparse 2097152K
Next we restore sparse from the backup:
me#mycomputer ~> tar xf /tmp/my.tar.tar .
me#mycomputer ~> du -sh .
2.4G .
me#mycomputer ~> df -h .
Filesystem Size Used Available Capacity Mounted on
/ 98G 10G 87G 11% /
Voila - we gained 2GB. Sparse files are a great way to create a sort of future negative payback - a possible disk full error. The above example is on the system disk - once the system disk is 100% full the system has all kinds of nasty problems.
This is because restoring files (months later maybe) inflates sparse files on disk to their "real" size. And you forgot about them in the meantime.
Seriously consider not creating these things in the first place, there usually are no requirements to do it.
Tomcat runs on my workstation for several days, now it has no response, lsof command outputs lots of close_wait state connections, tomcat pid is 25422, however the ulimit command shows that the "open file" is 1024, how can this happen?
[root#localhost home]# lsof -p 25422 | wc -l
10309
[root#localhost home]# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 399360
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 399360
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
For open files, we have soft / hard open files limit on linux os.
If the soft limit is reached, it will just expand the limit to higher limit but under Hard limit.
By checking the hard limit, you can simply run:
# ulimit -Hn
Here is also an article that may help you understand more:
Guide to limits.conf / ulimit /open file descriptors under linux
Not every type of entry in lso counts towards ulimit. RUn the following command for getting only those fds that count towards ulimit
lsof -p <pid> -d '^cwd,^err,^ltx,^mem,^mmap,^pd,^rtd,^txt' -a
Lsof Options in Linux man page describes the command as follows:
In the absence of any options, lsof lists all open files belonging to
all active processes.
enter link description here
Ulimit simply limits resources at the user level,and applied to each process .That's why you see that the number in the Lsof report is far greater than that of the ulimit.
If you want to get kernel level restrictions, use the following command
cat /proc/sys/fs/file-max
You'll get a very big number.
In summary, lsof may be more than ulimit , and it is definitely less than file-max.
What is the limit of EXT4, what i found is only EXT3, and other links only suppositions and not a real number?
Can you please provide me: max number per directory, max size?
Follow-up on #Curt's answer. The creation parameters can determine the number of inodes, and that's what can limit you in the end. df's -i switch gives you inode info.
(env)somesone#somewhere:/$ df -iT
Filesystem Type Inodes IUsed IFree IUse% Mounted on
/dev/root ext4 25149440 612277 24537163 3% /
devtmpfs devtmpfs 3085602 1418 3084184 1% /dev
none tmpfs 3086068 2 3086066 1% /sys/fs/cgroup
none tmpfs 3086068 858 3085210 1% /run
none tmpfs 3086068 1 3086067 1% /run/lock
none tmpfs 3086068 1 3086067 1% /run/shm
none tmpfs 3086068 4 3086064 1% /run/user
This is a Linode box BTW, so it's virtualized environment. The number I look at is 24537163, that's how many free inodes the root fs has. Note, that more than 10K files in a directory can cause difficulties for many tools. 100K can be really hard on utilities.
See also: https://serverfault.com/questions/104986/what-is-the-maximum-number-of-files-a-file-system-can-contain
It depends upon the MKFS parameters used during the filesystem creation. Different Linux flavors have different defaults, so it's really impossible to answer your question definitively.
I have partition structure like :
$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda6 51606140 16939248 34142692 34% /
/dev/sda5 495844 72969 397275 16% /boot
/dev/sda7 113022648 57515608 49765728 50% /home
/dev/sda8 113022648 57515608 49765728 4% /mnt
while parsing directories content using readdir() - how to find out which file resides on what device?
readdir() invoked from root directory and parses the file name and prints its size.
like from device : /dev/sda6 and list the filenames under that partition.
When it reads contents from /home - it should display reading content from /dev/sda7 and list filenames
Please let me know,if you need more details/info
you can just do
df <file_name>
that will give you the device and partition for the particuar file
There is a st_dev member in struct stat, it should uniquely identify one partition.
Example in bash:
stat ~/.vimrc
File: `/home2//leonard/.vimrc' -> `local-priv/vimrc'
Size: 16 Blocks: 0 IO Block: 4096 symbolic link
Device: 802h/2050d Inode: 6818899 Links: 1
Access: (0777/lrwxrwxrwx) Uid: ( 1024/ leonard) Gid: ( 1024/ leonard)
Access: 2012-06-22 16:36:45.341371003 +0300
Modify: 2012-06-22 16:36:45.341371003 +0300
Change: 2012-06-22 16:36:45.341371003 +0300
The stat utility does no additional magic. Here is strace -vvv output:
lstat64("/home2//leonard/.vimrc", {st_dev=makedev(8, 2), st_ino=6818899, st_mode=S_IFLNK|0777, st_nlink=1, st_uid=1024, st_gid=1024, st_blksize=4096, st_blocks=0, st_size=16, st_atime=2012/06/22-16:36:45, st_mtime=2012/06/22-16:36:45, st_ctime=2012/06/22-16:36:45}) = 0
0x0802 is major 8(sd) partition 2, so /dev/sda2
In order to map this to actual partitions you can iterate /proc/mounts and stat all the devices (first column). The contents of /proc/mounts is just like the output of mount(1) except it comes directly from the kernel. Some distros symlink /etc/mtab to /proc/mounts.
Or you can parse /proc/partitions:
$ cat /proc/partitions
major minor #blocks name
8 0 976762584 sda
8 1 3998720 sda1
8 2 972762112 sda2
Of course /dev/sda might not actually exist, the device could be using a long udev name like /dev/disk/by-uuid/c4181217-a753-4cf3-b61d-190ee3981a3f. Major/Minor numbers should be a reliable unique identifier of a partition.