XFS inodes suddenly exhausted - but not because of IUsed going up

XFS inodes suddenly exhausted - but not because of IUsed going up - filesystems

I got an alert that IUse% on my XFS filesystem had suddenly jumped from 3% to 96% used.
An hour or so later, it went back to 3%.
During the problem:
# df -i /data
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/VolGroup01-LogVol01 57082000 54388657 2693343 96% /data
After resolution:
# df -i /data
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/VolGroup01-LogVol01 2621197920 54375585 2566822335 3% /data
Note that IUsed (column 3) stays almost exactly the same -- in both cases there are ~54 million inodes used.
But during the problem, the number of inodes (column 2) changes drastically - from 2.3 billion (2300 million) - down to 57 million.
What could cause this?

Related

Disk is full and cannot start MongoDB, How to drop databases or tables

root#mongo_node_1:~# df -h
Filesystem Size Used Avail Use% Mounted on
udev 42G 0 42G 0% /dev
tmpfs 8.3G 1.3M 8.3G 1% /run
/dev/sda2 2.9T 2.9T 0 100% /
tmpfs 42G 0 42G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 42G 0 42G 0% /sys/fs/cgroup
/dev/loop0 87M 87M 0 100% /snap/core/4917
/dev/loop1 90M 90M 0 100% /snap/core/8268
tmpfs 8.3G 0 8.3G 0% /run/user/0
root#mongo_node_1:~# e
I have deleted the 20G mongod log file, but the disk is still insufficient, so I can only delete some databases or tables to free the disk.
However, mongod cannot be started now. Can I delete the database or table without starting mongod?
By the way, there are three database nodes. Only the shard1 server disk is full.

If your shard has more then one replicaSet member you can delete the entire data folder content and the member will init sync its content from other members , if it is only one member and running under the default wiredTiger storage engine it is best to not delete files from the data folder since you could easily corrupt the content. It is best if you shutdwon the member , extend the partition offline and start the member again ...

UFS - how a 0 bytes file broke filesystem header?

For those reaching here; Unfortunately I could not recover the data, after various tries and reproducing the problem it was too costy to keep trying, so we just used a past backup to recreate the information needed
A human error broke an 150G UFS filesystem (Solaris).
When trying to do a backup of the filesytem (c0t0d0s3) the ufsdump(1M) hasn't been correctly used.
I will explain the background that led to this ...
The admin used:
# ufsdump 0f /dev/dsk/c0t0d0s3 > output_1
root#ats-br000432 # ufsdump 0f /dev/dsk/c0t0d0s3 > output_1
Usage: ufsdump [0123456789fustdWwnNDCcbavloS [argument]] filesystem
This is a bad usage, so it created a file called output_1 with 0 bytes:
# ls -la output_1
-rw-r--r-- 1 root root 0 abr 12 14:12 output_1
Then, the syntax used was:
# ufsdump 0f /dev/rdsk/c0t0d0s3 output_1
Which wrote that 0 bytes file output_1 to /dev/rdsk/c0t0d0s3 - which was the partition slice
Now, interestingly, due to being a 0 bytes file, we thought that this would cause no harm to the filesystem, but it did.
When trying to ls in the mountpoint, the partition claimed there was an I/O error, when umounting and mounting again, the filesystem showed no contents, but the disk space was still showing as used just like it was previously.
I assume, at some point, the filesystem 'header' was affected, right? Or was it the slice information?
A small fsck try brings up this:
** /dev/rdsk/c0t0d0s3
** Last Mounted on /database
** Phase 1 - Check Blocks and Sizes
INCORRECT DISK BLOCK COUNT I=11 (400 should be 208)
CORRECT?
Disk block count / I=11
this seems that the command broke filesystem information regarding its own contents, right?
When we tried to fsck -y -F ufs /dev/dsk.. various files have been recovered, but not the dbf files we are after (which are GB sized)
What can be done now? Should I try every superblock information from newfs -N ?
EDIT: new information regarding partition
newfs output showing superblock information
# newfs -N /dev/rdsk/c0t0d0s3
Warning: 2826 sector(s) in last cylinder unallocated
/dev/rdsk/c0t0d0s3: 265104630 sectors in 43149 cylinders of 48 tracks, 128 sectors
129445,6MB in 2697 cyl groups (16 c/g, 48,00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
Initializing cylinder groups:
.....................................................
super-block backups for last 10 cylinder groups at:
264150944, 264241184, 264339616, 264438048, 264536480, 264634912, 264733344,
264831776, 264930208, 265028640

what is the max files per directory in EXT4?

What is the limit of EXT4, what i found is only EXT3, and other links only suppositions and not a real number?
Can you please provide me: max number per directory, max size?

Follow-up on #Curt's answer. The creation parameters can determine the number of inodes, and that's what can limit you in the end. df's -i switch gives you inode info.
(env)somesone#somewhere:/$ df -iT
Filesystem Type Inodes IUsed IFree IUse% Mounted on
/dev/root ext4 25149440 612277 24537163 3% /
devtmpfs devtmpfs 3085602 1418 3084184 1% /dev
none tmpfs 3086068 2 3086066 1% /sys/fs/cgroup
none tmpfs 3086068 858 3085210 1% /run
none tmpfs 3086068 1 3086067 1% /run/lock
none tmpfs 3086068 1 3086067 1% /run/shm
none tmpfs 3086068 4 3086064 1% /run/user
This is a Linode box BTW, so it's virtualized environment. The number I look at is 24537163, that's how many free inodes the root fs has. Note, that more than 10K files in a directory can cause difficulties for many tools. 100K can be really hard on utilities.
See also: https://serverfault.com/questions/104986/what-is-the-maximum-number-of-files-a-file-system-can-contain

It depends upon the MKFS parameters used during the filesystem creation. Different Linux flavors have different defaults, so it's really impossible to answer your question definitively.

Given the directory name, how to find the Filesystem on which it resides in C?

For example, a sample df command output is
Filesystem MB blocks Free %Used Iused %Iused Mounted on
/dev/hd4 512.00 322.96 37% 4842 7% /
/dev/hd2 4096.00 717.96 83% 68173 29% /usr
/dev/hd9var 1024.00 670.96 35% 6385 4% /var
/dev/hd3 5120.00 0.39 100% 158 10% /tmp
Now if I specify something like /tmp/dummy.txt I should be able to get /dev/hd3 or just hd3.
EDIT : Thanks torek for the answer. But probing the /proc would become very tedious. Can anyone suggest me some system calls which can do the same internally?

df `pwd`
...Super simple, works, and also tells you how much space is there...
[stackuser#rhel62 ~]$ pwd
/home/stackuser
[stackuser#rhel62 ~]$ df `pwd`
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda7 250056240 196130640 41223408 83% /
[stackuser#rhel62 ~]$ cd isos
[stackuser#rhel62 isos]$ pwd
/home/stackuser/isos
[stackuser#rhel62 isos]$ df `pwd`
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda5 103216920 90417960 11750704 89% /mnt/sda5
[stackuser#rhel62 isos]$ df $(pwd)
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda5 103216920 90417960 11750704 89% /mnt/sda5
...which is the likely cause of the mount point query in the first place.
Note those are backticks, and the alternate (modern) method, providing further control over slashes and expansion is df $(pwd). Tested and traverses symlinks correctly on bash, dash, busybox, zsh. Note that tcsh won't like the $(...), so stick to the older backtick style in csh-variants.
There are also extra switches in pwd and df for further enjoyment.

On Linux, use /proc/<pid>/mounts to access a list of mount points for a given pid, or /proc/self/mounts (with the literal word self) to refer to yourself. (cat the /proc/self/mount* files to see what they look like.)
Then, for each file system, you can do a statfs() call and compare f_fsid the f_fsid field to the result from an earlier statfs() on the path in question. Once the fsid's match, you have found the appropriate mounted file system and can use the other data from /proc/self/mounts. (However, see statfs(2) for restrictions on doing anything useful with f_fsid.)

How many bytes per inodes?

I need to create a very high number of files which are not very large (like 4kb,8kb).
It's not possible on my computer cause it takes all inodes up to 100% and I cannot create more files :
$ df -i /dev/sda5
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda5 54362112 36381206 17980906 67% /scratch
(I started deleting files, it's why it's now 67%)
The bytes-per-nodes are of 256 on my filesystem (ext4)
$ sudo tune2fs -l /dev/sda5 | grep Inode
Inode count: 54362112
Inodes per group: 8192
Inode blocks per group: 512
Inode size: 256
I wonder if it's possible to set this value very low even below 128(during reformating). If yes,what value should I use?
Thx

The default bytes per inode is usually 16384, which is the default inode_ratio in /etc/mke2fs.conf (it's read prior to filesystem creation). If you're running out of inodes, you might try for example:
mkfs.ext4 -i 8192 /dev/mapper/main-var2
Another option that affects this is -T, typically -T news which further reduces it to 4096.
Also, you can not change the number of inodes in a ext3 or ext4 filesystem without re-creating or hex-editing it. Reiser filesystems are dynamic so you'll never have an issue with them.

You can find out the approximate inode ratio by dividing the size of available space by the number of available inodes. For example:
$ sudo tune2fs -l /dev/sda1 | awk -F: ' \
/^Block count:/ { blocks = $2 } \
/^Inode count:/ { inodes = $2 } \
/^Block size:/ { block_size = $2 } \
END { blocks_per_inode = blocks/inodes; \
print "blocks per inode:\t", blocks_per_inode, \
"\nbytes per inode:\t", blocks_per_inode * block_size }'
blocks per inode: 3.99759
bytes per inode: 16374.1

I have found solution to my problem on the mke2fs man page :
-I inode-size
Specify the size of each inode in bytes. mke2fs creates 256-byte inodes by default. In kernels after 2.6.10 and some earlier vendor kernels it is possible to utilize
inodes larger than 128 bytes to store extended attributes for improved performance. The inode-size value must be a power of 2 larger or equal to 128. The larger the
inode-size the more space the inode table will consume, and this reduces the usable space in the filesystem and can also negatively impact performance. Extended
attributes stored in large inodes are not visible with older kernels, and such filesystems will not be mountable with 2.4 kernels at all. It is not possible to change
this value after the filesystem is created.
The maximun you will be able to set is given by your block-size.
sudo tune2fs -l /dev/sda5 | grep "Block size"
Block size: 4096
Hope this can help....

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

XFS inodes suddenly exhausted - but not because of IUsed going up - filesystems

Related

Disk is full and cannot start MongoDB, How to drop databases or tables

UFS - how a 0 bytes file broke filesystem header?

what is the max files per directory in EXT4?

Given the directory name, how to find the Filesystem on which it resides in C?

How many bytes per inodes?

Categories

Resources