How to fix btrfs root inode errors [closed] - btrfs

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 years ago.
Improve this question
Running btrfsck, or more officially, btrfs check --repair gives this output stating that there are root inode errors. The repair command does not fix the issue and reruns will display the same output. The system is fully mountable and operational, but I cannot perform advanced operations on the partition (resizing).
sudo btrfs check --repair /dev/sda9
enabling repair mode
Checking filesystem on /dev/sda9
UUID: 82fca3c2-703b-4fae-aec2-6b7df1be71c1
checking extents
Fixed 0 roots.
checking free space cache
cache and super generation don't match, space cache will be invalidated
checking fs roots
root 257 inode 452001 errors 400, nbytes wrong
root 257 inode 452004 errors 400, nbytes wrong
root 257 inode 452005 errors 400, nbytes wrong
root 257 inode 452006 errors 400, nbytes wrong
root 257 inode 452010 errors 400, nbytes wrong
root 257 inode 452011 errors 400, nbytes wrong
root 257 inode 452012 errors 400, nbytes wrong
root 257 inode 1666032 errors 400, nbytes wrong
checking csums
checking root refs
found 33957216263 bytes used err is 0
total csum bytes: 32206988
total tree bytes: 968933376
total fs tree bytes: 886636544
total extent tree bytes: 35323904
btree space waste bytes: 199109273
file data blocks allocated: 41090113536
referenced 32584159232
btrfs-progs v4.0.1

Provided that the broken inodes are the only problem present, the solution is to simply remove them. There may be a quicker way to do this, but here is what worked for me. From here I gleaned that you can use the find command to search for an inode like so:
find / -inum XXXXXX -print
of course giving it the inode in question from the btrfsck command. It will show you the offending file and you can delete it. When all have been removed, btrfsck will be clear and the system will function normally.

Related

S_ISDIR returning different value for directory [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
I'm currently trying to do some kind of recursive find where I need to distinguish regular files with directory.
I made a loop to check with S_ISDIR if a given file is a directory but when applied to /home, I don't get the expected result.
I only have a pome directory in /home so expected result would be :
/home/.
/home/..
/home/pome
but it doesn't detect pome as a directory, S_ISDIR() returns 0 for pome and 1 for . and ..
Code :
DIR * dir = opendir("/home");
if(dir==NULL){
puts("Unknown directory");
return 1;
}
char path[SIZE_PATH];
memset(path,'\0',sizeof(path));
strcpy(path,"/home");
struct dirent * trucdir;
char filename[SIZE_PATH];
memset(filename,'\0',sizeof(filename));
struct stat * filestat=malloc(sizeof(struct stat));
while((trucdir=readdir(dir))!=NULL){
memset(filename,'\0',sizeof(filename));
strcpy(filename,trucdir->d_name);
stat(filename,filestat);
if(S_ISDIR(filestat->st_mode)!=0){
puts(filename);
}
}
Isn't S_ISDIR supposed to return a non-zero value if the file is a directory ?
You have to check, that your stat()-call actually succeeds by checking its return value. In your case, the call to stat is:
stat("pome", filestat);
but your current working directory (cwd) is not /home!
Hence, the call to stat fails with -1 and errno ENOENT and the IS_DIR-macro is meaningless. The stat-calls to . and .. of course succeed, since they are present in all directories (although other information like inode number etc. doesn't match then)
You have to make sure that you either provide the full path in filename (i.e. /home/pome) or you set the cwd to /home before (with chdir("/home");); this should solve your issue!

Why doesn't ls -al show message queue created by mq_open [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I use mq_open to create message queue /temp.1234,
but command ls -al and ipcs -qin directory / doesn't show any information about the message queue.
I use mq_send to send a message. Also, in another program, mq_open("/temp.1234", O_WRONLY) returns 3(message descriptor) succesfully, but when calling mq_receive, it outputs EBADF. OS is ubuntu.
Is it only effective on solaris rather than ubuntu?
Added:
it's from unix network programing volume 2.
Here is the output under Solaris 2.6:
solaris % mqcreatel /temp.l234
solaris % 1s -1 /tmp/.*l234
-rw-rw-rw- 1 rstevens otherl 132632 Oct 23 17:08 /tmp/.MQDtemp.1234
-rw-rw-rw- 1 rstevens other1
0 Oct 23 17:08 /tmp/.MQLtemp.l234
-rw-r--r--
1 rstevens other1
0 Oct 23 17:08 /tmp/.MQPtemp.l234
The first argument is not a filename, it is an identifier only. It will not exist on the filesystem.
Attempting to receive from a write-only queue is an error.

How do I revover a btrfs filesystem that will not mount (but mount returns without error), checks ok, errors out on restore?

SYNOPSIS
mount -o degraded,ro /dev/disk/by-uuid/ec3 /mnt/ec3/ && echo noerror
noerror
DESCRIPTION
mount -t btrfs fails but returns with noerror as above and only since the last reboot.
btrfs check seems clean to me (I am simple user).
btrfs restore errors out with "We have looped trying to restore files in"...
I had a lingering artifact btrfs filesystem show giving " *** Some devices missing " on the volume. This meant it would not automount on boot and I have been manually mounting (+ searching for a resolution to that)
I have previously used rdfind to deduplicate with hard links (as many as 10 per file)
I had just backed up using btrfs send/recieve but have to check if I have everything - this was the main Raid1 server
DETAILS
btrfs-find-root /dev/disk/by-uuid/ec3
Superblock thinks the generation is 103093
Superblock thinks the level is 1
Found tree root at 8049335181312 gen 103093 level 1
btrfs restore -Ds /dev/disk/by-uuid/ec3 restore_ec3
We have looped trying to restore files in
df -h /mnt/ec3/
Filesystem Size Used Avail Use% Mounted on
/dev/dm-0 16G 16G 483M 97% /
mount -o degraded,ro /dev/disk/by-uuid/ec3 /mnt/ec3/ && echo noerror
noerror
df /mnt/ec3/
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/dm-0 16775168 15858996 493956 97% /
btrfs filesystem show /dev/disk/by-uuid/ec3
Label: none uuid: ec3
Total devices 3 FS bytes used 1.94TiB
devid 6 size 2.46TiB used 1.98TiB path /dev/mapper/26d2e367-65ea-47ad-b298-d5c495a33efe
devid 7 size 2.46TiB used 1.98TiB path /dev/mapper/3c193018-5956-4637-9ec2-dd5e49a4a412
*** Some devices missing #### comment, this is an old artifact unchanged since before unable to mount
btrfs check /dev/disk/by-uuid/ec3
Checking filesystem on /dev/disk/by-uuid/ec3
UUID: ec3
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 2132966506496 bytes used err is 0
total csum bytes: 2077127248
total tree bytes: 5988204544
total fs tree bytes: 3492638720
total extent tree bytes: 242151424
btree space waste bytes: 984865976
file data blocks allocated: 3685012271104
referenced 3658835013632
btrfs-progs v4.1.2
Update:
After a reboot (had to wait for a slot to go down) the system mounts manually but not completely clean.
Now asking question on irc #btrfs:
! http://pastebin.com/359EtZQX
Hi, I'm scratching my head and searched in vain to remove *** Some devices missing. Can anyone help give me a clue to clean this up?
- is there a good way to 'fix' the artifacts I am seeing? trying: scrub, balance. Try: resize, defragment.
- would I be advised to move to a new clean volume set?
- would a fix via a btrfs send/recieve be safe from propogating errors?
- or (more painfully) rsync to a clean volume? http://pastebin.com/359EtZQX (My first ever day using irc)

Write one billion files in one Folder BUT "(No space left on device)" error

I'm trying to write 1 billion of files in one folder using multi thread but next my program wrote 20 million files I got "No space left on device". I did not close my program because It still writing same files.
I don't have any problems with "inode", I used only 7%.
No problem with /tmp, /var/tmp, there are empty.
I increased fs.inotify.max_user_watches to 1048576.
I use debian and EXT4 as filesystem.
Is there same one meet this problem and thank you so much for help.
Running tune2fs -l /path/to/drive gives
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 260276224
Block count: 195197952
Reserved block count: 9759897
Free blocks: 178861356
Free inodes: 260276213
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 1024
Blocks per group: 24576
Fragments per group: 24576
Inodes per group: 32768
Inode blocks per group: 2048
Flex block group size: 16
Filesystem created: ---
Last mount time: ---
Last write time: ---
Mount count: 2
Maximum mount count: -1
Last checked: ---
Check interval: 0 ()
Lifetime writes: 62 GB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: ---
Directory Hash Seed: ---
Journal backup: inode blocks
check this question
How to store one billion files on ext4?
you have fewer blocks than inodes which is not going to work, though I think that is the least of your problems. If you really want to do this (would a database be better?) you may need to look into filesystems other an ext4 zfs springs to mind as an option that allows 2^48 entries per directory and should do what you want.
If this question https://serverfault.com/questions/506465/is-there-a-hard-limit-to-the-number-of-files-a-directory-can-have is anything to go by, there is a limit on the number of files per directory using ext4 which you are likely hitting

How to find which process is leaking file handles in Linux?

The problem incident:
Our production system started denying services with an error message "Too many open files in system". Most of the services were affected, including inability to start a new ssh session, or even log in into virtual console from the physical terminal. Luckily, one root ssh session was open, so we could interact with the system (morale: keep one root session always open!). As a side effect, some services (named, dbus-daemon, rsyslogd, avahi-daemon) saturated the CPU (100% load). The system also serves a large directory via NFS to a very busy client which was backing up 50000 small files at the moment. Restarting all kinds of services and programs normalized their CPU behavior, but did not solve the "Too many open files in system" problem.
The suspected cause
Most likely, some program is leaking file handles. Probably the culprit is my tcl program, which also saturated the CPU (not normal). However, killing it did not help, but, most disturbingly, lsof would not reveal large amounts of open files.
Some evidence
We had to reboot, so whatever information was collected is all we have.
root#xeon:~# cat /proc/sys/fs/file-max
205900
root#xeon:~# lsof
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
init 1 root cwd DIR 8,6 4096 2 /
init 1 root rtd DIR 8,6 4096 2 /
init 1 root txt REG 8,6 124704 7979050 /sbin/init
init 1 root mem REG 8,6 42580 5357606 /lib/i386-linux-gnu/libnss_files-2.13.so
init 1 root mem REG 8,6 243400 5357572 /lib/i386-linux-gnu/libdbus-1.so.3.5.4
...
A pretty normal list, definitely not 200K files, more like two hundred.
This is probably, where the problem started:
less /var/log/syslog
Mar 27 06:54:01 xeon CRON[16084]: (CRON) error (grandchild #16090 failed with exit status 1)
Mar 27 06:54:21 xeon kernel: [8848865.426732] VFS: file-max limit 205900 reached
Mar 27 06:54:29 xeon postfix/master[1435]: warning: master_wakeup_timer_event: service pickup(public/pickup): Too many open files in system
Mar 27 06:54:29 xeon kernel: [8848873.611491] VFS: file-max limit 205900 reached
Mar 27 06:54:32 xeon kernel: [8848876.293525] VFS: file-max limit 205900 reached
netstat did not show noticeable anomalies either.
The man pages for ps and top do not indicate an ability to show open file count. Probably the problem will repeat itself after a few months (that was our uptime).
Any ideas on what else can be done to identify the open files?
UPDATE
This question has changed the meaning, after qehgt identified the likely cause.
Apart from the bug in NFS v4 code, I suspect there is a design limitation in Linux and kernel-leaked file handles can NOT be identified. Consequently, the original question transforms into:
"Who is responsible for file handles in the Linux kernel?" and "Where do I post that question?". The 1st answer was helpful, but I am willing to accept a better answer.
Probably the root cause is a bug in NFSv4 implementation: https://stackoverflow.com/a/5205459/280758
They have similar symptoms.

Resources