I wrote a simple program and ran the program on ext4 and xfs.
#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>
int
main(int argc, char *argv[])
{
int fd;
char *file_name = argv[1];
struct stat buf;
fd = open (file_name, O_RDWR|O_CREAT);
if (fd == -1) {
printf ("Error: %s\n", strerror(errno));
return -1;
}
write (fd, "hello", sizeof ("hello"));
fstat (fd, &buf);
printf ("st_blocks: %lu\n", buf.st_blocks);
stat (file_name, &buf);
printf ("st_blocks: %lu\n", buf.st_blocks);
close (fd);
stat (file_name, &buf);
printf ("st_blocks: %lu\n", buf.st_blocks);
return 0;
}
output on ext4:
st_blocks: 8
st_blocks: 8
st_blocks: 8
output on xfs:
st_blocks: 128
st_blocks: 128
st_blocks: 8
Then I explored about xfs and found an option for changing the extent size while running mkfs.xfs.
example: mkfs.xfs -r extsize=4096 /dev/sda1
But still I get the same output on XFS. Can anyone provide more insight on how to change the st_blocks. Thanks in advance.
I found the answer, posting the answer here so that others facing the problem can refer it.
mount -t xfs -o allocsize=4096 device mount-point
The allocsize option is used to tune the buffer size.
What you are seeing is xfs speculative preallocation, which is a heuristic which is used to avoid fragmentation of files as they grow.
For more info, see this FAQ entry.
You are correct that the "-o allocsize=XXX" option disables that heuristic. Your attempt at using "-r extsize=XXX" failed because that option is only for the realtime subvolume, which you are almost certainly not using.
Related
I recently learned (initially from here) how to use mmap to quickly read a file in C, as in this example code:
// main.c
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <unistd.h>
#define INPUT_FILE "test.txt"
int main(int argc, char* argv) {
struct stat ss;
if (stat(INPUT_FILE, &ss)) {
fprintf(stderr, "stat err: %d (%s)\n", errno, strerror(errno));
return -1;
}
{
int fd = open(INPUT_FILE, O_RDONLY);
char* mapped = mmap(NULL, ss.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
close(fd);
fprintf(stdout, "%s\n", mapped);
munmap(mapped, ss.st_size);
}
return 0;
}
My understanding is that this use of mmap returns a pointer to length heap-allocated bytes.
I've tested this on plain text files, that are not explicitly null-terminated, e.g. a file with the 13-byte ascii string "hello, world!":
$ cat ./test.txt
hello, world!$
$ stat ./test.txt
File: ./test.txt
Size: 13 Blocks: 8 IO Block: 4096 regular file
Device: 810h/2064d Inode: 52441 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 1000/ user) Gid: ( 1000/ user)
Access: 2022-10-25 20:30:52.563772200 -0700
Modify: 2022-10-25 20:30:45.623772200 -0700
Change: 2022-10-25 20:30:45.623772200 -0700
Birth: -
When I run my compiled code, it never segfaults or spews garbage -- the classic symptoms of printing an unterminated C-string.
When I run my executable through gdb, mapped[13] is always '\0'.
Is this undefined behavior?
I can't see how it's possible that the bytes that are memory-mapped from the input file are reliably NULL-terminated.
For a 13-byte string, the "equivalent" that I would have normally done with malloc and read would be to allocate a 14-byte array, read from file to memory, then explicitly set byte 13 (0-based) to '\0'.
mmap returns a pointer to whole pages allocated by the kernel. It doesn't go through malloc. Pages are usually 4096 bytes each and apparently the kernel fills the extra bytes with zeroes, not with garbage.
I am trying to understand direct I/O. To that end I have written this little toy code, which is merely supposed to open a file and write a text string to it:
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
int main(int argc, char **argv) {
char thefile[64];
int fd;
char message[64]="jsfreowivanlsaskajght";
sprintf(thefile, "diotestfile.dat");
if ((fd = open(thefile,O_DIRECT | O_RDWR | O_CREAT, S_IRWXU)) == -1) {
printf("error opening file\n");
exit(1);
}
write(fd, message, 64);
close(fd);
}
My compile command for Cray and GNU is
cc -D'_GNU_SOURCE' diotest.c
and for Intel it is
cc -D'_GNU_SOURCE' -xAVX diotest.c
Under all three compilers, the file diotestfile.dat is created with correct permissions, but no data is ever written to it. When the executable finishes, the output file is blank. The O_DIRECT is the culprit (or, more precisely I guess, my mishandling of O_DIRECT). If I take it out, the code works just fine. I have seen this same problem in a much more complex code that I am trying to work with. What is it that I need to do differently?
Going on Ian Abbot's comment, I discovered that the problem can be solved by adding an alignment attribute to the "message" array:
#define BLOCK_SIZE 4096
int bytes_to_write, block_size=BLOCK_SIZE;
bytes_to_write = ((MSG_SIZE + block_size - 1)/block_size)*block_size;
char message[bytes_to_write] __attribute__ ((aligned(BLOCK_SIZE)));
(System I/O block size is 4096.)
So that solved it. Still can't claim to understand everything that is happening. Feel free to enlighten me if you want. Thanks to everyone for the comments.
Well, you need to rethink the question, because your program runs perfectly on my system, and I cannot guess from it's listing where the error can be.
Have you tested it before posting?
if the program doesn't write to the file, probably a good idea is to see about the return code of write(2). Have you done this? I cannot check because on my system (intel 64bit/FreeBSD) the program runs as you expected.
Your program runs, giving no output and a file named diotestfile.dat appeared in the . directory with contents jsfreowivanlsaskajght.
lcu#europa:~$ ll diotestfile.dat
-rwx------ 1 lcu lcu 64 1 feb. 18:14 diotestfile.dat*
lcu#europa:~$ cat diotestfile.dat
jsfreowivanlsaskajghtlcu#europa:~$ _
My test program is calling stat(2) to obtain a device the file resides on.
stat.c (built with cc stat.c -o stat)
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <sys/sysmacros.h>
#include <errno.h>
#include <stdio.h>
#include <string.h>
int main()
{
char *path = "/home/smoku/test.txt";
unsigned int maj, min;
struct stat sb;
if (stat(path, &sb) < 0) {
fprintf(stderr, "Error getting stat for '%s': %d %s\n", path, errno, strerror(errno));
return 1;
}
maj = major(sb.st_dev);
min = minor(sb.st_dev);
fprintf(stderr, "Found '%s' => %u:%u\n", path, maj, min);
return 0;
}
Got 0:44
$ ls -l /home/smoku/test.txt
-rw-r--r-- 1 smoku smoku 306 08-30 09:33 /home/smoku/test.txt
$ ./stat
Found '/home/smoku/test.txt' => 0:44
$ /usr/bin/stat -c "%d" /home/smoku/test.txt
44
But... there is no such device in my system and /home is 0:35
$ grep /home /proc/self/mountinfo
75 59 0:35 /home /home rw,relatime shared:30 - btrfs /dev/bcache0 rw,ssd,space_cache,subvolid=258,subvol=/home
Why do I get a device ID that does not exist in my system?
stat(2) in fs/stat.c uses inode->i_sb->s_dev to fill stat.st_dev
/proc/self/mountinfo in fs/proc_namespace.c uses mnt->mnt_sb->s_dev
Apparently struct inode.i_sb superblock may be different to struct vfsmount.mnt_sb superblock in case of mount of btrfs subvolume.
This is an issue inherent to btrfs implementation, which "requires non-trivial changes in the VFS layer" to fix: https://mail-archive.com/linux-btrfs#vger.kernel.org/msg57667.html
The end goal here is that I'd like to be able to extend the size of a shared memory segment and notify processes to remap the segment after the extension. However it seems that calling ftruncate a second time on a shared memory fd fails with EINVAL. The only other question I could find about this has no answer: ftruncate failed at the second time
The manpages for ftruncate and shm_open make no mention of disallowing the expansion of shared memory segments after creation, in fact they seem to indicate that they can be resized via ftruncate but so far my testing has shown otherwise. The only solution I can think of would be to destroy the shared memory segment and recreate it at a larger size, however this would require all processes that have mmap'd the segment to unmap it before the object will be destroyed and available for recreation.
Any thoughts? Thanks!
EDIT: As requested as simple example
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <sys/types.h>
int main(int argc, char *argv[]){
const char * name = "testfile";
size_t sz = 4096; // page size on my sys
int fd;
if((fd = shm_open(name, O_CREAT | O_RDWR, 0666)) == -1){
perror("shm_open");
exit(1);
}
ftruncate(fd, sz);
perror("First truncate");
ftruncate(fd, 2*sz);
perror("second truncate");
shm_unlink(name);
return 0;
}
Output:
First truncate: Undefined error: 0
second truncate: Invalid argument
EDIT - Answer: Appears that this is an issue with OSX implementation of the POSIX standard, the above snippet works on a 3.13.0-53-generic GNU/Linux kernel and likely others I'd guess.
With respect to your end goal, here's an open source library I wrote that seems to be a match: rszshm - resizable pointer-safe shared memory.
This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
I am trying to acquire the size of a file after creating it and writing data to it. I get values that don't seem to correspond to the actual file size. Here is my program. Please show me how I can display the file size in Bits, Bytes, Kilobytes, and Megabytes. According to me the file size should be 288 Bits, 36 Bytes, 0.03515626 Kilobytes, and 0.000034332 Megabytes.
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
#define PERMS 0777
int main(int argc, char *argv[])
{
int createDescriptor;
int openDescriptor;
char fileName[15]="Filename1.txt";
umask(0000);
if ((openDescriptor = creat(fileName, PERMS )) == -1)
{
printf("Error creating %s", fileName);
exit(EXIT_FAILURE);
}
if(write(openDescriptor,"This will be output to testfile.txt\n",36 ) != 36)
{
write(2,"There was an error writing to testfile.txt\n",43);
return 1;
}
if((close(openDescriptor))==-1)
{
write(2, "Error closing file.\n", 19);
}
struct stat buf;
fstat(openDescriptor, &buf);
int size=buf.st_size;
printf("%d\n",size);
printf("%u\n",size);
return 0;
}
The fstat() function has a return code, check it.
int r = fstat(openDescriptor, &buf);
if (r) {
fprintf(stderr, "error: fstat: %s\n", strerror(errno));
exit(1);
}
This will print:
error: fstat: Bad file descriptor
Yep... you closed the file descriptor, it's not a file descriptor any more. You have to fstat() before calling close().
The code worries me.
This is extremely fragile, and cannot be recommended under any circumstances:
if (write(openDescriptor,"This will be output to testfile.txt\n",36 ) != 36)
You can do this:
const char *str = "This will be output to testfile.txt\n";
if (write(fd, str, strlen(str)) != strlen(str))
It will compile to the same machine code, and it's obviously correct (as opposed to the original code, where you have to count the number of characters in a string to figure out if it's correct or not).
Even better, when you are using stderr, just use the standard <stdio.h> functions:
fprintf(stderr, "There was an error writing to %s: %s\n",
fileName, strerror(errno));
The same error appears when defining fileName...
// You should never have to know how to count higher than 4 to figure
// out if code is correct...
char fileName[15]="Filename1.txt";
// Do this instead...
static const char fileName[] = "Filename1.txt";
You actually miscounted this time, [15] should have been [14], but better to leave it to the compiler. There's no benefit to making the compiler's job easier, since the compiler presumably doesn't have better things to do.
About the machine code:
$ cat teststr.c
#include <unistd.h>
void func(int openDescriptor) {
write(openDescriptor,"This will be output to testfile.txt\n",36 );
}
$ cat teststr2.c
#include <string.h>
#include <unistd.h>
void func(int openDescriptor) {
const char *str = "This will be output to testfile.txt\n";
write(openDescriptor, str, strlen(str));
}
$ cc -S -O2 teststr.c
$ cc -S -O2 teststr2.c
$ diff teststr.s teststr2.s
1c1
< .file "teststr.c"
---
> .file "teststr2.c"
Yep. As demonstrated, the call to strlen() does not actually result in different machine code.