What are the semantics of structure padding/packing in the Linux kernel? - c

I am interested in the semantics of structure padding and packing, specifically in relation to the structures returned from the Linux kernel.
For example, if a program+stdlib is compiled so structure padding doesn't take place, and a kernel is compiled with so structure padding does take place (Which IIRC is the default for GCC anyway), surely the program cannot run due to the structures returned from the kernel being garbage from it's point of view.
What about if the compiler in question changed it's padding semantics over time, surely the same problem is likely to crop up. The structures defined in /usr/include/linux/* and /usr/include/asm-generic/* do not appear to be packed, so they depend on the compiler used and the alignment semantics of said compiler, right?
But I can take a binary compiled years ago on a different computer with different memory alignment requirements and presumably different padding semantics, and run it on my modern computer and it appears to work fine.
How does it not see garbage? Is this just pure luck? Do compiler authors (like say, TCC and the like) take care to copy GCC's structure padding semantics? How is this potential problem dealt with in the real world?

The structures defined in /usr/include/linux/* and
/usr/include/asm-generic/* do not appear to be packed, so they
depend on the compiler used and the alignment semantics of said
compiler, right?
That's not true, generally. Here is an example from GCC on 64-bit Ubuntu (/usr/include/x86_64-linux-gnu/asm/stat.h):
struct stat {
__kernel_ulong_t st_dev;
__kernel_ulong_t st_ino;
__kernel_ulong_t st_nlink;
unsigned int st_mode;
unsigned int st_uid;
unsigned int st_gid;
unsigned int __pad0;
__kernel_ulong_t st_rdev;
__kernel_long_t st_size;
__kernel_long_t st_blksize;
__kernel_long_t st_blocks; /* Number 512-byte blocks allocated. */
__kernel_ulong_t st_atime;
__kernel_ulong_t st_atime_nsec;
__kernel_ulong_t st_mtime;
__kernel_ulong_t st_mtime_nsec;
__kernel_ulong_t st_ctime;
__kernel_ulong_t st_ctime_nsec;
__kernel_long_t __unused[3];
};
See __pad0? int is generally 4 bytes, but st_rdev is long, which is 8 bytes, so it must be 8-byte aligned. However, it is preceded by 3 ints = 12 bytes, so a 4-byte __pad0 is added.
Essentially, the implementation of stdlib takes care to hard-code its ABI.
BUT that isn't true for all APIs. Here is struct flock (from the same machine, /usr/include/asm-generic/fcntl.h) used by the fcntl() call:
struct flock {
short l_type;
short l_whence;
__kernel_off_t l_start;
__kernel_off_t l_len;
__kernel_pid_t l_pid;
__ARCH_FLOCK_PAD
};
As you can see, there is no padding between l_whence and l_start. And indeed, for the following C program, saved as abi.c:
#include <fcntl.h>
#include <string.h>
int main(int argc, char **argv)
{
struct flock fl;
int fd;
fd = open("y", O_RDWR);
memset(&fl, 0xff, sizeof(fl));
fl.l_type = F_RDLCK;
fl.l_whence = SEEK_SET;
fl.l_start = 200;
fl.l_len = 1;
fcntl(fd, F_SETLK, &fl);
}
We get:
$ cc -g -o abi abi.c && strace -e fcntl ./abi
fcntl(3, F_SETLK, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=200, l_len=1}) = 0
+++ exited with 0 +++
$ cc -g -fpack-struct -o abi abi.c && strace -e fcntl ./abi
fcntl(3, F_SETLK, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=4294967296, l_len=-4294967296}) = 0
+++ exited with 0 +++
As you can see, the fields following l_whence are indeed garbage.
Moreover, C has no ABI, and so this fragile compatibility relies on implementation playing nice. struct stat above assumes that the compiler wouldn't insert extra random padding.
ANSI C says:
There may also be unnamed padding at the end of a structure or union, as necessary to achieve the appropriate alignment were the structure or union to be a member of an array.
There's no wording on how padding may be inserted in the middle of a struct for reasons other than alignment, however there's also:
Implementation-defined behavior
Each implementation shall document its behavior in each of the areas listed in this section. The following are implementation-defined:
...
The padding and alignment of members of structures. This should present no problem unless binary data written by one implementation are read by another.
On my Ubuntu machine, both the compiler and the standard library come from GCC, so they interoperate smoothly. Clang wants to grow, so it's compatible with GNU libc. Everyone is just playing nice, most of the time.

Related

SUNPATHLEN on Linux. Where is it defined?

Recently I begun to port some my TCP code from FreeBSD to Linux. Already had a bunch of questions ;) So, here is another one.
The C struct sockaddr_un on Linux have some different definition than of that on FreeBSD. But, to the question. I have such code in my project:
}else if(AF_UNIX == domain){
if(SUNPATHLEN == strnlen(a, SUNPATHLEN)){
return -ENAMETOOLONG;
}
The above tests that Maximum path should be no more than SUNPATHLEN constant. The SUNPATHLEN is defined on FreeBSD, but apparently not on Linux.
Looking through the gcc -E source.c | grep -n4 sockaddr_un, the struct definition is following:
1724:struct sockaddr_un
1725- {
1726- sa_family_t sun_family;
1727- char sun_path[108];
1728- };
Here the length of a buffer is explicitly set to be of 108.
What is a general rule to check for buffer being trimmed/overflowed in Linux, for the case?
Looking through the gcc -E source.c | grep -n4 sockaddr_un, the struct definition is following:
You don't have to (and shouldn't) trawl the source for this kind of information. You should be looking at user-facing documentation to determine the interface characteristics on which you can rely. In this case, you're looking for unix(7):
A UNIX domain socket address is represented in the following
structure:
struct sockaddr_un {
sa_family_t sun_family; /* AF_UNIX */
char sun_path[108]; /* Pathname */
};
The sun_family field always contains AF_UNIX. On Linux, sun_path is
108 bytes in size
(emphasis added).
What is a general rule to check for buffer being trimmed/overflowed in
Linux, for the case?
No macro is defined for it, but the capacity of the path buffer is explicitly documented as 108 bytes. You can (and probably should) define your own macro for this if you're going to perform tests related to it.
You could possibly do some variation on this to remove system dependencies:
static struct sockaddr_un dummy_sockaddr_un_;
#define MY_SUN_PATH_SIZE (sizeof(dummy_sockaddr_un_.sun_path))

How to not write more bytes than are in the buffer in C?

I'm trying to write a simple copy program. It reads test_data.txt in chunks of 100 bytes and copies those bytes to test_dest.txt. I find that the destination file is at least one unit of chunk larger than the source file. How could I adjust it so that just the right number of bytes are copied? Would I need a copy buffer of size 1?
Please not the point is to solve it using low level I/O system calls.
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
int main() {
int fh = open("test_data.txt", O_RDONLY);
int trg = open("test_dest.txt", O_CREAT | O_WRONLY);
int BUF_SIZE = 100;
char inp[BUF_SIZE];
int read_bytes = read(fh, inp, BUF_SIZE);
while (read_bytes > 0) {
write(trg, inp, BUF_SIZE);
read_bytes = read(fh, inp, BUF_SIZE);
}
close(trg);
close(fh);
return 0;
}
The read function just told you how many bytes it read. You should write that amount of bytes:
write(trg, inp, read_bytes);
On another note, you really should check for failures from the write call as well. And definitely the open calls.
On yet another note, you only really need one call to read:
ssize_t read_bytes; // The read function is specified by POSIX to return a ssize_t
while ((read_bytes = read(fh, inp, sizeof inp)) > 0)
{
write(trg, inp, read_bytes);
}
Your code is not standard C11. Check by reading the standard n1570, and read before the Modern C book.
Your code is more or less POSIX, and certainly compiles on most Linux distributions, e.g. Debian or Ubuntu (you want to install their build-essentials metapackage).
Please read the documentation of open(2), read(2), write(2), every syscalls(2) you are using, and of errno(3).
Notice that each of the functions you are calling can fail, and your code should test for the failure case. Also be aware that a write (or a read) could be partial in some cases, and this is documented.
Recommendation:
with a recent GCC -the usual C compiler on most Linux distributions, compile with all warnings and debug info, so gcc -Wall -Wextra -g.
Read Advanced Linux Programming and How to debug small programs
Learn to use the GDB debugger.
Read about build automation tools, such as GNU make (a very common tool on most Linux systems) or ninja.
Be aware of strace(1). You might use it on cp(1), or study the source code of GNU coreutils (providing cp).
Remember that most Linux distributions are mostly made of open source software.
You are allowed to study their source code.
I even believe that you should study their source code, at least for inspiration !
I'm trying to write a simple copy program. It reads test_data.txt in chunks of 100 bytes and copies those bytes to test_dest.txt
If performance matters, the chunk size of 100 bytes is really too small in practice. I would recommend a power of two bigger than 4Kbytes (the usual page size on x86-64).

Struct layout in apcs-gnu ABI

For this code:
struct S { unsigned char ch[2]; };
int main(void)
{
_Static_assert( sizeof(struct S) == 2, "size was not 2");
}
using GCC (various versions) for ARM with the ABI apcs-gnu (aka. OABI, or EABI version 0), I get the assertion fails. It turns out the size of the struct is 4.
I can work around this by using __attribute__((packed)); but my questions are:
What is the rationale for making this struct size 4?
Is there any documentation specifying the layout of structs in this ABI?
On the ARM website I found documentation for aapcs (EABI version 5) which does specify this struct as having a size of 2; but I could not find anything about apcs-gnu.
This is a GCC-specific decision to trade-off size for performance. It can be overridden with -mstructure-size-boundary=8.
An excerpt from source code:
/* Setting STRUCTURE_SIZE_BOUNDARY to 32 produces more efficient code, but the
value set in previous versions of this toolchain was 8, which produces more
compact structures. The command line option -mstructure_size_boundary=<n>
can be used to change this value. For compatibility with the ARM SDK
however the value should be left at 32. ARM SDT Reference Manual (ARM DUI
0020D) page 2-20 says "Structures are aligned on word boundaries".
The AAPCS specifies a value of 8. */
#define STRUCTURE_SIZE_BOUNDARY arm_structure_size_boundary

poll() returns EINVAL for more than 256 descriptors on macOS

Here is the example code that crashes:
#include <stdio.h>
#include <poll.h>
#include <stdlib.h>
#include <limits.h>
#define POLL_SIZE 1024
int main(int argc, const char * argv[]) {
printf("%d\n", OPEN_MAX);
struct pollfd *poll_ = calloc(POLL_SIZE, sizeof(struct pollfd));
if (poll(poll_, POLL_SIZE, -1) < 0)
if (errno == EINVAL)
perror("poll error");
return 0;
}
If you set POLL_SIZE to 256 or less, the code works just fine. What's interesting is that if you run this code in Xcode, it executes normally, but if you run the binary yourself you get a crash.
The output is like this:
10240
poll error: Invalid argument
According to poll(2):
[EINVAL] The nfds argument is greater than OPEN_MAX or the
timeout argument is less than -1.
As you can see, POLL_SIZE is a lot smaller than the limit, and the timeout is exactly -1, yet it crashed.
My clang version that I use for manual building:
Configured with: prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/usr/include/c++/4.2.1
Apple LLVM version 8.1.0 (clang-802.0.41)
Target: x86_64-apple-darwin16.5.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
On Unix systems, processes have limits on resources. See e.g. getrlimit. You might change them (using setrlimit) and your sysadmin can also change them (e.g. configure these limits at startup or login time). There is a limit RLIMIT_NOFILE related to file descriptors. Read also about the ulimit bash builtin. See also sysconf with _SC_OPEN_MAX.
The poll system call is given a not-too-big array, and it is bad taste (possible, but inefficient) to repeat some file descriptor in it. So in practice you'll often use it with a quite small array mentioning different (but valid) file descriptors. The second argument to poll is the number of useful entries (practically, all different), not the allocated size of the array.
You may deal with many file descriptors. Read about the C10K problem.
BTW, your code is not crashing. poll is failing as documented (but not crashing).
You should read some POSIX programming book. Advanced Linux Programming is freely available, and most of it is on POSIX (not Linux specific) things.

Why does this struct not align properly?

I was reading this answer : C struct memory layout? and was curious to know why :
struct ST
{
long long ll;
char ch2;
char ch1;
short s;
int i;
};
still is the size of 24 bytes instead of 16. I was expecting 2*char + short + int to fit into 8 bytes. Why is it so?
EDIT:
Sorry for the confusion, I am running on a 64 bit system (debian) gcc (Debian 4.4.5-8) 4.4.5. I already know its due to padding. My question was why? One of the answers suggests :
char = 1 byte
char = 1 byte
short = 1 byte (why is this 1 and not 2?)
* padding of 5 bytes
My question is, why is this padding here... why not just put an int straight after the short, it will still fit within 8 bytes.
The simple answer is: it isn't 24 bytes. Or you're running on on the 64 bit s390 port of Linux that I haven't been able to find the ABI documentation for. Every other 64 bit hardware that Debian can run on will have the size of this struct as 16 bytes.
I have dug up the ABI documentation for a bunch of different CPU ABIs and they all have more or less this wording (it seems they have all been copying from each other):
Structures and unions assume the alignment of their most strictly aligned component. Each member is assigned to the lowest available offset with the appropriate alignment. The size of any object is always a multiple of the object's alignment.
And all architecture ABI documents I found (mips64, ppc64, amd64, ia64, sparc64, arm64) have the alignment for char 1, short 2, int 4 and long long 8.
Even though operating systems are allowed to make their own ABI, almost every unix-like system and especially Linux follow the System V ABI and their supplemental CPU documentation that specifies this behavior very well. And Debian will definitely not change this behavior to be different from all other Linuxes.
Here's a quick verification (all on amd64/x86_64 which is what you're most likely running):
$ cat > foo.c
#include <stdio.h>
int
main(int argc, char **argv)
{
struct {
long long ll;
char ch2;
char ch1;
short s;
int i;
} foo;
printf("%d\n", (int)sizeof(foo));
return 0;
}
MacOS:
$ cc -o foo foo.c && ./foo && uname -ms
16
Darwin x86_64
Ubuntu:
$ cc -o foo foo.c && ./foo && uname -ms
16
Linux x86_64
CentOS:
$ cc -o foo foo.c && ./foo && uname -ms
16
Linux x86_64
OpenBSD:
$ cc -o foo foo.c && ./foo && uname -ms
16
OpenBSD amd64
There's something else wrong with your compilation. Or that's not the struct you're testing or you're running on a very strange hardware architecture and specifying it as "64 bit" is equivalent to saying "I'm driving this government issued vehicle and it has very strange acceleration and the engine cuts out after 5 minutes" and not mentioning that you're talking about the space shuttle.
It's all about padding. In visual studio (and some other compilers) you can use the #pragma push/pack to make it align as you wish.
#pragma pack(push, 1)
struct ST
{
/*0x00*/ long long ll;
/*0x08*/ char ch2;
/*0x09*/ char ch1;
/*0x0a*/ short s;
/*0x0c*/ int i;
/*0x10*/
};
#pragma pack(pop)
Since you said it was coming up as size 24, I'm going to guess that the compiler is aligning you at 4 bytes by doing something like this:
struct ST
{
/*0x00*/ long long ll;
/*0x08*/ char ch2;
/*0x09*/ char padding1[0x3];
/*0x0c*/ char ch1;
/*0x0d*/ char padding2[0x3];
/*0x10*/ short s;
/*0x11*/ char padding3[0x2];
/*0x14*/ int i;
/*0x18*/
};
(Sorry, I think in hex when doing this sort of thing. 0x10 is 16 in decimal and 0x18 is 24 in decimal.)
Usually to minimize paddings, it's normal practice to order the members from biggest to smallest, could you try to reorder them and see what comes out of this?
struct ST
{
long long ll; //8 bytes
int i; //4bytes
short s; //2 bytes
char ch2; //1 byte
char ch1; //1 byte
};
total is 16 bytes

Resources