SUNPATHLEN on Linux. Where is it defined? - c

Recently I begun to port some my TCP code from FreeBSD to Linux. Already had a bunch of questions ;) So, here is another one.
The C struct sockaddr_un on Linux have some different definition than of that on FreeBSD. But, to the question. I have such code in my project:
}else if(AF_UNIX == domain){
if(SUNPATHLEN == strnlen(a, SUNPATHLEN)){
return -ENAMETOOLONG;
}
The above tests that Maximum path should be no more than SUNPATHLEN constant. The SUNPATHLEN is defined on FreeBSD, but apparently not on Linux.
Looking through the gcc -E source.c | grep -n4 sockaddr_un, the struct definition is following:
1724:struct sockaddr_un
1725- {
1726- sa_family_t sun_family;
1727- char sun_path[108];
1728- };
Here the length of a buffer is explicitly set to be of 108.
What is a general rule to check for buffer being trimmed/overflowed in Linux, for the case?

Looking through the gcc -E source.c | grep -n4 sockaddr_un, the struct definition is following:
You don't have to (and shouldn't) trawl the source for this kind of information. You should be looking at user-facing documentation to determine the interface characteristics on which you can rely. In this case, you're looking for unix(7):
A UNIX domain socket address is represented in the following
structure:
struct sockaddr_un {
sa_family_t sun_family; /* AF_UNIX */
char sun_path[108]; /* Pathname */
};
The sun_family field always contains AF_UNIX. On Linux, sun_path is
108 bytes in size
(emphasis added).
What is a general rule to check for buffer being trimmed/overflowed in
Linux, for the case?
No macro is defined for it, but the capacity of the path buffer is explicitly documented as 108 bytes. You can (and probably should) define your own macro for this if you're going to perform tests related to it.
You could possibly do some variation on this to remove system dependencies:
static struct sockaddr_un dummy_sockaddr_un_;
#define MY_SUN_PATH_SIZE (sizeof(dummy_sockaddr_un_.sun_path))

Related

What are the semantics of structure padding/packing in the Linux kernel?

I am interested in the semantics of structure padding and packing, specifically in relation to the structures returned from the Linux kernel.
For example, if a program+stdlib is compiled so structure padding doesn't take place, and a kernel is compiled with so structure padding does take place (Which IIRC is the default for GCC anyway), surely the program cannot run due to the structures returned from the kernel being garbage from it's point of view.
What about if the compiler in question changed it's padding semantics over time, surely the same problem is likely to crop up. The structures defined in /usr/include/linux/* and /usr/include/asm-generic/* do not appear to be packed, so they depend on the compiler used and the alignment semantics of said compiler, right?
But I can take a binary compiled years ago on a different computer with different memory alignment requirements and presumably different padding semantics, and run it on my modern computer and it appears to work fine.
How does it not see garbage? Is this just pure luck? Do compiler authors (like say, TCC and the like) take care to copy GCC's structure padding semantics? How is this potential problem dealt with in the real world?
The structures defined in /usr/include/linux/* and
/usr/include/asm-generic/* do not appear to be packed, so they
depend on the compiler used and the alignment semantics of said
compiler, right?
That's not true, generally. Here is an example from GCC on 64-bit Ubuntu (/usr/include/x86_64-linux-gnu/asm/stat.h):
struct stat {
__kernel_ulong_t st_dev;
__kernel_ulong_t st_ino;
__kernel_ulong_t st_nlink;
unsigned int st_mode;
unsigned int st_uid;
unsigned int st_gid;
unsigned int __pad0;
__kernel_ulong_t st_rdev;
__kernel_long_t st_size;
__kernel_long_t st_blksize;
__kernel_long_t st_blocks; /* Number 512-byte blocks allocated. */
__kernel_ulong_t st_atime;
__kernel_ulong_t st_atime_nsec;
__kernel_ulong_t st_mtime;
__kernel_ulong_t st_mtime_nsec;
__kernel_ulong_t st_ctime;
__kernel_ulong_t st_ctime_nsec;
__kernel_long_t __unused[3];
};
See __pad0? int is generally 4 bytes, but st_rdev is long, which is 8 bytes, so it must be 8-byte aligned. However, it is preceded by 3 ints = 12 bytes, so a 4-byte __pad0 is added.
Essentially, the implementation of stdlib takes care to hard-code its ABI.
BUT that isn't true for all APIs. Here is struct flock (from the same machine, /usr/include/asm-generic/fcntl.h) used by the fcntl() call:
struct flock {
short l_type;
short l_whence;
__kernel_off_t l_start;
__kernel_off_t l_len;
__kernel_pid_t l_pid;
__ARCH_FLOCK_PAD
};
As you can see, there is no padding between l_whence and l_start. And indeed, for the following C program, saved as abi.c:
#include <fcntl.h>
#include <string.h>
int main(int argc, char **argv)
{
struct flock fl;
int fd;
fd = open("y", O_RDWR);
memset(&fl, 0xff, sizeof(fl));
fl.l_type = F_RDLCK;
fl.l_whence = SEEK_SET;
fl.l_start = 200;
fl.l_len = 1;
fcntl(fd, F_SETLK, &fl);
}
We get:
$ cc -g -o abi abi.c && strace -e fcntl ./abi
fcntl(3, F_SETLK, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=200, l_len=1}) = 0
+++ exited with 0 +++
$ cc -g -fpack-struct -o abi abi.c && strace -e fcntl ./abi
fcntl(3, F_SETLK, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=4294967296, l_len=-4294967296}) = 0
+++ exited with 0 +++
As you can see, the fields following l_whence are indeed garbage.
Moreover, C has no ABI, and so this fragile compatibility relies on implementation playing nice. struct stat above assumes that the compiler wouldn't insert extra random padding.
ANSI C says:
There may also be unnamed padding at the end of a structure or union, as necessary to achieve the appropriate alignment were the structure or union to be a member of an array.
There's no wording on how padding may be inserted in the middle of a struct for reasons other than alignment, however there's also:
Implementation-defined behavior
Each implementation shall document its behavior in each of the areas listed in this section. The following are implementation-defined:
...
The padding and alignment of members of structures. This should present no problem unless binary data written by one implementation are read by another.
On my Ubuntu machine, both the compiler and the standard library come from GCC, so they interoperate smoothly. Clang wants to grow, so it's compatible with GNU libc. Everyone is just playing nice, most of the time.

Determine `OSTYPE` during runtime in C program

In a C program, I need to find the OSTYPE during runtime, on the basis of which I will do some operations.
Here is the code
#include <stdlib.h>
#include <string.h>
int main () {
const char * ostype = getenv("OSTYPE");
if (strcasecmp(ostype, /* insert os name */) == 0) ...
return 0;
}
But getenv returns NULL (and there is segmentation fault). When I do a echo $OSTYPE in the terminal it prints darwin15 . But when I do env | grep OSTYPE nothing gets printed, which means it is not in the list of environment variables. To make it work on my local machine I can go to the .bash_profile and export the OSTYPE manually but that doesn't solve the problem if I want to run a generated executable on a new machine.
Why is OSTYPE available while running terminal, but apparently not there in the list of environment variables. How to get around this ?
For the crash, you should check if the return was NULL or not before using it in strcmp or any function. From man 3 getenv:
The getenv() function returns a pointer to the value in the
environment, or NULL if there is no match.
If you're at POSIX (most Unix's and somehow all Linux's), I agree with Paul's comment on uname.
But actually you can check for OSTYPE at compile time with precompiler (with #ifdef's), here's a similar question on so: Determine OS during runtime
Edit: uname
Good point Jonathan. man 2 uname on my linux tells how to use (and begin POSIX, macos has the same header, too):
SYNOPSIS
#include <sys/utsname.h>
int uname(struct utsname *buf);
DESCRIPTION
uname() returns system information in the structure pointed to by buf. The utsname struct is
defined in :
struct utsname {
char sysname[]; /* Operating system name (e.g., "Linux") */
char nodename[]; /* Name within "some implementation-defined
network" */
char release[]; /* Operating system release (e.g., "2.6.28") */
char version[]; /* Operating system version */
char machine[]; /* Hardware identifier */
#ifdef _GNU_SOURCE
char domainname[]; /* NIS or YP domain name */
#endif
};

fstat: st_atime and st_mtime not a member?

I am doing an fstat on my file descriptor and dumping that into a struct stat. I read the documentation for fstat (link below) and it claims there are members st_atime and st_mtime.
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html
GCC let's me compile, but stepping through GDB, I cannot print out those members (i.e. I can print every other member). GDB claims they don't exist.
In fact, when I print out the struct stat, st_atime is spelt st_atim (i.e. same thing with st_mtime). Then it looks like it's a tuple or something because it holds two values, tv_sec and tv_nsec.
Does anyone know why GDB is claiming they don't exist?
Also, does anyone know how to pass it to memcpy? I am using C90.
This is the line of code it complains about saying I can't pass a time_t in here. How would I cast it to make this line work?
memcpy(&temp.otar_adate, file_statistics.st_atime, OTAR_DATE_SIZE);
MY OS: CentOS
On Linux, at least certain versions, st_atime and some other time fields in struct stat are inside struct timespec and contain proper timestamps with full nanosecond precision. On those systems st_atime is a define to something else. On my CentOS machine it is defined to st_atim.tv_sec.
Throw your code into the preprocessor to see what it is on your system:
$ cat foo.c
#include <sys/stat.h>
void
foo(void)
{
struct stat st;
(void)st.st_atime;
}
$ cc -E foo.c | tail -7
void
foo(void)
{
struct stat st;
(void)st.st_atim.tv_sec;
}
Gdb doesn't know about preprocessor defines, so it can't know how your code got preprocessed. It only knows about the real definition of the struct.

readdir() 32/64 compatibility issues

I'm trying to get some old legacy code working on new 64-bit systems, and I'm currently stuck. Below is a small C file I'm using to test functionality that exists in the actual program that is currently breaking.
#define _POSIX_SOURCE
#include <dirent.h>
#include <sys/types.h>
#undef _POSIX_SOURCE
#include <stdio.h>
main(){
DIR *dirp;
struct dirent *dp;
char *const_dir;
const_dir = "/any/path/goes/here";
if(!(dirp = opendir(const_dir)))
perror("opendir() error");
else{
puts("contents of path:");
while(dp = readdir(dirp))
printf(" %s\n", dp->d_name);
closedir(dirp);
}
}
The Problem:
The OS is Red Hat 7.0 Maipo x86_64.
The legacy code is 32-bit, and must be kept that way.
I've gotten the compile for the program working fine using the -m32 flag with g++. The problem that arises is during runtime, readdir() gets a 64-bit inode and then throws an EOVERFLOW errno and of course nothing gets printed out.
I've tried using readdir64() in place of readdir() to some success. I no longer get the errno EOVERFLOW, and the lines come out on the terminal, but the files themselves don't get printed. I'm assuming this is due to the buffer not being what dirent expects.
I've attempted to use dirent64 to try to alleviate this problem but whenever I attempt this I get:
test.c:19:22 error: dereferencing pointer to incomplete type
printf(" %s\n", dp->d_name);
I'm wondering if there's a way to manually shift the dp->d_name buffer for dirent to be used with readdir(). I've noticed in Gdb that using readdir() and dirent results in dp->d_name having directories listed at dp->d_name[1], whereas readdir64() and dirent gives the first directory at dp->d_name[8].
That or somehow get dirent64 to work, or maybe I'm just on the wrong path completely.
Lastly, it's worth noting that the program functions perfectly without the -m32 flag included, so I'm assuming it has to be a 32/64 compatibility error somewhere. Any help is appreciated.
Thanks to #Martin in the comments above I was led to try defining the dirent64 struct in my code. This works. There's probably a #define that can be used to circumvent pasting libc .h code into my own code, but this works for now.
The code I needed was found in <bits/dirent.h>
I guess I should also note that this makes it work using both readdir64() and dirent64
In order to get a 64-bit ino_t with GCC and Glibc, you need to define the features _XOPEN_SOURCE and _FILE_OFFSET_BITS=64.
$ echo '#include <dirent.h>' | gcc -m32 -E -D_XOPEN_SOURCE -D_FILE_OFFSET_BITS=64 - | grep ino
__extension__ typedef unsigned long int __ino_t;
__extension__ typedef __u_quad_t __ino64_t;
typedef __ino64_t ino_t;
__ino64_t d_ino;
I say this from documentation reading and checking the preprocessor, not from deep experience or testing with a filesystem with inode numbers above 2^32, so I don't guarantee that you won't run into other problems down the line.

uses undefined struct compile error - C

The compiler doesn't know where stat.h is?
Error:
c:\Projects\ADC_HCI\mongoose.c(745) : error C2079: 'st' uses undefined struct '_stat64'
#include <sys/types.h>
#include <sys/stat.h>
static int
mg_stat(const char *path, struct mgstat *stp)
{
struct _stat64 st; //<-- ERROR
int ok;
wchar_t wbuf[FILENAME_MAX];
to_unicode(path, wbuf, ARRAY_SIZE(wbuf));
if (_wstat64(wbuf, &st) == 0) {
ok = 0;
stp->size = st.st_size;
stp->mtime = st.st_mtime;
stp->is_directory = S_ISDIR(st.st_mode);
} else {
ok = -1;
}
return (ok);
}
...downloaded the files straight from the source.
See MSDN: _wstat64 takes a parameter of struct __stat64 (with two underscores). Redeclare your variable st to be of type struct __stat64.
Note that neither _stat64 nor __stat64 is 'standard' in the sense of documented by any standard, such as POSIX. You would normally use struct stat; if you are worried about whether that will work with big files (over 2 GiB), then check what compilation options are required on your platform to obtain 'large file support'. For 64-bit machines and 64-bit compilations (not necessarily Windows 64), you usually don't need to worry. You can often obtain large file support using:
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE
These are at least semi-standardized. Systems such as autoconf detect these things automatically (if you ask them to do so).
Change the _stat64 to stat64. At least in my Linux machines that's the name of the structure. I don't know if it is different in Windows.
I suggest you to sync to SVN trunk.
If you don't have SVN client, simply download two files:
http://mongoose.googlecode.com/svn/trunk/mongoose.h (and .c file too)
The reason is that recently the code was refactored, and CRT _stat function was substituted
with WinAPI one, GetFileAttributesExW().

Resources