Useful GCC flags to improve security of your programs? - c

By pure chance I stumbled over an article mentioning you can "enable" ASLR with -pie -fPIE (or, rather, make your application ASLR-aware). -fstack-protector is also commonly recommended (though I rarely see explanations how and against which kinds of attacks it protects).
Is there a list of useful options and explanations how they increase the security?
...
And how useful are such measures anyway, when your application uses about 30 libraries that use none of those? ;)

Hardened Gentoo uses these flags:
CFLAGS="-fPIE -fstack-protector-all -D_FORTIFY_SOURCE=2"
LDFLAGS="-Wl,-z,now -Wl,-z,relro"
I saw about 5-10% performance drop in comparison to optimized Gentoo linux (incl. PaX/SElinux and other measures, not just CFLAGS) in default phoronix benchmark suite.

As for your final question:
And how useful are such measures anyway, when your application uses about 30 libraries that use none of those? ;)
PIE is only necessary for the main program to be able to be loaded at a random address. ASLR always works for shared libraries, so the benefit of PIE is the same whether you're using one shared library or 100.
Stack protector will only benefit the code that's compiled with stack protector, so using it just in your main program will not help if your libraries are full of vulnerabilities.
In any case, I would encourage you not to consider these options part of your application, but instead part of the whole system integration. If you're using 30+ libraries (probably most of which are junk when it comes to code quality and security) in a program that will be interfacing with untrusted, potentially-malicious data, it would be a good idea to build your whole system with stack protector and other security hardening options.
Do keep in mind, however, that the highest levels of _FORTIFY_SOURCE and perhaps some other new security options break valid things that legitimate, correct programs may need to do, and thus you may need to analyze whether it's safe to use them. One known-dangerous thing that one of the options does (I forget which) is making it so the %n specifier to printf does not work, at least in certain cases. If an application is using %n to get an offset into a generated string and needs to use that offset to later write in it, and the value isn't filled in, that's a potential vulnerability in itself...

The Hardening page on the Debian wiki explains at least the most commons ones which are usable on Linux. Missing from your list is at least -D_FORTIFY_SOURCE=2, -Wformat, -Wformat-security, and for the dynamic loader the relro and now features.

Related

GCC preprocessor directives for Arch Linux

Does GCC (or alternatively Clang) defines any macro when it is compiled for the Arch Linux OS?
I need to check that my software restricts itself from compiling under anything but Arch Linux (the reason behind this is off-topic). I couldn't find any relevant resources on the internet.
Does anyone know how to guarantee through GCC preprocessor directives that my binaries are only compilable under Arch Linux?
Of course I can always
#ifdef __linux__
...
#endif
But this is not precise enough.
Edit: This must be done through C source code and not by any building systems, so, for example, doing this through CMake is completely discarded.
Edit 2: Users faking this behaviour is not a problem since the software is distributed to selected clients and thus, actively trying to "misuse" our source code is "their decision".
Does GCC (or alternatively Clang) defines any macro when it is compiled for the Arch Linux OS?
No. Because there's nothing inherently specific to Arch Linux on the binary level. For what it's worth, when compiling the only things you/the compiler has to care about is the target architecture (i.e. what kind of CPU it's going to run with), data type sizes and alignments and function calling conventions.
Then later on, when it's time to link the compiled translation unit objects into the final binary executable, the runtime libraries around are also of concern. Without taking special precautions you're essentially locking yourself into the specific brand of runtime libraries (glibc vs. e.g. musl; libstdc++ vs. libc++) pulled by the linker.
One can easily sidestep the later problem by linking statically, but that limits the range of system and midlevel APIs available to the program. For example on Linux a purely naively statically linked program wouldn't be able to use graphics acceleration APIs like OpenGL-3.x or Vulkan, since those rely on loading components of the GPU drivers into the process. You can however still use X11 and indirect GLX OpenGL, since those work using wire protocols going over sockets, which are implemented using direct syscalls to the kernel.
And these kernel syscalls are exactly the same on the binary level for each and every Linux kernel of every distribution out there. Although inside of the kernel there's a lot of leeway when it comes to redefining interfaces, when it comes to the interfaces toward the userland (i.e. regular programs) there's this holy, dogmatic, ironclad rule that YOU NEVER BREAK USERLAND! Kernel developers breaking this rule, intentionally or not are chewed out publicly by Linus Torvalds in his in-/famous rants.
The bottom line to this is, that there is no such thing as a "Linux distribution specific identifier on the binary level". At the end of the day, a Linux distribution is just that: A distribution of stuff. That means someone or more decided on a set of files that make up a working Linux system, wrap it up somehow and slap a name on it. That's it. There's nothing inherently specific to "Arch" Linux other than it's called "Arch" and (for the time being) relies on the pacman package manager. That's it. Everything else about "Arch", or any other Linux distribution, is just a matter of happenstance.
If you really want to sort different Linux distributions into certain bins regarding binary compatibility, then you'd have to pigeonhole the combinations of
Minimum required set of supported syscalls. This translates into minimum required kernel version.
What libc variant is being used; and potentially which version, although it's perfectly possible to link against a minimally supported set of functions, that has been around for almost "forever".
What variant of the C++ standard library the distribution decided upon. This actually also inflicts programs that might appear to be purely C, because certain system level libraries (*cough* Mesa *cough*) will internally pull a lot of C++ infrastructure (even compilers), also triggering other "fun" problems¹
I need to check that my software restricts itself from running under anything but Arch Linux (the reason behind this is off-topic). I couldn't find any relevant resources on the internet.
You couldn't find resources on the Internet, because there's nothing specific on the binary level that makes "Arch" Arch. For what it's worth right now, this instant I could create a fork of Arch, change out its choice of default XDG skeleton – so that by default user directories are populated with subdirs called leech, flicks, beats, pics – and call it "l33tz" Linux. For all intents and purposes it's no longer Arch. It does behave significantly different from the default Arch behavior, which would also be of concern to you, if you'd relied on any specific thing, and be it most minute.
Your employer doesn't seem to understand what Linux is or what distinguished distributions from each other.
Hint: It's not the binary compatibility. As a matter of fact, as long as you stay within the boring old realm of boring old glibc + libstdc++ Linux distributions are shockingly compatible with each other. There might be slight differences in where they put libraries other than libc.so, libdl.so and ld-linux[-${arch}].so, but those two usually always can be found under /lib. And once ld-linux[-${arch}].so and libdl.so take over (that means pulling in all libraries loaded at runtime) all the specifics of where shared objects and libraries are to be found are abstracted away by the dynamic linker.
1: like becoming multithreaded only after global constructors were executed and libstdc++ deciding it wants to be singlethreaded, because libpthread wasn't linked into a program that didn't create a single thread on its own. That was a really weird bug I unearthed, but yshui finally understood https://gitlab.freedesktop.org/mesa/mesa/-/issues/3199
You can list the predefined preprocessor macros with
gcc -dM -E - /dev/null
clang -dM -E - /dev/null
None of those indicate what operating system the compiler is running under. So not only you can't tell whether the program is compiled under Arch Linux, you can't even tell whether the program is compiled under Linux. The macros __linux__ and friends indicate that the program is being compiler for Linux. They are defined when cross-compiling from another system to Linux, and not defined when cross-compiling from Linux to another system.
You can artificially make your program more difficult to compile by specifying absolute paths for system headers and relying on non-portable headers (e.g. /usr/include/bits/foo.h). That can make cross-compilation or compilation for anything other than Linux practically impossible without modifying the source code. However, most Linux distributions install headers in the same location, so you're unlikely to pinpoint a specific distribution.
You're very likely asking the wrong question. Instead of asking how to restrict compilation to Arch Linux, start from why you want to restrict compilation to Arch Linux. If the answer is “because the resulting program wouldn't be what I want under another distribution”, then start from there and make sure that the difference results in a compilation error rather than incorrect execution. If the answer to “why” is something else, then you're probably looking for a technical solution to a social problem, and that rarely ends well.
No, it doesn't. And even if it did, it wouldn't stop anyone from compiling the code on an Arch Linux distro and then running it on a different Linux.
If you need to prevent your software from "from running under anything but Arch Linux", you'll need to insert a run-time check. Although, to be honest, I have no idea what that check might consist of, since linux distros are not monolithic products. The actual check would probably have to do with your reasons for imposing the restriction.

Will it be feasible to use gcc's function multi-versioning without code changes?

According to most benchmarks, Intel's Clear Linux is way faster than other distributions, mostly thanks to a GCC feature called Function Multi-Versioning. Right now the method they use is to compile the code, analyze which function contains vectorized loops, then patch the code with FMV attributes and compile it again.
How feasible will it be for GCC to do it automatically? For example, by passing -mmultiarch=sandybridge,skylake (or a similar -m option listing CPU extensions like AVX and AVX2).
Right now I'm interested in two usage scenarios:
Use this option for our large math-heavy program for delivering releases to our customers. I don't want to pollute the code with non-standard attributes and I don't want to modify the third-party libraries we use.
The other Linux distributions will be able to do this easily, without patching the code as Intel does. This should give all Linux users massive performance gains.
No, but it doesn't matter. There's very, very little code that will actually benefit from this; for the most part by doing it globally you'll just (without special effort to sort matching versions in pages together) make your system much more memory-constrained and slower due to the huge increase in code size. Most actual loads aren't even CPU-bound; they're syscall-overhead-bound, GPU-bound, IO-bound, etc. And many of the modern ones that are CPU-bound aren't running precompiled code but JIT'd code (i.e. everything running in a browser, whether that's your real browser or the outdated and unpatched fork of Chrome in every Electron app).

Disable __thread support

I'm implementing a pthread replacement library which is extremely lightweight. There are several reasons why I want to disable __thread completely.
It's a waste of memory. If I'm creating a thousand threads which has nothing to do with the context that declares a variable with __thread they will still allocate the program will still have allocated 1000*the size of that data bytes and never use it. It's simply not memory compatible with the mass-concurrency model. If we need extremely lightweight fibers with only 8K of stack, a TLS block of just 4K would be an overhead of 50% of the memory used by each thread. In some scenarios the TLS overhead would be enormous.
TLS is a complex standard and I simply don't have the time/resources to support it. It's to expensive. Personally I think the standard is poorly designed. It should have defined standard functions that had to be provided by the linker so thread libraries can take control over where TLS allocation takes place and insert relevant offsets and addresses it requires. Also the standard ELF implementation has been infected with pthread, expecting pthread sized structs to calculate offsets making it really hard to adapt to something else.
It's just a bad pattern. It encourages using globals and creating functions with static functions/side effects. This is not a territory we want to be in if we're creating correct programs that are easy to analyze.
If we really need "thread context" for some magic that tracks thread state behind the scenes (like for allocation or cancellation tracking) why not just expose the magic that TLS uses to understand that context in the first place? Personally I'm just going to use the %fs register directly. This would not be possible in libraries for obvious reasons but why should they be thread aware to begin with? Why not just design them correctly so they get the context related data they need in the first place right in the argument list?
My question is simply: What is the simplest way to disable __thread support and make clang emit errors if you accidentally used it? How can I get errors if I load a dynamic library which happens to require TLS?
I believe the simplest way is adding something like this unconditionally to your CFLAGS (perhaps from clang's equivalent of the gcc specfile if you want it to be system-global):
-D__thread='^-^'
where the righthand side can be anything that's syntactically invalid (a constraint violation) at any point in a C program.
As for preventing loading of libraries with TLS, you'd have to patch the linker and/or dynamic linker to reject them. If you're just talking about dlopen, your program could first read the file and parse the ELF headers for TLS relocations, then reject the library (without passing it to dlopen) if it has any. This might even be possible with an LD_PRELOAD wrapper.
I agree with you that, especially in its current implementation, TLS is something whose use should generally be avoided, but may I ask if you've measured the costs? I think stamping it out completely will be fairly difficult on a system that's designed to use it, and there's much lower-hanging fruit for cutting bloat. Which libc are you using? If it's glibc, I'm pretty sure glibc has lots of TLS it uses internally itself these days... Of course if you're writing your own threads implementation, that's going to entail a lot of interaction with the rest of the standard library, so perhaps you're already patching it out...?
By the way (shameless plug follows), we have an extremely light-weight threads implementation in musl libc which presently does not have TLS. I don't think it would be easy to integrate with another libc (and I'm sure if you're writing your own you will find it difficult to integrate with glibc, especially glibc's dynamic linker, which expects TLS to be supported) but if you can use the whole library as-is, it may meet your needs for specific projects, or have useful code you can borrow (license is MIT).

Detecting CPU capability without assembly

I've been looking at ways to determine the CPU, and its capabilities (eg SEE,SSE2,etc).
However all the ways I've found involved assembly code using the cpuid instruction. Given the differing ways of doing assembly in c/c++ between compilers and even targets (no inline assembly for 64bit targets under VC), id rather avoid that.
Is there some simple library around, or OS functions (for both windows and linux) for getting this information?
Currently I'm only interested in platforms using the x86 and x86-64 CPU's, and I defiantly need to support at least AMD and Intel.
EDIT: At first, I googled for "libcpuid" remembering a conversation with peer programmers, and the google search pointed me to this sourceforge page which looks quite outdated. Then I noticed it was the project page of "libcpu".
Actually, there really is a libcpuid that might suit your needs.
Last time I looked for such a library, I found crisscross. It's C++ though.
There is also geekinfo.
But I don't know a cross-platform library that just does cpuid.
Under linux peek in /proc/cpuinfo
This so hardware-dependent --- which is an aspect the OS usually shields you from --- that requiring it to be cross-platform is a tad hard.
It would be useful to have, but I fear you may just have to enumerate the cross-product of CPU s and features you would like to support, times the number of OSs (here two) and write tests for all. I do not think it has been done -- and someone will surely correct me if I overlooked something.
That would be a useful library to have, so good luck with the endeavor!

Minimum amount of information necessary to do (dynamic) linking?

This is a question I've come across repeatedly, usually concerning plug ins, but recently I came across it trying to hammer out some build system issues. My concern is primarily for *nix based systems, but I suppose it applies to windows as well.
The question is, what is the minimum amount of information necessary to do dynamic linking? I know linux distributions like Debian have simply an 'i686', which is enough. However, I suppose there is some implicit information here, and I probably won't be able to do dynamic linking of any shared object as long as they're compiled using -march=i686, will I?
So what must be matched correctly for me to be able to load a shared object successfully? I know that for c++ even the compiler (and sometimes version) must match due to name mangling, but I was kind of hoping this wasn't the case for c.
Any thoughts appreciated.
Edit:
Neil's answer made me realize I'm not really talking about dynamic linking, or rather, the question is two-fold,
what's needed for static linking, and
what's needed for dynamic linking
I have higher hopes for the first I guess.
Well at minimum, the code must have been compiled for the same processor family, and you need to know the names of the library and the function. On top of that, you need the same ABI. You should be aware that despite what people think, the C Standard does not specify an ABI and it is entirely possible for two C compilers (or versions of the same compiler) to adhere to the standard, run on the same platform, but have different ABIs.
As for exactly specifying architecture details - I must admit I've never done it. Are you planning on distributing binary libraries on different Linux variants?

Resources