How do I build newlib for size optimization? - arm

I'm building an arm-eabi-gcc toolchain with Newlib 2.5.0 as the target C library.
The target embedded system would prefer smaller code size over execution speed. How do I configure newlib to favour smaller code size?
The default build does things like produce a version of strstr that is over 1KB in code size.

There is fat in Newlib that can be addressed with Newlib-nano, which is already part of GCC ARM Embedded, as discussed here (Note the article is from 2014, so the information may be out-dated, but there appears to be Newlib-nano support in the current v6-2017 too).
It removes some features added after C89 that are rarely used in MCU based embedded systems, simplifies complex functions such as formatted I/O, and removes wide character support from non-wide character specific functions. Critically in respect to this question the default build is already size optimised (-Os).

Configure newlib like this:
CFLAGS_FOR_TARGET="-DPREFER_SIZE_OVER_SPEED=1 -Os" \
../newlib-2.5.0/configure
(where I've omitted the rest of the arguments I used for configure, they don't change based on this issue).
There isn't a configure flag, but the configure script reads certain variables from the environment. CFLAGS_FOR_TARGET means flags used when building for the target system.
Not to be confused with CFLAGS_FOR_BUILD , which are flags that would be used if the build system needed to make any auxiliary executables to execute on the build system to help with the build process.
I couldn't find any official documentation on this, but searching the source code, it contained many instances of testing for PREFER_SIZE_OVER_SPEED or __OPTIMIZE_SIZE__. Based on a quick grep, these two flags are almost identical. The only difference was a case in the printf family that if a null pointer is passed for %s, then the former will translate it to (null) but the latter bulls on ahead , probably causing a crash.

Related

How is the type sf_count_t in sndfile.h defined in libsndfile?

I am trying to work with Nyquist (a music programming platform, see: https://www.cs.cmu.edu/~music/nyquist/ or https://www.audacityteam.org/about/nyquist/) as a standalone program and it utilizes libsndfile (a library for reading and writing sound, see: http://www.mega-nerd.com/libsndfile/). I am doing this on an i686 GNU/Linux machine (Gentoo).
After successful set up and launching the program without errors, I tried to generate sound via one of the examples, "(play (osc 60))", and was met with this error:
*** Fatal error : sizeof (off_t) != sizeof (sf_count_t)
*** This means that libsndfile was not configured correctly.
Investigating this further (and emailing the author) has proved somewhat helpful, but the solution is still far from my grasp. The author recommended looking at /usr/include/sndfile.h to see how sf_count_t is defined, and (this portion of) my file is identical to his:
/* The following typedef is system specific and is defined when libsndfile is
** compiled. sf_count_t will be a 64 bit value when the underlying OS allows
** 64 bit file offsets.
** On windows, we need to allow the same header file to be compiler by both GCC
** and the Microsoft compiler.
*/
#if (defined (_MSCVER) || defined (_MSC_VER))
typedef __int64 sf_count_t ;
#define SF_COUNT_MAX 0x7fffffffffffffffi64
#else
typedef int64_t sf_count_t ;
#define SF_COUNT_MAX 0x7FFFFFFFFFFFFFFFLL
#endif
In the above the author notes there is no option for a "32 bit offset". I'm not sure how I would proceed. Here is the particular file the author of Nyquist recommend I investigate: https://github.com/erikd/libsndfile/blob/master/src/sndfile.h.in , and here is the entire source tree: https://github.com/erikd/libsndfile
Here are some relevant snippets from the authors email reply:
"I'm guessing sf_count_t must be showing up as 32-bit and you want
libsndfile to use 64-bit file offsets. I use nyquist/nylsf which is a
local copy of libsndfile sources -- it's more work keeping them up to
date (and so they probably aren't) but it's a lot easier to build and
test when you have a consistent library."
"I use CMake and nyquist/CMakeLists.txt to build nyquist."
"It may be that one 32-bit machines, the default sf_count_t is 32
bits, but I don't think Nyquist supports this option."
And here is the source code for Nyquist: http://svn.code.sf.net/p/nyquist/code/trunk/nyquist/
This problem is difficult for me to solve because it's composed of an niche use case of relatively obscure software. This also makes the support outlook for the problem a bit worrisome. I know a little C++, but I am far from confident in my ability to solve this. Thanks for reading and happy holidays to all. If you have any suggestions, even in terms of formatting or editing, please do not hesitate!
If you look at the sources for the bundled libsndfile in nyquist, i.e. nylsf, then you see that sndfile.h is provided directly. It defines sf_count_t as a 64-bit integer.
The libsndfile sources however do not have this file, rather they have a sndfile.h.in. This is an input file for autoconf, which is a tool that will generate the proper header file from this template. It has currently the following definition for sf_count_t for linux systems (and had it since a while):
typedef #TYPEOF_SF_COUNT_T# sf_count_t ;
The #TYPEOF_SF_COUNT_T# would be replaced by autoconf to generate a header with a working type for sf_count_t for the system that is going to be build for. The header file provided by nyquist is therefore already configured (presumably for the system of the author).
off_t is a type specified by the POSIX standard and defined in the system's libc. Its size on a system using the GNU C library is 32bit if the system is 32bit.
This causes the sanity check in question to fail, because the sizes of sf_count_t and off_t don't match. The error message is also correct, as we are using an unfittingly configured sndfile.h for the build.
As I see it you have the following options:
Ask the nyquist author to provide the unconfigured sndfile.h.in and to use autoconf to configure this file at build time.
Do not use the bundled libsndfile and link against the system's one. (This requires some knowledge and work to change the build scripts and header files, maybe additional unexpected issues)
If you are using the GNU C library (glibc): The preprocessor macro _FILE_OFFSET_BITS can be set to 64 to force the size of off_t and the rest of the file interface to use the 64bit versions even on 32bit systems.
This may or may not work depending on whether your system supports it and it is not a clean solution as there may be additional misconfiguration of libsndfile going unnoticed. This flag could also introduce other interface changes that the code relies on, causing further build or runtime errors/vulnerabilities.
Nonetheless, I think the syntax for cmake would be to add:
add_compile_definitions(_FILE_OFFSET_BITS=64)
or depending on cmake version:
add_definitions(-D_FILE_OFFSET_BITS=64)
in the appropriate CMakeLists.txt.
Actually the README in nyquist/nylsf explains how the files for it were generated. You may try to obtain the source code of the same libsndfile version it is based on and repeat the steps given to produce an nylsf configured to your system. It may cause less further problems than 2. and 3. because there wouldn't be any version/interface changes introduced.

What does __section( ) mean in linux kernel source

I see the following code in some OS kernel. But I don't understand the way __section is used, and don't know what does this code mean.
#define KEEP_PAGER(sym) \
extern const unsigned long ____keep_pager_##sym; \
const unsigned long ____keep_pager_##sym \
__section("__keep_meta_vars_pager") = (unsigned long)&sym
It's specific linux kernel C macro definition wrapped around a GCC extension, specifying an atttribute to use for an object. It's a shorter way of writing the section attribute definition
Historically the linux kernel has been written specifically for building with the GCC compiler, and makes extensive use of low level extensions to do specific hardware operations and optimisations.
The section attribute specifically is used to determine the storage location of the object tagged with it. ELF binary format arranges the object file into named sections, and using the attribute like this allows the programmer to more precisely specify where the information for the tagged object will be placed in the target object
Over the years, there's been work put in to increasing the compatibility of these compiler extensions between different compilers, as well as making linux compilable with alternative compilers (if you look at the linux header file where the macro is defined you'll see that it is full of conditional directives for various compiler features). Macros like this can be a useful way to have a portable internal API for low level features across different compiler implementations.
Kernel and kernel driver C code is atypically concerned with direct specifics of physical hardware implementation, and needs to be explicit about the compiler binary output in a way that application level C code rarely will.
One example of why the linux kernel uses named sections is in the init handling - functions and data that are only used during bootup are grouped into one section of memory that can be easily released once startup is complete - you may be familiar with the boot message along the lines of 'freeing unused kernel memory:...' towards the end of the linux boot sequence
It is hard to tell what that __section is exactly without its definition, but it might be a variable "section" attribute. It is used to make compiler place variable into a section different from "bss" or "data". See GCC documentation for details.

Dynamically change the running code by writing into the __FILE__?

I got to know of a way to print the source code of a running code in C using the __FILE__ macro. As such I can seek the location and use putchar() to alter the contents of the file.
Is it possible to dynamically change the running code using this method?
Is it possible to dynamically change the running code using this method ?
No, because once a program is compiled it no longer depends on the source file.
If you want learn how to alter the behavior of an process that is already running from within the process itself, you need to learn about assembly for the architecture you're using, the executable file format on your system, and the process API on your system, at the very least.
As most other answers are explaining, in practical terms, most C implementations are compilers. So the executable that is running has only an indirect (and delayed) relation with the source code, because the source code had to be processed by the compiler to produce that executable.
Remember that a programming language is (not a software but...) a specification, written in some report. Read n1570, draft specification of C11. Most implementations of C are command-line compilers (e.g. GCC & Clang/LLVM in the free software realm), even if you might find interpreters.
However, with some operating systems (notably POSIX ones, such as MacOSX and Linux), you could dynamically load some plugin. Or you could create, in some other way (such as JIT compilation libraries like libgccjit or LLVM or libjit or GNU lightning), a fresh function and dynamically get a pointer to it (and that is not stricto sensu conforming to the C standard, where a function pointer should point to some existing function of your program).
On Linux, you might generate (at runtime of your own program, linked with -rdynamic to have its names usable from plugins, and with -ldl library to get the dynamic loader) some C code in some temporary source file e.g. /tmp/gencode.c, run a compilation (using e.g. system(3) or popen(3)) of that emitted code as a /tmp/gencode.so plugin thru a command like e.g. gcc -O1 -g -Wall -fPIC -shared /tmp/gencode.c -o /tmp/gencode.so, then dynamically load that plugin using dlopen(3), find function pointers (from some conventional name) in that loaded plugin with dlsym(3), and call indirectly that function pointer. My manydl.c program shows that is possible for many hundred thousands of generated C files and loaded plugins. I'm using similar tricks in my GCC MELT. See also this and that. Notice that you don't really "self-modify" C code, you more broadly generate additional C code, compile it (as some plugin, etc...), and then load it -as an extension or plugin- then use it.
(for pragmatical reasons including ease of debugging, I don't recommend overwriting some existing C file, but just emitting new C code in some fresh temporary .c file -from some internal AST-like representation- that you would later feed to the compiler)
Is it possible to dynamically change the running code?
In general (at least on Linux and most POSIX systems), the machine code sits in a read-only code segment of the virtual address space so you cannot change or overwrite it; but you can use indirection thru function pointers (in your C code) to call newly loaded code (e.g. from dlopen-ed plugins).
However, you might also read about homoiconic languages, metaprogramming, multi-staged programming, and try to use Common Lisp (e.g. using its SBCL implementation, which compile to machine code at every REPL interaction and at every eval). I also recommend reading SICP (an excellent and freely available introduction to programming, with some chapters related to metaprogramming approaches)
PS. Dynamic loading of plugins is also possible in Windows -which I don't know- with LoadLibrary, but with a very different (and incompatible) model. Read Levine's linkers and loaders.
A computer doesn't understand the code as we do. It compiles or interprets it and loads into memory. Our modification of code is just changing the file. One needs to compile it and link it with other libraries and load it into memory.
ptrace() is a syscall used to inject code into a running program. You can probably look into that and achieve whatever you are trying to do.
Inject hello world in a running program. I have tried and tested this sometime before.

Combining source code into a single file for optimization

I was aiming at reducing the size of the executable for my C project and I have tried all compiler/linker options, which have helped to some extent. My code consists of a lot of separate files. My question was whether combining all source code into a single file will help with optimization that I desire? I read somewhere that a compiler will optimize better if it finds all code in a single file in place of separate multiple files. Is that true?
A compiler can indeed optimize better when it finds needed code in the same compilable (*.c) file. If your program is longer than 1000 lines or so, you'll probably regret putting all the code in one file, because doing so will make your program hard to maintain, but if shorter than 500 lines, you might try the one file, and see if it does not help.
The crucial consideration is how often code in one compilable file calls or otherwise uses objects (including functions) defined in another. If there are few transfers of control across this boundary, then erasing the boundary will not help performance appreciably. Therefore, when coding for performance, the key is to put tightly related code in the same file.
I like your question a great deal. It is the right kind of question to ask, in my view; and, though the complete answer is not simple enough to treat fully in a Stackexchange answer, your pursuit of the answer will teach you much. Though you may not yet realize it, your question really regards linking, a subject every advancing programmer eventually has to learn. Your question regards symbol tables, inlining, the in-place construction of return values and several, other, subtle factors.
At any rate, if your program is shorter than 500 lines or so, then you have little to lose by trying the single-file approach. If longer than 1000 lines, then a single file is not recommended.
It depends on the compiler. The Intel C++ Composer XE for example can automatically optimize over multiple files (when building using icc -fast *.c *.cpp or icl /fast *.c *.cpp, for linux/windows respectively).
When you use Microsoft Visual Studio, or a derived product (like Atmel Studio for microcontrollers), every single source file is compiled on its own (i. e. one cl, icl, or gcc command is issued for every c and cpp file in the project). This means no optimization.
For microcontroller projects I sometimes have to put everything in a single file in order make it even fit in the limited flash memory on the controller. If your compiler/IDE does it like visual studio, you can use a trick: Select all the source files and make them not participate in the build process (but leave them in the project), then create a file (I always use whole_program.c, and #include every single source (i.e. non-header) file in it (note that including c files is frowned upon by many high level programmers, but sometimes, you have to do it the dirty way, and with microcontrollers, that's actually more often than not).
My experience has been that with gnu/gcc the optimization is within the single file plus includes to create a single object. With clang/llvm it is quite easy and I recommend, DO NOT optimize the clang step, use clang to get from C to bytecode, the use llvm-link to link all of your bytecode modules into one bytecode module, then you can optimize the whole project, all source files optimized together, the llc adds more optimization as it heads for the target. Your best results are to tell clang using the something triple command line option what your ultimate target is. For the gnu path to do the same thing either use includes to make one big file compiled to one object, or if there is a machine code level optimizer other than a few things the linker does, then that is where it would have to happen. maybe gnu has an exposed ir file format, optimizer, and ir to target tool, but I think I would have seen that by now.
http://github.com/dwelch67 a number of my projects, although very simple programs, have llvm and gnu builds for the same source files, you can see where the llvm builds I make a binary from unoptimized bytecode and also optimized bytecode (llvm's optimizer has problems with small while loops and sometimes generates non-working code, a very quick check to see if it is you or them is to try the non-optimized llvm binary and the gnu binary to see if they all behave the same (you) or if only the optimized llvm doesnt work (them)).

Delphi dcu to obj

Is there a way to convert a Delphi .dcu file to an .obj file so that it can be linked using a compiler like GCC? I've not used Delphi for a couple of years but would like to use if for a project again if this is possible.
Delphi can output .obj files, but they are in a 32-bit variant of Intel OMF. GCC, on the other hand, works with ELF (Linux, most Unixes), COFF (on Windows) or Mach-O (Mac).
But that alone is not enough. It's hard to write much code without using the runtime library, and the implementation of the runtime library will be dependent on low-level details of the compiler and linker architecture, for things like correct order of initialization.
Moreover, there's more to compatibility than just the object file format; code on Linux, in particular, needs to be position-independent, which means it can't use absolute values to reference global symbols, but rather must index all its global data from a register or relative to the instruction pointer, so that the code can be relocated in memory without rewriting references.
DCU files are a serialization of the Delphi symbol tables and code generated for each proc, and are thus highly dependent on the implementation details of the compiler, which changes from one version to the next.
All this is to say that it's unlikely that you'd be able to get much Delphi (dcc32) code linking into a GNU environment, unless you restricted yourself to the absolute minimum of non-managed data types (no strings, no interfaces) and procedural code (no classes, no initialization section, no data that needs initialization, etc.)
(answer to various FPC remarks, but I need more room)
For a good understanding, you have to know that a delphi .dcu translates to two differernt FPC files, .ppu file with the mentioned symtable stuff, which includes non linkable code like inline functions and generic definitions and a .o which is mingw compatible (COFF) on Windows. Cygwin is mingw compatible too on linking level (but runtime is different and scary). Anyway, mingw32/64 is our reference gcc on Windows.
The PPU has a similar version problem as Delphi's DCU, probably for the same reasons. The ppu format is different nearly every major release. (so 2.0, 2.2, 2.4), and changes typically 2-3 times an year in the trunk
So while FPC on Windows uses own assemblers and linkers, the .o's it generates are still compatible with mingw32 In general FPC's output is very gcc compatible, and it is often possible to link in gcc static libs directly, allowing e.g. mysql and postgres linklibs to be linked into apps with a suitable license. (like e.g. GPL) On 64-bit they should be compatible too, but this is probably less tested than win32.
The textmode IDE even links in the entire GDB debugger in library form. GDB is one of the main reasons for gcc compatibility on Windows.
While Barry's points about the runtime in general hold for FPC too, it might be slightly easier to work around this. It might only require calling certain functions to initialize the FPC rtl from your startup code, and similarly for the finalize. Compile a minimal FPC program with -al and see the resulting assembler (in the .s file, most notably initializeunits and finalizeunits) Moreover the RTL is more flexible and probably more easily cut down to a minimum.
Of course as soon as you also require exceptions to work across gcc<->fpc bounderies you are out of luck. FPC does not use SEH, or any scheme compatible with anything else ATM. (contrary to Delphi, which uses SEH, which at least in theory should give you an advantage there, Barry?) OTOH, gcc might use its own libunwind instead of SEH.
Note that the default calling convention of FPC on x86 is Delphi compatible register, so you might need to insert proper cdecl (which should be gcc compatible) modifiers, or even can set it for entire units at a time using {$calling cdecl}
On *nix this is bog standard (e.g. apache modules), I don't know many people that do this on win32 though.
About compatibility; FPC can compile packages like Indy, Teechart, Zeos, ICS, Synapse, VST
and reams more with little or no mods. The dialect levels of released versions are a mix of D7 and up, with the focus on D7. The dialect level is slowly creeping to D2006 level in trunk versions. (with for in, class abstract etc)
Yes. Have a look at the Project Options dialog box:
(High-Res)
As far as I am aware, Delphi only supports the OMF object file format. You may want to try an object format converter such as Agner Fog's.
Since the DCU format is proprietary and has a tendency of changing from one version of Delphi to the next, there's probably no reliable way to convert a DCU to an OBJ. Your best bet is to build them in OBJ format in the first place, as per Andreas's answer.

Resources