I've tried passing -ffunction-sections -fdata-sections for the compiler, but that doesn't seem to have the desired effect. As far as I understand, I also have to pass -Wl,--gc-sections to the linker, but I'm not linking the files at this point. I just want to have a .a library file as small as possible, with minimal redundant code/data.
The compiler performs optimization based on the knowledge it has of the program. Optimization levels -O2 and above, in particular, enable unit-at-a-time mode, which allows the compiler to consider information gained from later functions in the file when compiling a function. Compiling multiple files at once to a single output file in unit-at-a-time mode allows the compiler to use information gained from all of the files when compiling each of them.
Not all optimizations are controlled directly by a flag.
-ffunction-sections
-fdata-sections
Place each function or data item into its own section in the output file if the target supports arbitrary sections. The name of the function or the name of the data item determines the section's name in the output file.
Use these options on systems where the linker can perform optimizations to improve locality of reference in the instruction space. Most systems using the ELF object format and SPARC processors running Solaris 2 have linkers with such optimizations. AIX may have these optimizations in the future.
Only use these options when there are significant benefits from doing so. When you specify these options, the assembler and linker will create larger object and executable files and will also be slower. You will not be able to use gprof on all systems if you specify this option and you may have problems with debugging if you specify both this option and -g.
U can use the following link for more details..:-)
http://gcc.gnu.org/onlinedocs/gcc-4.0.4/gcc/Optimize-Options.html
The following will reduce the size of your compiled objects (and thus the static library)
-Os -g0 -fvisibility=hidden -fomit-frame-pointer -mno-accumulate-outgoing-args -finline-small-functions -fno-unwind-tables -fno-asynchronous-unwind-tables -s
The following will increase the size of objects (though they may make the binary smaller)
-ffunction-sections -fdata-sections -flto -g -g1 -g2 -g3
The only way to really make a static library smaller is to remove the unneeded code by compiling the static library with -ffunction-sections -fdata-sections and linking the final product with -Wl,--gc-sections,--print-gc-sections to find out what parts are cruft. Then go back and remove those functions/variables (this also works for making smaller shared libraries - see http://libraryopt.sf.net)
Compare the size of the library if you -g in the compiler flags, vs not having it. I can easily imagine you double the size of a static library if it includes debug information in. If that was what you saw, chances are you already stripped the library of debug symbols, and hence cannot significantly shrink the library file more. This could explain your memory of cutting the size in half at some point.
Related
I am working on a task for the university, there is a webiste that checks my memory usage and it compiles the .c files with:
/usr/bin/gcc -DEVAL -std=c11 -O2 -pipe -static -s -o program programname.c -lm
and it says my program exceeds the memory limits of 4 Mib which is a lot i think. I was told this command makes it use more memory that the standard compilation I use on my pc, like this:
gcc myprog.c -o myprog
I launched the executable created by this one compilation with:
/usr/bin/time -v ./myprog
and under "maximum resident set size" it says 1708 kilobytes, which should be 1,6 Mibs. So how can it be that for the university checker my program goes over 4 Mibs? I have eliminated all the possible mallocs i have, I just left the essential ones but it still says it goes over the limit, what else should I improve? I'm almost thinking the wesite has an error or something...
From GNU GCC Manual, Page 197:
-static On systems that support dynamic linking, this overrides ‘-pie’ and prevents linking with the shared libraries. On other systems, this
option has no effect.
If you don't know about the pie flag quoted here, have a look at this section:
-pie Produce a dynamically linked position independent executable on targets that support it. For predictable results, you must also
specify the same set of options used for compilation (‘-fpie’,
‘-fPIE’, or model suboptions) when you specify this linker option.
To answer your question: yes is it possible this overhead generated by the static flag, because in that case, the compiler can not do the basic optimization by merging stdlib's code with the one you've produced.
As it was suggested in the comments you shall compile your code with the same flag of the website to have an idea of the real overhead of your program (be sure that your gcc version is the same of the website) and also you shall do some common manual optimization such constant folding, function inlining etc. A good reference to these optimizations could be this one
There is a vendor whose software I'd like to work with. They have a code base which they can only compile using IAR Embedded Workbench (as far as I know, their code does not compile with GCC). Unfortunately their hardware only works with their software stack, so I don't really have a choice about whether or not I'd like to use it. They distribute this code as a .a static library file (and accompanying headers) compiled for the ARM Cortex-M4 CPU. (They don't want to distribute sources.) For the sake of this discussion, let's call it evil_sw_stack.a.
I'd like to use this piece of code but I don't have an IAR license and have zero expertise with IAR. I'd like to use GCC.
Is there a way to make IAR produce such a static library that GCC can link to? What kind of compiler option would the vendor need to use to produce such a binary?
(I would guess that the ABI of the resulting binary can be somehow specified and set to a setting which statisfies GCC. )
Example usage of GCC
Their default software stack is very GCC-friendly, this specific one is the only one in their offering which isn't. Generally, I can compile a simple piece of example code if I have the following:
startup_(devicename).S: GCC-specific assembly file
system_(devicename).c
(devicename).ld: linker script
Some header files for the specific device
For example, I can compile a simple piece of example like this:
$ arm-none-eabi-gcc helloworld.c startup_(devicename).S system_(devicename).c -T (devicename).ld -o helloworld -D(devicename) -I. -fno-builtin -ffunction-sections -fdata-sections -mfpu=fpv4-sp-d16 -mfloat-abi=softfp -mcpu=cortex-m4 -mthumb -mno-sched-prolog -Wl,--start-group -lgcc -lc -lnosys -Wl,--end-group
So far, so good. No warnings, no errors.
How I try to use the static library
For the sake of this discussion, let's call it evil_sw_stack.a.
This is how I attempted to use it:
$ arm-none-eabi-gcc evil_sw_stack.a helloworld.c startup_(devicename).S system_(devicename).c -T (devicename).ld -o helloworld -D(devicename) -I. -fno-builtin -ffunction-sections -fdata-sections -mfpu=fpv4-sp-d16 -mfloat-abi=softfp -mcpu=cortex-m4 -mthumb -mno-sched-prolog -Wl,--start-group -lgcc -lc -lnosys -Wl,--end-group
Unfortunately this complains about multiple definitions of a bunch of functions that are defined in system_(devicename).c. Maybe they accidentally compiled that into this library? Or maybe IAR just compiled it this way? Now, if I try to remove system_(devicename).c from the GCC command line and simply link to the .a file, I get these errors:
/usr/lib/gcc/arm-none-eabi/5.2.0/../../../../arm-none-eabi/bin/ld: warning: thelibrary.a(startup_chipname.o) uses 2-byte wchar_t yet the output is to use 4-byte wchar_t; use of wchar_t values across objects may fail
undefined reference to `__iar_program_start'
undefined reference to `CSTACK$$Limit'
undefined reference to `__iar_program_start'
Poking the file with readelf gets me nowhere:
$ readelf -h evil_sw_stack.a
readelf: Error: evil_sw_stack.a: did not find a valid archive header
Interestingly though, this seems to be getting somewhere:
$ arm-none-eabi-ar x evil_sw_stack.a
Now I've got a bunch of object files which do have ELF headers according to readelf, and yup, they did compile a startup file (of another of their devices) into the library... I'm wondering why, but I think this is a mistake.
This also works:
$ arm-none-eabi-objdump -t evil_sw_stack_objfile.o
So now the question is, is it safe to try to compile these object files into my own application using GCC? According to this other SO question, the object file formats are not compatible.
I assume that the startup code is mistakenly compiled into the library. I can delete it:
$ arm-none-eabi-ar d evil_sw_stack.a startup_(otherdevicename).o
$ arm-none-eabi-ar d evil_sw_stack.a system_(otherdevicename).o
Now I get an evil_sw_stack.a which gcc can accept as an input without complaining.
However, there is one thing that still worries me. When I use the object files instead of the static library, I get these warnings:
/usr/lib/gcc/arm-none-eabi/5.2.0/../../../../arm-none-eabi/bin/ld: warning: evil_objfile.o uses 2-byte wchar_t yet the output is to use 4-byte wchar_t; use of wchar_t values across objects may fail
/usr/lib/gcc/arm-none-eabi/5.2.0/../../../../arm-none-eabi/bin/ld: warning: evil_objfile.o uses 32-bit enums yet the output is to use variable-size enums; use of enum values across objects may fail
So it seems that evil_sw_stack.a was compiled with (the IAR equivalents of) -fno-short-enums and -fshort-wchar. GCC doesn't complain about this when I use evil_sw_stack.a at its command line but it does complain when I try to use any object file that I extracted from the library. Should I worry about this?
I don't use wchar_t in my code so I believe that one doesn't matter, but I would like to pass enums between my code and the library.
Update
Even though the linker doesn't complain, it doesn't work when I actually call some functions from the static library. In that case, make sure to put the libraries in the correct order when you call the linker. According to the accepted answer to this question, they need to be in reverse order of dependency. After doing this, it still misses some IAR crap:
undefined reference to `__aeabi_memclr4'
undefined reference to `__aeabi_memclr'
undefined reference to `__aeabi_memmove'
undefined reference to `__aeabi_memset4'
undefined reference to `__aeabi_memset'
undefined reference to `__iar_vla_alloc2'
undefined reference to `__iar_vla_dealloc2'
undefined reference to `__aeabi_memclr4'
I've found out that the __aeabi functions are defined in libgcc but even though I link to libgcc too, the definition in libgcc doesn't seem to be good enough for the function inside evil_sw_stack.a.
EDIT: after some googling around, it seems that arm-none-eabi-gcc doesn't support these specific __aeabi functions. Take a look at this issue.
Anyway, after taking a look at ARM's runtime ABI docs, the missing __aeabi functions can be trivially implemented using their standard C library equivalents. But I'm not quite sure how __iar_vla_alloc2 and __iar_vla_dealloc2 should work and couldn't find any documentation on them online. The only thing I found out is that VLA means "variable length array".
So, it seems that this will never work unless the chip vendor can compile their static library in such a way that it doesn't use these symbols. Is that right?
Disclaimer
I'd prefer not to disclose who the vendor is and not to disclose which product I work with. They are not proud that this thing doesn't work properly and asked me not to. I'm asking this question to help and not to discredit them.
Consider something like the following scenario:
main.c contains something like this:
#include "sub.h"
main(){
int i = 0;
while(i < 1000000000){
f();
i++;
}
}
while sub.h contains:
void f();
and sub.c contains something like this:
void f(){
int a = 1;
}
Now, if this were all in one source file, the compiler (gcc in my case) would notice that f() doesn't actually do anything and optimize the loop away. But since compiling happens before linking, that optimization can't happen in this case.
This can be avoided for local include files by including the raw .c files rather tan the headers, but when including headers from other libraries this becomes impossible. Is there any way around this?
If I'm understanding correctly, you would like to only link library functions that are being used by your program. Using the GCC tool chain this is possible with the optimization flags:
-O2 -fdata-sections -ffunction-sections
The first flag should optimize away loops that do nothing. The other two flags place each function or data item into its own section in the compiled output file. This allows the linker to perform optimizations. Note: it will take longer to compile and you won't be able to use gprof.
You will also need to then pass the linker the -gc-sections flag so that it won't include unused function and data sections.
All in all, you would execute:
gcc -O2 -fdata-sections -ffunction-sections main.c sub.c -Wl,-gc-sections
If you were to instead call GCC to produce assembly files you could inspect them to find that _main does not execute a loop or call the function f():
$ gcc -O2 -S -fdata-sections -ffunction-sections main.c sub.c -Wl,-gc-sections
$ cat main.s
Sources:
How to remove unused C/C++ symbols with GCC and ld?
http://linux.die.net/man/1/ld
http://linux.die.net/man/1/gcc
The compiler cannot guess and cannot make assumptions on what's outside the individual translation unit that it compiles. Some toolchains (end-to-end compiler+linker+supporting-utilities) may detect some such cases within a project being built from sources, depending on their sophistication of optimizations. It would not be common, and not guaranteed. It would most certainly not, and could not, apply to opaque 3rd party libraries being linked in.
In practice however, would you really use a 3rd party library that exported some no-op function, in hope that someone (your toolchain) would notice and safely optimize it away?
In windows, the vs system has whole program optimization
Sqlite uses a script to build a single C file to compile
I am aiming at removing as much dead code as possible from my executable to reduce code-size. I followed a number of tips from some other questions on Stack Overflow and am using -fdata-sections -ffunction-sections... + ...--gc-sections in my makefile. However I have one question.
I am first compiling all my .o's using -fdata-sections -ffunction-sections as flags. Then I am archiving these into a static library (lib.a). Then I am building the executable linking this lib.a and using the --gc-sections flag in this linker. However this doesnt seem to do any improvement. Is this because I am archiving first? Does archiving remove the effect of splitting into sections? (Also, any form of "strip" I am using doesn't seem to affect the executable at all!)
We're facing an interesting topic. Lets say we have a special-functions.c file, basically a library.
We need to optimize the code as getting rid of all unused/unreferenced functions during the build process on-the-fly.
I'm not searching for generally unused (dead) code: some parts will be "dead" in case of compiling to one of the architectures, but it's going to be used in an other architecture build.
Does anybody knows of flags, tools, methods, tricks to do this?
The compiler is standard gcc with ansi 99 C code.
EDIT
I know, this is mainly the part of the linker, but using gcc, the process is not really split into two parts.
From http://embeddedfreak.wordpress.com/2009/02/10/removing-unused-functionsdead-codes-with-gccgnu-ld/ :
Compile with -fdata-sections to keep the data in separate data
sections and -ffunction-sections to keep functions in separate
sections, so they (data and functions) can be discarded if unused.
Link with --gc-sections to remove unused sections.
For example:
gcc -Os -fdata-sections -ffunction-sections test.c -o test -Wl,--gc-sections
I think that a recent GCC (i.e. 4.6) should do that if you compile and link with the -flto flag (link time optimization). I would imagine that having hidden or internal visibility should be relevant (at least for non-static functions).
To my knowledge, the GNU binary utils (ld, in this case) already remove unusesd references on static link