Faster code with another compiler - c

I'm using the standard gcc compiler in math software development with C-language. I don't know that much about compilers or compiler options, and I was just wondering, is it possible to make faster executables using another compiler or choosing better options? The default Makefile sets options -ffast-math and -O3 and I think both of them have some impact in the overall calculation time. My software is using memory quite extensively, so I imagine some options related to memory management might do the trick?
Any ideas?

Before experimenting with different compilers or random, arbitrary micro-optimisations, you really need to get a decent profiler and profile your code to find out exactly what the performance bottlenecks are. The actual picture may be very different from what you imagine it to be. Once you have a profile you can then start to consider what might be useful optimisations. E.g. changing compiler won't help you if you are limited by memory bandwidth.

Here are some tips about gcc performance:
do benchmarks with -Os, -O2 and -O3. Sometimes -O2 will be faster because it makes shorter code. Since you said that you use a lot of memory, try with -Os too and take measurements.
Also check out the -march=native option (it is considered safe to use, if you are making executable for computers with similar processors) on the client computer. Sometimes it can have considerable impact on performance. If you need to make a list of options gcc uses with native, here's how to do it:
Make a small C program called test.c, then
$ touch test.c
$ gcc -march=native -fverbose-asm -S test.c
$ cat test.s
credits for code goto Gentoo forums users.
It should print out a list of all optimizations gcc used. Please note that if you're using i7, gcc 4.5 will detect it as Atom, so you'll need to set -march and -mtune manually.
Also read this document, it will help you (still, in my experience on Gentoo, -march=native works better) http://gcc.gnu.org/onlinedocs/gcc/i386-and-x86_002d64-Options.html
You could try with new options in late 4.4 and early 4.5 versions such as -flto and -fwhole-program. These should help with performance, but when experimenting with them, my system was unstable. In any case, read this document too, it will help you understand some of GCC's optimization options http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

If you are running Linux on x86 then typically the Intel or PGI compilers will give you significantly faster performing executables.
The downsides are that there are more knobs to tune and that they come with a hefty price tag!

If you have specific hardware you can target your code for, the (hardware) company often releases paid-for compilers optimized for that hardware.
For example:
xlc for AIX
CC for Solaris
These compilers will generally produce better code optimization-wise.

As you say your program is memory heavy you could test to use a different malloc implementation than the one in standard library on your platform.
For example you could try the jemalloc (http://www.canonware.com/jemalloc/).

Keep in mind they most improvements to be had by changing compilers or settings will only get you proportional speedups where as adjusting algorithms you can sometimes get improvements in the O() of your program. Be sure to exhaust that before you put to much work into tweaking settings.

Related

Does debug information get stripped from library on optimized build?

I am using GCC's C compiler for ARM. I've compiled Newlib using the C compiler. I went into the makefile for Newlib and, saw that the Newlib library gets compiled using -g -O2.
When compiling my code and linking against Newlib's standard C library does this debug information get stripped?
You can use -g and -O2 both together. The compiler with optimize the code and keep the debugging information. Of course at some places because of code optimization you will not get information for some symbol that has been removed by code optimization and is no longer present.
From the Gcc options summary
Turning on optimization flags makes the compiler attempt to improve the performance and/or code size at the expense of compilation time and possibly the ability to debug the program.
There are multiple flags and options that will make debugging impossible or difficult. e.g.
-fomit-frame-pointer .... It also makes debugging impossible on some machines.
-fsplit-wide-types.... This normally generates better code for those types, but may make debugging more difficult.
-fweb - ... It can, however, make debugging impossible, since variables no longer stay in a “home register”.
The first two are enabled for -O2.
If you want debugging information to be preserved, the following option can be used.
-Og
Optimize debugging experience. -Og enables optimizations that do not interfere with debugging. It should be the optimization level of choice for the standard edit-compile-debug cycle, offering a reasonable level of optimization while maintaining fast compilation and a good debugging experience.

Which compilation flag should I use -> Os or O2?

I'm currently working on an embedded device application (in C). What optimization flag should I use for compiling that application keeping in mind that it only has 96 MB of RAM.
Also, please note that in this application, I'm basically pre-processing the JPEG image. So which optimization flag should I use ?
Also, does stripping this app will have any effect on efficiency and speed ?
The OS on which I'm running this app is Linux 2.6.37.
Generally optimization increases the binary size. Besides what effect this will have on speed is not predictable at all depending on the data set that you have. The only way is to benchmark the application with different set of flags, not just -O2 or -O3 but with other possible flags which you think may improve the performance of the application as I believe you'd have information on what the program does and how it might behave for different inputs.
The performance is dependent on the nature of the application, hence I don't think anyone can give you a convincing answer as to which flags can give you better performance.
Look at the GCC optimization flags and analyze your algorithm so as to find suitable flags and then decide which ones to use.
-Os is preferable. Not only are there RAM limits, but the CPU's cache size is limited as well, so -Os code can be faster executed in despite of using less optimization techniques.

C performance measure

I'm looking for a measure performance tool for C (I'm using MinGW windows toolchain) that gives me some results like:
Occupied memory by a variable;
Cycles to run the program/a function;
Spent time on a function.
Thanks
Google Perftools is multi-platform: http://code.google.com/p/google-perftools/
GCC has profiling as well: How to use profile guided optimizations in g++?
You can use gprof with is shipped with GCC. Here are some examples.
You'll find more about that in the GCC documentation. Just remember that you must use the -pg option for both compilation and link.
However, I got that working, but only on small software. On the bigger one I work on, I only got empty timing, and couldn't find the cause of that. But maybe you won't have the same problem...
Usually when gprof does not give you results it is because it is a multithread application. gprof does not support this kind of apps.

Which are mostly commonly used gcc optimization options?

I found a lot of Optimization Options here
While going through them I found some of them have side-effects (like makes debugging impossible). In my experience I have found the -O1 to -O3 and -Os most commonly used. But, what are the other options which are commonly used in your projects?
-ffast-math can have a significant performance impact on floating point intensive software.
Also, compiling specific for the target processor using the appropriate -march= option may have a slight performance impact, but strictly speaking, this is not an optimizing option.
-march=native with recent versions of gcc removes all the headache of determining the platform on which you are compiling.

Which 4.x version of gcc should one use?

The product-group I work for is currently using gcc 3.4.6 (we know it is ancient) for a large low-level c-code base, and want to upgrade to a later version. We have seen performance benefits testing different versions of gcc 4.x on all hardware platforms we tested it on. We are however very scared of c-compiler bugs (for a good reason historically), and wonder if anyone has insight to which version we should upgrade to.
Are people using 4.3.2 for large code-bases and feel that it works fine?
The best quality control for gcc is the linux kernel. GCC is the compiler of choice for basically all major open source C/C++ programs. A released GCC, especially one like 4.3.X, which is in major linux distros, should be pretty good.
GCC 4.3 also has better support for optimizations on newer cpus.
When I migrated a project from GCC 3 to GCC 4 I ran several tests to ensure that behavior was the same before and after. Can you just run a run a set of (hopefully automated) tests to confirm the correct behavior? After all, you want the "correct" behavior, not necessarily the GCC 3 behavior.
I don't have a specific version for you, but why not have a 4.X and 3.4.6 installed? Then you could try and keep the code compiling on both versions, and if you run across a show-stopping bug in 4, you have an exit policy.
Use the latest one, but hunt down and understand each and every warning -Wall gives. For extra fun, there are more warning flags to frob. You do have an extensive suite of regression (and other) tests, run them all and check them.
GCC (particularly C++, but also C) has changed quite a bit. It does much better code analysis and optimization, and does handle code that turns out to invoke undefined bahaviiour differently. So code that "worked fine" but really did rely on some particular interpretation of invalid constructions will probably break. Hopefully making the compiler emit a warning or error, but no guarantee of such luck.
If you are interested in OpenMP then you will need to move to gcc 4.2 or greater. We are using 4.2.2 on a code base of around 5M lines and are not having any problems with it.
I can't say anything about 4.3.2, but my laptop is a Gentoo Linux system built with GCC 4.3.{0,1} (depending on when each package was built), and I haven't seen any problems. This is mostly just standard desktop use, though. If you have any weird code, your mileage may vary.

Resources