compiling a C file with gcc to get x86 assembly code - c

I have a C file heapsort.c which Im trying to compile on a 64 bit linux machine to output the corresponding assembly code. Im using the following command:
gcc -02 -S heapsort.c
when I type this Im getting this error message
gcc:error: unrecognized option '-02'
I tried googling this error but nothing helpful came up. Any suggestions on how to navigate this error and get the x86 output?

The flag is -O2, not -02. That's a letter O for "optimization", not a number 0. You might want to look into using a font that makes the difference more obvious.

I suggest even (as every pointed out is is the letter O not the digit 0)
gcc -O2 -fverbose-asm -S heapsort.c
The -fverbose-asm will give you more generated comments in the assembly file heapsort.s
BTW, passing -Wall to GCC is always a good habit.
if you are really curious and want to understand a bit more the internal representations inside GCC try even
gcc -fdump-tree-all -O2 -S heapsort.c
but be prepared to get a lot of files.
You'll get hundreds of them, matching heapsort.c.*!
If you want some crude GUI interface to query the Gimple internal representation at some arbitrary source code position, consider using MELT. MELT is mostly a high-level domain specific language (with a Lisp-like syntax, powerful pattern matching, object oriented, functional, dynamically typed, ....) to extend GCC, but you can also use its (crude) probe to query interactively some of the GCC internal representations.

It should be -O2 with 'O' not "zero"

Try -O2 instead of -02. It's a letter 'O' and shorthand for "optimization level 2".

Related

gcc-11 compiler optimisation levels not working on Mac m1

I have to complete a project for university where I need to be able to optimise using compiler optimisation levels.
I am using OpenMP and as a result I have got the gcc-11 compiler from brew.
I have watched this video https://www.youtube.com/watch?v=U161zVjv1rs and tried the same thing but I am getting an error:
gcc-11 -fopenmp jacobi2d1.c -o1 out/jacobi2d1
But I am getting the following error:
How do I do this?
Any advice would be appreciated
Optimization levels are specified with -O1 etc, using capital letter O, not lower-case letter o.
Lower-case -o1 specifies that the output file should be 1, and then out/jacobi2d1 is an input file to be linked, but it is an existing executable and you can't link one executable into another — hence the error from the linker.

what gcc compiler options can I use for gfortran

I studied Option Summary for gfortran but found no compiler option to detect integer overflow. Then I found the GCC (GNU Compiler Collection) flag option -fsanitize=signed-integer-overflow here and used it when invoking gfortran. It works--integer overflow can be detected at run time!
So what does -fsanitize=signed-integer-overflow do here? Just adding to the machine code generated by gfortran some machine-level pieces that check integer overflow?
What is the relation between GCC (GNU Compiler Collection) flag options and gfortran compiler options ? What gcc compiler options can I use for gfortran, g++ etc ?
There is the GCC - GNU Compiler Collection. It shares the common backend and middleend and has frontends for different languages. For example frontends for C, C++ and Fortran which are usually invoked by commands gcc, g++ and gfortran.
It is actually more complicated, you can call gcc on a Fortran source and gfortran on a C source and it will work almost the same with the exceptions of libraries being linked (there are some other fine points). The appropriate frontend will be called based on the file extension or the language requested.
You can look almost all GCC (not just gcc) flags for all of the mentioned frontends. There are certain flags which are language specific. Normally you will get a warning like
gfortran -fcheck=all source.c
cc1: warning: command line option ‘-fcheck=all’ is valid for Fortran but not for C
but the file will compile fine, the option is just ignored and you will get a warning about that. Notice it is a C file and it is compiled by the gfortran command just fine.
The sanitization options are AFAIK not that language specific and work for multiple languages implemented in GCC, maybe with some exceptions for some obviously language specific checks. Especially -fsanitize=signed-integer-overflow which you ask about works perfectly fine for both C and C++. Signed integer overwlow is undefined behaviour in C and C++ and it is not allowed by the Fortran standard (which effectively means the same, Fortran just uses different words).
This isn't a terribly precise answer to your question, but an aha! moment, when learning about compilers, is learning that gcc (the GNU Compiler Collection), like llvm, is an example of a three-stage compiler.
The ‘front end’ parses the syntax of whichever language you're interested, and spits out an Abstract Syntax Tree (AST), which represents your program in a language-independent way.
Then the ‘middle end’ (terrible name, but ‘the clever bit’) reorganises that AST into another AST which is semantically equivalent but easier to turn into machine code.
Then the ‘back end’ turns that reorganised AST into assembler for one-or-other processor, possibly doing platform-specific micro-optimisations along the way.
That's why the (huge number of) gcc/llvm options are unexpectedly common to (apparently wildly) different languages. A few of the options are specific to C, or Fortran, or Objective-C, or whatever, but the majority of them (probably) are concerned with the middle and last bits, and so are common to all of the languages that gcc/llvm supports.
Thus the various options are specific to stage 1, 2 or 3, but may not be conveniently labelled as such; with this in mind, however, you might reasonably intuit what is and isn't relevant to the particular language you're interested in.
(It's for this sort of reason that I will dogmatically claim that CC++FortranJavaPerlPython is essentially a single language, with only trivial syntactical and library minutiae to distinguish between dialects).

Side by side C, x86 programs

Is there anywhere I can find side-by-side examples of dead simple C and x86 programs? The examples I've found so far on the Internet seem to jump straight from "here's Hello World in x86" to "write your own operating system!" I'm having trouble internalizing what has to happen when you do things like call a function.
I would recommend a look at GCC's intermediate assembly output, for example call
gcc -S a.c
then look at a.s
Most of the time, smaller and easier to understand assembly is generated by optimizing, so you would rather use
gcc -O -S a.c
If you mean x86 assembly language, use objdump --disassemble myprog (on any GNU system) to show the assembly language generated by your C program. If your system doesn't have objdump, you can use ndisasm.
Assuming you mean x86 assembler then with gcc you can use gcc -S yourhelloworldprogram.c to get assembler output. For Visual Studio you can get assembler output by following this: How do I get the assembler output from a C file in VS2005
I reccommend ddd. You can have the both C sources (if you built with debug symbols) and the machine code showing. You can also step over the code interactively probing register and memory values. A great learning tool.
On gcc you can use the -save-temps -fverbose-asm options which is better than the -S option because it still generates the object file and you get also the preprocessor file. The verbose-asm is also important because it adds comments to the assembly output that make the link between the function and variable names of your program and the generated assembly code. Especially when generating with optimization it often is difficult to make the link between the source C and the assembly.

Performance difference between C program executables created by gcc and g++ compilers

Lets say I have written a program in C and compiled it with both gcc (as C) and g++ (as C++), which compiled executable will run faster: the one created by gcc or by g++? I think using the g++ compiler will make the executable slow, but I'm not sure about it.
Let me clarify my question again because of confusion about gcc:
Let's say I compile program a.c like this in the terminal:
gcc a.c
g++ a.c
Which a.out executable will run faster?
Firstly: the question (and some of the other answers) seem to be based on the faulty premise that C is a strict subset of C++, which is not in fact the case. Compiling C as C++ is not the same as compiling it as C: it can change the meaning of your program!
C will mostly compile as C++, and will mostly give the same results, but there are some things that are explicitly defined to give different behaviour.
Here's a simple example - if this is your a.c:
#include <stdio.h>
int main(void)
{
printf("%d\n", sizeof('x'));
return 0;
}
then compiling as C will give one result:
$ gcc a.c
$ ./a.out
4
and compiling as C++ will give a different result (unless you're using an unusual platform where int and char are the same size):
$ g++ a.c
$ ./a.out
1
because the C specification defines a character literal to have type int, and the C++ specification defines it to have type char.
Secondly: gcc and g++ are not "the same compiler". The same back end code is used, but the C and C++ front ends are different pieces of code (gcc/c-*.c and gcc/cp/*.c in the gcc source).
Even if you stick to the parts of the language that are defined to do the same thing, there is no guarantee that the C++ front end will parse the code in exactly the same way as the C front end (i.e. giving exactly the same input to the back end), and hence no guarantee that the generated code will be identical. So it is certainly possible that one might happen to generate faster code than the other in some cases - although I would imagine that you'd need complex code to have any chance of finding a difference, as most of the optimisation and code generation magic happens in the common back end of the compiler; and the difference could be either way round.
I think they they will both produce the same machine code, and therefore the same speed on your computer.
If you want to find out, you could compile the assembly for both and compare the two, but I'm betting that they create the same assembly, and therefore the same machine code.
Profile it and try it out. I'm certain it will depend on the actual code, even if it would require potentially a really weird case to get any different bytecode. Though if you don't have extern C {} around your C code, and or works fine in C, I'm not sure how "compiling it as though it were C++" could provide any speed, unless the particular compiler optimizations in g++ just happen to be a bit better for your particular situation...
The machine code generated should be identical. The g++ version of a.out will probably link in a couple of extra support libraries. This will make the startup time of a.out be slower by a few system calls.
There is not really any practical difference though. The Linux linker will not become noticeably slower until you reach 20-40 linked libraries and thousands of symbols to resolve.
The gcc and g++ executables are just frontends, they are not the actual compilers. They both run the actual C or C++ compilers (and ld, ar, whatever is needed to produce the output you asked for) based on the file extensions. So you'll get the exact same result. G++ is commonly used for C++ because it links with the standard C++ library (iostreams etc.).
If you want to compile C code as C++, either change the file extension, or do something like this:
gcc test.c -otest -x c++
http://gcc.gnu.org/onlinedocs/gcc-3.3.6/gcc/G_002b_002b-and-GCC.html
GCC is a compiler collection. It is mainly used for compilation of C,C++,Ada,Java and many more programming languages.
G++ is a part of gnu compiler collection(gcc).
I mean gcc includes g++ as well. When we use gcc for compilation of C++ it uses g++. The output files will be different because the G++ compiler uses its own run time library.
Edit: Okay, to clarify things, because we have a bit of confusion in naming here. GCC is the GNU Compiler Collection. It can compile Ada, C++, C, and a billion and a half other languages. It is a "backend" to the various languages "front end" compilers like GNAT. Go read the link i made at the top of the page from GCC.GNU.Org.
GCC can also refer to the GNU C Compiler. This will compile C++ code if given the -lstdc++ command, but normally will choke and die because it's not pulling in the C++ libraries.
G++, the GNU C++ Compiler, like the GNU C Compiler is a front end to the GNU Compiler Collection. It's difference between the C Compiler is that it automatically includes those libraries and makes a few other small tweaks, because it's assuming it's going to be fed C++ code to compile.
This is where the confusion comes from. Does this clarify things a bit?

Easy way to convert c code to x86 assembly?

Is there an easy way (like a free program) that can covert c/c++ code to x86 assembly?
I know that any c compiler does something very similar and that I can just compile the c code and then disassemble the complied executable, but that's kind of an overkill, all I want is to convert a few lines of code.
Does anyone know of some program that can do that?
EDIT: I know that GCC compiler does that but it's AT&T syntax and I'm looking for the Intel syntax (not sure if it's called intel syntax or not). The AT&T syntax looks a bit like gibberish to me and some commands use operands in reverse order and not how I'm used to and it can get really confusing.
GCC can output Intel syntax assembly using the following command line:
gcc -S input.c -o output.asm -masm=intel
Gcc can do it with the -S switch, but it will be disgustingly ugly at&t syntax.
gcc will generate assembly if you pass it the -S option on the command line.
Microsoft Visual C++ will do the same with the /FAs option.
The lcc compiler is a multiplatform cross-compiler. You can get it to produce Intel syntax assembly code by
lcc -S -Wf-target=x86/win32 foo.c
I find assembly code from lcc significantly easier to read than what gcc spits out nowawadays.
Your compiler is already doing that as you've stated, and most likely will have an option to stop before assembling.
For GCC, add the -S flag.
gcc -S x.c
cat x.s
Edit: If your program is pretty short, you could use the online service at https://gcc.godbolt.org/.
if you are using gcc as a compiler, you can compile with the -S option to produce assembly code. see http://www.delorie.com/djgpp/v2faq/faq8_20.html
As many people point out most compilers will do that. If you don't like the syntax,a bit of work with awk or sed should be able to translate. Or I'd be surprised if there wasn't a program that did that bit for you already written somewhere.
In VC++ the following command can be used to list the assembly code.
cl /FAs a.c
notice
every architecture has its own unique names and even build differently
now when you know the stacks involved using asm volatile
would b the perfect solution

Resources