I'm using an open source library which seems to have lots of preprocessing directives to support many languages other than C. So that I can study what the library is doing I'd like to see the C code that I'm compiling after preprocessing, more like what I'd write.
Can gcc (or any other tool commonly available in Linux) read this library but output C code that has the preprocessing converted to whatever and is also readable by a human?
Yes. Pass gcc the -E option. This will output preprocessed source code.
cpp is the preprocessor.
Run cpp filename.c to output the preprocessed code, or better, redirect it to a file with
cpp filename.c > filename.preprocessed.
-save-temps
This is another good option to have in mind:
gcc -save-temps -c -o main.o main.c
main.c
#define INC 1
int myfunc(int i) {
return i + INC;
}
and now, besides the normal output main.o, the current working directory also contains the following files:
main.i is the desired prepossessed file containing:
# 1 "main.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 31 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 32 "<command-line>" 2
# 1 "main.c"
int myfunc(int i) {
return i + 1;
}
main.s is a bonus :-) and contains the generated assembly:
.file "main.c"
.text
.globl myfunc
.type myfunc, #function
myfunc:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp)
movl -4(%rbp), %eax
addl $1, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size myfunc, .-myfunc
.ident "GCC: (Ubuntu 8.3.0-6ubuntu1) 8.3.0"
.section .note.GNU-stack,"",#progbits
If you want to do it for a large number of files, consider using instead:
-save-temps=obj
which saves the intermediate files to the same directory as the -o object output instead of the current working directory, thus avoiding potential basename conflicts.
The advantage of this option over -E is that it is easy to add it to any build script, without interfering much in the build itself.
Another cool thing about this option is if you add -v:
gcc -save-temps -c -o main.o -v main.c
it actually shows the explicit files being used instead of ugly temporaries under /tmp, so it is easy to know exactly what is going on, which includes the preprocessing / compilation / assembly steps:
/usr/lib/gcc/x86_64-linux-gnu/8/cc1 -E -quiet -v -imultiarch x86_64-linux-gnu main.c -mtune=generic -march=x86-64 -fpch-preprocess -fstack-protector-strong -Wformat -Wformat-security -o main.i
/usr/lib/gcc/x86_64-linux-gnu/8/cc1 -fpreprocessed main.i -quiet -dumpbase main.c -mtune=generic -march=x86-64 -auxbase-strip main.o -version -fstack-protector-strong -Wformat -Wformat-security -o main.s
as -v --64 -o main.o main.s
Tested in Ubuntu 19.04 amd64, GCC 8.3.0.
CMake predefined targets
CMake automatically provides a targets for the preprocessed file:
make help
shows us that we can do:
make main.i
and that target runs:
Preprocessing C source to CMakeFiles/main.dir/main.c.i
/usr/bin/cc -E /home/ciro/bak/hello/main.c > CMakeFiles/main.dir/main.c.i
so the file can be seen at CMakeFiles/main.dir/main.c.i
Tested on cmake 3.16.1.
I'm using gcc as a preprocessor (for html files.) It does just what you want. It expands "#--" directives, then outputs a readable file. (NONE of the other C/HTML preprocessors I've tried do this- they concatenate lines, choke on special characters, etc.) Asuming you have gcc installed, the command line is:
gcc -E -x c -P -C -traditional-cpp code_before.cpp > code_after.cpp
(Doesn't have to be 'cpp'.) There's an excellent description of this usage at http://www.cs.tut.fi/~jkorpela/html/cpre.html.
The "-traditional-cpp" preserves whitespace & tabs.
Run:
gcc -E <file>.c
or
g++ -E <file>.cpp
Suppose we have a file as Message.cpp or a .c file
Steps 1: Preprocessing (Argument -E)
g++ -E .\Message.cpp > P1
P1 file generated has expanded macros and header file contents and comments are stripped off.
Step 2: Translate Preprocessed file to assembly (Argument -S). This task is done by compiler
g++ -S .\Message.cpp
An assembler (ASM) is generated (Message.s). It has all the assembly code.
Step 3: Translate assembly code to Object code. Note: Message.s was generated in Step2.
g++ -c .\Message.s
An Object file with the name Message.o is generated. It is the binary form.
Step 4: Linking the object file. This task is done by linker
g++ .\Message.o -o MessageApp
An exe file MessageApp.exe is generated here.
#include <iostream>
using namespace std;
//This a sample program
int main()
{
cout << "Hello" << endl;
cout << PQR(P,K) ;
getchar();
return 0;
}
Related
I'm currently trying to get into the basics regarding C-compilation without the use of an IDE.
As I only learned C- and embedded-programming with an IDE I thought it would be a good idea to learn and give me a better understanding of how the whole build process is working behind the scenes.
I mainly want to learn how to implement a complete IDEless toolchain for an STM32 controller.
So my idea was to start simple and try to understand the C-only build toolchain and its possible configurations. For this purpose I searched for tutorials and found this and this one.
I tried to follow along the first tutorial on my windows system but encountered some problems quite early that I have trouble understanding.
I created the following hello.c testfile:
#include <stdio.h>
#include <stdint.h>
int main ( void )
{
printf("Hello World!\n");
return 0;
}
First I tried the simple full compilation using gcc -o hello.exe hello.c (1.6 from the tutorial)
Everything works fine, so I decided to test the compilation steps one after the other (1.7 from the tutorial)
I called all commands in the following order:
cpp hello.c > hello.i (preprocessing) -> gcc -S hello.i (Compilation) -> as -o hello.o hello.s (Assembly) -> ld -o hello.exe hello.o (Linking)
Every step until the linking seems to work but the linker gives me the following errors:
ld: hello.o:hello.c:(.text+0xa): undefined reference to `__main' ld:
hello.o:hello.c:(.text+0x47): undefined reference to `puts' ld:
hello.o:hello.c:(.text+0x5c): undefined reference to `printf'
Did I do something wrong here? And is there a reason the ">" operator is used for preprocessing and assembling but not if I just compile using gcc -o hello.exe hello.c
Do one even use these steps seperately that often?
I read that instead of cpp hello.c > hello.i I could also use gcc -E main.c > main.i so why use the cpp command, are there any advantages?
Next I set this problem aside and tried to add includes.
For this purpose I created the following 2 files:
myFunc.c:
uint8_t myFunc( uint8_t param )
{
uint8_t retVal = 0;
retVal = param + 1;
return retVal;
}
myFunc.h
#include <stdint.h>
uint8_t myFunc( uint8_t param );
And changed the hello.c to:
#include <stdio.h>
#include <stdint.h>
#include "myFunc.h"
int main ( void )
{
uint8_t testVal = 0;
testVal = myFunc(testVal);
printf("Hello World!\n");
printf("Test Value is %d \n", testVal);
return 0;
}
I first tried the gcc -o hello.exe hello.c but get the error:
undefined reference to `myFunc' collect2.exe: error: ld returned 1 exit status
So I figured I should add the include path (even if it is the same directory).
After a short search and the help of the second site I tried gcc -Wall -v -IC:\Users\User\Desktop\C-Only_Toolchain hello.c -o hello.exe
But get the same error...
Is there something wrong with the way my include paths are added? (obviously yes)
Lastly I tried to test the GNU make command from the tutorial.
I opened the editor and inserted all contents shown in the tutorial.
As the editor saves the file as a .txt editor I tried to just delete the file extension.
The makefile looks like this:
all: hello.exe
hello.exe: hello.o
gcc -o hello.exe hello.o
hello.o: hello.c
gcc -c hello.c
clean:
rm hello.o hello.exe
But if I enter make in my console I get the error that the command "make" is written incorrectly or could not be found.
I used tab for the indentation just as the tutorial suggests but it will not even recognize that there is a makefile.
Is this because it was originally a .txt file before I deleted the extension?
I would be happy if someone could help me with my confusing regarding this rather simple issues...
Furthermore I would be very thankful if you have some good suggestions on how to get into this topic more efficiently or have some good sources to share.
Thank you in advance and stay healthy :)
Best Regards
Evox402
So, these are a lot of questions.
(In the following I use linux, so some outputs are just similar, not identical, like paths and the assembly output, but because of your usage of gcc, it's quite transferable to windows).
I called all commands in the following order: cpp hello.c > hello.i (preprocessing) -> gcc -S hello.i (Compilation) -> as -o hello.o hello.s (Assembly) -> ld -o hello.exe hello.o (Linking)
As a repetition: What are you doing here?
cpp hello.c > hello.i
You run the preprocessor over the C file. It just does a text-replace of macros/ #defines and includes files.
This looks like this. (A bit shortened as it has around 800 lines)
...Snip....
struct _IO_FILE;
typedef struct _IO_FILE FILE;
struct _IO_FILE
{
int _flags;
char *_IO_read_ptr;
char *_IO_read_end;
char *_IO_read_base;
char *_IO_write_base;
char *_IO_write_ptr;
char *_IO_write_end;
char *_IO_buf_base;
char *_IO_buf_end;
char *_IO_save_base;
char *_IO_backup_base;
char *_IO_save_end;
struct _IO_marker *_markers;
struct _IO_FILE *_chain;
int _fileno;
int _flags2;
__off_t _old_offset;
unsigned short _cur_column;
signed char _vtable_offset;
char _shortbuf[1];
_IO_lock_t *_lock;
__off64_t _offset;
struct _IO_codecvt *_codecvt;
struct _IO_wide_data *_wide_data;
struct _IO_FILE *_freeres_list;
void *_freeres_buf;
size_t __pad5;
int _mode;
char _unused2[15 * sizeof (int) - 4 * sizeof (void *) - sizeof (size_t)];
};
extern FILE *stdin;
extern FILE *stdout;
extern FILE *stderr;
...Snip...
extern int printf (const char *__restrict __format, ...);
...Snip...
int main ( void )
{
printf("Hello World!\n");
return 0;
}
Now all important definitions are included, so the C compiler can run.
gcc -S hello.i.
It just converts your C code to assembly. (It will look a bit different on windows)
.file "hello.c"
.text
.section .rodata
.LC0:
.string "Hello World!"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
leaq .LC0(%rip), %rdi
call puts#PLT
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Debian 10.2.0-17) 10.2.0"
.section .note.GNU-stack,"",#progbits
Now you have to convert the assembly code to machine code:
as -o hello.o hello.s
This command just generates an so called object file with your code and important metadata, the linker will need.
ld -o hello.exe hello.o
Now you invoke the linker with your object file as argument and hello.exe as output file. It will look for the entry point (_start on linux-like, WinMain for example on windows, or sometimes _main).
But also the functions from the C-standard-library are missing.
But why? You don't say the linker, that you want to include it. If you invoke the linker ld as explicit as you did, you have to pass all libraries you want to include.
You have to add for example -lc to include the stdlib, and so on.
Did I do something wrong here?
You just forgot to add the C library to the libraries the linker should link with your object-file.
And is there a reason the ">" operator is used for preprocessing
> is not from cpp. It is from the shell. Try running without > hello.i. The preprocessor will just output it on the console. The > redirects to the specified file (Here hello.i).
I could also use gcc -E main.c > main.i so why use the cpp command, are there any advantages?
There is no difference. gcc calls the preprocessor internally.
Do one even use these steps seperately that often?
These steps are sometimes used in makefiles, but not as separated as you did, but often only in compiling+linking as two separate steps to reduce compile-time.
first tried the gcc -o hello.exe hello.c but get the error:
It compiles, the C compiler knows, there is at least a definition for myFunc and because of this, it emits valid assembly code.
But the linker, as soon as it resolves the references to functions, it doesn't find it and emits the error.
You have to add the myFunc.c to your commandline:
gcc -o hello.exe hello.c myFunc.c
But if I enter make in my console I get the error that the command "make" is written incorrectly or could not be found. I used tab for the indentation just as the tutorial suggests but it will not even recognize that there is a makefile. Is this because it was originally a .txt file before I deleted the extension?
You have to add the directory of make.exe to the path.
Suppose it has the path:
C:\Foo\bar\baz\make.exe
Then you add it to the path (Execute it in the commandline):
set PATH=%PATH%;C:\Foo\bar\baz
This will only work until you close the commandline, or you can set it permanently as outlined here for example.
This question already has answers here:
32-bit absolute addresses no longer allowed in x86-64 Linux?
(1 answer)
What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?
(1 answer)
Assembling 32-bit binaries on a 64-bit system (GNU toolchain)
(2 answers)
Closed 3 years ago.
I want to call a print function from my C program.
assembler prog:
#test.s
.text
.global _start
.global print
.type print, #function
_start:
call print
# and exit.
movl $0,%ebx # first argument: exit code.
movl $1,%eax # system call number (sys_exit).
int $0x80 # call kernel.
print:
# write our string to stdout.
movl $len,%edx # third argument: message length.
movl $msg,%ecx # second argument: pointer to message to write.
movl $1,%ebx # first argument: file handle (stdout).
movl $4,%eax # system call number (sys_write).
int $0x80 # call kernel.
mov $0, %eax
ret
.data
msg:
.ascii "Hello, world!\n" # the string to print.
len = . - msg # length of the string.
I can assemble and link it using:
$as test.s -o test.o
$ld test.o -o test
And I can execute it as a program, and it outputs "Hello, world!"
But when I tried to call a print from C code like this:
#include <stdio.h>
extern int print();
int main(){
int g;
g = print();
printf("Hello from c!, %d\n", g);
}
It was compiled using:
$gcc -c main.c test
It just prints "Hello from c, 13", that means that the function was called and return a number of chars, but does not print anything!
What am I doing wrong?
P.S.
When I trying to compile prog like this:
$as test.s -o test.o
$gcc -c main.c -o main.o
$gcc main.c test.o
I have a error:
/usr/bin/ld: test.o: in function `_start':
(.text+0x0): multiple definition of `_start'; /usr/lib/gcc/x86_64-pc-linux-gnu/9.2.0/../../../../lib/Scrt1.o:(.text+0x0): first defined here
/usr/bin/ld: test.o: relocation R_X86_64_32 against `.data' can not be used when making a PIE object; recompile with -fPIE
/usr/bin/ld: final link failed: nonrepresentable section on output
collect2: error: ld returned 1 exit status
Ok, done! Thanks clearlight
I can compile all use
$as test.s -o test.o
$gcc -c main.c -o main.o
$gcc -no-pie main.c test.o
And all will work fine!
I am including external asm into c, when I try to compile I am getting error.
I am compiling c file like this - g++ testing.c
Error:
cc0FHCkn.o:testing.c:(.text+0xe): undefined reference to helloWorld
collect2.exe: error: ld returned 1 exit status
C code:
#include<stdio.h>
extern "C" int helloWorld();
int main() {
printf("Its - ",helloWorld());
}
ASM code:
.code
helloWorld proc
mov rax, 123
ret
helloWorld endp
end
Note : I use that answer to be able to say more than it is possible through a remark, and using gcc.
First, just doing g++ testing.c g++ is not able to link with the assembler file which is not specified, so of course helloWorld is missing.
If I have the file hw.c :
int helloWorld()
{
return 123;
}
I ask to produce the source assembler through the option -S (I also use -O to reduce the assembler source size), so I do not have to write the assembler file by hand and I am sure it is compatible with gcc :
/tmp % gcc -O -S hw.c
That produced the file hw.s :
/tmp % cat hw.s
.file "hw.c"
.text
.globl helloWorld
.type helloWorld, #function
helloWorld:
.LFB0:
.cfi_startproc
movl $123, %eax
ret
.cfi_endproc
.LFE0:
.size helloWorld, .-helloWorld
.ident "GCC: (GNU) 4.4.7 20120313 (Red Hat 4.4.7-16)"
.section .note.GNU-stack,"",#progbits
/tmp %
Also having the file m.c :
#include <stdio.h>
extern int helloWorld();
int main()
{
printf("%d\n", helloWorld());
return 0;
}
I can do :
/tmp % gcc m.c hw.s
/tmp % ./a.out
123
I propose you to do the same as, write helloWorld in C then generate the assembler with option -S, doing that you are sure to follow the gcc requirements in the function definition
1.) Create an ELF object file from the assembly file
nasm -f elf64 -o assembly.o assembly.asm
2.) Create an ELF object file of testing.c file
gcc -c testing.c -o testing.o
3.) Link ELF object file together to create final executable file.
gcc -o testing assembly.o testing.o
4.) Run final executable file
./testing
use extern int hellowrold();
I am trying to make my own operating system from scratch and am making my own boot loader. I have a function to print a string onto the screen.
Here is some code that I have:
ORG 0x7C00
BITS 16
mov si, msg
call Print
cli
hlt
Print:
lodsb
cmp al, 0
je Done
mov ah, 0Eh
mov bh, 0
int 10h
jmp Print
Done:
ret
msg db 'Hello World!', 0
times 510-($-$$) db 0
dw 0xAA55
This is then compiled with the following command:
nasm -f bin bootloader.asm -o bootloader.bin
The question is, how would I be able to access the print function within C? I know I have to use the extern keyword, but how would I compile this into a binary format file?
Basically you have to run gcc with -ffreestanding (don't link) and then link using ld with the flags -static, -nostdlib.
Creating bootloader in C is not exactly good idea. I'd recommend you to get copy of GRUB and work on top of it. OSDEV wiki has explained this incredibly well.
To sum things up, whenever you'll try to create bootloader in C, use these to compile it:
$ gcc -m16 -c -g -Os -march=i686 -ffreestanding -Wall -Werror -I. -o bootloader.o bootloader.c
$ ld -static -T linker.ld -nostdlib --nmagic -o bootloader.elf bootloader.o
$ objcopy -O binary bootloader.elf bootloader.bin
Second thing, you can't use extern! You didn't set up stack, so C code will probably bail out pretty quickly. C compiler doesn't know in which format do you pass parameters to it, because your function doesn't follow any of usual conventions. Possible linker script:
ENTRY(main);
SECTIONS
{
. = 0x7C00;
.text : AT(0x7C00)
{
_text = .;
*(.text);
_text_end = .;
}
.data :
{
_data = .;
*(.bss);
*(.bss*);
*(.data);
*(.rodata*);
*(COMMON)
_data_end = .;
}
.sig : AT(0x7DFE)
{
SHORT(0xaa55);
}
/DISCARD/ :
{
*(.note*);
*(.iplt*);
*(.igot*);
*(.rel*);
*(.comment);
}
}
Also, GCC is by default emitting 32-bit code - you need to force it to generate 16-bit code using __asm__(".code16gcc\n") or, as suggested in comments, pass -m16 parameter to compilers' commandline.
You can rewrite your function to C (to make it complain any of calling conventions) like so:
void print(const unsigned char * s){
while(*s){
__asm__ __volatile__ ("int $0x10" : : "a"(0x0E00 | *s), "b"(7));
s++;
}
}
And of course, right after .code16gcc, you'd have to jump directly to your bootloader start: __asm__ ("jmpl $0, $main\n");
I've started working on a home-brew OS for learning purposes. So it works like this :
Once the kernel is loaded I create a stack and call my kmain()
In kmain I try calling function foo() defined in header.h
//Header.h
#ifndef INCLUDE_HEADER_H
#define INCLUDE_HEADER_H
int foo(char* buf);
int bar();
#endif
Using nm on my kernel I can clearly see that foo() is in the binary but when I disassemble kmain with gdb I see that foo isn't called, instead bar is.
This problem is recurrent on all headers containing multiple functions.
I compile on windows 10 in a Cygwin environment. I use the following arguments passed to nasm/gcc/ld in my makefile
CC = gcc
CFLAGS = -m32 -nostdlib -nostdinc \
-nostartfiles -fno-leading-underscore -nodefaultlibs\
-Wall -Wextra -Wno-unused-variable -Wno-unused-function\
-c
LD = i686-elf-ld
LDFLAGS = -Tlink.ld -melf_i386
AS = nasm
ASFLAGS = -f elf
Any ideas why ?
EDIT :
//screen.h
#ifndef SCREEN_H
#define SCREEN_H
int test();
void print(char c);
#endif
And
//kmain.c
#include "screen.h"
int kmain(){
int b = test();
print('A');
return 0xcafebabe;
}
nm kernel.elf
$ nm kernel.elf
e4524ffe a CHECKSUM
00000000 a FLAGS
0010011c b kernel_stack
00004000 a KERNEL_STACK_SIZE
00100000 T kmain
001000c8 T loader
001000dd t loader.loop
1badb002 a MAGIC_NUMBER
001000b0 T outb
00100072 T print
0010002c T strlen
00100068 T test
0010005c T testFunc
gdb disassembly of kmain:
(gdb) disassemble kmain
Dump of assembler code for function kmain:
0x00100000 <kmain+0>: push %ebp
0x00100001 <kmain+1>: mov %esp,%ebp
0x00100003 <kmain+3>: sub $0x28,%esp
0x00100006 <kmain+6>: call 0x10006b <print+1> ;should call test but calls print instead
0x0010000b <kmain+11>: mov %eax,-0xc(%ebp)
0x0010000e <kmain+14>: movl $0x41,(%esp) ;pushes 'A'
0x00100015 <kmain+21>: call 0x100084 <print+26> ;calls print('A')
0x0010001a <kmain+26>: mov $0xcafebabe,%eax
0x0010001f <kmain+31>: leave
0x00100020 <kmain+32>: ret
0x00100021 <kmain+33>: nop
0x00100022 <kmain+34>: nop
0x00100023 <kmain+35>: nop
End of assembler dump.
0x00100006 <kmain+6>: call 0x10006b <print+1> ;should call test but calls print instead
<print+1> is just the label. This instruction does call the test function as can be seen from the address 0x10006b :
00100068 T test
00100072 T print
It'll be clearer if you look at the disassembly of the compiled "screen.c".
I found that the problem was in the compiler tool-chain I was using. It's what created the weird linking problem.
Here are the instructions I followed to compile a clean new Binutils + Gcc and it's working now !