Compiling without main function (MacOS) - c

So I am trying to compile ,link and run a program without the main function.This is the code:
#include <stdio.h>
#include <stdlib.h>
int my_main()
{
printf("Hello world!\n");
return 0;
}
void _start()
{
exit(my_main());
}
Tried to compile with the command : gcc -nostartfiles nomain.c . While it does compile and produces the a.out file on a Debian vm, I am unable to compile it in my macOS Catalina v10.15.2. I am using the latest version of gcc. The message I am receiving when trying to compile is :
Undefined symbols for architecture x86_64: "_main", referenced from:
implicit entry/start for main executable ld: symbol(s) not found for architecture x86_64 collect2: error: ld returned 1 exit status
So far I have tried to change _start to start but still getting the same result. As I understand the compilation process is different depending on the OS.
Note: There is no problem I am trying to solve here , just curiosity.
Thank you in advance

On macOS 10.14.6 with Xcode 11.3, the code in the question compiles and links with the command:
clang -Wl,-e, -Wl,__start <Name of Your Source File>
The resulting executable appears to work. However, since it bypasses the startup code for the C environment, you should not expect that using routines from the C library or other C features will work properly.
Note that two underscores are needed before start in the above command because the source code contains one and another is added by the C compiler. If the code is changed to use start instead of _start, then the command would use one underscore:
clang -Wl,-e, -Wl,_start <Name of Your Source File>
The switches -Wl,-e, -Wl,_start pass -e _start to the linker, which tells it to use _start as the address of the initial code to execute. It is not clear to me why this bypasses the default loading of the C-run-time-startup object module, which also defines _start. I would have preferred to use a linker switch that tells it not to load that module, but I did not find one in the man page for ld. Experimentation suggests that, by default, ld loads the default object module, and it refers to main, which results in a link error, but, when -e _start is used, the linker sets the program’s _start symbol as the startup address and does not load the default object module.

I'm pretty sure you can compile any C-source without main().
The problem will be with the linker trying to create an executable, which won't work without main().

Related

Why is clang removing an underscore from a function declared as 'extern "C"'?

I'm watching a video in an attempt to better understand object files. The presenter uses the following as an example of a program that produces a very simple object file:
extern "C" void _start() {
asm("mov $60, %eax\n"
"mov $24567837, %edi\n"
"syscall\n");
}
The program is compiled via
clang++ -c step0.cpp -O1 -o step0.o
and linked via
ld -static step0.o -o step0
I get this error message when trying to link:
Undefined symbols for architecture x86_64:
"start", referenced from:
-u command line option
(maybe you meant: __start)
ld: symbol(s) not found for inferred architecture x86_64
I don't pass the -u command line option, so I'm not sure why I'm getting that error message.
clang isn't removing an underscore, it's adding an underscore. Your program is actually exporting a __start symbol, but ld expects you to have a start symbol for your entry point, i.e. ld runs with -u start by default for your architecture.
You could disable this check in ld with -U start (which suppresses the error from the start symbol being undefined) or via -undefined suppress (which suppresses all undefined symbol errors). However, you will end up with an executable that does not have an entry point for your architecture, so the program won't actually work.
Instead of suppressing the error, I suggest controlling the symbol that clang chooses directly. You can tell clang what symbol to generate by using a standalone asm declaration:
void _start() asm ("start");
Make sure this standalone declaration is separate from the function definition.
You can read more about controlling the symbols generated by gcc here: https://stackoverflow.com/a/1035937/12928775
Also, as was pointed out in a comment to a similar answer, you will most likely want to use __attribute__((naked)) on the function definition to prevent clang from generating a stack frame on entry. See: https://stackoverflow.com/a/60311490/12928775

embedding Julia in C/C++ on OSX

I am trying to compile a very simple C/C++ program to call Julia functions. Following the instructions that you find on the Julia documentation page, I set up my link path to /Users/william.calhoun/Desktop/romeo/lib/julia looking for libjulia.so and I set up my include path to /Users/william.calhoun/Desktop/romeo/include/julia looking for julia.h
I have a C file called test.c which runs the following code:
#include <stdio.h>
#include "skeleton.h"
#include <julia.h>
int main(int argc, const char * argv[]) {
jl_init(NULL);
/* run julia commands */
jl_eval_string("print(sqrt(2.0))");
/* strongly recommended: notify julia that the
program is about to terminate. this allows
julia time to cleanup pending write requests
and run all finalizers
*/
jl_atexit_hook();
return 0;
}
However this yields the following error:
Undefined symbols for architecture x86_64:
"_jl_atexit_hook", referenced from:
_main in test.o
"_jl_eval_string", referenced from:
_main in test.o
"_jl_init", referenced from:
_main in test.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
I am not doing anything other than calling functions defined properly (hopefully) within the Julia source code. What am I doing wrong? This seems like the simplest example and I can't figure it out.
Any help would be much appreciated!
Linking to libjulia (libjulia.dynlib on OS/X)
This error is a result of not linking to libjulia, as all of the symbols (_jl_atexit_hook, _jl_eval_string, _jl_init) are located in that library. Broadly, for all 3 of the following platforms (Windows, OS/X, Linux), the approach is similar, and though the location of the libjulia library is different on Windows than the other 2 this stackoverflow question is applicable. Also to be completely accurate, on OS/X, dynamic libraries have the extension .dynlib not .so as they do on Linux.
The link step
For simplicity, assuming you've compiled to object code (there is a file called embed.o), here's the link step.
cc -o embed embed.o -L/Users/william.calhoun/Desktop/romeo/lib/julia -Wl,-rpath,/Users/william.calhoun/Desktop/romeo/lib/julia -ljulia
There are 2 important things to note here.
Linking using -ljulia will allow the linker to resolve all of the above symbols.
Since this is a dynamic library and that dynamic library is located in a non standard location (e.g. not in /usr/lib), the dynamic linker will not be able to find it at run time unless you give it special instructions on how to find it. The -rpath directive causes the linker to insert the path /Users/william.calhoun/Desktop/romeo/lib/juliainto the list of paths to search.

How does this C program compile and run with two main functions?

Today, while working with one custom library, I found a strange behavior.
A static library code contained a debug main() function. It wasn't inside a #define flag. So it is present in library also. And it is used link to another program which contained the real main().
When both of them are linked together, the linker didn't throw a multiple declaration error for main(). I was wondering how this could happen.
To make it simple, I have created a sample program which simulated the same behavior:
$ cat prog.c
#include <stdio.h>
int main()
{
printf("Main in prog.c\n");
}
$ cat static.c
#include <stdio.h>
int main()
{
printf("Main in static.c\n");
}
$ gcc -c static.c
$ ar rcs libstatic.a static.o
$ gcc prog.c -L. -lstatic -o 2main
$ gcc -L. -lstatic -o 1main
$ ./2main
Main in prog.c
$ ./1main
Main in static.c
How does the "2main" binary find which main to execute?
But compiling both of them together gives a multiple declaration error:
$ gcc prog.c static.o
static.o: In function `main':
static.c:(.text+0x0): multiple definition of `main'
/tmp/ccrFqgkh.o:prog.c:(.text+0x0): first defined here
collect2: ld returned 1 exit status
Can anyone please explain this behavior?
Quoting ld(1):
The linker will search an archive only once, at the location where it is specified on the command line. If the archive defines a symbol which was undefined in some object which appeared before the archive on the command line, the linker will include the appropriate file(s) from the archive.
When linking 2main, main symbol gets resolved before ld reaches -lstatic, because ld picks it up from prog.o.
When linking 1main, you do have undefined main by the time it gets to -lstatic, so it searches the archive for main.
This logic only applies to archives (static libraries), not regular objects.
When you link prog.o and static.o, all symbols from both objects are included unconditionally, so you get a duplicate definition error.
When you link a static library (.a), the linker only searches the archive if there were any undefined symbols tracked so far. Otherwise, it doesn't look at the archive at all. So your 2main case, it never looks at the archive as it doesn't have any undefined symbols for making the translation unit.
If you include a simple function in static.c:
#include <stdio.h>
void fun()
{
printf("This is fun\n");
}
int main()
{
printf("Main in static.c\n");
}
and call it from prog.c, then linker will be forced to look at the archive to find the symbol fun and you'll get the same multiple main definition error as linker would find the duplicate symbol main now.
When you directly compile the object files(as in gcc a.o b.o), the linker doesn't have any role here and all the symbols are included to make a single binary and obviously duplicate symbols are there.
The bottom line is that linker looks at the archive only if there are missing symbols. Otherwise, it's as good as not linking with any libraries.
After the linker loads any object files, it searches libraries for undefined symbols. If there are none, then no libraries need to be read. Since main has been defined, even if it finds a main in every library, there is no reason to load a second one.
Linkers have dramatically different behaviors, however. For example, if your library included an object file with both main () and foo () in it, and foo was undefined, you would very likely get an error for a multiply defined symbol main ().
Modern (tautological) linkers will omit global symbols from objects that are unreachable - e.g. AIX. Old style linkers like those found on Solaris, and Linux systems still behave like the unix linkers from the 1970s, loading all of the symbols from an object module, reachable or not. This can be a source of horrible bloat as well as excessive link times.
Also characteristic of *nix linkers is that they effectively search a library only once for each time it is listed. This puts a demand on the programmer to order the libraries on the command line to a linker or in a make file, in addition to writing a program. Not requiring an ordered listing of libraries is not modern. Older operating systems often had linkers that would search all libraries repeatedly until a pass failed to resolve a symbol.

Creating a dylib which gets linked at runtime

I am trying to create a dynamic library which is meant to be linked and loaded into a host environment at runtime (e.g. similar to how class loading works in Java). As such, I want the dynamic library to be left with a few "dangling" references, which I expect it to pick up from its host environment when it is loaded into that environment.
My problem is that I cannot figure out how to create the dynamic library without explicitly linking it to existing symbols. I am hoping to produce a dynamic library that does not depend on a specific host executable (or host library), rather one that is able to be loaded (e.g. by dlopen) in any host as long as the host makes a couple symbols available for use.
Right now, any linking command I've tried results in a complaint of missing symbols. I'd like it to allow symbols to be missing (ideally, just particularly specified symbols).
For example, here's a transcript with the error on OS X:
$ cat frotz.c
void blort(void);
void run(void) {
blort();
}
$ cc -c -o frotz.o frotz.c
$ cc -dynamiclib -o libfrotz.dylib frotz.o
Undefined symbols for architecture x86_64:
"_blort", referenced from:
_run in frotz.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
If I do the same thing using a GNU toolchain (on Linux), it helpfully tells me:
$ gcc -shared -o libfrotz.so frotz.o
/usr/bin/ld: frotz.o: relocation R_X86_64_PC32 against undefined symbol `blort'
can not be used when making a shared object; recompile with -fPIC
and indeed, adding -fPIC to the C compile command seems to fix the problem in that environment. However, it doesn't seem to have any effect in OS X.
All the other dynamic-linking questions I could find on SO seem to be about the more usual arrangement of libraries, where a library is being built to be linked into an executable before that executable runs, rather than the other way around. The closest related question I found was this:
Can an executable be linked to a dynamic library after its built?
which unfortunately has very little info, none of it relevant to the question I'm asking here.
UPDATE: I distilled the info from the answer along with everything else I'd figured
out, and put together this example:
https://github.com/danfuzz/dl-example
As far as my knowledge goes, you want to use weak linkage:
// mark function as weakly-linked
extern void foo() __attribute__((weak));
// inform the linker about that too
clang -dynamiclib -o bar.dylib bar.o -flat_namespace -undefined dynamic_lookup
If a weak function can be resolved at runtime, it will then be resolved. If it can't, it will be NULL, instead of generating a runtime (or, obviously, link-time) error.

Weird C Compiler, getting an error "ld: duplicate symbol _main"

I just started learning C, and wrote my hello world program:
#include <stdio.h>
main()
{
printf("Hello World");
return 0;
}
When I run the code, I get a really long error:
Apple Mach-O Linker (id) Error
Ld /Users/Solomon/Library/Developer/Xcode/DerivedData/CProj-cwosspupvengheeaapmkrhxbxjvk/Build/Products/Debug/CProj normal x86_64
cd /Users/Solomon/Desktop/C/CProj
setenv MACOSX_DEPLOYMENT_TARGET 10.7
/Developer/usr/bin/clang -arch x86_64 -isysroot /Developer/SDKs/MacOSX10.7.sdk -L/Users/Solomon/Library/Developer/Xcode/DerivedData/CProj-cwosspupvengheeaapmkrhxbxjvk/Build/Products/Debug -F/Users/Solomon/Library/Developer/Xcode/DerivedData/CProj-cwosspupvengheeaapmkrhxbxjvk/Build/Products/Debug -filelist /Users/Solomon/Library/Developer/Xcode/DerivedData/CProj-cwosspupvengheeaapmkrhxbxjvk/Build/Intermediates/CProj.build/Debug/CProj.build/Objects-normal/x86_64/CProj.LinkFileList -mmacosx-version-min=10.7 -o /Users/Solomon/Library/Developer/Xcode/DerivedData/CProj-cwosspupvengheeaapmkrhxbxjvk/Build/Products/Debug/CProj
ld: duplicate symbol _main in /Users/Solomon/Library/Developer/Xcode/DerivedData/CProj-cwosspupvengheeaapmkrhxbxjvk/Build/Intermediates/CProj.build/Debug/CProj.build/Objects-normal/x86_64/helloworld.o and /Users/Solomon/Library/Developer/Xcode/DerivedData/CProj-cwosspupvengheeaapmkrhxbxjvk/Build/Intermediates/CProj.build/Debug/CProj.build/Objects-normal/x86_64/main.o for architecture x86_64
Command /Developer/usr/bin/clang failed with exit code 1
I am running xCode
Should I reinstall DevTools?
If you read the error messages (specifically the line starting ld: duplicate symbol _main in ...), you'll notice that it's complaining about two main functions, one in:
......blah blah blah/helloworld.o
and the other in:
......yada yada yada/main.o
That means your project is screwed up somehow. Either you have two separate source files containing main or Xcode is supplying one automagically.
You just need to fix that.
Here's how to interpret that message:
Apple Mach-O Linker (id) Error
An error occurred
Ld /Users/ …
cd …
setenv …
/Developer/…
This is the command that Xcode executed to perform the linking step. You can almost always ignore it and skip past the next blank line.
ld: duplicate symbol _main in /Users/…/helloworld.o and /Users/…/main.o for architecture x86_64
This is the actual error message. It tells you that you have duplicate _main symbols, one in the file helloworld.o and one in main.o. This means you have to functions which are both called main, which isn't allowed. One of them is in helloworld.c and the other is in main.c. If you delete one of these functions or files, the error will go away.
Command /Developer/usr/bin/clang failed with exit code 1
This tells you the exit code of the command Xcode performed. It is less helpful than the error message, and I have never seen anything other than 1 for linking errors.
I meet this problem as well. In "Target Membership", just tick the file you want to run. Untick this in other files you don't want to run. Then try again.
It is also important to remember that you could have received this error message if you had a #include "...filename..." that created a duplicate copy of your function calls. However, in your case, that is not likely.
remember that #include essentially just copies and pastes a copy of your code where the #include takes place.

Resources