OCaml: Issues linking C and OCaml

OCaml: Issues linking C and OCaml - c

I am able to wrap C code and access it from the OCaml interpreter, but cannot build a binary! I'm 98% sure it is some linking problem, but can't find the tools to explore the linkage.
Getting even to this point was a chore, (endless quantities of Error: The external function is not available messages) so I'll document everything I did.
A 'system' file stuff.c
#include <stdio.h>
int fun(int z) // Emulate a "real" subroutine
{
printf("duuude whoa z=%d\n", z);
return 42;
}
Compile above as
cc -fPIC -c stuff.c
ld -shared -o libstuff.so stuff.o
An OCaml wrapper around above, in ocmstuff.c:
#include <caml/mlvalues.h>
CAMLprim value yofun(value z) {
return Val_long(fun(Long_val(z)));
}
Build above as
cc -fPIC -c ocmstuff.c
ld -shared -o dllostuff.so ocmstuff.o -L . -lstuff -lc -rpath .
Yes, the rpath really is needed, else the next steps suffer. (Edit: If you don't use rpath, you'll need to use LD_LIBRARY_PATH=. instead. For the final 'production' version, you'd change the rpath to the actual library path, or do ld.so.conf trickery or install into 'standard' locations, or tell your users about LD_LIBRARY_PATH. This is just like what you'd do for any other system. The rpath solution seems to be the most stable and reliable solution.)
Next, a module declaration, stored in fapi.mli
module Fapi : sig
external ofun : int -> int = "yofun" ;;
end
Build above as:
ocamlc -a -o fapi.cma -intf fapi.mli -dllib -lostuff
Does it work? Yes it does:
$ rlwrap ocaml fapi.cma
OCaml version 4.11.1
open Fapi ;;
Fapi.ofun 33 ;;
duuude whoa z=33
- : int = 42
#
So the wrapper works fine. Now lets compile with it. Here's myprog.ml:
open Fapi ;;
Fapi.ofun 33 ;;
Compile it:
ocamlc -c myprog.ml
ocamlc -o myprog myprog.cmo fapi.cma
The very last command spews:
File "_none_", line 1:
Error: Required module `Fapi' is unavailable
I am 98% sure the above error is due to some silly linking error, but I cannot track it down. Why do I think this? Well, here's a related problem that provides a hint.
$ rlwrap ocaml
open Fapi ;;
# Fapi.ofun 33 ;;
Error: The external function `yofun' is not available
#
Well, that's odd. It clearly must have found fapi.cma because that is the only way it can know about yofun. But somehow, it doesn't know it needs to dig into dllostuff.so for that. Or possibly dllostuff.so is failing to correctly link/load libstuff.so ? Or maybe libc.so to get printf ? I'm pretty sure its one of these last few, but I just can't get it to work, and don't have the tools to debug it. (nm and ldd -r look healthy. Are there some similar tools for the assorted cma,cmo,cmi,cmx files?)

Interfacing with C is much easier if you use dune. You don't need to know the low-level details it is all handled for you.
Now, back to your example. This is definitely not how OCaml users are interfacing with C, but if you really want to learn about it here are a few notes.
The reason why you have the error is that:
you specified modules in an incorrect order, it should be topological, not reverse topological order, i.e., the dependency comes before dependent
you do not have the .ml file (the -intf option means very different)
The reason why the last snippet doesn't work is because you're not loading the library. The ocaml binary obviously doesn't have any fapi units linked into it, so you have to explicitly load it using either #load directive or by passing it in the command line.
Also the following line is not necessary,
ld -shared -o dllostuff.so ocstuff.o -L . -lstuff -lc -rpath .
First of all, there is no need to link a stub file into a shared library. It is counterproductive and doesn't really bring you anything. Second, passing -rpath . will render the end executable unusable, unless the shared objects are stored in the same folder as the executable. Just remove this.
Just to complete your exercise, here is how it could be built and run. First, let's fix the stub file. We need the ml file and we also need to remove an extra module definition,
$ cat fapi.{ml,mli}
external ofun : int -> int = "yofun" ;;
external ofun : int -> int = "yofun" ;;
Yes, they are the same. The mli file is not really needed here, but let's keep it for the sake of completeness.
The way how you build the pure C part is fine, as long as you get a relocatable .so file it works.
Now to build the ocstuff.c (which we conventionally call stubs) you just need to do,
ocamlc -c ocstuff.c
Don't turn it into a shared library, don't do anything else with it. Now let's build the fapi library,
ocamlc -c fapi.mli
ocamlc -c fapi.ml
Now let's build the library that contains both OCaml and C code,
ocamlmklib -o fapi fapi.cmo ocstuff.o -lstuff -L.
Now we can finally build the executable,
ocamlc -c myprog.ml
LD_LIBRARY_PATH=. ocamlc -o myprog fapi.cma myprog.cmo
and run it,
LD_LIBRARY_PATH=. ./myprog
duuude whoa z=33
Notice that we have to use the LD_LIBRARY_PATH to tell the system dynamic loader where to look for the external dependency libstuff.so. You can, of course, use rpath to specify its location (pass it to ocamlmklib via -ccopt) but in general it is assumed that the external library is installed at some location that the system loader knows.
Again, unless you're developing your own build system, please use dune or oasis for building OCaml programs. These systems will handle all low-level details in the best possible way.
P.S. It is also worth mentioning that you're not building a binary, but a bytecode executable. For binaries, you will have to use the ocamlopt compiler. And this would be a completely different story. Again, dune is the solution.

There is a lot to take in here, but these lines are suspicious:
ocamlc -c myprog.ml
ocamlc -o myprog myprog.cmo fapi.cma
OCaml expects modules in topologically sorted order, with a module appearing on the command line before the modules that refer to it.
So it would seem the last line should be this:
ocamlc -o myprog fapi.cma myprog.cmo
I hope this helps, it's just a quick response.

The answer provided by ivg works. It also provides enough hints to retrofit the original question to get the correct behavior. The changes to the original recipe are:
Create fapi.mli and fapi.ml which both have the same content: external ofun : int -> int = "yofun" ;;
Compile both the above with ocaml -c. The mli must be compiled first: it yields an interface file cmi which is needed before the ml file can be compiled into it's object file cmo.
The name dllostuff.so was wrong: it must be dllfapi.so to maintain naming consistency.
Build the cma archive/library as ocamlc -a -o fapi.cma fapi.cmo -dllib -lfapi
That's it! Other than these, the original instructions work. The answer from ivg suggests using
ocamlmklib -o fapi fapi.cmo ostuff.o -L. -lstuff
instead of
ld -shared -o dllfapi.so ostuff.o -L. -lstuff
Either of these work. The primary difference is that ocamlmklib also creates a static-linked library libfapi.a. Other than that, it creates the dllfapi.so as before. (That version also contains a motley assortment of typical gcc symbols, for handling exceptions, library ctors, etc. It's not clear why these are needed here, since they'll show up sooner or later anyway.)

Related

Running c program from command line in one step?

I just got started writing some C programs.
To start with I was just running them through VS code. Nice and easy, I just had to press a button and bam, there it was.
But now I need to pass files as arguments to my program, which creates the need of running it from the command line.
The way I do it now, is using this two step process, (which I think is just the basic way of doing it):
ask#Garsy:~/Notes/ethHack/crpytifiles$ gcc test.c -o test
and then running the file:
ask#Garsy:~/Notes/ethHack/crpytifiles$ ./test
This is a bit tedious in the long run. Is there any way I could do this process in one step?
And perhaps also without creating the executable?
It would be really cool if I could just run it as you normally would with a python or java file, one command, and the thing runs.

You could do that with a makefile. More about GNU Make here.
all:
gcc test.c -o test
./test
The file should be called Makefile or makefile (it can have different names,just keeping it simple), and you can run it by executing:
make
Assuming you have GNU Make installed and test.c is located in the same directory with makefile.

This is a bit tedious in the long run. Is there any way I could do this process in one step?
Yes. You could create a shell function (or an alias if your shell supports alias arguments, which bash does not), e.g.:
ccr() { gcc "$1" -o x.$$ && ./x.$$; rm -f x.$$ }
$ ccr hello.c
Hello, world!
$
which will compile the script, run it if compilation succeeded, then remove the compiled binary.
And perhaps also without creating the execuable?
No (well, not easily). Executing binaries is offloaded to the exec*() function family, and the operations performed are complex and, I suspect, incompatible with stdin operations. So you cannot send the executable to a pipe and execute it from the pipe.
What you can do is use a C interpreter, albeit it is not exactly the same thing.

I am wondering that nobody is issuing the general comparison between an IDE and shell. So yes IDE may give you some comfort. But you will be happy if you learnt the fundamentals of linking & Co from scratch - otherwise the configuration of the
IDE can get pretty challenging, when you start stuff that does not work out of the box.
The rest are helpful tips to increase the efficiency on the shell - like make or
other automation builders. Shell editors provide additional tools and plugins to increase your workflow - eg with vim as an shell editor (and some plugins) you come pretty close to an IDE. This includes syntax highlight,
code check, compile and run of the program, etc... just my 2 cents

As #alex01011 correctly stated, what you need is a Makefile, and his solution should work. What I want to suggest here is a better Makefile.
First make already know how to use build test from test.c in the simple case. It will add parameters to the preprocessor, compilation and linker steps from Makefile variables, so it is better to use the built-in command for better fleksibility.
# Tell make that `all` and `run` is technically not files that will be built
.PHONY : all run
# These flags are passed to the compiler, we always want to compile with
# warnings when developing
CFLAGS= -Wall
# `all` is the first rule, so that is the one that will be build not
# specifying anything on the command line
# `all` also depends on `test` so that will be built from `test.c` calling
# `make` or `make all`
all: test
# `make run` will run your command. `run` depends on `all` to make sure the
# program exist before calling `./test`
# Note the the indent must be made with a tab and not spaces
run: all
./test
If your program is composed of more files, things get a lit more complicated, but still easily manageable:
# Example of a Makefile for a project that is composed of the files
# test.c foo.c, bar.c, foo.h and bar.h
# The main-function is in test.c, and the generated program will be
# called `test`
#
.PHONY: all run
CFLAGS= -Wall
all: test
# foo.c includes foo.h therefore foo.o depends on foo.h in addition to foo.c
foo.o: foo.h
# bar.c includes bar.h therefore foo.o depends on bar.h in addition to bar.c
bar.o: bar.h
# test.c includes both foo.h and bar.h
test.o: foo.h bar.h
# test should be linked with foo.o and bar.o in addition to test.o
test: foo.o bar.o
run: all
./test
Now typing make run will automatically build and link test, if needed, and the run ./test if there was no errors.
Other variables you may set in addition to CFLAGS are CC, CPPFLAGS, LDFLAGS, LOADLIBES and LDLIBS.
Often you also want to have a clean targets in your Makefile for typing make clean to remove generated files. See info make for more details.

How to compile a C program without knowing the include files

I have some example C code that I'm looking to adapt to suit my needs. Before then I'm trying to compile the example as it is. The C code contains a #include reference, and I can find the .h file in an 'inc' directory. There is also a corresponding 'lib' directory. I am struggling to find the command line I need to compile the code.
So far I've managed to get to the following;
gcc -o amqsinqa -I/opt/mqm/inc amqsinqa.c -L/opt/mqm/lib -lcmqc
But it 'cannot find -lcmqc'. I've looked in lib and quite correctly there is no cmqc. How do I determine what -l option I need here?
The code looks fairly simple, there is the include reference;
#include <cmqc.h>
And the call itself;
MQCONN(QMgrName,&Hcon,&CompCode,&CReason);
If I omit the -l option from the command line I get;
undefined reference to 'MQCONN'
Which isn't a surprise. MQCONN is present in cmqc.h though.

To try to help others, this reference is useful:
64 bit apps: https://www.ibm.com/support/knowledgecenter/en/SSFKSJ_9.1.0/com.ibm.mq.dev.doc/q028490_.htm
32 bit apps:
https://www.ibm.com/support/knowledgecenter/SSFKSJ_9.1.0/com.ibm.mq.dev.doc/q028480_.htm
In summary:
-I is for the product includes, which are (For Linux) usually in /opt/mqm/inc
-L is the path to the libraries in your example which are (For Linux) usually in /opt/mqm/lib (for 32 bit applications) and /opt/mqm/lib64 (for 64 bit
applications)
-l (lower case L) is for the required library/libraries,
and the actual library you need is either:
mqm - server bound C applications (ie -lmqm, which links with libmqm.so)
mqic - client bound C applications (ie -lmqic, which links with libmqic.so)
.. and a suffix of _r if you are building as a threaded application (ie you are linking with -lpthread as well, ie providing -lmqm_r or -lmqic_r which in effect links with libmqm_r.so or libmqic.so)
cmqc.h is the name of the main header file, and there are other cmq*.h headers you can optionally include as well.
If you are using the (stabilized) C++ libraries there's other libraries to include on the command line but that's outside the scope for this answer - see the referenced links

Thanks to all the above for the guidance. Looks like I was missing a few things. This is what I did;
Use nm to identify which .so file contained what I wanted. This returned libmqm.so.
Move that into the -l command, which gave me;
gcc -o amqsinqa -I/opt/mqm/inc amqsinqa.c -L/opt/mqm/lib -lmqm
But it left me with a 'skipping incompatible' warning message followed by a 'cannot find' error message.
Most common Google answer to this issue was a 32/64 bit mismatch, so I searched for a 64 bit version of the same, which ended up being in lib64. So the final compile command is;
gcc -o amqsinqa -I/opt/mqm/inc amqsinqa.c -L/opt/mqm/lib64 -lmqm

You should review the gcc options, in particular the '-m' option,
If you want to build a 32-bit MQ application then you do:
gcc -m32 -o amqsinqa -I/opt/mqm/inc amqsinqa.c -L/opt/mqm/lib -lmqm
If you want to build a 64-bit MQ application then you do:
gcc -m64 -o amqsinqa -I/opt/mqm/inc amqsinqa.c -L/opt/mqm/lib64 -lmqm

a linker issue when learning static library [duplicate]

When I try to build the following program:
#include <stdio.h>
int main(void)
{
printf("hello world\n");
return 0;
}
On OS X 10.6.4, with the following flags:
gcc -static -o blah blah.c
It returns this:
ld: library not found for -lcrt0.o
collect2: ld returned 1 exit status
Has anyone else encountered this, or is it something that noone else has been affected with yet? Any fixes?
Thanks

This won’t work. From the man page for gcc:
This option will not work on Mac OS X unless all libraries (including libgcc.a) have also been compiled with -static. Since neither a static version of libSystem.dylib nor crt0.o are provided, this option is not useful to most people.

Per Nate's answer, a completely static application is apparently not possible - see also man ld:
-static Produces a mach-o file that does not use the dyld. Only used building the kernel.
The problem in linking with static libraries is that, if both a static and a dynamic version of a library are found in the same directory, the dynamic version will be taken in preference. Three ways of avoiding this are:
Do not attempt to find them via the -L and -l options; instead, specify the full paths, to the libraries you want to use, on the compiler or linker command line.
$ g++ -Wall -Werror -o hi /usr/local/lib/libboost_unit_test_framework.a hi.cpp
Create a separate directory, containing symbolic links to the static libraries, use the -L option to have this directory searched first, and use the -l option to specify the libraries you want to use.
$ g++ -Wall -Werror -L ./staticBoostLib -l boost_unit_test_framework -o hi hi.cpp
Instead of creating a link of the same name in a different directory, create a link of a different name in the same directory, and specify that name in a -l argument.
$ g++ -Wall -Werror -l boost_unit_test_framework_static -o hi hi.cpp

You may also try LLVM LLD linker - I did prebuilt version for my two major OSes - https://github.com/VerKnowSys/Sofin-llds
This one allows me to link for exmple: "Qemu" properly - which is impossible with ld preinstalled by Apple.
And last one is - to build GCC yourself with libstdc++ (don't).

Trying to understand the main function with GCC and Windows

They say that main() is a function like any other function, but "marked" as an entry point inside the binary, an entry point that the operating system may find (Don't know how) and start the program from there. So, I'm trying to find out more about this function. What have I done? I created a simple .C file with this code inside:
int main(int argc, char **argv) {
return (0);
}
I saved the file, installed the GCC compiler (in Windows, MingW environment) and created a batch file like this:
gcc -c test.c -nostartfiles -nodefaultlibs -nostdlib -nostdinc -o test.o
gcc -o test.exe -nostartfiles -nodefaultlibs -nostdlib -nostdinc -s -O2 test.o
#%comspec%
I did this to obtain a very simplistic compiler and linker, no library, no header, just the compiler. So, the compiling goes well but the linking stops with this error:
test.c:(.text+0xa): undefined reference to '___main'
collect2.exe: error: Id returned 1 exit status
I thought that the main function is exported by the linker but I believed that you didn't need any library with additional information about it. But it looks like it does. In my case I supposed that it must be the standard GCC library, so I downloaded the source code of it and opened this file: libgcc2.c
Now, I don't know if that is the file where the main function is constructed to be linked by GCC. In fact, I don't understand how the main function is used by GCC. Why does the linker need the gcc standard libraries? To know what about main? I hope this has made my question quite specific and clear. Thanks!

When gcc puts together all object files (test.o) and libraries to form a binary it also prepends a small object (usually crt0.o or crt1.o), which is responsible for calling your main(). You can see what gcc is doing, when you add -v on the command line:
$ gcc -v -o test.exe test.o
crt0/crt1 does some setup and then calls into main. But the linker is finally responsible for building the executable according to the OS. With -v you can see also an option for the target system. In my case it's for Linux 64 bit: -m elf_x86_64. For your system this will be something like -m windows or -m mingw.

The error happens because you use these two options: -nodefaultlibs -nostdlib
These tell GCC that it should not link your code against libc.a/c.lib which contains the code which really calls main(). In a nutshell, every OS is slightly different and most of them don't care about C and main(). Each has their own special way to start a process and most of them are not compatible with the C API.
So the solution of the C developers was to put "glue code" into the C standard library libc.a which contains the interface which the OS expects, creates the standard C environment (setting up the memory allocation structures so malloc() will map the OS's memory management functions, set up stdio, etc) and eventually calls main()
For C developers, this means they get a libc.a for their OS (along with the compiler binaries) and they don't need to care about how the setup works.
Another source of confusion is the name of the reference. On most systems, the symbolic name of main() is _main (i.e. one underscore) while __main is the name of an internal function called by the setup code which eventually calls the real main()

How to replace LD variable in a Makefile to link C objects

I'm writing a Makefile for C. I want be able to specify different programs for compilation and linking via environmental variables. However, I want it works without any additional variables too. I was trying to link with ld. However, the default doesn't link with standard C library.
The question:
How to link C program with ld or $LD
Is it possible to get appropriate flags from cc?
I cannot use $(CC) in place of $(LD). The LD ?= cc doesn't work too.
I want something like this to be true:
Environment variable CC set to tcc.
Environment variable LD unset.
My Makefile compile using tcc and link using system default linker for C.
Unfortunately, some C compilers are unable to link some libraries. I have this problem with tcc and glfw.
P.S.
Linux user

The conditional assignment $(LD) ?= cc can not work, since $(LD) is predefined.
If you want to start make without predefined variables, use the option -R:
> make -p | grep LD
...
LD = ld
...
> make -p -R | grep LD
>

Instead of using ld as the linker, use gcc or g++. They add the appropriate command line options for getting libraries and startup code, etc. In other words:
ld -o main main.o
is equivalent to:
gcc -o main main.o
except that gcc adds all the command line parameters when it calls ld.
In other words: LD=gcc.

One of the main features of tcc is that :
tcc is a compiler and a linker [...] it compile and execute C source directly. No linking or assembly necessary
There's more detail in tcc documentation, it says about linking that :
Dynamic ELF libraries can be output but the C compiler does not generate position independent code (PIC). It means that the dynamic library code generated by TCC cannot be factorized among processes yet.
It means that, if you want to use tcc, you'll need to link with tcc.

As I see, you can define rule-dependent values for variables.
build_with_tcc: CC=tcc compile_tcc
compile_tcc:
## Commands to do a full build with tcc.
build_with_gcc: CC=gcc LD=g++ link_gcc
compile_gcc:
## Commands to compile with gcc.
link_gcc: compile_gcc
## Commands to link with g++.
And build it by calling the appropriate rule.
If you wish, in other hand, to be able to pass an arbitrary compiler toolchain, you will have to have some resctrictions anyway.
The rule:
build_with_arbitrary: compile_arbitrary link_arbitrary
Implies that your build must be done in two steps and the respective rules (compile_arbitrary and link_arbitrary) must obey the same commandline.
So you can invoke make with custom CC and LD variables:
CC=any_cc LD=any_ld make build_with_arbitrary
Lastly, you can add a dirty check for LD being empty in the linker step, and only perform it if not.
link_arbitrary:
[ -n "$(LD)" ] && do_linker_stuff
So you could use build_with_arbitrary even for a compiler that does everything in a single step just by passing:
CC=any_cc LD= make build_with_arbitrary
I hope to have correctly understood your question. Sorry if I misunderstood, and please tell me were I am wrong.