How to write compilation rules for a bootstrapping compiler - shake-build-system

I want to write build rules for a self-hosted compiler. Taking the example of GHC, the GHC compiler is written in Haskell, and compiles Haskell. I want to first compile the source using an existing copy of the GHC compiler (phase1), then compile the compiler using the phase1 compiler (phase2) then compile the compiler using the phase2 compiler. How can I encode that in Shake?

This problem is similar to writing fixed-point build rules. Some asumptions:
I assume each source file is compiled to one object file with no additional dependencies (the complexities of include/import files are orthogonal)
I assume the objects and results from phase1 end up in the directory phase1 etc.
You can define:
want ["phase3/ghc" <.> exe]
let getPhase x = read $ drop (length "phase") $ takeDirectory1 x :: Int
"//*.o" *> \out ->
let src = dropDirectory1 out -<.> "hs"
let phase = getPhase out
let compiler = if p == 1 then "ghc" else "phase" ++ show (p-1) </> "ghc" <.> exe
need $ src : [compiler | p /= 1]
cmd [compiler] "-c" [src] "-o" out
("//ghc" <.> exe) *> \out ->
let os = map (takeDirectory1 out </>) ["Main.o","Module2.o",...]
need os
cmd "link -o" [out] os

Related

Why does CMake set -no-fat-lto-objects when I enable LTO/IPO?

I'm enabling IPO (inter-procedural optimization) for a C compilation of mine, using CMake:
set_property(TARGET foo PROPERTY INTERPROCEDURAL_OPTIMIZATION TRUE)
As expected, this causes an -flto compiler flag to be added. However, it also adds -fno-fat-lto-objects: That means that the resulting object file will only have intermediate code, rather than both properly-compiled and intermediate code; and that means that the linker must support my system compiler's intermediate representation and be IPO/LTO-aware.
I didn't ask for -fno-fat-lto-objects, nor did I want it. Can I get CMake to not add this option?
IMNSHO opinion this is a CMake bug... which I have filed as:
https://gitlab.kitware.com/cmake/cmake/-/issues/23136
The developers have simply made the incorrect assumption that this is what people want.
if(CMAKE_C_COMPILER MATCHES "GNU")
set(CMAKE_C_COMPILE_OPTIONS_IPO "-flto")
endif()
How to find it:
Navigate to your CMake installation directory and to Modules, most of the stuff is there.
It's /usr/share/cmake/Modules on my Linux system
Find the string or similar string that you are interested in
on my system, I do:
$ grep fno-fat-lto-objects -r .
./Compiler/GNU.cmake: list(APPEND __lto_flags -fno-fat-lto-objects)
Navigate and inspect the resulting files, the context where the string is used:
# '-flto' introduced since GCC 4.5:
# * https://gcc.gnu.org/onlinedocs/gcc-4.4.7/gcc/Option-Summary.html (no)
# * https://gcc.gnu.org/onlinedocs/gcc-4.5.4/gcc/Option-Summary.html (yes)
if(NOT CMAKE_${lang}_COMPILER_VERSION VERSION_LESS 4.5)
set(_CMAKE_${lang}_IPO_MAY_BE_SUPPORTED_BY_COMPILER YES)
set(__lto_flags -flto)
if(NOT CMAKE_${lang}_COMPILER_VERSION VERSION_LESS 4.7)
# '-ffat-lto-objects' introduced since GCC 4.7:
# * https://gcc.gnu.org/onlinedocs/gcc-4.6.4/gcc/Option-Summary.html (no)
# * https://gcc.gnu.org/onlinedocs/gcc-4.7.4/gcc/Option-Summary.html (yes)
list(APPEND __lto_flags -fno-fat-lto-objects)
endif()
set(CMAKE_${lang}_COMPILE_OPTIONS_IPO ${__lto_flags})
Come up with a workaround to implement custom behavior of such coe.

How integrate gnatmake/gnatbind/gnatlink in CMake files for C/Ada code?

I wrote a code in a few languages (C, C++, Fortran77, Fortran90) and I can compile it without any sort of problem by using CMake. It works out perfectly.
Now, I would like to add in the main(), which is written in C, some Ada function and I want to compile it by CMake. Given that I am not able to link my Ada function to the main one by using CMake, I get
main.c:(.text.startup+0x16a): undefined reference to adainit
main.c:(.text.startup+0x179): undefined reference to adafunction
main.c:(.text.startup+0x190): undefined reference to adafinal
I did another simplified test by using the main function (written in C) calling the only Ada function, which I coded, and I compiled it by using
gcc -c main.c
gnatmake -c lib_ada.ali
gnatbind -n lib_ada.ali
gnatlink lib_ada.ali main.o -o exe
and it works out. Do you know how I can integrate this approach in a CMakeList.txt?
Note: I think (maybe I mistake) I cannot use the only gnatlink because I need to link all other functions I already have.
Here is reported a minimal reproducible example.
--- main.c ---
#include <stdio.h>
extern int adainit();
extern int adafinal();
extern int Add(int,int);
int main()
{
adainit();
printf ("Sum of 3 and 4 is: %d\n", Add (3,4));
adafinal();
return 0;
}
--- lib_test.adb ---
package body Lib_Test is
function Ada_Add (A, B : Integer) return Integer is
begin
return A + B;
end Ada_Add;
end Lib_Test;
--- lib_test.ads ---
package Lib_Test is
function Ada_Add (A, B : Integer) return Integer;
pragma Export (C, Ada_Add, "Add");
end Lib_Test;
1° test: if you compile by using the following commands:
gcc -c main.c
gnatmake -c lib_test.adb
gnatbind -n lib_test.ali
gnatlink lib_test.ali main.o -o exe
and run ./exe you get Sum of 3 and 4 is: 7.
2° test: I tried to use the following CMake file (CMakeLists.txt) linking the *.a
cmake_minimum_required(VERSION 2.6)
project(Ada2C)
enable_language(C)
set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${CMAKE_SOURCE_DIR}/cmake")
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/bin)
set(CMAKE_VERBOSE_MAKEFILE ON)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -O3 -m64")
find_library(TEST_lib lib_test.a PATHS ${CMAKE_CURRENT_SOURCE_DIR})
message(STATUS "Finding library: ${TEST_lib}")
add_executable(TEST_release ${CMAKE_CURRENT_SOURCE_DIR}/main.c)
target_link_libraries(TEST_release ${TEST_lib})
I generate library lib_test.a for the Ada function
gnatmake lib_test.adb
ar rc lib_test.a
I run the cmake and make and I get
main.c:(.text.startup+0x16a): undefined reference to adainit
main.c:(.text.startup+0x179): undefined reference to adafunction
main.c:(.text.startup+0x190): undefined reference to adafinal
More of a comment than an answer, but too long for a comment, so here goes:
Compiling Ada code into your binary means that your binary needs access to the GNAT runtime. This is one thing gnatlink does when you use it to link the final executable. The other thing is the b~<something>.ad{s,b} source gnatbind generates which you need to compile and link against as others mentioned.
The cleanest way to embed Ada in C I've seen so far is to create an encapsulated library. This probably does not make sense if your actual problem is with only one Ada function, but it does with larger chunks of Ada. The encapsulated library will be a shared library that has GNAT's runtime baked in. Being a shared library enables it to implicitly handle initialization during library loading so you don't need adainit() / adafinal() anymore.
The easiest way to create an encapsulated library is to use a ada_code.gpr file:
project ada_code is
for Library_Name use "mylib";
for Library_Dir use "lib";
for Library_Kind use "relocatable";
for Library_Standalone use "encapsulated";
for Library_Auto_Init use "true";
for Library_Interface use ("All", "Packages", "In.Your", "Ada.Code");
for Source_Dirs use ("adasrc");
end ada_code;
In CMake, you can then do:
# tell CMake how to call `gprbuild` on the `.gpr` file.
# you may need to replace `gprbuild` with the absolute path to it
# or write code that finds it on your system.
add_custom_target(compile_mylib
COMMAND gprbuild -P ada_code.gpr)
# copy the library file generated by gprbuild to CMake's build tree
# (you may skip this and just link against the file in the source tree)
add_custom_command(
OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/mylib.so
DEPENDS compile_mylib
COMMAND ${CMAKE_COMMAND} -E copy
${CMAKE_SOURCE_DIR}/lib/mylib.so
${CMAKE_CURRENT_BINARY_DIR}/mylib.so)
# ... snip ...
# link to the copied library
# I am not 100% sure this adds the correct dependency to the custom command.
# You may need to experiment a bit yourself
target_link_libraries(TEST_release ${CMAKE_CURRENT_BINARY_DIR}/mylib.so)
In your C file, you can then delete everything related to adainit() and adafinal().

Portable Makevars for R package using C, GSL and OpenMP with help of Rcpp

I am constructing an R package where I have few third party C functions which use GSL and OpenMP, and I then call these from R using wrappers made with Rcpp (which just modify some arguments and call the appropriate C function). Everything works fine in my Windows machine, but I am not sure how to define the Makevars.win and Makevars file in a portable way. My Makevars.win looks like this:
## This assumes that the LIB_GSL variable points to working GSL libraries
PKG_CPPFLAGS=-I$(LIB_GSL)/include -I../inst/include
PKG_LIBS=-L$(LIB_GSL)/lib/x64 -lgsl -lgslcblas $(SHLIB_OPENMP_CFLAGS)
PKG_CFLAGS=$(SHLIB_OPENMP_CFLAGS)
It is basically copied from various sources. Like said, this compiles in my computer (using toolchain of from Rtools), and if I remove PKG_CFLAGS=$(SHLIB_OPENMP_CFLAGS) I can also compile without OpenMP (for some reason I don't understand I get error if I remove OpenMP flag in PKG_LIBS).
My Makevars file looks identical, but I don't have access to Unix platforms so I am not sure how to deal with that side. My guess is that I need to replace LIB_GSL with something?
EDIT:
Okay, I think I finally understand how configure etc. works and was able to get everything working.
My Makevars.win:
## This assumes that the LIB_GSL variable points to working GSL libraries
PKG_CPPFLAGS=-I$(LIB_GSL)/include -I../inst/include
PKG_LIBS="-L$(LIB_GSL)/lib/$(R_ARCH)" -lgsl -lgslcblas $(SHLIB_OPENMP_CFLAGS)
PKG_CFLAGS=$(SHLIB_OPENMP_CFLAGS)
My Makevars.in:
GSL_CFLAGS = #GSL_CFLAGS#
GSL_LIBS = #GSL_LIBS#
PKG_LIBS=$(GSL_LIBS) $(SHLIB_OPENMP_CFLAGS)
PKG_CFLAGS=$(GSL_CFLAGS) $(SHLIB_OPENMP_CFLAGS)
And my configure.ac:
AC_INIT([testpackage], 0.0.1)
## Use gsl-config to find arguments for compiler and linker flags
##
## Check for non-standard programs: gsl-config(1)
AC_PATH_PROG([GSL_CONFIG], [gsl-config])
## If gsl-config was found, let's use it
if test "${GSL_CONFIG}" != ""; then
# Use gsl-config for header and linker arguments
GSL_CFLAGS=`${GSL_CONFIG} --cflags`
GSL_LIBS=`${GSL_CONFIG} --libs`
else
AC_MSG_ERROR([gsl-config not found, is GSL installed?])
fi
# Now substitute these variables in src/Makevars.in to create src/Makevars
AC_SUBST(GSL_CFLAGS)
AC_SUBST(GSL_LIBS)
AC_OUTPUT(src/Makevars)
I then run autoconf in testpackage directory in order to get configure file, which in turn converts Makevars.in to Makevars when running R CMD INSTALL.
There are a few CRAN packages using the GSL, and/or our RcppGSL bindings. Here is what I do in one of these (my RcppZiggurat package):
PKG_CPPFLAGS = -I. -I../inst/include
## Use the R_HOME indirection to support installations of multiple R version
PKG_LIBS = `$(R_HOME)/bin/Rscript -e "RcppGSL:::LdFlags()"`
That is the entire /src/Makevars. You can trivially add the same OpenMP variable from R. This does of course create a dependency on RcppGSL (as I already use it in RcppZiggurat). Else you can look into the R/init.R of RcppGSL and see how it tries to talk to gsl-config and store those values. You can do the same in src/Makevars -- it is just basic Make usage and nothing Rcpp specific.

Using R CMD SHLIB with OpenMP not for package building

The (R) program I'm writing is at one point able to write C source code files containing OpenMP instructions in order to speed up the resulting program (these files mainly contain a set of differential equations whose results are written to an array - as these steps can be executed independently, I thought it to be a good idea to parallelize them using omp sections). As the files generated this way are supposed to be used in another part of my program I also use R to compile them using system(R CMD SHLIB...) at runtime, as this approach seemed to have the advantage that, using R CMD SHLIB, no specific compiler would need to be imposed on the user.
The problem I'm now facing is that I can't pass the -fopenmp (or -openmp) compiler directive to R CMD SHLIB and it is not possible to use a Makevars file providing additional compiler flags (or ideally $SHLIB_OPENMP_CFLAGS) when not building an R package - which I'm not doing in this case, so R CMD SHLIB compiles the file I give it. Without OpenMP paralellization, however, as I see no way how to pass the according flags to SHLIB in this situation.
Is there any possibility to use R CMD SHLIB for this task anyway or will I have to sacrifice portability by internally specifying a compiler for OpenMP compilation?
You can also do it in R with:
system("R CMD COMPILE filename.c CFLAGS=-fopenmp")
system("R CMD SHLIB filename.o")
If you must use R CMD SHLIB as opposed to a Makefile or package, I think you want to modify an environment variable such as PKG_CPPFLAGS or PKG_CXXFLAGS, which you can from inside R via Sys.setenv().
R itself now uses OpenMP and the compiler option you desire is available on recent R systems:
edd#max:~$ grep OPENMP /etc/R/Makeconf
SHLIB_OPENMP_CFLAGS = -fopenmp
SHLIB_OPENMP_CXXFLAGS = -fopenmp
SHLIB_OPENMP_FCFLAGS = -fopenmp
SHLIB_OPENMP_FFLAGS = -fopenmp
edd#max:~$
That's from a standard R 2.15.1 on a Debian / Ubuntu system.

GCC installed. Mathematica still won't compile to C

I'm running Mathematica 8 on a MacOSX, trying to compile even the simplest program to C. Anything having to do with C simply doesn't work in Mathematica. I have GCC 4.2 installed; I've even reinstalled it multiple times with XCode. Here's what I'm doing and the errors I'm getting:
First, I always evaluate the command
Needs["CCompilerDriver`"]
If I set the compilation target to C,
c = Compile[ {{x}}, x^2 + Sin[x^2], CompilationTarget -> "C"];
I get an error that reads: Compile::nogen : A library could not be created from the compiled function.
If I try to create a library,
demoFile = FileNameJoin[{$CCompilerDirectory,"SystemFiles","CSource","createDLL_demo.c"}];
lib = CreateLibrary[{demoFile},"testLibrary"]
I get an message $Failed. Wolfram says that this is because I don't have a C compiler installed. I find that hard to believe because when I run
CCompilers[]
It tells me that I've got GCC installed: {{"Name" -> "GCC",
"Compiler" -> CCompilerDriver'GCCCompiler`GCCCompiler,
"CompilerInstallation" -> "/usr/bin", "CompilerName" -> Automatic}}
What's more, terminal says I have GCC installed too!! Any help would be appreciated. I'd really like to compile Mathematica to C.
In this answer I'll collect some debugging steps for similar problems, for future reference. Feel free to edit/improve them.
If compiling to C code does not work from Mathematica 8,
Check that you have a supported C compiler installed and it works (the obvious).
Note that the compiler does not necessarily have to be in the PATH, at least on Windows/Visual Studio it doesn't.
Check that Mathematica recognizes the compiler
<< CCompilerDriver`
CCompilers[]
will list the compilers known to Mathematica.
Check what commands Mathematica executes to compile the generated C code:
Compiler`$CCompilerOptions = {"ShellCommandFunction" -> Print};
Compile[{{x}}, x^2, CompilationTarget -> "C"];
Note that with "ShellCommandFunction" -> Print the commands will not be executed, so you'll need to re-set Compiler`$CCompilerOptions to {} after this step is complete to allow command execution again.
Check the output/errors from the compiler:
Compiler`$CCompilerOptions = {"ShellOutputFunction" -> Print};
Compile[{{x}}, x^2, CompilationTarget -> "C"];
These last two steps will hopefully give you enough clues to proceed. With this information you can check if the correct library / include paths are passed to the compiler (in the case of gcc/icc, look at the -L option which specifies library paths and the -I option which specifies include paths). Then check if the required include and library files are present at those paths.
If you get Compile::nogen, you can see the compiler output by setting ShellOutputFunction->Print right in the Compile expression:
c = Compile[ {{x}}, x^2 + Sin[x^2],
CompilationTarget -> {"C", "ShellOutputFunction"->Print}];
In general, this is how you can pass options to the underlying CreateLibrary call, by changing CompilationTarget->"C" to CompilationTarget->{"C", options}. Setting Compiler`$CCompilerOptions works too, but this technique has the advantage of not setting a global variable.
It is a shame that the only error you are seeing is $Failed, that's not terribly helpful; I wonder if perhaps there are some file or directory permissions problems?
I'm running on linux not Mac so I am not sure if my setup is "close enough" or not. On my machine your Compile command succeeds and generates a file .Mathematica/ApplicationData/CCompilerDriver/BuildFolder/blackie-desktop-5077/compiledFunction1.so in my home directory. Is there any way you can find a .Mathematica directory associated with your userid, and see if it exists and is writeable by mathematica?
Also, you could check to see if "gcc" is or is not being accessed by checking the file access time of /usr/bin/gcc before and after your call to Compile. From an operating system shell you can do ls -lu /usr/bin/gcc or from Mathematica perhaps Import["!ls -lu /usr/bin/gcc", "Text"]

Resources