Currently I'm playing around with LLVM and am implementing my own toy compiler and programming language. Are there any good tutorials or examples on how I can call external library functions (e.g. from libc or whatever) from the IR decomposition of my own language?
Cheers
You'll need to declare the functions you want to call in the LLVM IR. If you don't provide a body for a function, it works just like a declaration in C. You're probably aware of this, but the linker only checks the function name, not the type. Make sure you match the types up in the declaration or you'll get some strange results and no warnings.
Related
Not sure if this is even possible.
I need to pass a function from C to C++.
It cannot be a function pointer.
C++ Function that I need to call:
template<class Lam>
void parfor(int N, Lam lam) {
lam(i);
}
C Function that I want to give to parfor (ptr can be a global pointer here):
void calc(int num) {
ptr[0] = num;
}
C Main to look like this:
include <ParFor.hpp>
parfor(calc, 1);
I could put my function definitions inside a header. On the C++ side I have a function (from an outside library) that takes a C++ lambda or a C++ functor. It's templated to a lambda
My current thinking is put my C functions inside a file, compile LLVM IR for them and somehow force inline the IR generated by clang into my C++ function.
C calls mycppfunc(mycfunc). mycppfunc has the LLVM IR for mycfunc and is able to generate proper code.
I tried this but but compiler crashes at link stage due to what seems to be incompatible IRs.
From the code snippets and the comments I understand that you attempt to launch a C kernel using a SYCL parallel_for.
From an official support perspective, since the SYCL device compiler is by definition a C++ compiler (as SYCL is a programming model based on C++), the kernel code must be parsed as C++ code. I don't think there's any official way to achieve any more interoperability with C code than to just try to compile it as C++ by the device compiler. However, I have the impression that you are more interested in experimenting with interoperability beyond this as a research project.
For this, injecting IR might be a path worthy of investigation but I don't think that it will be straight-forward. Depending on which SYCL features you use, the device compiler might need to perform additional IR transformations to achieve correct semantics. So, you might have to replicate these transformations in order to inject your IR. You might end up reinventing a SYCL compiler for C...
Since any more in-depth discussion about this requires extensive knowledge of the internal implementation details of your SYCL implementation, I would suggest that you contact the developers of your implementation directly for clarification.
The premise: I'm writing a plug-in DLL which conforms to an industry standard interface / function signature. This will be used in at least two different software packages used internally at my company, both of which have some example skeleton code or empty shells of this particular interface. One vendor authors their example in C/C++, the other in Fortran.
Ideally I'd like to just have to write and maintain this library code in one language and not duplicate it (especially as I'm only just now getting some comfort level in various flavors of C, but haven't touched Fortran).
I've emailed off to both our vendors to see if there's anything specific their solvers need when they import this DLL, but this has made me curious at a more fundamental level. If I compile a DLL with an exposed method void foo(int bar) in both C and Fortran... by the time it's down to x86 machine instructions - does it make any difference in how that method is called by program "X"? I've gathered so far that if I were to do C++ I'd need the extern "C" bit to avoid "mangling" - there anything else I should be aware of?
It matters. The exported function must use a specific calling convention, there are several incompatible ones in common use in 32-bit code. The calling convention dictates where the function arguments are stored, in what order they are passed and how they are removed again. As well as how the function return value is passed back.
And the name of the function matters, exported function names are often decorated with extra characters. Which is what extern "C" is all about, it suppresses the name mangling that a C++ compiler uses to prevent overloaded functions from having the same exported name. So the name is one that the linker for a C compiler can recognize.
The way a C compiler makes function calls is pretty much the standard if you interop with code written in other languages. Any modern Fortran compiler will support declarations to make them compatible with a C program. And surely this is something that's already used by whatever software vendor you are working with that provides an add-on that was written in Fortran. And the other way around, as long as you provide functions that can be used by a C compiler then the Fortran programmer has a good chance at being able to call it.
Yes it has been discussed here many many times. Study answers and questions in this tag https://stackoverflow.com/questions/tagged/fortran-iso-c-binding .
The equivalent of extern "C" in fortran is bind(C). The equivalency of the datatypes is done using the intrinsic module iso_c_binding.
Also be sure to use the same calling conventions. If you do not specify anything manually, the default is usually the same for both. On Linux this is non-issue.
extern "C" is used in C++ code. So if you DLL is written in C++, you mustn't pass any C++ objects (classes).
If you stick with C types, you need to make sure the function passes parameters in a single way e.g. use C's default of _cdecl. Not sure what Fortran uses.
I am looking to generate LLVM-IR code from C code and was wondering how well is the IR generation for functions in:
stdio.h, string.h, stdlib.h and generally the standard memory based functions such as malloc, calloc, since I have not been able to find most of the common functions in:
http://llvm.org/docs/LangRef.html and was wondering about the limitations of this representation and whether I might be required to add my own intrinsics just to deal with standard/most popular c functions.
I am looking to change the code at runtime, so was wondering which kind of approach will give me the most flexibility eg: Manipulate the code at AST level instead.
Thanks
Emitting LLVM IR from C is exactly what the industrial-strength compiler Clang does. I suggest running Clang on small snippets of C code with -emit-llvm (details in this document: http://clang.llvm.org/get_started.html) and observing the resulting IR.
You can even do this in your browser: http://ellcc.org/demo/index.cgi
That will allow you to see how builtins like memcpy are handled and any other similar doubts.
Note that neither LLVM nor Clang carry a full C library with them, but they can be used to compile an existing one. newlib is a popular portable C library designed specifically for being built on various new platforms. PNaCl, for example, uses it to build C/C++ code into portable executables - it compiles newlib with the user's code together into a single LLVM IR module.
I'd like to know how C Header Files and ABIs relate. The sizes of various types are architecture and even compiler-dependent. Then how can one reliably link to a C library?
For a more specific problem: When using Haskell's FFI, one even only uses Haskell types like CDouble to define (duplicate the definition of) the C library interface. I don't know where the binary type size information is coming from. What is the trick for making the linking work?
Please see this link https://code.google.com/p/tabi
It may help you to avoid difficulties with possible ABI differences between Haskell and C.
The library type information comes from magic macros that are run to insert information grabbed from the C compiler by autoconf.
For example, see the definition of CDoublehere: https://hackage.haskell.org/package/base-4.8.2.0/docs/src/Foreign.C.Types.html#CDouble
and then see where the HTYPE_DOUBLE size comes from in this autoconf input here: https://hackage.haskell.org/package/base-4.8.2.0/src/include/HsBaseConfig.h.in
Since GHH compiles against the compiler/arch it is compiled with (except in the special cross-compiler modes, which are new and different in ways I'm not fully cognizant of) this makes everything tie out with the ABI properly.
I'm learning C for 2 months. I experimented with different IDEs and my experiments resulted in confusion. Because for e.g. in NETBEANS I can use abs function without stdlib.h library, but when I tried to do the same thing in Visual Studio 2012 it gave a an error. Or a very odd thing in NETBEANS I can use functions from math.h library without declaring the library. Why is this happening? Can someone help? NETBEANS USES cygwin compilers.
In C you don't need to include the headers in order to use the functions. Older compilers don't always warn about that though. Also, different compilers might provide those functions in different ways; on some, they're not functions but macros. With macros, you need to include the headers.
It's good practice to always include the headers that provide the functions you need, so that you get the function prototypes. That's the only way the compiler can check for errors (correct types of passed function arguments, for example.) If you call a function for which you have no prototype, you get an implicit declaration of that function. That means the compiler just takes a guess and hopes you're using the function correctly, but has no way to check. That's why this won't work with macros, since a macro can't have a function declaration (implicit or not.)
The reason Visual Studio gives an error is because it's a C++ compiler, not a C compiler. C++ is a bit different from C. One of the differences is that C++ does not allow implicit function declarations. If you don't declare the functions you use (by including their header file in this case), then that's considered an error. C++ is mostly compatible with C, but that happens to be one of the few differences.
Btw, they're not libraries. They're header files. There's a difference. You have several standard headers you can include, but you only have one library; the C library. On most systems, you also have a math library, which only contains math functions. The point though is that several header files can be (and usually are) part of the same library.
my experience with C has been the same. different compilers has different libraries and sometimes they don't stick to the standards.
some compiler vendors try to lock you in (XXXXX$XXX) :)