Loading and storing a file at compile time with a type provider - ffi

I'd like to load a (binary) file at compile time and store it in a toplevel variable of type Bytes:
module FileProvider
import Data.Bits
import Data.Bytes
import Data.Buffer
%default total
export
loadImage : String -> IO (Provider Bytes)
loadImage fileName = do
Right file <- openFile fileName Read
| Left err => pure (Error $ show err)
Just buf <- newBuffer size
| Nothing => pure (Error "allocation failed")
readBufferFromFile file buf size
Provide . dropPrefix 2 . pack <$> bufferData buf
where
size : Int
size = 256 * 256 + 2
It seems to work correctly at runtime:
module Main
import FileProvider
import Data.Bits
import Data.Bytes
main : IO ()
main = do
Provide mem <- loadImage "ram.mem"
| Error err => putStrLn err
printLn $ length mem
However, if I try to run it at compile time, it fails with a mysterious message mentioning FFI:
module Main
import FileProvider
import Data.Bits
import Data.Bytes
%language TypeProviders
%provide (mem : Bytes) with loadImage "ram.mem"
main : IO ()
main = do
printLn $ length mem
$ idris -p bytes -o SOMain SOMain.idr
Symbol "idris_newBuffer" not found
idris: user error (Could not call foreign function "idris_newBuffer" with args [65538])
What is going on here and how can I load a file's contents at compile time?

idris_newbuffer is a C function used by Data.Buffer. From the docs to type providers:
If we want to call our foreign functions from interpreted code (such as the REPL or a type provider), we need to dynamically link a library containing the symbols we need.
So every function that uses FFI needs to have a dynamic library linked. That'd be Data.Buffer and Data.ByteArray. Lets focus on the first and see what problems arise:
So Data.Buffer needs %dynamic "idris_buffer.so" (and not only the %include C "idris_buffer.h" it currently has). You can copy idris/rts/idris_buffer.(h|c) and remove the function that needs the other rts-stuff. To compile a shared library run:
cc -o idris_buffer.so -fPIC -shared idris_buffer.c
With a modified Data.Buffer to use it, you'd still get an error:
Could not use <<handle {handle: ram.mem}>> as FFI arg.
The FFI call is in Data.Buffer.readBufferFromFile. The File argument causes trouble. That's because Idris sees that openFile is used (another C function) and
transforms it in a Haskell call. On the one hand that's nice, because during compilation we're interpreting Idris code, and that make following functions like readLine C/JS/Node/… agnostic. But in this case it's unfortunate, because the Haskell backend doesn't support the returned file handle for FFI calls. So we can write another fopen : String -> String -> IO Ptr function that does the same thing but has another name so the Ptr will stay Ptr, as those can be used in FFI calls.
With this done, there is is another error thanks to built-ins:
Could not use prim__registerPtr (<<ptr 0x00000000009e2bf0>>) (65546) as FFI arg.
Data.Buffer uses ManagedPtr in its backend. And yep, that's unsupported in FFI calls also. So you'd need to change these to Ptr. Both of these could get supported in the compiler, I guess.
Finally everything should work for a valid %provide (mem : Buffer). But, nope, because:
Can't convert pointers back to TT after execution.
Even though Idris can now read a file in compilation time, it can't make Buffer or anything else with a Pnt accessible to runtime - and that's quite reasonable. Having just a pointer from when the program was compiled is just a random thing at run time. So you either need to transform the data to the result in the provider or use an intermediate format like Provider (List Bits8).
I made a short example to have the List Bits8 accessible in main. Buffer is basically Data.Buffer with _openFile included and Pnt instead of ManagedPtr. I hope this helps you somehow, but maybe some of the compiler people can give more background.

Related

How to read the absolute load address of the beginning of shared library data section on runtime?

Lets consider this example:
glob.c source code is linked to shared library named glob.so. From main.c that link against glob.so I want to read the value of 'global_offset' variable at runtime(I don't think it's possible to do at compile time). My compiler is gcc 4.8.5 MinGW.
glob.c:
int glob_shared_var = 69;
main.c:
size_t global_offset = // read shared library load offset
size_t relative_glob_shared_var_offset = // read offset value from e.g. nm glob.a symbols table
printf("glob_shared_var value: %d \n", *(int *)(global_offset + relative_glob_shared_var_offset));
console output:
glob_shared_var value: 69
Ok so I read little bit more about GNU LD linker scripts and I learned about __data_start__ and __data_end__ symbols which are added to the beginning and ending of each consolidated binary by default. What I guess could work is create custom linker script rule that creates __data_start_glob__ and __data_end_glob__ symbols accordingly next to them to uniquely identify each shared library while producing them.
After library creation I would produce text file dump from it and grep offset addresses of each symbol in the library and put it to simple flat text file which will be then read at runtime.
reading address of shared library data section at runtime would look like this:
usize_t = glob_offset = &__data_start_glob__;
usize_t = glob_shared_var_offset; // read from flat file
int val = *(int *)(glob_offset + glob_shared_var_offset);
I know that use-case for this will be very limited, but maybe someone will have similar crazy idea in the future.

How to import a haskell module that uses FFI without refering to the c object?

I'm trying to write a haskell module that wraps a bunch of c functions.
I want to be able to import this module like any other haskell module without referring to the c object files.
I can't find any examples about how to do this.
This is what I've tried. I have a c file "dumbCfunctions.c":
double addThree(double x) {
return x+3;
}
and a haskell file with a module defined in it "Callfunctions.hs"
module Callfunctions (
addThree
) where
import Foreign.C
foreign import ccall "addThree" addThree :: Double -> Double
main = print $ addThree 4
I can make an executable doing:
ghc --make -o cf_ex Callfunctions.hs dumbCfunctions.o
Which correctly gives me 7.
I can also import it into ghic by calling ghci with
shane> ghci dumbCfunctions.o
Prelude> :l Callfunctions.hs
[1 of 1] Compiling Callfunctions ( Callfunctions.hs, interpreted )
Ok, modules loaded: Callfunctions.
*Callfunctions> addThree 3
6.0
But I want to be able to treat it like any other module with out referring to "dumbCfunctions.o":
shane> ghci
Prelude> :l Callfunctions.hs
[1 of 1] Compiling Callfunctions ( Callfunctions.hs, interpreted )
Ok, modules loaded: Callfunctions.
*Callfunctions> addThree 3
But now I get the error
ByteCodeLink: can't find label
During interactive linking, GHCi couldn't find the following symbol:
addThree
This may be due to you not asking GHCi to load extra object files,
archives or DLLs needed by your current session. Restart GHCi, specifying
the missing library using the -L/path/to/object/dir and -lmissinglibname
flags, or simply by naming the relevant files on the GHCi command line.
Alternatively, this link failure might indicate a bug in GHCi.
If you suspect the latter, please send a bug report to:
glasgow-haskell-bugs#haskell.org
This makes sense because I haven't refereed to the object anywhere. So I must be able to do something better by first compiling the module, but I couldn't find out how to do this. I must be looking in the wrong places.
You can create a library through Cabal, and cabal install it.
This would link the C code inside your Haskell library. Later on, when you load the module, you will not need to manually load the C parts.

Idris FFI "symbol not found"

I've been messing around with Idris lately and decided to try playing around with its Network.Socket library. I fired up the REPL, imported the module, and created a socket using the socket command. Upon attempting to execute the IO operation, I was met with the following error:
failed to construct ffun from (Builtins.MkPair (FFI_C.C_Types (Int)) (Int) (FFI_C.C_IntT (Int) (FFI_C.C_IntNative)) (2),Builtins.MkPair (FFI_C.C_Types (Int)) (Int) (FFI_C.C_IntT (Int) (FFI_C.C_IntNative)) (1),[])
Symbol "socket" not found
user error (Could not call foreign function "socket" with args [2,1,0])
To see whether the issue was Network.Socket specific, or just FFI in general, I made a dummy function.
printf : String -> IO ()
printf = foreign FFI_C "printf" (String -> IO ())
Executed :x printf "Hello World" yields a similar error:
Symbol "printf" not found
user error (Could not call foreign function "printf" with args ["hello world"])
Despite all this, putStr works fine.
I am running Idris 9.20, installed through cabal with -f FFI set at compile. I am using libffi version 3.4 installed through MacPorts.
I believe that this has to do with the fact that the Idris FFI operates differently depending on whether code is being compiled or interpreted. When code is being compiled, the FFI requires that at the stage of C codegen, the named C function be in scope, and that when linking the C executable, the correct name is linked in. Since Idris's RTS links against libc, this makes a lot of names from libc work without any extra effort (certain names might require a %include to make sure that the correct C header file is included to put them in scope). When code is being interpreted, the interpreter looks up FFI calls in a list of libraries that have been loaded dynamically, which requires a different directive: %dynamic in the file, or :dynamic in the interpreter. By default, no dynamic libraries are loaded, so even standard names in libc are not in scope. This can be remedied by including %dynamic "libc" in the file, or using the :dynamic "libc" at the REPL commandline for one session.

C - LuaJit Assign custom module name to a compiled string

I have a small C program that has a string which must represent a Lua module and it looks like this:
const char *lua_str = " local mymodule = {} \
function mymodule.foo() \
print(\"Hello World!\") \
end
return mymodule";
Or maybe using the old way (if required):
const char *lua_str = "module(\"mymodule\", package.seeall \
function foo() \
print(\"Hello World!\") \
end";
And let's assume that this is my small host application:
#include <lua.h>
#include <lauxlib.h>
#include <lualib.h>
int main(int argc, char** argv)
{
lua_State *L = lua_open();
luaL_openlibs(L);
luaL_dostring(L, lua_str);
luaL_dofile(L, "test.lua");
return 0;
}
Now in test.lua to be able to use that module with a static name that isn't decided by the file name:
local mymodule = require "mymodule"
mymodule.foo()
Basically, I need to execute that string and give it a custom name which represents the actual module name. Currently the name is decided by the file name and I don't want that.
If you look at the documentation for require:
Loads the given module. The function starts by looking into the
package.loaded table to determine whether modname is already loaded.
If it is, then require returns the value stored at
package.loaded[modname]. Otherwise, it tries to find a loader for the
module.
To find a loader, require is guided by the package.loaders array. By
changing this array, we can change how require looks for a module. The
following explanation is based on the default configuration for
package.loaders.
First require queries package.preload[modname]. If it has a value,
this value (which should be a function) is the loader. Otherwise
require searches for a Lua loader using the path stored in
package.path. If that also fails, it searches for a C loader using the
path stored in package.cpath. If that also fails, it tries an
all-in-one loader (see package.loaders).
Once a loader is found, require calls the loader with a single
argument, modname. If the loader returns any value, require assigns
the returned value to package.loaded[modname]. If the loader returns
no value and has not assigned any value to package.loaded[modname],
then require assigns true to this entry. In any case, require returns
the final value of package.loaded[modname].
If there is any error loading or running the module, or if it cannot
find any loader for the module, then require signals an error.
You will see that it explains, in some detail, what methods require uses to find the code for the given module name. Implicit in that explanation is an indication as to how you can assign arbitrary chunks of loaded (or loadable) code to any given name you would like.
Specifically, if you set a value in package.loaded[modname] that value will be returned immediately. Failing that, package.preload[modname] is used as a loader (which is a function that takes the module name).

C Loading Code dynamically in the same way as the Java Compiler Api 7

I have the following use case which I had previously solved in Java, but am now required to port the program to C.
I had a method A which called a method do_work() belonging to an abstract class Engine. Each concrete implementation of the class was constructed as follows:
users would submit the definition of the do_work() method . If this definition was correct, the programmer would construct a concrete implementation of the Engine class using the Java Compiler API. (code for this is included for reference below).
How can I do something similar in C:
I now have a structure Engine, with a function pointer to the do_work() method. I want users to be able to submit this method at run time (note: this only occurs once, on startup, once the Engine structure has been constructed, I do not want to change it) via command line.
How could I go about this? I've read around suggestions stating that I would have to use assembly to do this, others stating that this was not possible, but none of them giving a good explanation or references. Any help would be appreciated.
The solution doesn't need to be compatible with 32/64 bits machines, as the program this is written for is only for 64 bits machines.
For reference, the Java Code:
JavaCompiler compiler = ToolProvider.getSystemJavaCompiler();
StandardJavaFileManager stdFileManager = compiler
.getStandardFileManager(null, Locale.getDefault(), null);
Iterable<? extends JavaFileObject> compilationUnits = null;
String[] compileOptions = new String[] { "-d", "bin" };
Iterable<String> compilationOptions = Arrays.asList(compileOptions);
SimpleJavaFileObject fileObject = new DynamicJavaSourceCodeObject(
"package.adress",getCode());
JavaFileObject javaFileObjects[] = new JavaFileObject[] { fileObject };
compilationUnits = Arrays.asList(javaFileObjects);
}
DiagnosticCollector<JavaFileObject> diagnostics = new DiagnosticCollector<JavaFileObject>();
CompilationTask compilerTask = compiler.getTask(null, stdFileManager,
diagnostics, compilationOptions, null, compilationUnits);
boolean status = compilerTask.call();
if (!status) {// If compilation error occurs
/* Iterate through each compilation problem and print it */
String result = "";
for (Diagnostic diagnostic : diagnostics.getDiagnostics()) {
result = String.format("Error on line %d in %s",
diagnostic.getLineNumber(), diagnostic);
}
Exception e = new Exception(result);
throw e;
}
stdFileManager.close();// Close the file manager
/*
* Assuming that the Policy has been successfully compiled, create a new
* instance
*/
Class newEngine = Class
.forName("package.name");
Constructor[] constructor = newPolicy.getConstructors();
constructor[0].setAccessible(true);
etc.
}
In C all code must be compiled to native one before usage, so the only way for you is to use command line compiler to build code submitted by users. It may be GNU C++ compiler for example, or Visual C++ compiler (but for Visual C++ I don't know what about legal problems, is it permitted by license to do that).
So, first of all, select your compiler, probably GNU one.
Next, you can compile it as executable program or as DLL (assuming your software is for
Windows). If you decide to compile it to DLL, you have to use Win32 function LoadLibrary to load new built DLL into your process, and after that you can use GetProcAddress function to get method address and call it dynamically from C++ (you must implement a function wrapper and make it public in DLL).
If you decide to compile it as EXE file, you have to use CreateProcess function to run your code, send parameters via command line and receive data, may be, with pipe (see CreatePipe function), or may be with temporary file, or any other interprocess communication way available in Windows.
I think in your situation it is better to compile to EXE file, because in DLL if user code is buggy your main program may crash.

Resources