I've started playing with LLVM, making a pet language. I'm using the C-API. I have a parser and basic AST, but I am at a bit of a road block with LLVM.
The following is a minified version of my code to illustrate my current issue:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "llvm-c/Core.h"
#include "llvm-c/ExecutionEngine.h"
#include "llvm-c/Target.h"
#include "llvm-c/Analysis.h"
#include "llvm-c/BitWriter.h"
static LLVMModuleRef mod;
static LLVMBuilderRef builder;
static LLVMExecutionEngineRef engine;
typedef struct oper_t {
const char * name;
LLVMTypeRef args[2];
LLVMTypeRef ret;
LLVMValueRef val;
} oper_t;
#define NUM_OPER 2
static oper_t oper[NUM_OPER] = {
{ .name = "function1" },
{ .name = "function2" },
};
void codegen_init(const char * filename)
{
char *error;
mod = LLVMModuleCreateWithName(filename);
builder = LLVMCreateBuilder();
error = NULL;
LLVMVerifyModule(mod, LLVMAbortProcessAction, &error);
if(error) printf("LLVM init Verify message \"%s\"\n", error);
LLVMDisposeMessage(error);
error = NULL;
LLVMLinkInMCJIT();
LLVMInitializeNativeTarget();
LLVMInitializeNativeAsmPrinter();
if (LLVMCreateExecutionEngineForModule(&engine, mod, &error) != 0)
{
fprintf(stderr, "LLVM failed to create execution engine\n");
abort();
}
if(error)
{
printf("LLVM Execution Engine message %s\n", error);
LLVMDisposeMessage(error);
exit(EXIT_FAILURE);
}
}
int runOper(oper_t * o, long a, long b)
{
LLVMValueRef v, l, r;
o->args[0] = LLVMInt32Type();
o->args[1] = LLVMInt32Type();
o->ret = LLVMFunctionType(LLVMInt32Type(), o->args, 2, 0);
o->val = LLVMAddFunction(mod, o->name, o->ret);
LLVMBasicBlockRef entry = LLVMAppendBasicBlock(o->val, "entry");
LLVMPositionBuilderAtEnd(builder, entry);
l = LLVMConstInt(LLVMInt32Type(), a, 0);
r = LLVMConstInt(LLVMInt32Type(), b, 0);
v = LLVMBuildAdd(builder, l, r, "add");
LLVMBuildRet(builder, v);
char *error = NULL;
LLVMVerifyModule(mod, LLVMAbortProcessAction, &error);
if(error) printf("LLVM func Verify message \"%s\"\n", error);
LLVMDisposeMessage(error);
LLVMGenericValueRef g = LLVMRunFunction(engine, o->val, 0, NULL);
printf("LLVM func executed without crash\n");
LLVMDeleteFunction(o->val);
return (long)LLVMGenericValueToInt(g, 1);
}
int main(int argc, char const *argv[])
{
long val;
codegen_init("test");
val = runOper(&oper[0], 3, 4);
printf("3 + 4 is %ld\n", val);
val = runOper(&oper[1], 6, 7);
printf("6 + 7 is %ld\n", val);
}
I can compile this using the command:
gcc test.c `llvm-config --cflags --cppflags --ldflags --libs core executionengine mcjit interpreter analysis native bitwriter --system-libs` -o test.exe
Or alternatively I've also tried:
gcc `llvm-config --cflags --cppflags` -c test.c
g++ test.o `llvm-config --cxxflags --ldflags --libs core executionengine mcjit interpreter analysis native bitwriter --system-libs` -o test.exe
Either way I get this result:
$ ./test.exe
LLVM init Verify message ""
LLVM func Verify message ""
LLVM func executed without crash
3 + 4 is 7
LLVM func Verify message ""
Segmentation fault
I've also tried using clang just for good measure.
Clearly I am misusing the LLVM C-API. I'm struggling mostly to get some understanding of when the API functions are safe to call, and also when can I safely free/delete the memory referenced by LLVM. For instance the LLVMTypeRef args[2] parameter, I see in the LLVM C-API source code for LLVMFunctionType that it is creating an ArrayRef to the args parameter. This means I must hang onto the args parameter until LLVM is done with it. I can't really tell when that is exactly. (I plan to allocate this memory on the heap)
Stated simply, I'd like it if someone could not just explain what I am doing wrong in this example, but more fundamentally explain how I should figure out what I am doing wrong.
The LLVM C-API docs gives a great breakdown of the functions available in the API, but I haven't found it to give much description of how API functions should be called, ie. what order is safe/expected.
I have also found this documentation to be helpful, as it can be easily searched for individual function prototypes. But again it gives no context or examples of how to use the C-API.
Finally I have to reference Paul Smith's Blog, it's a bit outdated now, but is definitely the reason I got this far.
P.S. I don't expect everything to be spelled out for me, I just want advise on how to self-learn LLVM
The basic design is most easily understood in C++: If you pass a pointer to an object y as a constructor argument, ie. x=new Foo(…, y, …), then y has to live longer than x. This also applies to wrappers such as CallInst::Create() and ConstantInt::get(), both of which take pointers to objects and return constructed objects.
But there's more. Some objects assume ownership of the constructed objects, so that you aren't permitted to delete the constructed object at all. You are for example not allowed to delete what ConstantInt::get() returns. As a general rule, anything that's called create… in the C++ API returns something you may delete and anything called get… returns something that's owned by another LLVM object. I'm sure there are exceptions.
You may find it helpful to build a debug version of LLVM, unless you're much smarter than I. The extra assertions are great.
Related
I'm writing an application using libao for audio output. The portion
of my program that calls into libao lives in a shared object:
// playao.c
// compile with: gcc -shared -o libplayao.so playao.c -lao -lm
#include <ao/ao.h>
#include <stdio.h>
#include <math.h>
void playao(void) {
int i;
unsigned char samps[8000];
ao_initialize();
ao_sample_format sf;
sf.bits = 8;
sf.rate = 8000;
sf.channels = 1;
sf.byte_format = AO_FMT_NATIVE;
sf.matrix = "M";
ao_device *device = ao_open_live(ao_default_driver_id(), &sf, NULL);
if(!device) {
puts("ao_open_live error");
ao_shutdown();
return;
}
for(i = 0; i < 8000; ++i) {
float time = (float)i / 8000;
float freq = 440;
float angle = time * freq * M_PI * 2;
float value = sinf(angle);
samps[i] = (unsigned char)(value * 127 + 127);
}
if(!ao_play(device, (char *)samps, 8000)) {
puts("ao_play error");
}
ao_close(device);
ao_shutdown();
}
If I link against this shared object in a program, it works fine:
// directlink.c
// compile with: gcc -o directlink directlink.c libplayao.so -Wl,-rpath,'$ORIGIN'
void playao(void);
int main(int argc, char **argv) {
playao();
return 0;
}
However, if I use dlopen/dlsym to invoke it, there are no errors, but the
program does not cause any sound to be emitted:
// usedl.c
// compile with: gcc -o usedl usedl.c -ldl
#include <dlfcn.h>
#include <stdio.h>
int main(int argc, char **argv) {
void *handle = dlopen("./libplayao.so", RTLD_LAZY);
if(!handle) {
puts("dlopen failed");
return 1;
}
void *playao = dlsym(handle, "playao");
if(!playao) {
puts("dlsym failed");
dlclose(handle);
return 1;
}
((void (*)(void))playao)();
dlclose(handle);
return 0;
}
However, running usedl with LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libao.so.4
does work. So there's something about libao that wants to be loaded when the
program starts up, and doesn't like being loaded any later.
Why is this? Is there any way to work around this, so that libao works
correctly even if loaded later in the program's execution?
I'm running Debian 10 "buster" if it matters.
I asked about this on the #xiph channel on Freenode and xiphmont suggested turning turning on verbose mode. Once I did that, the failing case started getting the message:
ERROR: Failed to load plugin /usr/lib/x86_64-linux-gnu/ao/plugins-4/libalsa.so => dlopen() failed
So libao itself is trying to dlopen something, and it's failing. It's not showing me any more details, so I ran the program under GDB and set a breakpoint on dlopen. After hitting the dlopen breakpoint for libalsa and running finish, I tried finding what the error was by using print (const char *)dlerror(). And with this, I get a more detailed error:
/usr/lib/x86_64-linux-gnu/ao/plugins-4/libalsa.so: undefined symbol: ao_is_big_endian
So ao's libalsa plugin is trying to reference symbols back in libao, but it's not finding them. Why could this be? Referencing the dlopen documentation, I see:
Zero or more of the following values may also be ORed in flags:
RTLD_GLOBAL: The symbols defined by this shared object will be made available for symbol resolution of subsequently loaded shared objects.
RTLD_LOCAL: This is the converse of RTLD_GLOBAL, and the default if neither flag is specified. Symbols defined in this shared object are not made available to resolve references in subsequently loaded shared objects.
Because my dlopen call only used RTLD_LAZY and didn't include RTLD_GLOBAL or RTLD_LOCAL, it defaulted to RTLD_LOCAL, which does not expose the symbols in the shared object (like ao_is_big_endian) to subsequently loaded shared objects (like libalsa.so).
So, I tried changing the code from:
void *handle = dlopen("./libplayao.so", RTLD_LAZY);
To:
void *handle = dlopen("./libplayao.so", RTLD_LAZY | RTLD_GLOBAL);
And lo and behold, it works!
I'm trying to implement a small example using the C API. I get an error message that the function context doesn't match the module context, which I can't figure out.
Here's my code:
#include <stdio.h>
#include <llvm-c/Analysis.h>
#include <llvm-c/Core.h>
#include <llvm-c/Target.h>
#include <llvm-c/TargetMachine.h>
int
main() {
LLVMInitializeNativeTarget();
LLVMInitializeNativeAsmPrinter();
char* triple = LLVMGetDefaultTargetTriple();
char* error;
LLVMTargetRef target_ref;
if (LLVMGetTargetFromTriple(triple, &target_ref, &error)) {
printf("Error: %s\n", error);
return 1;
}
LLVMTargetMachineRef tm_ref = LLVMCreateTargetMachine(
target_ref,
triple,
"",
"",
LLVMCodeGenLevelDefault,
LLVMRelocStatic,
LLVMCodeModelJITDefault);
LLVMDisposeMessage(triple);
LLVMContextRef context = LLVMContextCreate();
LLVMModuleRef module = LLVMModuleCreateWithNameInContext("module_name", context);
// LLVMModuleRef module = LLVMModuleCreateWithName("module_name");
LLVMTypeRef param_types[] = {LLVMIntType(32), LLVMIntType(32)};
LLVMTypeRef func_type = LLVMFunctionType(LLVMIntType(32), param_types, 2, 0);
LLVMValueRef func = LLVMAddFunction(module, "function_name", func_type);
LLVMBasicBlockRef entry = LLVMAppendBasicBlock(func, "entry");
LLVMBuilderRef builder = LLVMCreateBuilderInContext(context);
// LLVMBuilderRef builder = LLVMCreateBuilder();
LLVMPositionBuilderAtEnd(builder, entry);
LLVMValueRef tmp = LLVMBuildAdd(builder, LLVMGetParam(func, 0), LLVMGetParam(func, 1), "add");
LLVMBuildRet(builder, tmp);
LLVMVerifyModule(module, LLVMAbortProcessAction, &error);
LLVMDisposeMessage(error);
}
And then my execution:
$ llvm-config --version
8.0.0
$ clang++ trash.cpp `llvm-config --cflags --ldflags` `llvm-config --libs` `llvm-config --system-libs`
$ ./a.out
Function context does not match Module context!
i32 (i32, i32)* #function_name
LLVM ERROR: Broken module found, compilation aborted!
This is not an API that lends itself to very small examples; consequently, there's a decent chunk of code here.
If I use the currently commented out code which doesn't reference context, everything works fine. It's not clear to me why, when I call LLVMAddFunction, it doesn't just take its context from the module I passed in.
Well, I found the answer. Rather than LLVMIntType, I should be using LLVMIntTypeInContext. And rather than LLVMAppendBasicBlock, I should be using LLVMAppendBasicBlockInContext. I didn't realize previously such functions existed.
I built a shell that tries to make the tab (\t) key do something custom using rl_bind_key(), but it didn't work in macOS Sierra, but it works on Ubuntu, Fedora, and CentOS. Here's the mcve:
#include <stdlib.h>
#include <stdio.h>
#include <readline/readline.h>
static int cmd_complete(int count, int key)
{
printf("\nCustom tab action goes here...\n");
rl_forced_update_display();
return 0;
}
char *interactive_input()
{
char *buffer = readline(" > ");
return buffer;
}
int main(int argc, char **argv)
{
rl_bind_key('\t', cmd_complete); // this doesn't seem to work in macOS
char *buffer = 0;
while (!buffer || strncmp(buffer, "exit", 4)) {
if (buffer) { free(buffer); buffer=0; }
// get command
buffer = interactive_input();
printf("awesome command: %s\n", buffer);
}
free(buffer);
return 0;
}
I compile using Clang like this:
$ cc -lreadline cli.c -o cli
What is the cause of this behavior and how do I fix it?
I was using the flag -lreadline, however unbeknownst to me, Clang appears to secretly use libedit (I've seen it called editline also). In libedit, for some reason (which merits another question), rl_bind_key appears to not work with anything except rl_insert.
So one solution that I found is to use Homebrew to install GNU Readline (brew install readline), and then to ensure I use that version, I compile thusly:
$ cc -lreadline cli.c -o cli -L/usr/local/opt/readline/lib -I/usr/local/opt/readline/include
In fact, when you install readline, it will tell you this at the end of the installation or if you do brew info readline:
gns-mac1:~ gns$ brew info readline
readline: stable 7.0.3 (bottled) [keg-only]
Library for command-line editing
https://tiswww.case.edu/php/chet/readline/rltop.html
/usr/local/Cellar/readline/7.0.3_1 (46 files, 1.5MB)
Poured from bottle on 2017-10-24 at 12:21:35
From: https://github.com/Homebrew/homebrew-core/blob/master/Formula/readline.rb
==> Caveats
This formula is keg-only, which means it was not symlinked into /usr/local,
because macOS provides the BSD libedit library, which shadows libreadline.
In order to prevent conflicts when programs look for libreadline we are
defaulting this GNU Readline installation to keg-only..
For compilers to find this software you may need to set:
LDFLAGS: -L/usr/local/opt/readline/lib
CPPFLAGS: -I/usr/local/opt/readline/include
Source of libedit rl_bind_key
So this is why it doesn't work in libedit. I downloaded the source and this is how the rl_bind_key function is defined:
/*
* bind key c to readline-type function func
*/
int
rl_bind_key(int c, rl_command_func_t *func)
{
int retval = -1;
if (h == NULL || e == NULL)
rl_initialize();
if (func == rl_insert) {
/* XXX notice there is no range checking of ``c'' */
e->el_map.key[c] = ED_INSERT;
retval = 0;
}
return retval;
}
So it seems designed to not work with anything except rl_insert. That seems like a bug, not a feature. I wish I knew how to become a contributor to libedit.
Language: C
Operating System: Red Hat EL
Starting with a "for instance":
Assume I have two libraries: libJUMP.so and libSIT.so.
JUMP contains the function jump() and similarly SIT contains the function sit()
I have an application that I want to provide to different people; they can either get the jump() feature, the sit() feature, or both. However, I would like to NOT use #ifdef if at all possible.
Header for libJUMP.so:
#ifndef JUMP_H_
#define JUMP_H_
#define JUMP_ENABLED
void jump();
#endif /* JUMP_H_ */
Header for libSIT.so:
#ifndef SIT_H_
#define SIT_H_
#define SIT_ENABLED
void sit();
#endif /* SIT_H_ */
I have an application:
#include "jump.h"
#include "sit.h"
int main()
{
// #ifdef JUMP_ENABLED
jump();
// #endif /* JUMP_ENABLED */
// #ifdef SIT_ENABLED
sit();
// #endif /* SIT_ENABLED */
}
So:
Is there a way to do this without using #ifdef? Is there a better way at all?
I have heard we could do this by compiling with both SO libraries, and if one is missing when I run the application on the target system, it could just exclude the feature automatically (using some combination of dlopen() and dlsym()?) Any easy examples, if this is indeed correct? An example with my code from above, if possible :D?
If this is a stupid question, or just not possible, please feel free to tell me so. If there is a similar question that this would be considered a duplicate of, let me know and I will delete this post.
Consider these three files. First, jump.c:
#include <stdio.h>
int jump(const double height)
{
fflush(stdout);
fprintf(stderr, "Jumping %.3g meters.\n", height);
fflush(stderr);
return 0;
}
Second, sit.c:
#include <stdio.h>
int sit(void)
{
fflush(stdout);
fprintf(stderr, "Sitting down.\n");
fflush(stderr);
return 0;
}
Third, example.c to use one or both of the above, depending on whether they (as libjump.so or libsit.so, respectively) exist in the current working directory:
#include <stdio.h>
#include <dlfcn.h>
static const char *jump_lib_path = "./libjump.so";
static int (*jump)(const double) = NULL;
static const char *sit_lib_path = "./libsit.so";
static int (*sit)(void) = NULL;
static void load_dynamic_libraries(void)
{
void *handle;
handle = dlopen(jump_lib_path, RTLD_NOW | RTLD_LOCAL);
if (handle) {
jump = dlsym(handle, "jump");
/* If no jump symbol, we don't need the library at all. */
if (!jump)
dlclose(handle);
}
handle = dlopen(sit_lib_path, RTLD_NOW | RTLD_LOCAL);
if (handle) {
sit = dlsym(handle, "sit");
/* If no sit symbol, the library is useless. */
if (!sit)
dlclose(handle);
}
}
int main(void)
{
int retval;
load_dynamic_libraries();
if (jump) {
printf("Calling 'jump(2.0)':\n");
retval = jump(2.0);
printf("Returned %d.\n\n", retval);
} else
printf("'jump()' is not available.\n\n");
if (sit) {
printf("Calling 'sit()':\n");
retval = sit();
printf("Returned %d.\n\n", retval);
} else
printf("'sit()' is not available.\n\n");
return 0;
}
Let's first compile and run the example program:
gcc -Wall -O2 example.c -ldl -o example
./example
The program outputs that neither jump() or sit() are available. Let's compile jump.c into a dynamic library, libjump.so, and then run the example again:
gcc -Wall -O2 -fPIC -shared jump.c -Wl,-soname,libjump.so -o libjump.so
./example
Now, the jump() function works. Let's compile sit.c, too, and run the example a final time:
gcc -Wall -O2 -fPIC -shared jump.c -Wl,-soname,libsit.so -o libsit.so
./example
Here, both functions get called, and everything just works.
In example.c, jump and sit are function pointers. We initialize them to NULL, so that we can use if (jump) to check if jump points to a valid function.
The load_dynamic_libraries() function uses dlopen() and dlsym() to obtain the function pointers. Note that if the dynamic library is opened successfully, and the necessary symbol is found, we do not dlclose() it because we want to keep the dynamic library in memory. (We only dlclose() it if it looks like it is not the kind of library we want.)
If you want to avoid the if (jump) and if (sit) clauses, you can use stubs like
int unsupported_jump(const double height)
{
return ENOTSUP;
}
int unsupported_sit(void)
{
return ENOTSUP;
}
and at the end of load_dynamic_libraries(), divert the functions to the stubs instead of NULL pointers, i.e.
if (!jump)
jump = unsupported_jump;
if (!sit)
sit = unsupported_sit;
Note that function-like interfaces are easiest to use, because the function pointer acts as the effective prototype. If you need objects, I recommend using getter functions. Objects do work just fine, as long as you remember that dlsym() returns a pointer to the object; using a getter function, that is explicit in the getter function pointer type.
Plug-in interfaces commonly have a single function (say, int properties(struct plugin *const props, const int version)), which is used to populate a structure of function and object pointers. The application supplies the version of the structure it uses, and the plug-in function returns either success or failure, depending on whether it can populate the structure to accommodate that version.
As plug-ins are typically stored in a single directory (/usr/lib/yourapp/plugins/ is very common), you can trivially load all plugins by using opendir() and readdir() to scan the file names in the plug-in directory one by one, dlopen()ing each one, obtaining the properties() function pointer, and calling it to see what kinds of services the plugin provides; typically creating an array or a linked list of the plugin structures.
All of this is very, very simple and straightforward in Linux, as you can see. If you want a specific plug-in functionality example, I recommend you pose that as a separate question, with more details on what kind of functionality the interface should expose -- the exact data structures and function prototypes do depend very much on what kind of application we have at hand.
Questions? Comments?
What is the simplest possible C function for starting the R interpreter, passing in a small expression (eg, 2+2), and getting out the result? I'm trying to compile with MingW on Windows.
You want to call R from C?
Look at section 8.1 in the Writing R Extensions manual. You should also look into the "tests" directory (download the source package extract it and you'll have the tests directory). A similar question was previously asked on R-Help and here was the example:
#include <Rinternals.h>
#include <Rembedded.h>
SEXP hello() {
return mkString("Hello, world!\n");
}
int main(int argc, char **argv) {
SEXP x;
Rf_initEmbeddedR(argc, argv);
x = hello();
return x == NULL; /* i.e. 0 on success */
}
The simple example from the R manual is like so:
#include <Rembedded.h>
int main(int ac, char **av)
{
/* do some setup */
Rf_initEmbeddedR(argc, argv);
/* do some more setup */
/* submit some code to R, which is done interactively via
run_Rmainloop();
A possible substitute for a pseudo-console is
R_ReplDLLinit();
while(R_ReplDLLdo1() > 0) {
add user actions here if desired
}
*/
Rf_endEmbeddedR(0);
/* final tidying up after R is shutdown */
return 0;
}
Incidentally, you might want to consider using Rinside instead: Dirk provides a nice "hello world" example on the project homepage.
In you're interested in calling C from R, here's my original answer:
This isn't exactly "hello world", but here are some good resources:
Jay Emerson recently gave a talk on R package development at the New York useR group, and he provided some very nice examples of using C from within R. Have a look at the paper from this discussion on his website, starting on page 9. All the related source code is here: http://www.stat.yale.edu/~jay/Rmeetup/MyToolkitWithC/.
The course taught at Harvard by Gopi Goswami in 2005: C-C++-R (in Statistics). This includes extensive examples and source code.
Here you go. It's the main function, but you should be able to adapt it to a more general purpose function. This example builds an R expression from C calls and also from a C string. You're on your own for the compiling on windows, but I've provided compile steps on linux:
/* simple.c */
#include <Rinternals.h>
#include <Rembedded.h>
#include <R_ext/Parse.h>
int
main(int argc, char *argv[])
{
char *localArgs[] = {"R", "--no-save","--silent"};
SEXP e, tmp, ret;
ParseStatus status;
int i;
Rf_initEmbeddedR(3, localArgs);
/* EXAMPLE #1 */
/* Create the R expressions "rnorm(10)" with the R API.*/
PROTECT(e = allocVector(LANGSXP, 2));
tmp = findFun(install("rnorm"), R_GlobalEnv);
SETCAR(e, tmp);
SETCADR(e, ScalarInteger(10));
/* Call it, and store the result in ret */
PROTECT(ret = R_tryEval(e, R_GlobalEnv, NULL));
/* Print out ret */
printf("EXAMPLE #1 Output: ");
for (i=0; i<length(ret); i++){
printf("%f ",REAL(ret)[i]);
}
printf("\n");
UNPROTECT(2);
/* EXAMPLE 2*/
/* Parse and eval the R expression "rnorm(10)" from a string */
PROTECT(tmp = mkString("rnorm(10)"));
PROTECT(e = R_ParseVector(tmp, -1, &status, R_NilValue));
PROTECT(ret = R_tryEval(VECTOR_ELT(e,0), R_GlobalEnv, NULL));
/* And print. */
printf("EXAMPLE #2 Output: ");
for (i=0; i<length(ret); i++){
printf("%f ",REAL(ret)[i]);
}
printf("\n");
UNPROTECT(3);
Rf_endEmbeddedR(0);
return(0);
}
Compile steps:
$ gcc -I/usr/share/R/include/ -c -ggdb simple.c
$ gcc -o simple simple.o -L/usr/lib/R/lib -lR
$ LD_LIBRARY_PATH=/usr/lib/R/lib R_HOME=/usr/lib/R ./simple
EXAMPLE #1 Output: 0.164351 -0.052308 -1.102335 -0.924609 -0.649887 0.605908 0.130604 0.243198 -2.489826 1.353731
EXAMPLE #2 Output: -1.532387 -1.126142 -0.330926 0.672688 -1.150783 -0.848974 1.617413 -0.086969 -1.334659 -0.313699
I don't think any of the above has answered the question - which was to evaluate 2 + 2 ;). To use a string expression would be something like:
#include <Rinternals.h>
#include <R_ext/Parse.h>
#include <Rembedded.h>
int main(int argc, char **argv) {
SEXP x;
ParseStatus status;
const char* expr = "2 + 2";
Rf_initEmbeddedR(argc, argv);
x = R_ParseVector(mkString(expr), 1, &status, R_NilValue);
if (TYPEOF(x) == EXPRSXP) { /* parse returns an expr vector, you want the first */
x = eval(VECTOR_ELT(x, 0), R_GlobalEnv);
PrintValue(x);
}
Rf_endEmbeddedR(0);
return 0;
}
This lacks error checking, obviously, but works:
Z:\>gcc -o e.exe e.c -IC:/PROGRA~1/R/R-213~1.0/include -LC:/PROGRA~1/R/R-213~1.0/bin/i386 -lR
Z:\>R CMD e.exe
[1] 4
(To get the proper commands for your R use R CMD SHLIB e.c which gives you the relevant compiler flags)
You can also construct the expression by hand if it's simple enough - e.g., for rnorm(10) you would use
SEXP rnorm = install("rnorm");
SEXP x = eval(lang2(rnorm, ScalarInteger(10)), R_GlobalEnv);
I think you can't do much better than the inline package (which supports C, C++ and Fortran):
library(inline)
fun <- cfunction(signature(x="ANY"),
body='printf("Hello, world\\n"); return R_NilValue;')
res <- fun(NULL)
which will print 'Hello, World' for you. And you don't even know where / how / when the compiler and linker are invoked. [ The R_NilValue is R's NULL version of a SEXP and the .Call() signature used here requires that you return a SEXP -- see the 'Writing R Extensions' manual which you can't really avoid here. ]
You will then take such code and wrap it in a package. We had great success with using
inline for the
Rcpp unit tests (over 200 and counting now) and some of the examples.
Oh, and this inline example will work on any OS. Even Windoze provided you have the R package building tool chain installed, in the PATH etc pp.
Edit: I misread the question. What you want is essentially what the littler front-end does (using pure C) and what the RInside classes factored-out for C++.
Jeff and I never bothered with porting littler to Windoze, but RInside did work there in most-recent release. So you should be able to poke around the build recipes and create a C-only variant of RInside so that you can feed expression to an embedded R process. I suspect that you still want something like Rcpp for the clue as it gets tedious otherwise.
Edit 2: And as Shane mentions, there are indeed a few examples in the R sources in tests/Embedding/ along with a Makefile.win. Maybe that is the simplest start if you're willing to learn about R internals.