How do I access the Ruby AST from C level code? - c

I understand that the Ruby 1.8 AST is traversed at runtime using a big switch statement, and many things like calling a method in a class or parent module involve the interpreter looking up and down the tree as it goes. Is there a straightforward way of accessing this AST in a Ruby C extension? Does it involve the Ruby extension API, or necessitate hacking the internal data structures directly?

A good starting point is probably to read the source of the ParseTree library, which lets you get at and mess with the AST from ruby.

Thanks for the tip. You're right - ParseTree seems to be the only code out there with any manipulation of the AST going on, except that it's actually written in RubyInline.
So, it's a strange mixture between Ruby and C code. Very interesting reading, though.
The other reference of course is eval.c from Ruby itself.
It's going to take a fair bit of reading of both, to get my head around it.

Related

Call Esper from my C program

Can some kind soul please show me a cookbook way that I can call Esper from my C program? Ideally (I think that) I'd like to call an Esper function/method with a line of EDL and get a value returned.
EDIT: i ask this question because I have 12,000 lines of working C code that I want to keep. Esper offers some really nice event evaluation that's crucial to my C code. JNI seems to be oriented toward calling C code from Java, maybe because C is faster for some things; I want to go the other way: to call Java code from C to take advantage of the power in the Java package, which is called Esper.
Thanks!
Try the Socket Adapter in EsperIO: doc link. Seems to be aimed more at getting events into Esper though; is that what you want? Otherwise, take the concept: sockets are one proven 'cookbook' way of implementing IPC and they save all that complicated messing with JNI.

Abstract-syntax tree for subset of C

For teaching purpose we are building a javascript step by step interpreter for (a subset of) C code.
Basically we have : int,float..., arrays, functions, for, while... no pointers.
The javascript interpreter is done and allow us to explain how a boolean expression is evaluated, will show the variables stack...
For now, we are manually converting our C examples to some javascript that will run and build a stack of actions (affectation, function call...) that can later on be used to do the step by step stuff. Since we are limiting ourselves to a subset of C it's quite easy to do.
Now we would like to compile the C code to our javascript representation. All we need is a Abstract-syntax tree of the C code and the javascript generation is straightforward.
Do you know a good C-parser that could generate a such tree ? No need to be in javascript (but that would be perfect), any language is alright as this can be done offline.
I've looked at Emscripten ( https://github.com/kripken/emscripten ) but it's more a C=>javascript compiler and that's not what we want.
I've recently used Eli Bendersky's pycparser to mess with ASTs of C code. I think it'd work well for your purposes.
I think that ANTLR has a full C parser.
To do your translation task, I suspect you will need full symbol table support; you have to know what the symbols mean. Here most "parsers" will fail you; they don't build a full symbol table. I think ANTLR does not, but I could be wrong.
Our DMS Software Reengineering Toolkit with its C Front End provides a full C arser, and builds complete symbol tables. (You may not need it for your application, but it includes a full C preprocessor, too). It also provide control flow, data flow, points-to-analysis and call graph construction, all of which can be useful in translating C to whatever your target virtual machine is.

How to interface between C and gprolog?

I am in the somewhat unfortunate position of interfacing C and Prolog code. We have some data collection code in C, and some analysis code in Gnu-Prolog. So what is the best method to interface C and gprolog? I am currently trying to use the C library included in the gprolog package to call Prolog from C.
Note: I am working on ubuntu machines.
One of the problems I was facing was how to iterate over a list. I finally realized that though you could make a list out of n elements, you had to iterate over it in Prolog fashion - get the head and get the tail and recurse.
There's an entire chapter called Interfacing Prolog and C in the GNU-Prolog manual. I expect that you've seen this since you mention the manual in your comment, but since you seem to be asking for more information than what's given there, perhaps you could be more specific about where you're having trouble?

Is there a way to translate C code to Ruby?

I'm sitting with massive amounts of legacy C code that needs to be converted to Ruby for a project.
I've seen Ruby to C translators online, but not the other way around. Would there be a simple way to approach this particular problem?
You'll either have to write a C to Ruby translator, which is possible but the effort might not be justifiable, or you could split the C code up into smaller modules that you can create Ruby wrappers for as a first step. Once they're all wrapped in Ruby and the main control flow is done in Ruby, you can write a test harness (both to verify correctness of your replacement code and to aid reverse engineering) and start replacing the C modules with Ruby modules.
The divide & conquer approach should work with regular Ruby if you use the modules as native extensions but obviously this will cause further problems if you're targeting something like JRuby as your runtime environment. If you want to do something similar in JRuby as per your comment, you're looking at wrapping the C modules in JNI and calling through from the JVM that way. Either way will allow your C code to interact with the Ruby code, but the two approaches are not interchangeable.
Neither approach is going to be quick and both are going to be a lot of work.
You could:
Write a C interpreter in Ruby (very hard).
Wrap the compiled C code with something like SWIG (much easier).
Programming in C and programming in Ruby bear completely different programming paradigms. So while the old saying that you can write Fortran (or C in this case) code in any language is true, the Ruby code that you would eventually get by machine translation wouldn't be Ruby at all, except syntactically.
So, IMHO, any way other than manual (and done by proficient Rubyists, I might add) would be either impossible, or at least not useful at all.
Maybe you can try this https://github.com/jackieju/CPP2Ruby/
It's fresh for cpp but existed longer for c to ruby
You can try Abap2Ruby. It can also convert cpp to ruby.

Is it possible to write code to write code?

I've heard that there are some things one cannot do as a computer programmer, but I don't know what they are. One thing that occurred to me recently was: wouldn't it be nice to have a class that could make a copy of the source of the program it runs, modify that program and add a method to the class that it is, and then run the copy of the program and terminate itself. Is it possible for code to write code?
If you want to learn about the limits of computability, read about the halting problem
In computability theory, the halting
problem is a decision problem which
can be stated as follows: given a
description of a program and a finite
input, decide whether the program
finishes running or will run forever,
given that input.
Alan Turing proved in 1936 that a
general algorithm to solve the halting problem for all
possible program-input pairs cannot exist
Start by looking at quines, then at Macro-Assemblers and then lex & yacc, and flex & bison. Then consider self-modifying code.
Here's a quine (formatted, use the output as the new input):
#include<stdio.h>
main()
{
char *a = "main(){char *a = %c%s%c; int b = '%c'; printf(a,b,a,b,b);}";
int b = '"';
printf(a,b,a,b,b);
}
Now if you're just looking for things programmers can't do look for the opposite of np-complete.
Sure it is. That's how a lot of viruses work!
Get your head around this: computability theory.
Yes, that's what most Lisp macros do (for just one example).
Yes it certainly is, though maybe not in the context you are referring to check out this post on t4.
If you look at Functional Programming that has many opportunities to write code that generates further code, the way that a language like Lisp doesn't differentiate between code and data is a significant part of it's power.
Rails generates the various default model and controller classes from the database schema when it's creating a new application. It's quite standard to do this kind of thing with dynamic languages- I have a few bits of PHP around that generate php files, just because it was the simplest solution to the problem I was dealing with at the time.
So it is possible. As for the question you are asking, though- that is perhaps a little vague- what environment and language are you using? What do you expect the code to do and why does it need to be added to? A concrete example may bring more directly relevant responses.
Yes it is possible to create code generators.
Most of the time they take user input and produce valid code. But there are other possibilities.
Self modifying programes are also possible. But they were more common in the dos era.
Of course you can! In fact, if you use a dynamic language, the class can change itself (or another class) while the program is still running. It can even create new classes that didn't exist before. This is called metaprogramming, and it lets your code become very flexible.
You are confusing/conflating two meanings of the word "write". One meaning is the physical writing of bytes to a medium, and the other is designing software. Of course you can have the program do the former, if it was designed to do so.
The only way for a program to do something that the programmer did not explicitly intend it to do, is to behave like a living creature: mutate (incorporate in itself bits of environment), and replicate different mutants at different rates (to avoid complete extinction, if a mutation is terminal).
Sure it is. I wrote an effect for Paint.NET* that gives you an editor and allows you to write a graphical effect "on the fly". When you pause typing it compiles it to a dll, loads it and executes it. Now, in the editor, you only need to write the actual render function, everything else necessary to create a dll is written by the editor and sent to the C# compiler.
You can download it free here: http://www.boltbait.com/pdn/codelab/
In fact, there is even an option to see all the code that was written for you before it is sent to the compiler. The help file (linked above) talks all about it.
The source code is available to download from that page as well.
*Paint.NET is a free image editor that you can download here: http://getpaint.net
In relation to artificial intelligence, take a look at Evolutionary algorithms.
make a copy of the source of the program it runs, modify that program and add a method to the class that it is, and then run the copy of the program and terminate itself
You can also generate code, build it into a library instead of an executable, and then dynamically load the library without even exiting the program that is currently running.
Dynamic languages usually don't work quite as you suggest, in that they don't have a completely separate compilation step. It isn't necessary for a program to modify its own source code, recompile, and start from scratch. Typically the new functionality is compiled and linked in on the fly.
Common Lisp is a very good language to practice this in, but there are others where you can created code and run it then and there. Typically, this will be through a function called "eval" or something similar. Perl has an "eval" function, and it's generally common for scripting languages to have the ability.
There are a lot of programs that write other programs, such as yacc or bison, but they don't have the same dynamic quality you seem to be looking for.
Take a look at Langtom's loop. This is the simplest example of self-reproducing "program".
There is a whole class of such things called "Code Generators". (Although, a compiler also fits the description as you set it). And those describe the two areas of these beasts.
Most code generates, take some form of user input (most take a Database schema) and product source code which is then compiled.
More advanced ones can output executable code. With .NET, there's a whole namespace (System.CodeDom) dedicated to the create of executable code. The these objects, you can take C# (or another language) code, compile it, and link it into your currently running program.
I do this in PHP.
To persist settings for a class, I keep a local variable called $data. $data is just a dictionary/hashtable/assoc-array (depending on where you come from).
When you load the class, it includes a php file which basically defines data. When I save the class, it writes the PHP out for each value of data. It's a slow write process (and there are currently some concurrency issues) but it's faster than light to read. So much faster (and lighter) than using a database.
Something like this wouldn't work for all languages. It works for me in PHP because PHP is very much on-the-fly.
It has always been possible to write code generators. With XML technology, the use of code generators can be an essential tool. Suppose you work for a company that has to deal with XML files from other companies. It is relatively straightforward to write a program that uses the XML parser to parse the new XML file and write another program that has all the callback functions set up to read XML files of that format. You would still have to edit the new program to make it specific to your needs, but the development time when a new XML file (new structure, new names) is cut down a lot by using this type of code generator. In my opinion, this is part of the strength of XML technology.
Lisp lisp lisp lisp :p
Joking, if you want code that generates code to run and you got time to loose learning it and breaking your mind with recursive stuff generating more code, try to learn lisp :)
(eval '(or true false))
wouldn't it be nice to have a class that could make a copy of the source of the program it runs, modify that program and add a method to the class that it is, and then run the copy of the program and terminate itself
There are almost no cases where that would solve a problem that cannot be solved "better" using non-self-modifying code..
That said, there are some very common (useful) cases of code writing other code.. The most obvious being any server-side web-application, which generates HTML/Javascript (well, HTML is markup, but it's identical in theory). Also any script that alters a terminals environment usually outputs a shell script that is eval'd by the parent shell. wxGlade generates code to that creates bare-bone wx-based GUIs.
See our DMS Software Reengineering Toolkit. This is general purpose machinery to read and modify programs, or generate programs by assembling fragments.
This is one of the fundamental questions of Artificial Intelligence. Personally I hope it is not possible - otherwise soon I'll be out of a job!!! :)
It is called meta-programming and is both a nice way of writing useful programs, and an interesting research topic. Jacques Pitrat's Artificial Beings: the conscience of a conscious machine book should interest you a lot. It is mostly related to meta-knowledge based computer programs.
Another related term is multi-staged programming (because there are several stages of programs, each generating the next one).

Resources