Plugin architecture in C using libdl - c

I've been toying around, writing a small IRC framework in C that I'm now going to expand with some core functionality - but beyond that, I'd like it to be extensible with plugins!
Up until now, whenever I wrote something IRC related (and I wrote a lot, in about 6 different languages now... I'm on fire!) and actually went ahead to implement a plugin architecture, it was inside an interpreted language that had facilities for doing (read: abusing) so, like jamming a whole script file trough eval in Ruby (bad!).
Now I want to abuse something in C!
Basically there's three things I could do
define a simple script language inside of my program
use an existing one, embedding an interpreter
use libdl to load *.so modules on runtime
I'm fond of the third one and raather avoid the other two options if possible. Maybe I'm a masochist of some sort, but I think it could be both fun and useful for learning purposes.
Logically thinking, the obvious "pain-chain" would be (lowest to highest) 2 -> 1 -> 3, for the simple reason that libdl is dealing with raw code that can (and will) explode in my face more often than not.
So this question goes out to you, fellow users of stackoverflow, do you think libdl is up to this task, or even a realistic thought?

libdl is very well suited to plug-in architectures - within certain boundaries :-). It is used a lot for exactly this sort of purpose in a lot of different software. It works well in situations where there is a well-defined API/interface between the main program and the plug-in, and a number of different plug-ins implement the same API/interface. For instance, your IRC client might have plug-ins that implement gateways to different IM protocols (Jabber, MSN, Sametime etc...) - all of these are very similar, so you could define an API with functions such as "send message", "check for reply" etc - and write a bunch of plug-ins that each implemented a different one of the protocols.
The situation where it works less well is where you want to have the plug-ins make arbitrary changes to the behaviour of the main program - in the way that, for instance, Firefox plug-ins can change the behaviour of browser tabs, their appearance, add/remove buttons, and so on. This sort of thing is much easier to achieve in a dynamic language (hence why much of Firefox is implemented in javascript), and if this is the sort of customisation you want you may be better off with your option (2), and writing a lot of your UI in the scripting language...

dlopen() / dlsym() are probably the easiest way to go. Some silly psuedo code:
int run_module(const char *path, char **args)
{
void *module;
void (*initfunc)(char **agrs);
int rc = 0;
module = dlopen(path, RTLD_NOW);
if (module == NULL)
err_out("Could not open module %s", path);
initfunc = dlsym(module, "module_init");
if (initfunc == NULL) {
dlclose(module);
err_out("Could not find symbol init_func in %s", path);
}
rc = initfunc(args);
dlclose(module);
return rc;
}
You would, of course, want much more in the way of error checking, as well as code that actually did something useful :) It is, however extremely easy and convenient to write a plug-in architecture around the pair and publish an easy spec for others to do the same.
You'd probably want something more along the lines of load_module(), the above just loads the SO, seeks an entry point and blocks until that entry point exits.
That's not to say that writing your own scripting language is a bad idea. People could write complex filters, responders, etc without having to go through a lot of trouble. Perhaps both would be a good idea. I don't know if you'd want a full fledged LUA interpreter .. maybe you could come up with something that makes taking actions based on regular expressions simple.
Still, plug in modules will not only make your life simpler, they'll help you grow a community of people developing stuff around whatever you make.

There are plenty of existing C programs out there that use dlopen() / dlsym() to implement a plugin architecture (including more than one IRC-related one); so yes, it is definitely up to the task.

Related

Advantages of dynamic linking with either ld utility vs. dlfcn API?

I am doing some research into platform independent code and found mention of the dlfcn API. It was the first time I came across mention of it and did further research into it. Now hopefully my lack of experience/understanding of platform independent code as well as compiling/linking isn't going to show in this post but to me the dlfcn API just lets us do the same dynamic linking programmatically that the ld utility does. If I have misconceptions please correct me as I would like to know. Regarding what I think I know about the ld utility and the dlfcn API I have some questions.
What are the advantages of using either the ld utility vs. dlfcn API to dynamically link?
My first thought was that the dlfcn API seems like a waste of my time since I need to request pointers to the functions vs. having ld examine a symbol table for undefined symbols and then linking them. Similarly ld does everything for me while I have to do everything by hand with the dlfcn API (i.e. open/load the library, get a function pointer, close the library, etc.). But on second glance I thought that there may be some advantages. One being that we can load a library out of memory after we are done using it.
In this way memory could be saved if we knew we didn't need to utilize a library the whole time. I am unsure if there is any "memory/library" management for libraries dynamically linked by ld? Similarly I am unsure of what scenarios/environments would we be interested in using the dlfcn API to save said memory as it seems this wouldn't be a problem in modern day systems. I presume one would be the usage of the library on a system with very very very limited resources (maybe some embedded system?).
What other advantages or disadvantages may there be?
What "coding pattern" is used for platform independent code in regards to dynamic linking?
If I was making platform independent code that depended on system calls I could see myself achieving platform independent code by coding in one of three styles:
Logical branching directly in my libraries code via macros. Something like:
void myAwesomeFunction()
{
...
#if defined(_MSC_VER)
// Call some Windows system call
#elif defined(__GNUC__)
// Call some Unix system call
...
}
Create generic system call functions and use those in my libraries code. Something like:
OS_Calls.h
void OS_openFile(string myFile)
{
...
#if defined(_MSC_VER)
// Call Windows system call to open file
#elif defined(__GNUC__)
// Call Unix system call to open file
...
}
MyAwesomeFunctions.cpp
#include "OS_Calls.h"
void myAwesomeFunction()
{
...
OS_openFile("my awesome file");
...
}
Similar to one but add a layer of abstraction by using the dlfcn API
MyLibraryLoader.h
void* GetLibraryFunction(void* lib, char* funcName)
{
...
return dlsym(lib, funcName);
}
MyAwesomeFunctions.cpp
#include "MyLibraryLoader.h"
void myAwesomeFunction()
{
Result result = GetLibraryFunction(someLib, someFunc)(arguments...);
}
What ones are typically used and why? And if there are any others that aren't listed and preferred to mine please let me know.
Thanks for reading this post. I will keep it updated so that it may serve as a future informative reference.
dlfcn and ld does not solve the same problem: in fact you can use both in your project.
The dlfcn API is meant to support plugin architectures, in which you define an interface which modules should implement. An application can then load different implementations of that interface, for various reasons (extensibility, customization, etc.).
ld, well, links the libraries your application request, but does that at compile time, not at runtime time. It doesn't support in any way plugin architectures, since ld links objects specified in the command line.
Of course you can only use the dlfcn API, but it is not meant to be used in that way and, of course, using it in that way would be a huge pain in your rectum.
For your second question, I think the best pattern is the second one.
Branching "directly in the code" can be confusing, because it's not immediately obvious what the two branches accomplish, something which is well-defined if you define a proper abstraction and you implement it using multiple branches for each supported architecture.
Using the dlfcn API is pretty pointless, because you don't have a uniform interface to call (that's exactly the argument that supports the second pattern), so it just adds bloats in your code.
HTH
I don't think dynamic linkage helps you much with platform independence.
Your second option seems like a reasonable way to be platform independence. Most of the code just calls your platform independent wrappers, while a small part of it is "dirty" with ifdefs.
I don't see how dynamic loading helps here.
Some pros and cons for dynamic loading:
1. Cons:
a. Not the "straightforward" way, requires more work.
b. Prevents standard tools (e.g. ldd) from analyzing dependenies (thus helping you undersatnd what you need to successfully run).
2. Pros:
a. Allows loading only what you need (e.g. depending on command line arguments), or unloading what you don't. This can save memory.
b. Lets you generate library names in more complicated ways (e.g. read a list of plugins from a configuration file).

Spaghetti code visualisation software?

a smoking pile of spaghetti just landed on my desk, and my task is to understand it (so I can refactor / reimplement it).
The code is C, and a mess of global variables, structure types and function calls.
I would like to plot graphs of the code with the information:
- Call graph
- Which struct types are used in which functions
- Which global variable is used in what function
Hopefully this would make it easier to identify connected components, and extract them to separate modules.
I have tried the following software for similar purposes:
- ncc
- ctags
- codeviz / gengraph
- doxygen
- egypt
- cflow
EDIT2:
- frama-c
- snavigator
- Understand
The shortcomings of these are either
a) requires me to be able to compile the code. My code does not compile, since portions of the source code is missing.
b) issues with preprocessor macros (like cflow, who wants to execute both branches of #if statements). Running it through cpp would mess up the line numbers.
c) I for some reason do not manage to get the software to do what I want to do (like doxygen; the documentation for call graph generation is not easy to find, and since it does not seem to plot variables/data types anyway, it is probably not worth spending more time learning about doxygen's config options). EDIT: I did follow a these Doxygen instrcutions, but it did only plot header file dependencies.
I am on Linux, so it is a huge plus if the software is for linux, and free software. Not sure my boss understands the need to buy a visualizer :-(
For example: a command line tool that lists in which functions a symbol (=function,variable,type) is referenced in would be of great help (like addr2line, but for types/variable names/functions and source code).
//T
My vote goes to gnu global. It has all the features of ctags/cscope combined as well as the possibility to generate fully indexed html which allows you to browse the code in your favorite browser. Fire it up in apache and you have a web-service that anyone can access including full search capabilities.
It integrates nicely into emacs/vim/even the bash-shell, and you can use it directly from the shell-prompt.
To see it in action on the linux kernel, visit this
Combine that with a tool for cyclomatic complexity plugin for eclipse which calculates the complexity of your code. besides the cyclomatic complexity it can handle:
McCabe's Cyclomatic Complexity
Efferent Couplings
Lack of Cohesion in Methods
Lines Of Code in Method
Number Of Fields
Number Of Levels
Number Of Locals In Scope
Number Of Parameters
Number Of Statements
Weighted Methods Per Class
...and you should have everything you need.
If you like command line ;) maybe you could try cscope, it does static analysis of code and can tell you where are referenced some symbols/variables/functions... Not the Holy Graal, but it can be pretty usefull to browse unknown source code.
There are also some GUI that can handle csope results (Vi, Emacs, JEdit...).
On the other hand, Eclipse with the CDT plugin can also help you to navigate into the spaghetti code you have to maintain.
It's not free and afaik not linux but cppDepend might be worth evaluating - at least until someone comes up with a more suitable suggestion :)
http://www.cppdepend.com/ [Demo video here]
If you'd like to know in which functions a symbol is declared or referenced you can try LXR. It's not console based, but is quite usable.

Clean, self-contained VM implemented in C and under 100-200K compiled code size?

I'm looking for a VM with the following features:
Small compiled code footprint (under 200K).
No external dependencies.
Unicode (or raw) string support.
Clean code/well organized.
C(99) code, NOT C++.
C/Java-like syntax.
Operators/bitwise: AND/OR, etc.
Threading support.
Generic/portable bytecode. Bytecode should work on different machines even if it was compiled on a different architecture with different endianness etc.
Barebones, nothing fancy necessary. Only the basic language support.
Lexer/parser and compiler separate from VM. I will be embedding the VM in a program and then compile the bytecode independently.
So far I have reviewed Lua, Squirrel, Neko, Pawn, Io, AngelScript... and the only one which comes somewhat close to the spec is Lua, but the syntax is horrible, it does not have bitwise support, and the code style generally sucks. Squirrel and IO are huge, mostly. Pawn is problematic, it is small, but bytecode is not cross platform and the implementation has some serious issues (ex bytecode is not validated at all, not even the headers AFAIK).
I would love to find a suitable option out there.
Thanks!
Update: Javascript interpreters are... interpreters. This is a VM question for a bytecode-based VM, hence the compiler/bytecode vm separation requirement. JS is interpreted, and very seldom compiled by JIT. I don't want JIT necessarily. Also, all current ECMAScript parsers are all but small.
You say you've reviewed NekoVM, but don't mention why it's not suitable for you.
It's written in C, not C++, the VM is under 10kLOC with a compiled size of roughly 100kB, and the compiler is a separate executable producing portable bytecode. The language itself has C-like syntax, bitwise operators, and it's not thread-hostile.
Finally after all this time none of the answers really did it. I ended up forking LUA. As of today no self contained VM with the above requirements exists... it's a pity ;(
Nonetheless, Pawn is fairly nice, if only the code wasn't kind of problematic.
JerryScript:
requires less than 64 KB of RAM
~160 KB binary size
written in C99
VM based
has bytecode precompilation
IoT JavaScript glues JerryScript with libuv (nodejs style) - it may be easier to play with.
Threading is probably not there in a state you want. There are recent additions to ECMAScript around background workers on separate threads and shared, cross-thread buffers - not sure what's the story with it in JerryScript - probably not there yet, but who knows - they have a blueprint for how to do it, may not be far.
Try EmbedVM.
http://www.clifford.at/embedvm/
http://svn.clifford.at/embedvm/trunk/
Here's an example of some code, a guessing game. The compiler is built in C with lex+yacc:
global points;
function main()
{
local num, guess;
points = 0;
while (1)
{
// report points
$uf4();
// get next random number
num = $uf0();
do {
// read next guess
guess = $uf1();
if (guess < num) {
// hint to user: try larger numbers
$uf2(+1);
points = points - 1;
}
if (guess > num) {
// hint to user: try smaller numbers
$uf2(-1);
points = points - 1;
}
} while (guess != num);
// level up!
points = points + 10;
$uf3();
}
}
There isn't any threading support. But there's no global state in the VM, so it's easy to run multiple copies in the same process.
The API is simple. VM RAM is accessed via callbacks. Your main loop calls embedvm_exec(vmdata) repeatedly, it executes a single operation and returns.
The VM has a tiny footprint and has been used on 8-bit microcontrollers.
For something very "barebones" :
http://en.wikibooks.org/wiki/Creating_a_Virtual_Machine/Register_VM_in_C
More of a short introduction to the topic than anything else, granted.
Yet, it probably meets at least these few of the desired criteria :
Small compiled code footprint (under 200K) ... check, obviously;
No external dependencies ... check;
Clean code/well organized ... check;
C(99) code, NOT C++ ... check;
C/Java-like syntax ... check.
On option is to use something minimal and extend it. mini-vm is under 200 lines of code, including comments, it has a liberal license (MIT), it's written in C. Out of the box it supports 0 operations, but it is very easy to extend. The included example compiler is only a simple calculator. But one could easily imagine adding comparisons, branches, memory access, and supervisor calls to take it where you want to go. A VM that is easy to extend is especially useful for developing domain specific languages, and having multiple languages target your flavor of mini-vm would be straight forward other than having to implement multiple compilers (or port them. the QuakeC compiler is just lcc, and very easy to retarget).
Threading support would have to be an extension, and the core VM would not play nicely in a multiprocessor pthread scenario (heavyweight threading). Weirdly mini-vm can have a pc (program counter) per heavyweight thread, but would share registers among all threads using the same context. Running separate contexts would be thread-safe though.
I'm skipping answering the requirements on language because the question starts off asking for a barebones VM. But at the same time demands C/Java like syntax, not sure how to resolve that conflict other than stating this conflict.
Try embedding a JavaScript interpreter in your code.
http://www.mozilla.org/js/spidermonkey/

How should I structure complex projects in C? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have little more than beginner-level C skills and would like to know if there are any de facto "standards" to structure a somewhat complex application in C. Even GUI based ones.
I have been always using the OO paradigm in Java and PHP and now that I want to learn C I'm afraid that I might structure my applications in the wrong way. I'm at a loss on which guidelines to follow to have modularity, decoupling and dryness with a procedural language.
Do you have any readings to suggest? I couldn't find any application framework for C, even if I don't use frameworks I've always found nice ideas by browsing their code.
The key is modularity. This is easier to design, implement, compile and maintain.
Identify modules in your app, like classes in an OO app.
Separate interface and implementation for each module, put in interface only what is needed by other modules. Remember that there is no namespace in C, so you have to make everything in your interfaces unique (e.g., with a prefix).
Hide global variables in implementation and use accessor functions for read/write.
Don't think in terms of inheritance, but in terms of composition. As a general rule, don't try to mimic C++ in C, this would be very difficult to read and maintain.
If you have time for learning, take a look at how an Ada app is structured, with its mandatory package (module interface) and package body (module implementation).
This is for coding.
For maintaining (remember that you code once, but you maintain several times) I suggest to document your code; Doxygen is a nice choice for me. I suggest also to build a strong regression test suite, which allows you to refactor.
It's a common misconception that OO techniques can't be applied in C. Most can -- it's just that they are slightly more unwieldy than in languages with syntax dedicated to the job.
One of the foundations of robust system design is the encapsulation of an implementation behind an interface. FILE* and the functions that work with it (fopen(), fread() etc.) is a good example of how encapsulation can be applied in C to establish interfaces. (Of course, since C lacks access specifiers you can't enforce that no-one peeks inside a struct FILE, but only a masochist would do so.)
If necessary, polymorphic behaviour can be had in C using tables of function pointers. Yes, the syntax is ugly but the effect is the same as virtual functions:
struct IAnimal {
int (*eat)(int food);
int (*sleep)(int secs);
};
/* "Subclass"/"implement" IAnimal, relying on C's guaranteed equivalence
* of memory layouts */
struct Cat {
struct IAnimal _base;
int (*meow)(void);
};
int cat_eat(int food) { ... }
int cat_sleep(int secs) { ... }
int cat_meow(void) { ... }
/* "Constructor" */
struct Cat* CreateACat(void) {
struct Cat* x = (struct Cat*) malloc(sizeof (struct Cat));
x->_base.eat = cat_eat;
x->_base.sleep = cat_sleep;
x->meow = cat_meow;
return x;
}
struct IAnimal* pa = CreateACat();
pa->eat(42); /* Calls cat_eat() */
((struct Cat*) pa)->meow(); /* "Downcast" */
All good answers.
I would only add "minimize data structure". This might even be easier in C, because if C++ is "C with classes", OOP is trying to encourage you to take every noun / verb in your head and turn it into a class / method. That can be very wasteful.
For example, suppose you have an array of temperature readings at points in time, and you want to display them as a line-chart in Windows. Windows has a PAINT message, and when you receive it, you can loop through the array doing LineTo functions, scaling the data as you go to convert it to pixel coordinates.
What I have seen entirely too many times is, since the chart consists of points and lines, people will build up a data structure consisting of point objects and line objects, each capable of DrawMyself, and then make that persistent, on the theory that that is somehow "more efficient", or that they might, just maybe, have to be able to mouse over parts of the chart and display the data numerically, so they build methods into the objects to deal with that, and that, of course, involves creating and deleting even more objects.
So you end up with a huge amount of code that is oh-so-readable and merely spends 90% of it's time managing objects.
All of this gets done in the name of "good programming practice" and "efficiency".
At least in C the simple, efficient way will be more obvious, and the temptation to build pyramids less strong.
The GNU coding standards have evolved over a couple of decades. It'd be a good idea to read them, even if you don't follow them to the letter. Thinking about the points raised in them gives you a firmer basis on how to structure your own code.
If you know how to structure your code in Java or C++, then you can follow the same principles with C code. The only difference is that you don't have the compiler at your side and you need to do everything extra carefully manually.
Since there are no packages and classes, you need to start by carefully designing your modules. The most common approach is to create a separate source folder for each module. You need to rely on naming conventions for differentiating code between different modules. For example prefix all functions with the name of the module.
You can't have classes with C, but you can easily implement "Abstract Data Types". You create a .C and .H file for every abstract data type. If you prefer you can have two header files, one public and one private. The idea is that all structures, constants and functions that need to be exported go to the public header file.
Your tools are also very important. A useful tool for C is lint, which can help you find bad smells in your code. Another tool you can use is Doxygen, which can help you generate documentation.
Encapsulation is always key to a successful development, regardless of the development language.
A trick I've used to help encapsulate "private" methods in C is to not include their prototypes in the ".h" file.
I'd suggets you to check out the code of any popular open source C project, like... hmm... Linux kernel, or Git; and see how they organize it.
The number rule for complex application: it should be easy to read.
To make complex application simplier, I employ Divide and conquer.
I would suggest reading a C/C++ textbook as a first step. For example, C Primer Plus is a good reference. Looking through the examples would give you and idea on how to map your java OO to a more procedural language like C.

Is it possible to write code to write code?

I've heard that there are some things one cannot do as a computer programmer, but I don't know what they are. One thing that occurred to me recently was: wouldn't it be nice to have a class that could make a copy of the source of the program it runs, modify that program and add a method to the class that it is, and then run the copy of the program and terminate itself. Is it possible for code to write code?
If you want to learn about the limits of computability, read about the halting problem
In computability theory, the halting
problem is a decision problem which
can be stated as follows: given a
description of a program and a finite
input, decide whether the program
finishes running or will run forever,
given that input.
Alan Turing proved in 1936 that a
general algorithm to solve the halting problem for all
possible program-input pairs cannot exist
Start by looking at quines, then at Macro-Assemblers and then lex & yacc, and flex & bison. Then consider self-modifying code.
Here's a quine (formatted, use the output as the new input):
#include<stdio.h>
main()
{
char *a = "main(){char *a = %c%s%c; int b = '%c'; printf(a,b,a,b,b);}";
int b = '"';
printf(a,b,a,b,b);
}
Now if you're just looking for things programmers can't do look for the opposite of np-complete.
Sure it is. That's how a lot of viruses work!
Get your head around this: computability theory.
Yes, that's what most Lisp macros do (for just one example).
Yes it certainly is, though maybe not in the context you are referring to check out this post on t4.
If you look at Functional Programming that has many opportunities to write code that generates further code, the way that a language like Lisp doesn't differentiate between code and data is a significant part of it's power.
Rails generates the various default model and controller classes from the database schema when it's creating a new application. It's quite standard to do this kind of thing with dynamic languages- I have a few bits of PHP around that generate php files, just because it was the simplest solution to the problem I was dealing with at the time.
So it is possible. As for the question you are asking, though- that is perhaps a little vague- what environment and language are you using? What do you expect the code to do and why does it need to be added to? A concrete example may bring more directly relevant responses.
Yes it is possible to create code generators.
Most of the time they take user input and produce valid code. But there are other possibilities.
Self modifying programes are also possible. But they were more common in the dos era.
Of course you can! In fact, if you use a dynamic language, the class can change itself (or another class) while the program is still running. It can even create new classes that didn't exist before. This is called metaprogramming, and it lets your code become very flexible.
You are confusing/conflating two meanings of the word "write". One meaning is the physical writing of bytes to a medium, and the other is designing software. Of course you can have the program do the former, if it was designed to do so.
The only way for a program to do something that the programmer did not explicitly intend it to do, is to behave like a living creature: mutate (incorporate in itself bits of environment), and replicate different mutants at different rates (to avoid complete extinction, if a mutation is terminal).
Sure it is. I wrote an effect for Paint.NET* that gives you an editor and allows you to write a graphical effect "on the fly". When you pause typing it compiles it to a dll, loads it and executes it. Now, in the editor, you only need to write the actual render function, everything else necessary to create a dll is written by the editor and sent to the C# compiler.
You can download it free here: http://www.boltbait.com/pdn/codelab/
In fact, there is even an option to see all the code that was written for you before it is sent to the compiler. The help file (linked above) talks all about it.
The source code is available to download from that page as well.
*Paint.NET is a free image editor that you can download here: http://getpaint.net
In relation to artificial intelligence, take a look at Evolutionary algorithms.
make a copy of the source of the program it runs, modify that program and add a method to the class that it is, and then run the copy of the program and terminate itself
You can also generate code, build it into a library instead of an executable, and then dynamically load the library without even exiting the program that is currently running.
Dynamic languages usually don't work quite as you suggest, in that they don't have a completely separate compilation step. It isn't necessary for a program to modify its own source code, recompile, and start from scratch. Typically the new functionality is compiled and linked in on the fly.
Common Lisp is a very good language to practice this in, but there are others where you can created code and run it then and there. Typically, this will be through a function called "eval" or something similar. Perl has an "eval" function, and it's generally common for scripting languages to have the ability.
There are a lot of programs that write other programs, such as yacc or bison, but they don't have the same dynamic quality you seem to be looking for.
Take a look at Langtom's loop. This is the simplest example of self-reproducing "program".
There is a whole class of such things called "Code Generators". (Although, a compiler also fits the description as you set it). And those describe the two areas of these beasts.
Most code generates, take some form of user input (most take a Database schema) and product source code which is then compiled.
More advanced ones can output executable code. With .NET, there's a whole namespace (System.CodeDom) dedicated to the create of executable code. The these objects, you can take C# (or another language) code, compile it, and link it into your currently running program.
I do this in PHP.
To persist settings for a class, I keep a local variable called $data. $data is just a dictionary/hashtable/assoc-array (depending on where you come from).
When you load the class, it includes a php file which basically defines data. When I save the class, it writes the PHP out for each value of data. It's a slow write process (and there are currently some concurrency issues) but it's faster than light to read. So much faster (and lighter) than using a database.
Something like this wouldn't work for all languages. It works for me in PHP because PHP is very much on-the-fly.
It has always been possible to write code generators. With XML technology, the use of code generators can be an essential tool. Suppose you work for a company that has to deal with XML files from other companies. It is relatively straightforward to write a program that uses the XML parser to parse the new XML file and write another program that has all the callback functions set up to read XML files of that format. You would still have to edit the new program to make it specific to your needs, but the development time when a new XML file (new structure, new names) is cut down a lot by using this type of code generator. In my opinion, this is part of the strength of XML technology.
Lisp lisp lisp lisp :p
Joking, if you want code that generates code to run and you got time to loose learning it and breaking your mind with recursive stuff generating more code, try to learn lisp :)
(eval '(or true false))
wouldn't it be nice to have a class that could make a copy of the source of the program it runs, modify that program and add a method to the class that it is, and then run the copy of the program and terminate itself
There are almost no cases where that would solve a problem that cannot be solved "better" using non-self-modifying code..
That said, there are some very common (useful) cases of code writing other code.. The most obvious being any server-side web-application, which generates HTML/Javascript (well, HTML is markup, but it's identical in theory). Also any script that alters a terminals environment usually outputs a shell script that is eval'd by the parent shell. wxGlade generates code to that creates bare-bone wx-based GUIs.
See our DMS Software Reengineering Toolkit. This is general purpose machinery to read and modify programs, or generate programs by assembling fragments.
This is one of the fundamental questions of Artificial Intelligence. Personally I hope it is not possible - otherwise soon I'll be out of a job!!! :)
It is called meta-programming and is both a nice way of writing useful programs, and an interesting research topic. Jacques Pitrat's Artificial Beings: the conscience of a conscious machine book should interest you a lot. It is mostly related to meta-knowledge based computer programs.
Another related term is multi-staged programming (because there are several stages of programs, each generating the next one).

Resources