Reverse engineering from c code to sequence chart - c

I have a c code and I want to extract some patterns of code which contains instructions of communication etc and build a sequence chart from that.
Is there any way I can do that?
Thanks

I strongly recommend to use doxygen with the following options:
EXTRACT_ALL = YES
CALL_GRAPH = YES
CALLER_GRAPH = YES
GRAPHICAL_HIERARCHY = YES
You will get a very nice call and caller graphs of functions which can help a great deal with understanding the code. A call graph is more common and useful for C code than a sequence chart.
call graph http://pedro.larroy.com/files/example.png

You can use pic2plot which is part of GNU plotutils. You need log/trace what talks to what, to a file and then you can render it with pic2plot. I've done this for a python program, but there is no reason why you can't do that with a C program.
(source: umlgraph.org)
see http://www.umlgraph.org/

Related

Solving a Variable Equation defined by the User

Answers in C, Python, C++ or Javascript would be very much appreciated.
I've read a few books, done all the examples. Now I'd like to write a simple program.
But, I already ran into the following roadblock:
My intention is to take an equation from the user and save it in a variable,
For example:
-3*X+4 or pow(2,(sin(cos(x))/5)) > [In valid C Math syntax]
And then calculate the given expression for a certain X-Value.
Something like this:
printf("%g", UserFunction(3.2)) // Input 3.2 for X in User's Function and Print Result
Any ideas? For the life of me, I can't figure this out. Adding to my frustration, the solution is likely a very simply one. Thank you in advance.
There isn't a simple way to do this in C but I think muParser may be useful to you, it is written in C++ but has C binding. ExprTk is also an option but looks like it is C++ only, on the plus side it looks much easier to get interesting results with.
Another option may be the Expression Evaluation which is part of Libav. It is in C and the eval.h header has some good descriptions of the interface.
In compiled languages like C, C++, or Java there is no easy way to do this--you basically have to rewrite a whole compiler (or use an external library with an interpreter). This is only trivial in "scripting" languages like Python and Javascript, which have a function (often called "eval()") that evaluates expressions at runtime. This function is often dangerous, because it can also do things like call functions with side effects.
Ffmpeg/libav has a nice simple function evaluator you could use.

Call Esper from my C program

Can some kind soul please show me a cookbook way that I can call Esper from my C program? Ideally (I think that) I'd like to call an Esper function/method with a line of EDL and get a value returned.
EDIT: i ask this question because I have 12,000 lines of working C code that I want to keep. Esper offers some really nice event evaluation that's crucial to my C code. JNI seems to be oriented toward calling C code from Java, maybe because C is faster for some things; I want to go the other way: to call Java code from C to take advantage of the power in the Java package, which is called Esper.
Thanks!
Try the Socket Adapter in EsperIO: doc link. Seems to be aimed more at getting events into Esper though; is that what you want? Otherwise, take the concept: sockets are one proven 'cookbook' way of implementing IPC and they save all that complicated messing with JNI.

Abstract-syntax tree for subset of C

For teaching purpose we are building a javascript step by step interpreter for (a subset of) C code.
Basically we have : int,float..., arrays, functions, for, while... no pointers.
The javascript interpreter is done and allow us to explain how a boolean expression is evaluated, will show the variables stack...
For now, we are manually converting our C examples to some javascript that will run and build a stack of actions (affectation, function call...) that can later on be used to do the step by step stuff. Since we are limiting ourselves to a subset of C it's quite easy to do.
Now we would like to compile the C code to our javascript representation. All we need is a Abstract-syntax tree of the C code and the javascript generation is straightforward.
Do you know a good C-parser that could generate a such tree ? No need to be in javascript (but that would be perfect), any language is alright as this can be done offline.
I've looked at Emscripten ( https://github.com/kripken/emscripten ) but it's more a C=>javascript compiler and that's not what we want.
I've recently used Eli Bendersky's pycparser to mess with ASTs of C code. I think it'd work well for your purposes.
I think that ANTLR has a full C parser.
To do your translation task, I suspect you will need full symbol table support; you have to know what the symbols mean. Here most "parsers" will fail you; they don't build a full symbol table. I think ANTLR does not, but I could be wrong.
Our DMS Software Reengineering Toolkit with its C Front End provides a full C arser, and builds complete symbol tables. (You may not need it for your application, but it includes a full C preprocessor, too). It also provide control flow, data flow, points-to-analysis and call graph construction, all of which can be useful in translating C to whatever your target virtual machine is.

How to write own Configformat

I've developed an own file format for configuration files (plaintext and line based -> EOL = one configuration) for an application. This format is nothing quit special and the only reason I do this, is to learn something! The reader and writer functions will be implemented in C (with GLib because it should be a UTF8 encoded file).
So now, I'm thinking about the way I implement this format in C code. Which steps I have to do to get error messages that are as good as possible. I've heard something about Lexer, Parser, ... but never gone too deep in it. I’ve only a very abstract idea of them. So which steps I need to do to get a clean reader written in C for the format, which is also maintainable for future changes? What are the topics to learn/think about?
And yes I know: C is pain, there are a lot of diffrent "sexy" formats for this propose and so on. I want to learn something!
Cheers,
Gregor
Additional information
The reader/writer/parser (or whatever it's called) should depend on as little as possible on third party programs/components. The application around this config part already uses GLib, so that's whay GLib is also used for UTF8
One cool way of creating a config format is to embed a scripting language.
This gives you the parser for free and gives you the possibility to generate data on the fly or define variables that are being reused:
Consider these examples of xml vs an ugly pseudo scripting language:
<InputPoints>
<Point>
<x>1.0</x>
<y>1.0</y>
</Point>
<Point>
<x>1.0</x>
<y>2.0</y>
</Point>
<Point>
<x>1.0</x>
<y>3.0</y>
</Point>
<Point>
<x>1.0</x>
<y>4.0</y>
</Point>
<InputPoint>
vs:
for(i = 1; i <= 4; ++i) {
InputPoint(1, i);
}
or perhaps
<Username>allanballan</Username>
<Accountname>allanballan</Accountname>
<HomeDirectory>/home/allanballan</HomeDirectory>
vs
user = "allanballan";
Username = user;
Accountname = user;
HomeDirectory = "/home/"+user;
The first example compresses a list of points to a few statements, the second examples shows how to remove lots of redundant data using a temporary variable.
A popular language for this kind of situation is Lua. Exactly how to map a scripting language to configuration is up to the integrator, but it's really powerful and it comes with parsing and type checking for free.
You might want to look at the libconfig source code. It has a lightweight parser you could use as a starting point and that will probably help you in figuring out what a parser for your own format would have to look like.
Though, if you really want to learn about parsers and lexers, it would probably be better to implement a simple compiler. There's an MIT course you could follow.
Depending on how deep you'd like to dive into learning the matter, you should think about not writing your parser manually. You can do so of course, but it will be a great deal more complicated and adding new features to your language will burden you with the problems of always adapting lexer and parser code.
The good thing is, there are lots of tools out there that enable you to generate this stuff from a high-level description of your input and its structure. Standard *nix tools to do so are Lex and Yacc (or their descendants Flex and Bison), but I'd like to point you to ANTLR (http://www.antlr.org) instead. One of its nice features is that it provides backends for many different languages (C/C++ as well as Java, Python, Ruby, C#, ...), so learning how to work with it will also help you if you want to switch languages at a later point.

Is it possible to write code to write code?

I've heard that there are some things one cannot do as a computer programmer, but I don't know what they are. One thing that occurred to me recently was: wouldn't it be nice to have a class that could make a copy of the source of the program it runs, modify that program and add a method to the class that it is, and then run the copy of the program and terminate itself. Is it possible for code to write code?
If you want to learn about the limits of computability, read about the halting problem
In computability theory, the halting
problem is a decision problem which
can be stated as follows: given a
description of a program and a finite
input, decide whether the program
finishes running or will run forever,
given that input.
Alan Turing proved in 1936 that a
general algorithm to solve the halting problem for all
possible program-input pairs cannot exist
Start by looking at quines, then at Macro-Assemblers and then lex & yacc, and flex & bison. Then consider self-modifying code.
Here's a quine (formatted, use the output as the new input):
#include<stdio.h>
main()
{
char *a = "main(){char *a = %c%s%c; int b = '%c'; printf(a,b,a,b,b);}";
int b = '"';
printf(a,b,a,b,b);
}
Now if you're just looking for things programmers can't do look for the opposite of np-complete.
Sure it is. That's how a lot of viruses work!
Get your head around this: computability theory.
Yes, that's what most Lisp macros do (for just one example).
Yes it certainly is, though maybe not in the context you are referring to check out this post on t4.
If you look at Functional Programming that has many opportunities to write code that generates further code, the way that a language like Lisp doesn't differentiate between code and data is a significant part of it's power.
Rails generates the various default model and controller classes from the database schema when it's creating a new application. It's quite standard to do this kind of thing with dynamic languages- I have a few bits of PHP around that generate php files, just because it was the simplest solution to the problem I was dealing with at the time.
So it is possible. As for the question you are asking, though- that is perhaps a little vague- what environment and language are you using? What do you expect the code to do and why does it need to be added to? A concrete example may bring more directly relevant responses.
Yes it is possible to create code generators.
Most of the time they take user input and produce valid code. But there are other possibilities.
Self modifying programes are also possible. But they were more common in the dos era.
Of course you can! In fact, if you use a dynamic language, the class can change itself (or another class) while the program is still running. It can even create new classes that didn't exist before. This is called metaprogramming, and it lets your code become very flexible.
You are confusing/conflating two meanings of the word "write". One meaning is the physical writing of bytes to a medium, and the other is designing software. Of course you can have the program do the former, if it was designed to do so.
The only way for a program to do something that the programmer did not explicitly intend it to do, is to behave like a living creature: mutate (incorporate in itself bits of environment), and replicate different mutants at different rates (to avoid complete extinction, if a mutation is terminal).
Sure it is. I wrote an effect for Paint.NET* that gives you an editor and allows you to write a graphical effect "on the fly". When you pause typing it compiles it to a dll, loads it and executes it. Now, in the editor, you only need to write the actual render function, everything else necessary to create a dll is written by the editor and sent to the C# compiler.
You can download it free here: http://www.boltbait.com/pdn/codelab/
In fact, there is even an option to see all the code that was written for you before it is sent to the compiler. The help file (linked above) talks all about it.
The source code is available to download from that page as well.
*Paint.NET is a free image editor that you can download here: http://getpaint.net
In relation to artificial intelligence, take a look at Evolutionary algorithms.
make a copy of the source of the program it runs, modify that program and add a method to the class that it is, and then run the copy of the program and terminate itself
You can also generate code, build it into a library instead of an executable, and then dynamically load the library without even exiting the program that is currently running.
Dynamic languages usually don't work quite as you suggest, in that they don't have a completely separate compilation step. It isn't necessary for a program to modify its own source code, recompile, and start from scratch. Typically the new functionality is compiled and linked in on the fly.
Common Lisp is a very good language to practice this in, but there are others where you can created code and run it then and there. Typically, this will be through a function called "eval" or something similar. Perl has an "eval" function, and it's generally common for scripting languages to have the ability.
There are a lot of programs that write other programs, such as yacc or bison, but they don't have the same dynamic quality you seem to be looking for.
Take a look at Langtom's loop. This is the simplest example of self-reproducing "program".
There is a whole class of such things called "Code Generators". (Although, a compiler also fits the description as you set it). And those describe the two areas of these beasts.
Most code generates, take some form of user input (most take a Database schema) and product source code which is then compiled.
More advanced ones can output executable code. With .NET, there's a whole namespace (System.CodeDom) dedicated to the create of executable code. The these objects, you can take C# (or another language) code, compile it, and link it into your currently running program.
I do this in PHP.
To persist settings for a class, I keep a local variable called $data. $data is just a dictionary/hashtable/assoc-array (depending on where you come from).
When you load the class, it includes a php file which basically defines data. When I save the class, it writes the PHP out for each value of data. It's a slow write process (and there are currently some concurrency issues) but it's faster than light to read. So much faster (and lighter) than using a database.
Something like this wouldn't work for all languages. It works for me in PHP because PHP is very much on-the-fly.
It has always been possible to write code generators. With XML technology, the use of code generators can be an essential tool. Suppose you work for a company that has to deal with XML files from other companies. It is relatively straightforward to write a program that uses the XML parser to parse the new XML file and write another program that has all the callback functions set up to read XML files of that format. You would still have to edit the new program to make it specific to your needs, but the development time when a new XML file (new structure, new names) is cut down a lot by using this type of code generator. In my opinion, this is part of the strength of XML technology.
Lisp lisp lisp lisp :p
Joking, if you want code that generates code to run and you got time to loose learning it and breaking your mind with recursive stuff generating more code, try to learn lisp :)
(eval '(or true false))
wouldn't it be nice to have a class that could make a copy of the source of the program it runs, modify that program and add a method to the class that it is, and then run the copy of the program and terminate itself
There are almost no cases where that would solve a problem that cannot be solved "better" using non-self-modifying code..
That said, there are some very common (useful) cases of code writing other code.. The most obvious being any server-side web-application, which generates HTML/Javascript (well, HTML is markup, but it's identical in theory). Also any script that alters a terminals environment usually outputs a shell script that is eval'd by the parent shell. wxGlade generates code to that creates bare-bone wx-based GUIs.
See our DMS Software Reengineering Toolkit. This is general purpose machinery to read and modify programs, or generate programs by assembling fragments.
This is one of the fundamental questions of Artificial Intelligence. Personally I hope it is not possible - otherwise soon I'll be out of a job!!! :)
It is called meta-programming and is both a nice way of writing useful programs, and an interesting research topic. Jacques Pitrat's Artificial Beings: the conscience of a conscious machine book should interest you a lot. It is mostly related to meta-knowledge based computer programs.
Another related term is multi-staged programming (because there are several stages of programs, each generating the next one).

Resources