I've designed a parser in C that is able to generate AST, but when I begin to implement simplifications it really got messed up. I've successfully implemented rules for the summation below;
x + 0 -> x
x + x -> 2 * x
etc.
But it took huge amount of effort and code to do it. What I did was to search entire tree and try to find a pattern that I can use (lots of recursion) then if there was a cascade of PLUS nodes, I've added them to a list, then worked on that list (summing numbers and combining variables etc.) then I created another tree from that list, and merged it to existing one. It was this paper I used to implement it. In short given the expression 2*x+1+1+x+0 I got 3*x+2. And it was just summation that got me into so much trouble, I can even imagine the advanced stuff. So I realized I was ding something wrong.
I've read this thread but I'm really confused about term rewriting systems (what it really is, how to implement in C).
Is there a more general and effective way to do simplification on AST? Or how to write a term rewriting system in C
Term rewriting is (in simple words) like the 2 examples you provided. (How to convert x + 0 to x in a AST?). It is about pattern matching on AST's, and once there is a match, a conversion of an equivalent expression. It is also called a term rewriting rule.
Note that having a term rewriting rule is not the absolute or general solution of algebraic simplification. The general solution involves having many rewriting rules (you showed two of them), and apply them in a given AST repeatedly until no one success.
Then, the general solution involves the process or coordination on the application of the rewriting rules. i.e. in order to avoid the re-application of a rule that has previously failed, as an example.
There is not a unique way to do it. There are several systems. For proprietary systems it is not known because they keeps it in secrecy, but there are open source systems too, for example Mathomatica is written in C.
I recommend you to check the open system Fōrmulæ. In this, the process of coordination of rewriting rules (which is called "the reduction engine") is relatively simple. It is written in Java. The advantage of this system is that rewriting rules are not hardwired/hardcoded in the system or the reduction engine (they are hot pluggable). Coding a rewriting rule involves the process of pattern matching and conversion, but no when or how it will be called (it follows the Hollywood principle).
In the specific case of Fōrmulæ:
The reduction engine is based (in general terms) on the post-order tree traversal algorithm. so when a node is "visited", its sub-nodes were already visited and (possibly) transformed, but it is possible to alter such that flow (i.e. to solve the unwanted referentiation of a variable in an assignment x <- 5). Note that it is not just a tree traversal, the AST is being actually changed in the process.
In order to efficiently manage the (possibly hundred or thousand) of rewriting rules, every rule has a type of expression where it is applicable, and when a single node is "visited", only the associated rules are checked for a match. For example, your 2 rules can only be applied to "addition" nodes of an AST.
Rewriting rules are not limited to algebraic simplification, they can be used in many other fields such as programming (Fōrmulæ is also its programming language, see examples of Fōrmulæ programs, or in automatic or assisted theorem proving.
Related
I wrote a code which has all the rules of Sudoku written into it (one occurence of a digit per column, line, and square). The code takes an input (unfilled sudoku grid), and returns a solution by translating logical clauses into DIMACS format and using a SAT solver.
Given that the algorithm respects rules, takes in data, and uses that data to form conclusion based on implications (eg if there is a 1 in the first cell, there cannot be a 1 in the second cell), is this code considered an "expert system"? Thank you.
Whether a program is an expert system is subjective, but I'd say unless your program is encoding non-trivial knowledge acquired from a domain expert, it's not an expert system. If you can't teach another person to practically do what your program is doing, it's not an expert system.
By that definition, what you've done is probably not an expert system since it would be too time consuming for a person to use the same technique. I've written a sudoku solver using a production system (https://sourceforge.net/p/clipsrules/code/HEAD/tree/branches/63x/examples/sudoku/) that I would consider to be an expert system. The encoded knowledge was acquired from websites with advanced techniques for humans to use for solving sudoku puzzles. All of the encoded techniques can be practically used by humans for solving puzzles (although some of the more complex techniques push that boundary).
Although my sudoku solver can solve much more complicated puzzles than I could, calling it an expert system is not an indication of its sophistication. There are better approaches for solving extremely complex sudoku puzzles than emulating approaches humans might take.
In the 80's, I had written a clone of the Emycin expert system engine. One important characteristic was the ability for the user to ask WHY the expert system got some conclusion. The system could reply (in an almost natural language) that it applied such and such rules to get to the conclusion.
With this kind of system, the knowledge is modeled and implemented (by a cognitician engineer) as an explicit set of rules. These rules are objects known by the engine. The engine can trigger the rules (forward or backward or maybe using metarules...) and can log the triggered rules and thus explain its conclusions.
(this is my sense for expert systems).
I am trying to solve a problem of finding the roots of a function using the Newton-Raphson (NR) method in the C language. The functions in which I would like to find the roots are mostly polynomial functions but may also contain trigonometric and logarithmic.
The NR method requires finding the differential of the function. There are 3 ways to implement differentiation:
Symbolic
Numerical
Automatic (with sub types being forward mode and reverse mode. For this particular question, I would like to focus on forward mode)
I have thousands of these functions all requiring finding roots in the quickest time possible.
From the little that I do know, Automatic differentiation is in general quicker than symbolic because it handles the problem of "expression swell" alot more efficiently.
My question therefore is, all other things being equal, which method of differentiation is more computationally efficient: Automatic Differentiation (and more specifically, forward mode) or Numeric differentiation?
If your functions are truly all polynomials, then symbolic derivatives are dirt simple. Letting the coefficients of the polynomial be stored in an array with entries p[k] = a_k, where index k corresponds to the coefficient of x^k, then the derivative is represented by the array with entries dp[k] = (k+1) p[k+1]. For multivariable polynomial, this extends straightforwardly to multidimensional arrays. If your polynomials are not in standard form, e.g. if they include terms like (x-a)^2 or ((x-a)^2-b)^3 or whatever, a little bit of work is needed to transform them into standard form, but this is something you probably should be doing anyways.
If the derivative is not available, you should consider using the secant or regula falsi methods. They have very decent convergence speed (φ-order instead of quadratic). An additional benefit of regula falsi, is that the iterations remains confined to the initial interval, which allows reliable root separation (which Newton does not).
Also note than in the case of numerical evaluation of the derivatives, you will require several computations of the functions, at best two of them. Then the actual convergence speed drops to √2, which is outperformed by the derivative-less methods.
Also note that the symbolic expression of the derivatives is often more costly to evaluate than the functions themselves. So one iteration of Newton costs at least two function evaluations, spoiling the benefit of the convergence rate.
I am seeking advice on how to incorporate C or C++ code into my R code to speed up a MCMC program, using a Metropolis-Hastings algorithm. I am using an MCMC approach to model the likelihood, given various covariates, that an individual will be assigned a particular rank in a social status hierarchy by a 3rd party (the judge): each judge (approx 80, across 4 villages) was asked to rank a group of individuals (approx 80, across 4 villages) based on their assessment of each individual's social status. Therefore, for each judge I have a vector of ranks corresponding to their judgement of each individual's position in the hierarchy.
To model this I assume that, when assigning ranks, judges are basing their decisions on the relative value of some latent measure of an individual's utility, u. Given this, it can then be assumed that a vector of ranks, r, produced by a given judge is a function of an unobserved vector, u, describing the utility of the individuals being ranked, where the individual with the kth highest value of u will be assigned the kth rank. I model u, using the covariates of interest, as a multivariate normally distributed variable and then determine the likelihood of the observed ranks, given the distribution of u generated by the model.
In addition to estimating the effect of, at most, 5 covariates, I also estimate hyperparameters describing variance between judges and items. Therefore, for every iteration of the chain I estimate a multivariate normal density approximately 8-10 times. As a result, 5000 iterations can take up to 14 hours. Obviously, I need to run it for much more than 5000 runs and so I need a means for dramatically speeding up the process. Given this, my questions are as follows:
(i) Am I right to assume that the best speed gains will be had by running some, if not all of my chain in C or C++?
(ii) assuming the answer to question 1 is yes, how do I go about this? For example, is there a way for me to retain all my R functions, but simply do the looping in C or C++: i.e. can I call my R functions from C and then do looping?
(iii) I guess what I really want to know is how best to approach the incorporation of C or C++ code into my program.
First make sure your slow R version is correct. Debugging R code might be easier than debugging C code. Done that? Great. You now have correct code you can compare against.
Next, find out what is taking the time. Use Rprof to run your code and see what is taking the time. I did this for some code I inherited once, and discovered it was spending 90% of the time in the t() function. This was because the programmer had a matrix, A, and was doing t(A) in a zillion places. I did one tA=t(A) at the start, and replaced every t(A) with tA. Massive speedup for no effort. Profile your code first.
Now, you've found your bottleneck. Is it code you can speed up in R? Is it a loop that you can vectorise? Do that. Check your results against your gold standard correct code. Always. Yes, I know its hard to compare algorithms that rely on random numbers, so set the seeds the same and try again.
Still not fast enough? Okay, now maybe you need to rewrite parts (the lowest level parts, generally, and those that were taking the most time in the profiling) in C or C++ or Fortran, or if you are really going for it, in GPU code.
Again, really check the code is giving the same answers as the correct R code. Really check it. If at this stage you find any bugs anywhere in the general method, fix them in what you thought was the correct R code and in your latest version, and rerun all your tests. Build lots of automatic tests. Run them often.
Read up about code refactoring. It's called refactoring because if you tell your boss you are rewriting your code, he or she will say 'why didn't you write it correctly first time?'. If you say you are refactoring your code, they'll say "hmmm... good". THIS ACTUALLY HAPPENS.
As others have said, Rcpp is made of win.
A complete example using R, C++ and Rcpp is provided by this blog post which was inspired by a this post on Darren Wilkinson's blog (and he has more follow-ups). The example is also included with recent releases of Rcpp in a directory RcppGibbs and should get you going.
I have a blog post which discusses exactly this topic which I suggest you take a look at:
http://darrenjw.wordpress.com/2011/07/31/faster-gibbs-sampling-mcmc-from-within-r/
(this post is more relevant than the post of mine that Dirk refers to).
I think the best method currently to integrate C or C++ is the Rcpp package of Dirk Eddelbuettel. You can find a lot of information at his website. There is also a talk at Google that is available through youtube that might be interesting.
Check out this project:
https://github.com/armstrtw/rcppbugs
Also, here is a link to the R/Fin 2012 talk:
https://github.com/downloads/armstrtw/rcppbugs/rcppbugs.pdf
I would suggest to benchmark each step of the MCMC sampler and identify the bottleneck. If you put each full conditional or M-H-step into a function, you can use the R compiler package which might give you 5%-10% speed gain. The next step is to use RCPP.
I think it would be really nice to have a general-purpose RCPP function which generates just one single draw using the M-H algorithm given a likelihood function.
However, with RCPP some things become difficult if you only know the R language: non-standard random distributions (especially truncated ones) and using arrays. You have to think more like a C programmer there.
Multivariate Normal is actually a big issue in R. Dmvnorm is very inefficient and slow. Dmnorm is faster, but it would give me NaNs quicker than dmvnorm in some models.
Neither does take an array of covariance matrices, so it is impossible to vectorize code in many instances. As long as you have a common covariance and means, however, you can vectorize, which is the R-ish strategy to speed up (and which is the oppositve of what you would do in C).
I created a special-purpose "programming language" that deliberately (by design) cannot evaluate the same piece of code twice (ie. it cannot loop). It essentially is made to describe a flowchart-like process where each element in the flowchart is a conditional that performs a different test on the same set of data (without being able to modify it). Branches can split and merge, but never in a circular fashion, ie. the flowchart cannot loop back onto itself. When arriving at the end of a branch, the current state is returned and the program exits.
When written down, a typical program superficially resembles a program in a purely functional language, except that no form of recursion is allowed and functions can never return anything; the only way to exit a function is to call another function, or to invoke a general exit statement that returns the current state. A similar effect could also be achieved by taking a structured programming language and removing all loop statements, or by taking an "unstructured" programming language and forbidding any goto or jmp statement that goes backwards in the code.
Now my question is: is there a concise and accurate way to describe such a language? I don't have any formal CS background and it is difficult for me to understand articles about automata theory and formal language theory, so I'm a bit at a loss. I know my language is not Turing complete, and through great pain, I managed to assure myself that my language probably can be classified as a "regular language" (ie. a language that can be evaluated by a read-only Turing machine), but is there a more specific term?
Bonus points if the term is intuitively understandable to an audience that is well-versed in general programming concepts but doesn't have a formal CS background. Also bonus points if there is a specific kind of machine or automaton that evaluates such a language. Oh yeah, keep in mind that we're not evaluating a stream of data - every element has (read-only) access to the full set of input data. :)
I believe that your language is sufficiently powerful to encode precisely the star-free languages. This is a subset of that regular languages in which no expression contains a Kleene star. In other words, it's the language of the empty string, the null set, and individual characters that is closed under concatenation and disjunction. This is equivalent to the set of languages accepted by DFAs that don't have any directed cycles in them.
I can attempt a proof of this here given your description of your language, though I'm not sure it will work precisely correctly because I don't have full access to your language. The assumptions I'm making are as follows:
No functions ever return. Once a function is called, it will never return control flow to the caller.
All calls are resolved statically (that is, you can look at the source code and construct a graph of each function and the set of functions it calls). In other words, there aren't any function pointers.
The call graph is acyclic; for any functions A and B, then exactly one of the following holds: A transitively calls B, B transitively calls A, or neither A nor B transitively call one another.
More generally, the control flow graph is acyclic. Once an expression evaluates, it never evaluates again. This allows us to generalize the above so that instead of thinking of functions calling other functions, we can think of the program as a series of statements that all call one another as a DAG.
Your input is a string where each letter is scanned once and only once, and in the order in which it's given (which seems reasonable given the fact that you're trying to model flowcharts).
Given these assumptions, here's a proof that your programs accept a language iff that language is star-free.
To prove that if there's a star-free language, there's a program in your language that accepts it, begin by constructing the minimum-state DFA for that language. Star-free languages are loop-free and scan the input exactly once, and so it should be easy to build a program in your language from the DFA. In particular, given a state s with a set of transitions to other states based on the next symbol of input, you can write a function that
looks at the next character of input and then calls the function encoding the state being transitioned to. Since the DFA has no directed cycles, the function calls have no directed cycles, and so each statement will be executed exactly once. We now have that (∀ R. is a star-free language → ∃ a program in your language that accepts it).
To prove the reverse direction of implication, we essentially reverse this construction and create an ε-NFA with no cycles that corresponds to your program. Doing a subset construction on this NFA to reduce it to a DFA will not introduce any cycles, and so you'll have a star-free language. The construction is as follows: for each statement si in your program, create a state qi with a transition to each of the states corresponding to the other statements in your program that are one hop away from that statement. The transitions to those states will be labeled with the symbols of input consumed making each of the decisions, or ε if the transition occurs without consuming any input. This shows that (∀ programs P in your language, &exists; a star-free language R the accepts just the strings accepted by your language).
Taken together, this shows that your programs have identically the power of the star-free languages.
Of course, the assumptions I made on what your programs can do might be too limited. You might have random-access to the input sequence, which I think can be handled with a modification of the above construction. If you can potentially have cycles in execution, then this whole construction breaks. But, even if I'm wrong, I still had a lot of fun thinking about this, and thank you for an enjoyable evening. :-)
Hope this helps!
I know this question is somewhat old, but for posterity, the phrase you are looking for is "decision tree". See http://en.wikipedia.org/wiki/Decision_tree_model for details. I believe this captures exactly what you have done and has a pretty descriptive name to boot!
The halting problem cannot be solved for Turing complete languages and it can be solved trivially for some non-TC languages like regexes where it always halts.
I was wondering if there are any languages that has both the ability to halt and not halt but admits an algorithm that can determine whether it halts.
The halting problem does not act on languages. Rather, it acts on machines
(i.e., programs): it asks whether a given program halts on a given input.
Perhaps you meant to ask whether it can be solved for other models of
computation (like regular expressions, which you mention, but also like
push-down automata).
Halting can, in general, be detected in models with finite resources (like
regular expressions or, equivalently, finite automata, which have a fixed
number of states and no external storage). This is easily accomplished by
enumerating all possible configurations and checking whether the machine enters
the same configuration twice (indicating an infinite loop); with finite
resources, we can put an upper bound on the amount of time before we must see
a repeated configuration if the machine does not halt.
Usually, models with infinite resources (unbounded TMs and PDAs, for instance),
cannot be halt-checked, but it would be best to investigate the models and
their open problems individually.
(Sorry for all the Wikipedia links, but it actually is a very good resource for
this kind of question.)
Yes. One important class of this kind are primitive recursive functions. This class includes all of the basic things you expect to be able to do with numbers (addition, multiplication, etc.), as well as some complex classes like #adrian has mentioned (regular expressions/finite automata, context-free grammars/pushdown automata). There do, however, exist functions that are not primitive recursive, such as the Ackermann function.
It's actually pretty easy to understand primitive recursive functions. They're the functions that you could get in a programming language that had no true recursion (so a function f cannot call itself, whether directly or by calling another function g that then calls f, etc.) and has no while-loops, instead having bounded for-loops. A bounded for-loop is one like "for i from 1 to r" where r is a variable that has already been computed earlier in the program; also, i cannot be modified within the for-loop. The point of such a programming language is that every program halts.
Most programs we write are actually primitive recursive (I mean, can be translated into such a language).
The short answer is yes, and such languages can even be extremely useful.
There was a discussion about it a few months ago on LtU:
http://lambda-the-ultimate.org/node/2846