How do I execute, as part of a ML{*...*} command, a string that contains ML source? - ml

From the Isabelle user list Makarius Wenzel says this:
It is also possible to pass around ML sources as strings or tokens in Isabelle/ML, and invoke the compiler on it. That is a normal benefit of incremental compilation.
I have an ML statement as a string, like this:
ML{* val source_string = "val x = 3 + 4"; *}
I want to use "val x = 3 + 4" as a ML statement in a ML{*...*} command. I can do it by calling Poly/ML externally like this:
ML{*
fun polyp cmd = Isabelle_System.bash
("perl -e 'print q(" ^ cmd ^ ");' | $ML_HOME/poly");
*}
ML{* polyp source_string; *}
That takes about 200ms. I figure it would be about 0ms if I could do it internally.
Update 140411
Makarius Wenzel may have another way to do things, but what I have below with ML_Context.eval_text is almost what I want. It's in line with what I've been experimenting with. The problem is that used_only_by_src1 is global. I can't put it in the let.
I suppose if I was using src1 in two different ML{*...*} commands, there's some short period of time in which used_only_by_src1 could be changed by the one before it was used by the other. But, I guess this is all part of learning about stateless programming.
ML{*
val src1 = "x + y" (*I would actually have a global list of sources.*)
val used_only_by_src1 = Unsynchronized.ref "";
fun two_int_arg_fun s1 s2 = let
val s = "val x = " ^ s1 ^ "val y = " ^ s2
^ "used_only_by_src1 := (Int.toString(" ^ src1 ^ "))";
val _ = ML_Context.eval_text true Position.none s;
in !used_only_by_src1 end;
*}
ML{*
two_int_arg_fun;
two_int_arg_fun "44;" "778;"
(*3ms*)
*}
Note 140412: Actually, I don't need to execute ML strings. I can program in ML like normal, since any ML{*...*} can access global ML functions, where I can write what I need.
The main solution I got from this is how to pass arguments to a string of Perl code, which I got here from trying to do it for ML, so thanks to davidg. Plus, ML_Context.exec and ML_Context.eval_text might come in useful somewhere, and learning enough to be able to use them is totally useful.
There is the problem of needing a local or global Unsynchronized.ref that's guaranteed to not be changed by some other code (or a non-mutable type), but surely there's a solution to that.
I didn't pursue ML_Context.exec because isar_syn.ML wasn't making any sense to me, but I've gotten as far as below, so now I'm asking, "What functions do I need that involve Context.generic -> Context.generic or Toplevel.transition -> Toplevel.transition, and what do those do for me as far as being able to get the return value of 3+4 in a ML{*...*}?"
I'm using grep with Isabelle_System.bash, and I've gotten as far as what's below, in looking for the right signatures in Isabelle/Pure. I throw in for free a grep I use to look for useful or needed functions in the Poly/ML Basis.
ML{*
Isabelle_System.bash ("grep -nr 'get_generic' $ISABELLE_HOME/src/Pure");
Isabelle_System.bash ("grep -nr 'hash' $ML_HOME/../src/Basis/*");
Config.get_generic;
(*From use at line 265 of isar_syn.ML.*)
ML_Context.exec;
ML_Context.exec (fn () => ML_Context.eval_text true Position.none "3 + 4");
(*OUT: val it = fn: Context.generic -> Context.generic*)
(fn (txt, pos) =>
Toplevel.generic_theory
(ML_Context.exec (fn () => ML_Context.eval_text true pos txt) #>
Local_Theory.propagate_ml_env)) ("3 + 4", Position.none);
(*OUT: val it = fn: Toplevel.transition -> Toplevel.transition*)
*}

The structure ML_Context might be a good starting point to look. For instance, your expression can be executed as so:
ML {*
ML_Context.eval_text true Position.none "val x = 3 + 4"
*}
This will internally evaluate the expression 3 + 4, and throw away the results.
Functions like ML_Context.exec will allow you to capture the results of expressions and put them into your local context; you might want to look at implementation of the ML Isar command in src/Pure/Isar/isar_syn.ML to see how such functions are used in practice.

Related

Reading from file to database prolog

Hello everyone
I have problem with part of my project for studies.
My task is to write a program in prolog that can tell you what illnes do you have based on input from user. Data base must be read from a file which format is up to me.
Construction:
I decide to have 2 dynamic rules;
:- dynamic (illness/2).
:- dynamic (symptoms/4).
where:
illnes(name_of_illness, symptoms(symptom1, symptom2, symptom3, symptom4)
File: example.txt:
flu,cough,fever,head_acke, runny_nose.
measles, rash, fever, sore_throat, inflamed_eyes.
Problem:
My major problem is to format this data to use asserta predicat, I tried many ways but it didn't work.
Thank you
So, per your other question, I think you can parse these strings with split_string/4, your problem is that the result of that is not atoms, and then you need to build the structure properly. This I think is the piece you're missing:
?- split_string("this,that,the,other,thing", ",", " ", X),
maplist(atom_string, [Condition,Sym1,Sym2,Sym3,Sym4], X),
Result = illness(Condition, symptoms(Sym1,Sym2,Sym3,Sym4)).
X = ["this", "that", "the", "other", "thing"],
Condition = this,
Sym1 = that,
Sym2 = the,
Sym3 = other,
Sym4 = thing,
Result = illness(this, symptoms(that, the, other, thing)).
If you then simply asserta(Result) you have added the right thing to the database.
If you have a variable number of symptoms, you should keep it a list in there instead, and that would simplify your processing greatly (and probably our downstream code, as doing anything four times in a row is a bit repetitive):
?- split_string("this,that,the,other,thing", ",", " ", X),
maplist(atom_string, [Condition|Symptoms], X),
Result = illness(Condition, symptoms(Symptoms)).
X = ["this", "that", "the", "other", "thing"],
Condition = this,
Symptoms = [that, the, other, thing],
Result = illness(this, symptoms([that, the, other, thing])).

R apply-like function for updating 2 arrays simultaneously?

I am new to R, and looking for an apply type function to work with 2 arrays at once (simultaneous update).
For example, let's say I have some variables X and P:
X = array(rep(0, 10), dim=c(10, 1))
P = array(rep(1, 10), dim=c(10, 1))
which are governed by the system of equations:
X[k,] = 2*X[k-1]
P[k,] = 3*X[k] + X[k-1] + 3
Obviously, this can easily be accomplished with a for-loop, however, I have read/confirmed myself that for loops work horrendously for large inputs, and I wanted to start getting into good R coding practice, so I am wondering, what is the best way to do this in an apply-type logic? I am looking for something like,
sapply(2:dim(X)[1], function(k) {
X[k,] = 2*X[k-1]
P[k,] = 3*X[k] + X[k-1] + 3
})
But this obviously won't work, as it doesn't actually update X and P internally. Any tips/tricks for how to make my for-loops faster, and get in better R coding practice? Thanks in advance!
You could do the following below. The <<- operator will set X and P outside of the function
sapply(2:dim(X)[1], function(k) {
X[k,] <<- 2*X[k-1]
P[k,] <<- 3*X[k] + X[k-1] + 3
})
As pointed out by thelatemail in the comments, using <<- can be problematic because of the side effects it can have. See the links below, the one comparing for loops (and other loops) to the apply family of functions.
Here is a link to documentation on assignment operators in R.
Here is a StackOverflow link on for loop vs. apply functions that talks about performance.

Idiomatic exceptions for exiting loops in OCaml

In OCaml, imperative-style loops can be exited early by raising exceptions.
While the use of imperative loops is not idiomatic per se in OCaml, I'd like to know what are the most idiomatic ways to emulate imperative loops with early exits (taking into account aspects such as performance, if possible).
For instance, an old OCaml FAQ mentions exception Exit:
Exit: used to jump out of loops or functions.
Is it still current? The standard library simply mentions it as a general-purpose exception:
The Exit exception is not raised by any library function. It is provided for use in your programs.
Relatedly, this answer to another question mentions using a precomputed let exit = Exit exception to avoid allocations inside the loop. Is it still required?
Also, sometimes one wants to exit from the loop with a specific value, such as raise (Leave 42). Is there an idiomatic exception or naming convention to do this? Should I use references in this case (e.g. let res = ref -1 in ... <loop body> ... res := 42; raise Exit)?
Finally, the use of Exit in nested loops prevents some cases where one would like to exit several loops, like break <label> in Java. This would require defining exceptions with different names, or at least using an integer to indicate how many scopes should be exited (e.g. Leave 2 to indicate that 2 levels should be exited). Again, is there an approach/exception naming that is idiomatic here?
As originally posted in comments, the idiomatic way to do early exit in OCaml is using continuations. At the point where you want the early return to go to, you create a continuation, and pass it to the code that might return early. This is more general than labels for loops, since you can exit from just about anything that has access to the continuation.
Also, as posted in comments, note the usage of raise_notrace for exceptions whose trace you never want the runtime to generate.
A "naive" first attempt:
module Continuation :
sig
(* This is the flaw with this approach: there is no good choice for
the result type. *)
type 'a cont = 'a -> unit
(* with_early_exit f passes a function "k" to f. If f calls k,
execution resumes as if with_early_exit completed
immediately. *)
val with_early_exit : ('a cont -> 'a) -> 'a
end =
struct
type 'a cont = 'a -> unit
(* Early return is implemented by throwing an exception. The ref
cell is used to store the value with which the continuation is
called - this is a way to avoid having to generate an exception
type that can store 'a for each 'a this module is used with. The
integer is supposed to be a unique identifier for distinguishing
returns to different nested contexts. *)
type 'a context = 'a option ref * int64
exception Unwind of int64
let make_cont ((cell, id) : 'a context) =
fun result -> cell := Some result; raise_notrace (Unwind id)
let generate_id =
let last_id = ref 0L in
fun () -> last_id := Int64.add !last_id 1L; !last_id
let with_early_exit f =
let id = generate_id () in
let cell = ref None in
let cont : 'a cont = make_cont (cell, id) in
try
f cont
with Unwind i when i = id ->
match !cell with
| Some result -> result
(* This should never happen... *)
| None -> failwith "with_early_exit"
end
let _ =
let nested_function i k = k 15; i in
Continuation.with_early_exit (nested_function 42)
|> string_of_int
|> print_endline
As you can see, the above implements early exit by hiding an exception. The continuation is actually a partially applied function that knows the unique id of the context for which it was created, and has a reference cell to store the result value while the exception is being thrown to that context. The code above prints 15. You can pass the continuation k as deep as you want. You can also define the function f immediately at the point where it is passed to with_early_exit, giving an effect similar to having a label on a loop. I use this very often.
The problem with the above is the result type of 'a cont, which I arbitrarily set to unit. Actually, a function of type 'a cont never returns, so we want it to behave like raise – be usable where any type is expected. However, this doesn't immediately work. If you do something like type ('a, 'b) cont = 'a -> 'b, and pass that down to your nested function, the type checker will infer a type for 'b in one context, and then force you to call continuations only in contexts with the same type, i.e. you won't be able to do things like
(if ... then 3 else k 15)
...
(if ... then "s" else k 16)
because the first expression forces 'b to be int, but the second requires 'b to be string.
To solve this, we need to provide a function analogous to raise for early return, i.e.
(if ... then 3 else throw k 15)
...
(if ... then "s" else throw k 16)
This means stepping away from pure continuations. We have to un-partially-apply make_cont above (and I renamed it to throw), and pass the naked context around instead:
module BetterContinuation :
sig
type 'a context
val throw : 'a context -> 'a -> _
val with_early_exit : ('a context -> 'a) -> 'a
end =
struct
type 'a context = 'a option ref * int64
exception Unwind of int64
let throw ((cell, id) : 'a context) =
fun result -> cell := Some result; raise_notrace (Unwind id)
let generate_id = (* Same *)
let with_early_exit f =
let id = generate_id () in
let cell = ref None in
let context = (cell, id) in
try
f context
with Unwind i when i = id ->
match !cell with
| Some result -> result
| None -> failwith "with_early_exit"
end
let _ =
let nested_function i k = ignore (BetterContinuation.throw k 15); i in
BetterContinuation.with_early_exit (nested_function 42)
|> string_of_int
|> print_endline
The expression throw k v can be used in contexts where different types are required.
I use this approach pervasively in some big applications I work on. I prefer it even to regular exceptions. I have a more elaborate variant, where with_early_exit has a signature roughly like this:
val with_early_exit : ('a context -> 'b) -> ('a -> 'b) -> 'b
where the first function represents an attempt to do something, and the second represents the handler for errors of type 'a that may result. Together with variants and polymorphic variants, this gives a more explicitly-typed take on exception handling. It is especially powerful with polymorphic variants, as the set of error variands can be inferred by the compiler.
The Jane Street approach effectively does the same as what is described here, and in fact I previously had an implementation that generated exception types with first-class modules. I am not sure anymore why I eventually chose this one – there may be subtle differences :)
Just to answer a specific part of my question which was not mentioned in other answers:
... using a precomputed let exit = Exit exception to avoid allocations inside the loop. Is it still required?
I did some micro-benchmarks using Core_bench on 4.02.1+fp and the results indicate no significant difference: when comparing two identical loops, one containing a local exit declared before the loop and another one without it, the time difference is minimal.
The difference between raise Exit and raise_notrace Exit in this example was also minimal, about 2% in some runs, up to 7% in others, but it could well be within the error margins of such a short experiment.
Overall, I couldn't measure any noticeable difference, so unless someone would have examples where Exit/exit significantly affect performance, I would prefer the former since it is clearer and avoids creating a mostly useless variable.
Finally, I also compared the difference between two idioms: using a reference to a value before exiting the loop, or creating a specific exception type containing the return value.
With reference to result value + Exit:
let res = ref 0 in
let r =
try
for i = 0 to n-1 do
if a.(i) = v then
(res := v; raise_notrace Exit)
done;
assert false
with Exit -> !res
in ...
With specific exception type:
exception Res of int
let r =
try
for i = 0 to n-1 do
if a.(i) = v then
raise_notrace (Res v)
done;
assert false
with Res v -> v
in ...
Once again, the differences were minimal and varied a lot between runs. Overall, the first version (reference + Exit) seemed to have a slight advantage (0% to 10% faster), but the difference was not significant enough to recommend one version over the another.
Since the former requires defining an initial value (which may not exist) or using an option type to initialize the reference, and the latter requires defining a new exception per type of value returned from the loop, there is no ideal solution here.
Exit is ok (I'm not sure whether I can say that it is idiomatic). But, make sure, that you're using raise_notrace, if you're using recent enough compiler (since 4.02).
Even better solution, is to use with_return from OCaml Core library. It will not have any problems with scope, because it will create a fresh new exception type for each nesting.
Of course, you can achieve the same results, or just take the source code of Core's implementation.
And even more idiomatic, is not to use exceptions for short-circuiting your iteration, and consider to use existing algorithm (find, find_map, exists, etc) or just write a recursive function, if no algorithm suits you.
Regarding the point
using a precomputed let exit = Exit exception to avoid allocations
inside the loop. Is it still required?
the answer is no with sufficiently recent versions of OCaml. Here is the relevant excerpt from the Changelog of OCaml 4.02.0.
PR#6203: Constant exception constructors no longer allocate (Alain Frisch)
Here is PR6203: http://caml.inria.fr/mantis/view.php?id=6203

OCaml - Read csv file into array

I'm trying to import a csv file in OCaml into an array. I do realise it's not the best fit for the langage and I'm not actually sure an array is the best structure, but anyway...
It's working fine, but I'm really uneasy about the way I did it.
let import file_name separator =
let reg_separator = Str.regexp separator in
let value_array = Array.make_matrix 1600 12 0. in
let i = ref 0 in
try
let ic = open_in file_name in
(* Skip the first line, columns headers *)
let _ = input_line ic in
try
while true; do
(* Create a list of values from a line *)
let line_list = Str.split reg_separator (input_line ic) in
for j = 0 to ((List.length line_list) - 1) do
value_array.(!i).(j) <- float_of_string (List.nth line_list j)
done;
i := !i + 1
done;
value_array
with
| End_of_file -> close_in ic; value_array
with
| e -> raise e;;
Basically, I read the file line by line, and I split each line along the separator. The problem is that this returns a list and thus the complexity of the following line is really dreadfull.
value_array.(!i).(j) <- float_of_string (List.nth line_list j)
Is there any way to do it in a better way short of recoding the whole split thing by myself?
PS : I haven't coded in Ocaml in a long time, so I'm quite unsure about the try things and the way I return the array.
Cheers.
On OCaml >=4.00.0, you can use the List.iteri function.
List.iteri
(fun j elem -> value_array.(!i).(j) <- float_of_string elem)
line_list
You can replace your for-loop with this code and it should work nicely (of course, you need to keep the ;).
On older version of OCaml, you can use List.iter with a reference you manually increment or, in a cleaner way, declare your own iteri.
Note that your code is not very safe, notably with respect to your file's size (in terms of number of lines and columns, for example). Maybe you should put your dimension parameters as function arguments for a bit of flexibility.
EDIT: for future readers, you can use the very simple ocaml-csv (through OPAM: opam install csv)

C: Convert A ? B : C into if (A) B else C

I was looking for a tool that can convert C code expressions for the form:
a = (A) ? B : C;
into the 'default' syntax with if/else statements:
if (A)
a = B
else
a = C
Does someone know a tool that's capable to do such a transformation?
I work with GCC 4.4.2 and create a preprocessed file with -E but do not want such structures in it.
Edit:
Following code should be transformed, too:
a = ((A) ? B : C)->b;
Coccinelle can do this quite easily.
Coccinelle is a program matching and
transformation engine which provides
the language SmPL (Semantic Patch
Language) for specifying desired
matches and transformations in C code.
Coccinelle was initially targeted
towards performing collateral
evolutions in Linux. Such evolutions
comprise the changes that are needed
in client code in response to
evolutions in library APIs, and may
include modifications such as renaming
a function, adding a function argument
whose value is somehow
context-dependent, and reorganizing a
data structure. Beyond collateral
evolutions, Coccinelle is successfully
used (by us and others) for finding
and fixing bugs in systems code.
EDIT:
An example of semantic patch:
## expression E; constant C; ##
(
!E & !C
|
- !E & C
+ !(E & C)
)
From the documentation:
The pattern !x&y. An expression of this form is almost always meaningless, because it combines a boolean operator with a bit operator. In particular, if the rightmost bit of y is 0, the result will always be 0. This semantic patch focuses on the case where y is a constant.
You have a good set of examples here.
The mailing list is really active and helpful.
The following semantic patch for Coccinelle will do the transformation.
##
expression E1, E2, E3, E4;
##
- E1 = E2 ? E3 : E4;
+ if (E2)
+ E1 = E3;
+ else
+ E1 = E4;
##
type T;
identifier E5;
T *E3;
T *E4;
expression E1, E2;
##
- E1 = ((E2) ? (E3) : (E4))->E5;
+ if (E2)
+ E1 = E3->E5;
+ else
+ E1 = E4->E5;
##
type T;
identifier E5;
T E3;
T E4;
expression E1, E2;
##
- E1 = ((E2) ? (E3) : (E4)).E5;
+ if (E2)
+ E1 = (E3).E5;
+ else
+ E1 = (E4).E5;
The DMS Software Reengineering Toolkit can do this, by applying program transformations.
A specific DMS transformation to match your specific example:
domain C.
rule ifthenelseize_conditional_expression(a:lvalue,A:condition,B:term,C:term):
stmt -> stmt
= " \a = \A ? \B : \C; "
-> " if (\A) \a = \B; else \a=\C ; ".
You'd need another rule to handle your other case, but it is equally easy to express.
The transformations operate on source code structures rather than text, so layout out and comments won't affect recognition or application. The quotation marks in the rule not traditional string quotes, but rather are metalinguistic quotes that separate the rule syntax language from the pattern langu age used to specify the concrete syntax to be changed.
There are some issues with preprocessing directives if you intend to retain them. Since you apparantly are willing to work with preprocessor-expanded code, you can ask DMS to do the preprocessing as part of the transformation step; it has full GCC4 and GCC4-compatible preprocessors built right in.
As others have observed, this is a rather easy case because you specified it work at the level of a full statement. If you want to rid the code of any assignment that looks similar to this statement, with such assignments embedded in various contexts (initializers, etc.) you may need a larger set of transforms to handle the various set of special cases, and you may need to manufacture other code structures (e.g., temp variables of appropriate type). The good thing about a tool like DMS is that it can explicitly compute a symbolic type for an arbitrary expression (thus the type declaration of any needed temps) and that you can write such a larger set rather straightforwardly and apply all of them.
All that said, I'm not sure of the real value of doing your ternary-conditional-expression elimination operation. Once the compiler gets hold of the result, you may get similar object code as if you had not done the transformations at all. After all, the compiler can apply equivalence-preserving transformations, too.
There is obviously value in making regular changes in general, though.
(DMS can apply source-to-source program transformations to many langauges, including C, C++, Java, C# and PHP).
I am not aware of such a thing as the ternary operator is built-into the language specifications as a shortcut for the if logic... the only way I can think of doing this is to manually look for those lines and rewrite it into the form where if is used... as a general consensus, the ternary operator works like this
expr_is_true ? exec_if_expr_is_TRUE : exec_if_expr_is_FALSE;
If the expression is evaluated to be true, execute the part between ? and :, otherwise execute the last part between : and ;. It would be the reverse if the expression is evaluated to be false
expr_is_false ? exec_if_expr_is_FALSE : exec_if_expr_is_TRUE;
If the statements are very regular like this why not run your files through a little Perl script? The core logic to do the find-and-transform is simple for your example line. Here's a bare bones approach:
use strict;
while(<>) {
my $line = $_;
chomp($line);
if ( $line =~ m/(\S+)\s*=\s*\((\s*\S+\s*)\)\s*\?\s*(\S+)\s*:\s*(\S+)\s*;/ ) {
print "if(" . $2 . ")\n\t" . $1 . " = " . $3 . "\nelse\n\t" . $1 . " = " . $4 . "\n";
} else {
print $line . "\n";
}
}
exit(0);
You'd run it like so:
perl transformer.pl < foo.c > foo.c.new
Of course it gets harder and harder if the text pattern isn't as regular as the one you posted. But free, quick and easy to try.

Resources