Heap's algorithm in Clojure (can it be implemented efficiently?)

Heap's algorithm in Clojure (can it be implemented efficiently?) - arrays

Heap's algorithm enumerates the permutations of an array. Wikipedia's article on the algorithm says that Robert Sedgewick concluded the algorithm was ``at that time the most effective algorithm for generating permutations by computer,'' so naturally it would be fun to try to implement.
The algorithm is all about making a succession of swaps within a mutable array, so I was looking at implementing this in Clojure, whose sequences are immutable. I put the following together, avoiding mutability completely:
(defn swap [a i j]
(assoc a j (a i) i (a j)))
(defn generate-permutations [v n]
(if (zero? n)
();(println (apply str a));Comment out to time just the code, not the print
(loop [i 0 a v]
(if (<= i n)
(do
(generate-permutations a (dec n))
(recur (inc i) (swap a (if (even? n) i 0) n)))))))
(if (not= (count *command-line-args*) 1)
(do (println "Exactly one argument is required") (System/exit 1))
(let [word (-> *command-line-args* first vec)]
(time (generate-permutations word (dec (count word))))))
For an 11-character input string, the algorithm runs (on my computer) in 7.3 seconds (averaged over 10 runs).
The equivalent Java program, using character arrays, runs in 0.24 seconds.
So I would like to make the Clojure code faster. I used a Java array with type hinting. This is what I tried:
(defn generate-permutations [^chars a n]
(if (zero? n)
();(println (apply str a))
(doseq [i (range 0 (inc n))]
(generate-permutations a (dec n))
(let [j (if (even? n) i 0) oldn (aget a n) oldj (aget a j)]
(aset-char a n oldj) (aset-char a j oldn)))))
(if (not= (count *command-line-args*) 1)
(do
(println "Exactly one argument is required")
(System/exit 1))
(let [word (-> *command-line-args* first vec char-array)]
(time (generate-permutations word (dec (count word))))))
Well, it's slower. Now it averages 9.1 seconds for the 11-character array (even with the type hint).
I understand mutable arrays are not the Clojure way, but is there any way to approach the performance of Java for this algorithm?

It's not so much that Clojure is entirely about avoiding mutable state. It's that Clojure has very strong opinions on when it should be used.
In this case, I'd highly recommend finding a way to rewrite your algorithm using transients, as they're specifically designed to save time by avoiding the reallocation of memory and allowing a collection to mutable locally so long as the reference to the collection never leaves the function in which it was created. I recently managed to cut a heavily memory intensive operation's time by nearly 10x by using them.
This article explains transients fairly well!
http://hypirion.com/musings/understanding-clojure-transients
Also, you may want to look into rewriting your loop structure in a way that allows you to use recur to recursively call generatePermutations rather than using the whole name. You'll likely get a performance boost, and it'd tax the stack a lot less.
I hope this helps.

Related

What is the closest equivalent to a for-loop in Racket-sdp?

Is recursion the only way to write something like a for-loop in the Racket dialect sdp ("Schreibe dein Programm!"), in which "(for)" isn't a thing or is there a more "efficient" or simpler way to do so?
What would the closest equivalent to the C++ loop for(i = 0 , i < 100, i++) look like in Racket-sdp code?
How I did this up until now was:
(: my-loop (natural -> %a))
(define my-loop
(lambda (i)
(cond
[(< i 100) (my-loop (+ i 1))] ; <-- count up by one till 99 is reached
[else "done"] ; <-- end
)))
(my-loop 0)
EDIT:
It's more of a general question. If I were to write lets say a raket library which contains a general function, that might be used like this:
(for 0 (< i 100) (+ i 1) (func i))
in my programs which is a for-loop that runs with a given function as it's "body", would there be a way to implement this properly?

[Professor of the mentioned course here.]
Recursion indeed is the only way to express iterated computation in the Racket dialect we are pursuing. (Yes, that's by design.)
Still, higher-order functions (and recursion) provide all you need to create your own "loop-like control structures". Take the following HOF, for example, which models a repeat-until loop:
(: until ((%a -> boolean) (%a -> %a) %a -> %a))
(define until
(lambda (done? f x)
(if (done? x)
x
(until done? f (f x)))))
Note that the until function is tail-recursive. You can expect it to indeed behave like a loop at runtime — a clever compiler will even translate such a function using plain jump instructions. (We'll discuss the above in the upcoming Chapter 12.)

You can make a high-order for-loop.
Here is an simple example:
(define (for start end f)
(define (loop i)
(when (< i end)
(f i)
(loop (+ i 1))))
(loop start))
(for 0 10 (λ (i) (displayln i)))
You can make this more general if you use a next function instead of (+ i 1) and use a while-predicate? function instead of (< i end).

Lisp - Flag(bandera) don't funtion

I'm trying to write a function to determine whether a word is palindrome or not. I make this but it always returns "Is not a palindrome". I don't know what is happening.
(defun palindromo (X)
(setq i 0)
(setq j (- (length X) 1))
(setq bandera 0)
(loop while (< j i)
do
(when (char= (char X i) (char X j))
(+ i 1)
(- j 1)
(setq bandera 1))
(unless (char/= (char X i) (char X j))
(setq bandera 0)
)
)
(cond
((equal 0 bandera) (write "Is not a palindrome"))
((equal 1 bandera) (write "Is a palindrome"))
)
)
How can I fix this?

 Loop problem
Your loop termination test is while (< j i), but you previously set i and j to respectively the index of the first and last character. That means that (<= i j). You never execute the body of the loop, and bandera is never modified from its initial value, 0.
Infinite loop problem
But suppose you fix your test so that it becomes (< i j), then your loop becomes an infinite loop, because you never mutates either i nor j in the body of your loop. The two expressions (+ i 1) and (- j 1) only computes the next indices, but do not change existing bindings. You would have to use setq, just as you did above.
Invalid use of SETQ
By the way, you cannot introduce variables with setq: it is undefined what happens when trying to set a variable that is not defined. You can introduce global variables with defvar, defparameter, and local variables with, among others, let, let* and the loop keyword with.
I assume your Common Lisp implementation implicitly defined global variables when you executed or compiled (setq i 0) and other assignments. But this is far from ideal since now your function depends on the global state and is not reentrant. If you called palindromo from different threads, all global variables would be modified concurrently, which would give incorrect results. Better use local variables.
Boolean logic
Do not use 0 and 1 for your flag, Lisp uses nil as false and everything else as true for its boolean operators.
Confusing tests
In the loop body, you first write:
(when (char= (char X i) (char X j)) ...)
Then you write:
(unless (char/= (char X i) (char X j)) ...)
Both test the same thing, and the second one involves a double-negation (unless not equal), which is hard to read.
Style
You generally do not want to print things from utility functions.
You should probably only return a boolean result.
The name of X is a little bit unclear, I'd would have used string.
Try to use the conventional way of formatting your Lisp code. It helps to use an editor which auto-indents your code (e.g. Emacs). Also, do not leave dangling parentheses on their own lines.
Rewrite
(defun palindromep (string)
(loop with max = (1- (length string))
for i from 0
for j downfrom max
while (< i j)
always (char= (char string i)
(char string j))))
I added a p to palindrome by convention, because it is a predicate.
The with max = ... in the loop defines a loop variable which holds the index of the last character (or -1 if string is empty).
i is a loop variable which increments, starting from 0
j is a loop variable which decrements, starting from max
the whileis a termination test
always evaluates a form at each execution of the loop, and check whether it is always true (non-nil).

Actually, no externally defined loop is needed for finding out, whether a string is palindromic or not. [ Remark: well, I thought that in the beginning. But as #coredump and #jkiiski pointed out, the reverse function slows down the procedure, since it copies the entire string once. ]
Use:
(defun palindromep (s)
(string= s (reverse s)))
[ This function will be way more efficient than your code
and it returns T if s is palindromic, else NIL.] (Not true, it only saves you writing effort, but it is less efficient than the procedure using loop.)
A verbose version would be:
(defun palindromep (s)
(let ((result (string= s (reverse s))))
(write (if result
"Is a palindrome"
"Is not a palindrome"))
result))
Writes the answer you wish but returns T or NIL.
The naming convention for a test function returning T or NIL is to end the name with p for 'predicate'.
The reverse function is less performant than the while loop suggested by #coredump
This was my beginner attempt to test the speed [not recommendable]:
;; Improved loop version by #coredump:
(defun palindromep-loop (string)
(loop with max = (1- (length string))
for i from 0
for j downfrom max
while (< i j)
always (char= (char string i)
(char string j))))
;; the solution with reverse
(defun palindromep (s)
(string= s (reverse s)))
;; the test functions test over and over the same string "abcdefggfedcba"
;; 10000 times over and over again
;; I did the repeats so that the measuring comes at least to the milliseconds
;; range ... (but it was too few repeats still. See below.)
(defun test-palindrome-loop ()
(loop repeat 10000
do (palindromep-loop "abcdefggfedcba")))
(time (test-palindrome-loop))
(defun test-palindrome-p ()
(loop repeat 10000
do (palindromep "abcdefggfedcba")))
(time (test-palindrome-p))
;; run on SBCL
[55]> (time (test-palindrome-loop))
Real time: 0.152438 sec.
Run time: 0.152 sec.
Space: 0 Bytes
NIL
[56]> (time (test-palindrome-p))
Real time: 0.019284 sec.
Run time: 0.02 sec.
Space: 240000 Bytes
NIL
;; note: this is the worst case that the string is a palindrome
;; for `palindrome-p` it would break much earlier when a string is
;; not a palindrome!
And this is #coredump's attempt to test the speed of the functions:
(lisp-implementation-type)
"SBCL"
(lisp-implementation-version)
"1.4.0.75.release.1710-6a36da1"
(machine-type)
"X86-64"
(defun palindromep-loop (string)
(loop with max = (1- (length string))
for i from 0
for j downfrom max
while (< i j)
always (char= (char string i)
(char string j))))
(defun palindromep (s)
(string= s (reverse s)))
(defun test-palindrome-loop (s)
(sb-ext:gc :full t)
(time
(loop repeat 10000000
do (palindromep-loop s))))
(defun test-palindrome-p (s)
(sb-ext:gc :full t)
(time
(loop repeat 10000000
do (palindromep s))))
(defun rand-char ()
(code-char
(+ #.(char-code #\a)
(random #.(- (char-code #\z) (char-code #\a))))))
(defun generate-palindrome (n &optional oddp)
(let ((left (coerce (loop repeat n collect (rand-char)) 'string)))
(concatenate 'string
left
(and oddp (list (rand-char)))
(reverse left))))
(let ((s (generate-palindrome 20)))
(test-palindrome-p s)
(test-palindrome-loop s))
Evaluation took:
4.093 seconds of real time
4.100000 seconds of total run time (4.068000 user, 0.032000 system)
[ Run times consist of 0.124 seconds GC time, and 3.976 seconds non-GC time. ]
100.17% CPU
9,800,692,770 processor cycles
1,919,983,328 bytes consed
Evaluation took:
2.353 seconds of real time
2.352000 seconds of total run time (2.352000 user, 0.000000 system)
99.96% CPU
5,633,385,408 processor cycles
0 bytes consed
What I have learned from that:
- Test more rigorously, repeat as often as necessary (range of seconds)
- do random generation and then test in parallel
Thank you very much for the nice example #coredump! And for the remark #jkiiski!

How to return a specific value in a loop

I am a complete novice to LISP I have the book Practical Common Lisp by Peter Seibel, but I couldn't find an answer to my question.
So basically how do I get this to return the value of the last ":do"
(defun averages (numbers)
(loop :for i :in numbers :sum i :into x :do (/ x (length numbers))))
Please bare in mind that I haven't been doing this very long.
Nor am I very aware of the unwritten do's and don't of Stackoverflow.

Use finally:
(defun averages (numbers)
(loop :for i :in numbers :sum i :into x
:finally (return (/ x (length numbers)))))
To avoid traversing the list twice, you can do (as suggested by #mark-reed and #joshua-taylor)
(defun averages (numbers)
(loop :for n :in numbers :sum n :into x :count t :into len
:finally (return (/ x len))))
but it will probably not make much difference performance-wise.
PS. You might want to consider CLOCC/CLLIB/math.lisp for your basic statistical needs.

If you don't make use of advanced features like multiple variables inside the LOOP, it's easy to simplify it:
(defun averages (numbers)
(loop for i in numbers sum i into x
finally (return (/ x (length numbers)))))
Just take advantage that the LOOP form returns the sum:
(defun averages (numbers)
(/ (loop for i in numbers sum i)
(length numbers)))
It might also be useful to check for an empty list of numbers first.

Conditionals in Elisp's cl-loop facility

I'm trying to wrap my head around Elisp's cl-loop facility but can't seem to find a way to skip elements. Here's an artificial example to illustrate the problem: I'd like to loop over a list of integers and get a new list in which all odd integers from the original list are squared. The even integers should be omitted.
According to the documentation of cl-loop, I should be able to do this:
(loop for i in '(1 2 3)
if (evenp i)
append (list)
else
for x = (* x x)
and append (list x))
The desired output is '(1 9) instead I get an error:
cl--parse-loop-clause: Expected a `for' preposition, found (list x)
Apparently the and doesn't work as expected but I don't understand why. (I'm aware that I could simplify the else block to consist of only one clause such that the and isn't needed anymore. However, I'm interested in situations where you really have to connect several clauses with and.)
Second part of the question: Ideally, I would be able to write this:
(loop for i in '(1 2 3)
if (evenp i)
continue
for x = (* x x)
append (list x))
Continue is a very common way to skip iterations in other languages. Why doesn't cl-loop have a continue operator? Is there a simple way to skip elements that I overlooked (simpler than what I tried in the first example)?

In Common Lisp it is not possible to write such a LOOP. See the LOOP Syntax.
There is a set of variable clauses on the top. But you can't use one like FOR later in the main clause. So in an IF clause you can't use FOR. If you want to introduce a local variable, then you need to introduce it at the top as a WITH clause and set it later in the body.
(loop for i in '(1 2 3)
with x
if (evenp i)
append (list)
else
do (setf x (* i i))
and append (list x))
LOOP in Common Lisp also has no continue feature. One would use a conditional clause.
Note, that Common Lisp has a more advanced iteration construct as a library ITERATE. It does not exist for Emacs Lisp, though.

You could do:
(loop for i in '(1 2 3)
if (oddp i) collect (* i i))
That would solve your sample problem.

And here's another without loop (yes, I know you asked for loop):
(let ((ns ()))
(dolist (n '(1 2 3))
(when (oddp n) (push (* n n) ns)))
(nreverse ns))
And without even cl-lib (which defines oddp):
(let ((ns ()))
(dolist (n '(1 2 3))
(unless (zerop (mod n 2)) (push (* n n) ns)))
(nreverse ns))
Everything about such definitions is clear -- just Lisp. Same with #abo-abo's examples.
loop is a separate language. Its purpose is to express common iteration scenarios, and for that it can do a good job. But Lisp it is not. ;-) It is a domain-specific language for expressing iteration. And it lets you make use of Lisp sexps, fortunately.
(Think of the Unix find command -- similar. It's very handy, but it's another language unto itself.)
[No flames, please. Yes, I know that dolist and all the rest are essentially no different from loop -- neither more nor less Lisp. But they are lispier than loop. Almost anything is lispier than loop.]

Here's a loop solution:
(loop for i in '(1 2 3)
when (oddp i) collect (* i i))
Here's a functional solution:
(delq nil
(mapcar (lambda(x) (and (oddp x) (* x x)))
'(1 2 3)))
Here's a slightly different solution (be careful with mapcan - it's destructive):
(mapcan (lambda(x) (and (oddp x) (list (* x x))))
'(1 2 3))

Clojure performance - why does the "ugly" "array swap trick" improve lcs performance?

This is a follow up to #cgrand's answer to the question "Clojure Performance For Expensive Algorithms." I haven been studying it and trying to apply some of his techniques to my own experimental Clojure perf tuning.
One thing I am wondering about is the "ugly" "arrays swap trick"
(set! curr prev)
(set! prev bak)
How and why does this improve performance over the original approach? I suspect that Clojure arrays are sometimes not true Java primitive arrays? If necessary, please cite Clojure core source in your answer.

As Chas mentions, loops with primitive hints are problematic. Clojure attempts to keep ints unboxed when you provide a hint, but it will (for the most part) silently fail when it can't honor the hints. Therefore he is forcing it to happen by creating a deftype with mutable fields and setting those inside the loop. It's a ugly hack, but gets around a few limitations in the compiler.

In fact, it has to do with object allocation. Here's the original algorithm, with annotations:
(defn my-lcs [^objects a1 ^objects a2]
(first
(let [n (inc (alength a1))]
(areduce a1 i
;; destructuring of the initial value
[max-len ^ints prev ^ints curr]
;; initial value - a vector of [long int[] int[]]
[0 (int-array n) (int-array n)]
;; The return value: a vector with the prev and curr swapped positions.
[(areduce a2 j max-len (unchecked-long max-len) ;;
(let [match-len
(if (.equals (aget a1 i) (aget a2 j))
(unchecked-inc (aget prev j))
0)]
(aset curr (unchecked-inc j) match-len)
(if (> match-len max-len)
match-len
max-len)))
curr prev])))) ;; <= swaps prev and curr for the next iteration
As per the Java version, prev and curr are "reused" - a dynamic programming approach similar to what's described here. However, doing so requires allocating a new vector on every iteration, which is passed to the next reduction.
By placing prev and curr outside the areduce and making them ^:unsynchronized-mutable members of the enclosing IFn object, he avoids allocating a persitent vector on each iteration, and instead just pays the cost of boxing a long (possibly not even that).
So the "ugly" trick was not in a prior iteration of his Clojure code, but rather the Java version.