Clojure: building collections using `for` bindings - loops

I'm still fairly new to clojure, but a pattern that I find myself using frequently in it goes something like this: I have some collections and I want to build a new collection, usually a hash-map, out of them with some filters or conditions. There are always a few ways to do this: using loop or using reduce combined with map/filter for example, but I would like to implement something more like the for macro, which has great syntax for controlling what gets evaluated in the loop. I'd like to produce a macro with syntax that goes like this:
(defmacro build
"(build sym init-val [bindings...] expr) evaluates the given expression expr
over the given bindings (treated identically to the bindings in a for macro);
the first time expr is evaluated the given symbol sym is bound to the init-val
and every subsequent time to the previous expr. The return value is the result
of the final expr. In essence, the build macro is to the reduce function
as the for macro is to the map function.
Example:
(build m {} [x (range 4), y (range 4) :when (not= x y)]
(assoc m x (conj (get m x #{}) y)))
;; ==> {0 #{1 3 2}, 1 #{0 3 2}, 2 #{0 1 3}, 3 #{0 1 2}}"
[sym init-val [& bindings] expr]
`(...))
Looking at the for code in clojure.core, it's pretty clear that I don't want to re-implement its syntax myself (even ignoring the ordinary perils of duplicating code), but coming up with for-like behavior in the above macro is a lot trickier than I initially expected. I eventually came up with the following, but I feel that (a) this probably isn't terribly performant and (b) there ought to be a better, still clojure-y, way to do this:
(defmacro build
[sym init-val bindings expr]
`(loop [result# ~init-val, s# (seq (for ~bindings (fn [~sym] ~expr)))]
(if s#
(recur ((first s#) result#) (next s#))
result#))
;; or `(reduce #(%2 %1) ~init-val (for ~bindings (fn [~sym] ~expr)))
My specific questions:
Is there a built-in clojure method or library that solves this already, perhaps more elegantly?
Can someone who is more familiar with clojure performance give me an idea of whether this implementation is problematic and whether/how much I should be worried about performance, assuming that I may use this macro very frequently for relatively large collections?
Is there any good reason that I should use the loop over the reduce version of the macro above, or vice versa?
Can anyone see a better implementation of the macro?

Your reduce version was also my first approach based on the problem statement. I think it's nice and straightforward and I'd expect it to work very well, particularly since for will produce a chunked seq that reduce will be able to iterate over very quickly.
for generates functions to do output generation anyway and I wouldn't expect the extra layer introduced by the build expansion to be particularly problematic. It may still be worthwhile to benchmark this version based on volatile! as well:
(defmacro build [sym init-val bindings expr]
`(let [box# (volatile! ~init-val)] ; AtomicReference would also work
(doseq ~bindings
(vreset! box# (let [~sym #box#] ~expr)))
#box#))
Criterium is great for benchmarking and will eliminate any performance-related guesswork.

I don't want to quite take your example code of your doc string since it's not idiomatic clojure. But taking plumbing.core's for-map, you can come up with a similar for-map-update:
(defn update!
"Like update but for transients."
([m k f] (assoc! m k (f (get m k))))
([m k f x1] (assoc! m k (f (get m k) x1)))
([m k f x1 x2] (assoc! m k (f (get m k) x1 x2)))
([m k f x1 x2 & xs] (assoc! m k (apply f (get m k) x1 x2 xs))))
(defmacro for-map-update
"Like 'for-map' for building maps but accepts a function as the value to build map values."
([seq-exprs key-expr val-expr]
`(for-map-update ~(gensym "m") ~seq-exprs ~key-expr ~val-expr))
([m-sym seq-exprs key-expr val-expr]
`(let [m-atom# (atom (transient {}))]
(doseq ~seq-exprs
(let [~m-sym #m-atom#]
(reset! m-atom# (update! ~m-sym ~key-expr ~val-expr))))
(persistent! #m-atom#))))
(for-map-update
[x (range 4)
y (range 4)
:when (not= x y)]
x (fnil #(conj % y) #{} ))
;; => {0 #{1 3 2}, 1 #{0 3 2}, 2 #{0 1 3}, 3 #{0 1 2}}

Related

Printing each number in new line in Scheme

I need a help to convert this Pascal code to Scheme code:
program reverseorder;
type
arraytype = array [1..5] of integer;
var
arr:arraytype;
i:integer;
begin
for i:=1 to 5 do
arr[i]:=i;
for i:=5 downto 1 do
writeln(arr[i]);
end.
I want to accesses to a specific atom and its seems there aren't iteration method in Scheme.
There are many ways to tackle this problem. With the online interpreter you're using you'll be limited to vanilla Scheme, and the solution will be more verbose than needed, using recursion:
(define lst '(1 2 3 4 5))
(let loop ((rev (reverse lst)))
(when (not (null? rev))
(display (car rev))
(newline)
(loop (cdr rev))))
With Racket, which is aimed at beginners, you can write a much simpler (albeit non-standard) solution:
(define lst (range 1 6))
(for ([e (reverse lst)])
(displayln e))
Either way, notice that the procedure for reversing a list is already built in the language, and you don't need to reimplement it - naturally, it's called reverse. And if it wasn't obvious already, in Scheme we prefer to use lists, not arrays to represent a sequence of elements - it's recommended to stop thinking in terms of indexes, array lengths, etc. because that's not how things are done in Scheme.
If you don't care about returned value (it's #<undef>) and just want to produce output, you can use for-each:
(for-each print (reverse (list 1 2 3 4 5)))
Output:
5
4
3
2
1
Not idiomatic Scheme, but a literal translation of the Pascal code would be:
(let ([arr (make-vector 5)])
(do ([i 0 (+ i 1)]) ((= i 5)) (vector-set! arr i i))
(do ([i 4 (- i 1)]) ((negative? i))
(display (vector-ref arr i))
(newline)))

Clojure - Loop Variables - Immutability

I am trying to learn functional programming in Clojure. Many functional programming tutorials begin with the benefits of immutability, and one common example is the loop variable in imperative-style languages. In that respect, how does Clojure's loop-recur differ from them? For example:
(defn factorial [n]
(loop [curr-n n curr-f 1]
(if (= curr-n 1)
curr-f
(recur (dec curr-n) (* curr-f curr-n)))))
Isn't curr-n and curr-f mutable values similar to loop variable in imperative-style languages?
As Thumbnail points out, using loop-recur in clojure has the same form and effect as a classic recursive function call. The only reason it exists is that it is much more efficient than pure recursion.
Since the recur can only occur in the tail position, you are guarenteed that the loop "variables" will never be needed again. Thus, you don't need to preserve them on the stack, so no stack is used (unlike nested function calls, recursive or not). The end result is that it looks & acts very similarly to an imperative loop in other languages.
The improvement compared to a Java-style for loop is that all "variables" are limited to "changing" only when initialized in the loop expression and when updated in the recur expression. No changes to the vars can occur in the body of the loop, nor anywhere else (such as embedded function calls which could mutate the loop vars in Java).
Because of these restrictions on where the "loop vars" can be mutated/updated, there are reduced opportunities for a bug to change them unintentionally. The cost of the restrictions is that the loop is not as flexible as a traditional Java-style loop.
As with anything, it is up to you to decide when this cost-benefit tradeoff is a better choice than the other cost-benefit tradeoffs available. If you want a pure Java-style loop, it is easy to use a clojure atom to simulate a Java variable:
; Let clojure figure out the list of numbers & accumulate the result
(defn fact-range [n]
(apply * (range 1 (inc n))))
(spyx (fact-range 4))
; Classical recursion uses the stack to figure out the list of
; numbers & accumulate the intermediate results
(defn fact-recur [n]
(if (< 1 n)
(* n (fact-recur (dec n)))
1))
(spyx (fact-recur 4))
; Let clojure figure out the list of numbers; we accumulate the result
(defn fact-doseq [n]
(let [result (atom 1) ]
(doseq [i (range 1 (inc n)) ]
(swap! result * i))
#result ))
(spyx (fact-doseq 4))
; We figure out the list of numbers & accumulate the result
(defn fact-mutable [n]
(let [result (atom 1)
cnt (atom 1) ]
(while (<= #cnt n)
(swap! result * #cnt)
(swap! cnt inc))
#result))
(spyx (fact-mutable 4))
(fact-range 4) => 24
(fact-recur 4) => 24
(fact-doseq 4) => 24
(fact-mutable 4) => 24
Even in the last case where we use atoms to emulate mutable variables in Java, at least each place we mutate something it is visibly marked with the swap! function, which makes it harder to miss "accidental" mutation.
P.S. If you wish to use spyx it is in the Tupelo library
Isn't curr-n and curr-f mutable values similar to loop variable in
imperative-style languages?
No. You can always rewrite a loop-recur as a recursive function call. For example, your factorial function can be rewritten ...
(defn factorial [n]
((fn whatever [curr-n curr-f]
(if (= curr-n 1)
curr-f
(whatever (dec curr-n) (* curr-f curr-n))))
n 1))
This is slower and subject to stack-overflow on big numbers.
When it comes to the moment of incarnating the call, recur overwrites the one-and-only stack frame instead of allocating a new one. This works only if the caller's stack frame is never thereafter referred to - what we call tail position.
loop is syntactic sugar. I doubt that it is a macro, but it could be. Except that the earlier bindings should be available to the later ones, as in a let, though I think this issue is currently moot.

Clojure loop inside let (global v local variable)

I was writing the code that does same thing as 'reduce' function in clojure
ex) (reduce + [1 2 3 4]) = (+ (+ (+ 1 2) 3) 4).
(defn new-reduce [fn coll]
(def answer (get coll 0))
(loop [i 1]
(when (< i (count coll))
(def answer (fn answer (get coll i)))
(recur (inc i))))
answer)
In my code I used the global variable, and for me it was easier for me to understand that way. Apparently, people saying it is better to change the global variable to local variable such as let. So I tried..
(defn new-reduce [fn coll]
(let [answer (get coll 0)]
(loop [i 1]
(when (< i (count coll))
(fn answer (get coll i))
(recur (inc i))))))
To be honest, I am not really familiar with let function and even though I try really simple code, it did not work. Can somebody help me to fix this code and help me to understand how the let (local variables) really work ? Thank you. (p.s. really simple code that has loop inside let function will be great also).
Let does not create local "variables", it gives names to values, and does not let you change them after giving them the name. So introducing a let is more like defining a local constant.
First I'll just add another item into the loop expression to store the value so far. Each time through the loop we will update this to incorporate the new information. This pattern is very common. I also needed to add a new argument to the function to hold the initial state (reduce as a concept needs this)
user> (defn new-reduce [function initial-value coll]
(loop [i 0
answer-so-far initial-value]
(if (< i (count coll))
(recur (inc i) (function answer-so-far (get coll i)))
answer-so-far)))
user> (new-reduce + 0 [1 2 3])
6
This moves the "global variable" into a name that is local to the loop expression can be updated once per loop at the time you jump back up to the top. Once it reaches the end of the loop it will return the answer thus far as the return value of the function rather than recurring again. Building your own reduce function is a great way to build understanding on how to use reduce effectively.
There is a function that introduces true local variables, though it is very nearly never used in Clojure code. It's only really used in the runtime bootstap code. If you are really curious read up on binding very carefully.
Here's a simple, functional solution that replicates the behavior of the standard reduce:
(defn reduce
([f [head & tail :as coll]]
(if (empty? coll)
(f)
(reduce f head tail)))
([f init [head & tail :as coll]]
(cond
(reduced? init) #init
(empty? coll) init
:else (recur f (f init head) tail))))
There is no loop here, because the function itself serves as the recursion point. I personally find it easier to think about this recursively, but since we're using tail recursion with recur, you can think about it imperatively/iteratively as well:
If init is a signal to return early then return its value, otherwise go to step 2
If coll is empty then return init, otherwise go to step 3
Set init to the result of calling f with init and the first item of coll as arguments
Set coll to a sequence of all items in coll except the first one
Go to step 1
Actually, under the hood (with tail-call optimization and such), that's essentially what's really going on. I'd encourage you to compare these two expressions of the same solution to get a better idea of how to go about solving these sorts of problems in Clojure.

How do I translate the loop part of Common Lisp code into Clojure? ... functional orientation

How do I translate the loop part of this working Common Lisp (SBCL v.1.2.3) code into Clojure (v.1.6)? I am a bit frustrated after working on it for some hours/days without results. Somewhere I don't get this functional orientation I suppose ...
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Unconditional Entropy
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Probabilities
(setq list_occur_prob '()) ;; init
;; set probabilities for which we want to calculate the entropy
(setq list_occur_prob '(1/2 1/3 1/6)) ;;
;; Function to calculate the unconditional
;; entropy H = -sigma i=0,n (pi*log2(pi)
;; bits persymbol.
(setq entropy 0) ;; init
(setq entropy (loop for i in list_occur_prob
for y = (* (log_base2 i) i)
collect y
))
(setq entropy (* -1 (apply '+ entropy))) ;; change the sign
;; Print the unconditional entropy in bits per symbol.
(print entropy) ;; BTW, here the entropy is 1.4591479 bits per symbol.
Before we dive into the Clojure equivalent of the code, you should take some time to clean up the Common Lisp code. Using setqthe way you're doing it is considered bad style at best and can lead to undefined consequences at worst: setq is intended to assign values to variables, but your variables list_occur_proband entropy aren't defined (via defvar). In addition, this piece of code looks like you're assigning global variables (cf. defvar again), which are dynamic variables, which by convention should be marked with earmuffs, e.g. *entropy*.
However, for this small piece of code, you could just as well use local, non-dynamic variables, introduced via let like this (warning, I don't have any CL or Clojure environment handy):
(let ((list_occur_prob '(1/2 1/3 1/6)))
(loop for i in list_occur_prob
for y = (* (log_base 2 i) i)
collect y into acc
finally (return (* -1 (apply '+ acc)))))
There are ways to optimize the apply clause away into the loop:
(let ((list-occur-prob '(1/2 1/3 1/6)))
(- (loop for i in list-occur-prob
sum (* (log i 2) i))))
Now, Daniel Neal has shown you already a map/reduce based solution, here is one which is more closer to the original looping construct, using a recursive approach:
(defn ent-helper [probs acc]
(if (seq probs)
(recur (rest probs)
(conj acc (* (log_base 2 (first probs)) (first probs))))
acc))
(let [probs 1/2 1/3 1/6
acc (ent-helper probs [])]
(* -1 (apply + acc))
We're using conj instead of collect to gather the results into the accumulator. The call to ent-helper, which is essentially triggered for all values of probs via the recur recursion call, takes an (initially empty) second parameter in which the values build up so far are collected. If we've exhausted all probabilities, we simply return the collected values.
Again, summing up the values so far could be optimized into the loop, instead of mapping over the values.
They key operation you need is map which transforms a sequence using a function.
In the entropy example you gave, the following should work:
(def probabilities [1/2 1/3 1/6])
(defn log [base x]
(/ (Math/log x) (Math/log base)))
(defn entropy [probabilities]
(->> probabilities
(map #(* (log 2 %) %)) ; note - #(f %) is shorthand for (fn [x] (f x))
(reduce +)
(-)))
(entropy probabilities) ; => 1.459
When working with collections, the pipeline operator (->>) is often used
to clearly show a sequence of operations. I personally find it much easier to read than the nested bracket syntax, especially if there are lots of operations.
Here, we're first mapping the pi * log2(pi) function over the sequence,
and then summing it using (reduce +)
I would start with more functional Common Lisp code:
(- (reduce #'+
'(1/2 1/3 1/6)
:key (lambda (i)
(* (log i 2) i))))
You can write imperative code in Lisp, with lots of operations setting variable values, but it is not the best style.
Even a tight LOOP can look okay:
(- (loop for i in '(1/2 1/3 1/6)
sum (* (log i 2) i)))
I endorse the general flavor of schaueho's answer, but if you prefer you can get something closer to the "feel" of the looping approach with Clojure's for macro:
(apply - 0
(for [prob [1/2 1/3 1/6]]
(* (log prob 2) prob)))
I find this much easier to read than schaueho's version with manual recursion, and it also performs much better, in that it doesn't traverse the list twice, doesn't accumulate results into a temporary vector, and so on.
Note that (- (apply + xs)) is the same as (apply - 0 xs), although which one you find clearer is probably a matter of taste. Also, I'm assuming you already have a suitable log function defined elsewhere.

Clojure: How do I have a for loop stop when the value of the current variable matches that of my input?

Preface
Firstly, I'm new to Clojure and programming, so I thought I'd try to create a function that solves a non-trivial equation using my natural instincts. What resulted is the desire to find a square root.
The question
What's the most efficient way to stop my square-n-map-maker function from iterating past a certain point? I'd like to fix square-n-map-maker so that I can comment out the square-maker function which provides me with the results and format I currently want to see but not the ability to recall the square-root answer (insofar as I know).
I.e. I want it to stop when it is greater than or equal to my input value
My initial thought was that instead of a keyword list, I would want it to be a map. But I'm having a very difficult time getting my function to give me a map. The whole reason I wanted a map where one member of a pair is n and another is n^2 so that I could extract the actual square root from it and it give it back to the user as the answer.
Any ideas on the best way to accomplish this? (below is the function I want to fix)
;; attempting to make a map so that I can comb over the
;; map later and recall a value that meets
;; my criteria to terminate and return result if (<= temp-var input)
(defn square-n-map-maker [input] (for [temp-var {remainder-culler input}]
(map list(temp-var) (* temp-var temp-var))
)
)
(square-n-map-maker 100) => clojure.lang.ArityException: Wrong number of args (0) passed to: MapEntry
AFn.java:437 clojure.lang.AFn.throwArity
AFn.java:35 clojure.lang.AFn.invoke
/Users/dbennett/Dropbox/Clojure Files/SquareRoot.clj:40 sqrt-range-high-end/square-n-map-maker[fn]
The following is the rest of my code
;; My idea on the best way to find a square root is simple.
;; If I want to find the square root of n, divide n in half
;; Then find all numbers in 0...n that return only a remainder of 0.
;; Then find the number that can divide by itself with a result of 1.
;; First I'll develop a function that works with evens and then odds
(defn sqrt-range-high-end [input] (/ input 2))
(sqrt-range-high-end 100) => 50
(defn make-sqrt-range [input] (range (sqrt-range-high-end (+ 1 input))))
(make-sqrt-range 100) =>(0 1 2 3 4 5 6 ... 50)
(defn zero-culler [input] (remove zero? (make-sqrt-range input)))
(zero-culler 100) =>(1 2 3 4 5 6 ... 50)
(defn odd-culler [input] (remove odd? (zero-culler input)))
(odd-culler 100) => (2 4 6 8 10...50)
(defn even-culler [input] (remove even? (zero-culler input)))
(even-culler 100) => (1 3 5 7...49)
(defn remainder-culler [input] (filter #(zero? (rem input %)) (odd-culler input)))
(remainder-culler 100) => (2 4 6 12 18)
(defn square-maker [input] (for [temp-var (remainder-culler input)]
(list (keyword (str
temp-var" "
(* temp-var temp-var)
)
)
)
)
(square-maker 100) => ((:2 4) (:4 16) (:10 100) (:20 400) (:50 2500))
Read the Error Messages!
You're getting a little ahead of yourself! Your bug has nothing to do with getting for to stop "looping."
(defn square-n-map-maker [input] (for [temp-var {remainder-culler input}]
(map list(temp-var) (* temp-var temp-var))))
(square-n-map-maker 100) => clojure.lang.ArityException: Wrong number of args (0) passed to: MapEntry
AFn.java:437 clojure.lang.AFn.throwArity
AFn.java:35 clojure.lang.AFn.invoke
Pay attention to error messages. They are your friend. In this case, it's telling you that you are passing the wrong number of arguments to MapEntry (search for IPersistentMap). What is that?
{} creates a map literal. {:key :value :key2 :value2} is a map. Maps can be used as if they were functions:
> ({:key :value} :key)
:value
That accesses the entry in the map associated with key. Now, you created a map in your first line: {remainder-culler input}. You just mapped the function remainder-culler to the input. If you grab an item out of the map, it's a MapEntry. Every MapEntry can be used as a function, accepting an index as an argument, just like a Vector:
> ([:a :b :c :d] 2)
:c
Your for is iterating over all MapEntries in {remainder-culler input}, but there's only one: [remainder-culler input]. This MapEntry gets assigned to temp-var.
Then in the next line, you wrapped this map in parentheses: (temp-var). This forms an S-expression, and expressions are evaluated assuming that the first item in the expression is a function/procedure. So it expects an index (valid indices here would be 0 and 1). But you pass no arguments to temp-var. Therefore: clojure.lang.ArityException: Wrong number of args.
Also, note that map is not a constructor for a Map.
Constructing a map
Now, on to your problem. Your square-maker is returning a list nicely formatted for a map, but it's made up of nested lists.
Try this:
(apply hash-map (flatten (square-maker 100)))
Read this page and this page to see how it works.
If you don't mind switching the order of the keys and values, you can use the group-by that I mentioned before:
(defn square-maker [input]
(group-by #(* % %) (remainder-culler input)))
(square-maker 100) => {4 [2], 16 [4], 100 [10], 400 [20], 2500 [50]}
Then you can snag the value you need like so: (first ((square-maker 100) 100)). This uses the map-as-function feature I mentioned above.
Loops
If you really want to stick with the intuitive looping concept, I would use loop, not for. for is lazy, which means that there is neither means nor reason (if you use it correctly) to "stop" it -- it doesn't actually do any work unless you ask for a value from it, and it only does the work it must to give you the value you asked for.
(defn square-root [input]
(let [candidates (remainder-culler input)]
(loop [i 0]
(if (= input (#(* % %) (nth candidates i)))
(nth candidates i)
(recur (inc i))))))
The embedded if determines when the looping will cease.
But notice that loop only returns its final value (acquaint yourself with loop's documentation if that sentence doesn't make sense to you). If you want to build up a hash-map for later analysis, you'd have to do something like (loop [i 0, mymap {}] .... But why analyze later if it can be done right away? :-)
Now, that's a pretty fragile square-root function, and it wouldn't be too hard to get it caught in an infinite loop (feed it 101). I leave it as an exercise to you to fix it (this is all an academic exercise anyway, right?).
I hope that helps you along your way, once again. I think this is a great problem for learning a new language. I should say, for the record, though, that once you are feeling comfortable with your solution, you should search for other Clojure solutions to the problem and see if you can understand how they work -- this one may be "intuitive," but it is not well-suited to Clojure's tools and capabilities. Looking at other solutions will help you grasp Clojure's world a bit better.
For more reading:
Imperative looping with side-effects.
How to position recur with loop
The handy into
Finally, this "not constructive" list of common Clojure mistakes
for is not a loop, and it's not iterating. It lazily creates a list comprehension, and it only realizes values when required (in this case, when the repl tries to print the result of the evaluation). There are two usual ways to do what you want: one is to wrap square-maker in
(first (filter some-predicate (square-maker number))) to obtain the first element in the sequence that complies with some-predicate. E.g.
(first (filter #(and (odd? %) (< 50 %)) (range)))
=> 51
The above won't realize the infinite range, obviously.
The other one is not to use a list comprehension and do it in a more imperative way: run an actual loop with a termination condition (see loop and recur).
Example:
(loop [x 0]
(if (and (odd? x) (> x 50))
x
(recur (inc x))))

Resources