Efficency of assign-and-compare in the same statement in Smalltalk - benchmarking

A previous SO question raised the issue about which idiom is better in time of execution efficency terms:
[ (var := exp) > 0 ] whileTrue: [ ... ]
versus
[ var := exp.
var > 0 ] whileTrue: [ ... ]
Intuitively it seems the first form could be more efficient during execution, because it saves fetching one additional statement (second form). Is this true in most Smalltalks?
Trying with two stupid benchmarks:
| var acc |
var := 10000.
[ [ (var := var / 2) < 0 ] whileTrue: [ acc := acc + 1 ] ] bench.
| var acc |
var := 10000.
[ [ var := var / 2. var < 0 ] whileTrue: [ acc := acc + 1 ] ] bench
Reveals no major differences between both versions.
Any other opinions?

So the question is: What should I use to achieve a better execution time?
temp := <expression>.
temp > 0
or
(temp := <expression>) > 0
In cases like this one, the best way to arrive at a conclusion is to go down one step in the level of abstraction. In other words, we need a better understanding of what's happening behind the scenes.
The executable part of a CompiledMethod is represented by its bytecodes. When we save a method, what we are doing is compiling it into a series of low level instructions for the VM to be able to execute the method every time it is invoked. So, let's take a look at the bytecodes of each one of the cases above.
Since <expression> is the same in the same in both cases, let's reduce it drastically to eliminate noise. Also, let's put our code in a method so to have a CompiledMethod to play with
Object >> m
| temp |
temp := 1.
temp > 0
Now, let's look CompiledMethod and its superclasses for some message that would show us the bytecodes of Object >> #m. The selector should contain the subword bytecodes, right?
...
Here it is #symbolicBytecodes! Now let's evaluate (Object >> #m) symbolicBytecodes to get:
pushConstant: 1
popIntoTemp: 0
pushTemp: 0
pushConstant: 0
send: >
pop
returnSelf
Note by the way how our temp variable has been renamed to Temp: 0 in the bytecodes language.
Now repeat with the other and get:
pushConstant: 1
storeIntoTemp: 0
pushConstant: 0
send: >
pop
returnSelf
The difference is
popIntoTemp: 0
pushTemp: 0
versus
storeIntoTemp: 0
What this reveals is that in both cases temp is read from the stack in different ways. In the first case, the result of our <expression> is popped into temp from the execution stack and then temp is pushed again to restore the stack. A pop followed by a push of the same thing. In the second case, instead, no push or pop happens and temp is simply read from the stack.
So the conclusion is that in the first case we will be generating two cancelling instructions pop followed by push.
This also explains why the difference is so hard to measure: push and pop instructions have direct translations into machine code and the CPU will execute them really fast.
Note however, that nothing prevents the compiler to automatically optimize the code and realize that in fact pop + push is equivalent to storeInto. With such an optimization both Smalltalk snippets would result in exactly the same machine code.
Now, you should be able to decide which form do you prefer. I my opinion such a decision should only take into account the programming style that you like better. Taking into consideration the execution time is irrelevant because the difference is minimal, and could be easily reduced to zero by implementing the optimization we just discussed. By the way, that would be an excellent exercise for those willing to understand the low level realms of the unparalleled Smalltalk language.

Related

Loops in Lean programming language

I'm starting to learn about Lean programming language https://leanprover.github.io
I've found out that there are functions, structures, if/else, and other common programming commands.
However, I haven't found anything to deal with loops. Is there a way of iterating or repeating a block of code in Lean? Something similar to for or while in other languages. If so, please add the syntaxis or an example.
Thank you in advance.
Like other functional programming languages, loops in Lean are primarily done via recursion. For example:
-- lean 3
def numbers : ℕ → list ℕ
| 0 := []
| (n+1) := n :: numbers n
This is a bit of a change of mindset if you aren't used to it. See: Haskell from C: Where are the for Loops?
To complicate matters, Lean makes a distinction between general recursion and structural recursion. The above example is using structural recursion on ℕ, so we know that it always halts. Non-halting programs can lead to inconsistency in a DTT-based theorem prover like lean, so it has to be strictly controlled. You can opt in to general recursion using the meta keyword:
-- lean 3
meta def foo : ℕ → ℕ
| n := if n = 100 then 0 else foo (n + 1) + 1
In lean 4, the do block syntax includes a for command, which can be used to write for loops in a more imperative style. Note that they are still represented as tail-recursive functions under the hood, and there are some typeclasses powering the desugaring. (Also you need the partial keyword instead of meta to do general recursion in lean 4.)
-- lean 4
partial def foo (n : Nat) : Nat :=
if n = 100 then 0 else foo (n + 1) + 1
def mySum (n : Nat) : Nat := Id.run do
let mut acc := 0
for i in [0:n] do
acc := acc + i
pure acc

Summing array elements in Ruby

An exercise in Coderbyte is supposed to determine if some subset of integers in an array sum to the largest number in the array.
The following code seems to work on my computer, but when I submit it online, it seems to cause an endless loop. (There's never any output, regardless of the argument passed).
def arr_add?(arr)
a = arr.sort
lgst = a.pop
size = a.size
result = false
while size > 1
a.combination(size) {|c| result |= (c.inject {|r, a| r + a} == lgst)}
size -= 1
end
result.to_s
end
arr_add?([1, 2, 3, 4, 10, 14])
Any ideas why this might be the case?
I suspect that you are actually not running into an endless loop, but rather just take a really long time, because of the inefficiency of your algorithm.
def ArrayAdditionI(arr)
arr_size = arr.size
ary = arr.sort
largest = ary.pop
ary_size = arr_size - 1
combination_size = ary_size
result = false
while combination_size > 1
ary.combination(combination_size) {|combination|
result |= (combination.inject(
:+
) == largest)
}
combination_size -= 1
end
result.to_s
end
I introduced a new variable and renamed some others, so that it becomes easier to talk about the algorithm. I also reformatted it, to make the three nested "loops" more obvious.
Let's take a look at the algorithm.
The outer while loop is executed ary_size - 1 == arr_size - 2 times, with combination_size ranging from 2 to ary_size == arr_size - 1.
The combination "loop" is executed ary_size choose combination_size times, that's … well, a very quickly growing number.
The innermost "loop" (the operation performed by combination.inject) is executed combination_size - 1 times.
This gives a total execution count for the innermost operation of:
The Sum from 2 to arr_size - 1 of
arr_size - 1 choose combination_size times
combination_size - 1
In Wolfram Language, that's Sum[Binomial[a-1, c]*(c-1), c, 2, a-1], which Wolfram Alpha tells us is 2^(a-2) (a-3)+1, which is in O(2^n).
Playing around with the numbers a bit:
for 10 items, we have 1793 executions of the inject operation
for 15 items, we already have 98 305
for 20 items, we have 4 456 449
at 28 items, we cross the threshold to a billion operations: 1 677 721 601
for 1000 items, which I suspect is a somewhat reasonable input size CoderBytes might use, we have 2 670 735 203 411 771 297 463 949 434 782 054 512 824 301 493 176 042 516 553 547 843 013 099 994 928 903 285 314 296 959 198 121 926 383 029 722 247 001 218 461 778 959 624 588 092 753 669 155 960 493 619 769 880 691 017 874 939 573 116 202 845 311 796 007 113 080 079 901 646 833 889 657 798 860 899 142 814 122 011 828 559 707 931 456 870 722 063 370 635 289 362 135 539 416 628 419 173 512 766 291 969 operations. Oops.
Try your algorithm with arrays of length 5, 10, 15 (all instantaneous), 20 (a noticeable pause), and then 23, 24, 25 to get a feel for just how quickly the runtime grows.
Assuming that you could build a CPU which can execute the inner loop in a single instruction. Further assuming that a single instruction takes only a Planck time unit (i.e. the CPU has a frequency of roughly 20 000 000 000 000 000 000 000 000 000 000 THz). Further assuming that every single particle in the observable universe was such a CPU. It will still take more than the current age of the universe to execute your algorithm for an array of not even 500 items.
Note that with most of these programming puzzles, they are not actually programming puzzles, they are mathematics puzzles. They usually require a mathematical insight, in order to be able to solve them efficiently. Or, in this case, recognizing that it is the Subset sum problem, which is known to be NP-complete.
By the way, as a matter of style, here is (a slight variation of) your algorithm written in idiomatic Ruby style. As you can see, in idiomatic Ruby, it almost becomes a 1:1 translation of the English problem statement into code.
While it is asymptotically just as inefficient as your algorithm, it breaks early, as soon as the answer is true (unlike yours, will just keep running even if it already found a solution). (any? will do that for you automatically.)
def ArrayAdditionI(arr)
largest = arr.delete_at(arr.index(arr.max))
1.upto(arr.size).any? {|combination_size|
arr.combination(combination_size).any? {|combination|
combination.inject(:+) == largest
}
}.to_s
end
This is an alternative interpretation of the (unclear) problem statement:
def ArrayAdditionI(arr)
2.upto(arr.size).any? {|combination_size|
arr.combination(combination_size).any? {|combination|
combination.inject(:+) == arr.max
}
}.to_s
end
The code above is valid ruby code.
The result is "true".
It is a bit unusual perhaps in that while loops are somewhat rare to see/have, but since it is valid ruby code, it should work on that remote site too.
Contact whoever runs the online ruby interpreter at Coderbyte - their version appears to be incompatible with the MRI ruby.
Your code seems to count down; perhaps have a look at 10.downto(1) - replace with variables as appropriate.

Find conditional evaluation in for loop using libclang

I'm using clang (via libclang via the Python bindings) to put together a code-review bot. I've been making the assumption that all FOR_STMT cursors will have 4 children; INIT, EVAL, INC, and BODY..
for( INIT; EVAL; INC )
BODY;
which would imply that I could check the contents of the evaluation expression with something in python like:
forLoopComponents = [ c for c in forCursor.get_children() ]
assert( len( forLoopComponents ) == 4 )
initExpressionCursor = forLoopComponents[ 0 ]
evalExpressionCursor = forLoopComponents[ 1 ]
incExpressionCursor = forLoopComponents[ 2 ]
bodyExpressionCursor = forLoopComponents[ 3 ]
errorIfContainsAssignment( evalExpressionCursor ) # example code style rule
This approach seems...less than great to begin with, but I just accepted it as a result of libclang, and the Python bindings especially, being rather sparse. However I've recently noticed that a loop like:
for( ; a < 4; a-- )
;
will only have 3 children -- and the evaluation will now be the first one rather than the second. I had always assumed that libclang would just return the NULL_STMT for any unused parts of the FOR_STMT...clearly, I was wrong.
What is the proper approach for parsing the FOR_STMT? I can't find anything useful for this in libclang.
UPDATE: Poking through the libclang source, it looks like these 4 components are dumbly added from the clang::ForStmt class using a visitor object. The ForStmt object should be returning null statement objects, but some layer somewhere seems to be stripping these out of the visited nodes vector...?
The same here, as a workaround I replaced the first empty statement with a dummy int foo=0 statement.
I can imagine a solution, which uses Cursor's get_tokens to match the parts of the statement.
The function get_tokens can help in situations, where clang is not enough.

Modeling memory access on Z3

I'm modelling a program's memory accesses using Z3 and I have a doubt about performance that I'd like to share.
I wanted to model on a compact way something like a:
memset(dst, 0, 1000);
My first try was to use the array theory, but that meant to either create a thousand terms like (assert (and (= (select mem 0) 0) (= (select mem 1) 0) ... or a thousand similar stores or a quantified formula:
(forall (x Int) (implies (and (>= x 0) (< x 1000)) (= (select mem x) 0))
But I was told to avoid quantifiers while using arrays.
Next idea was to define a UF:
(define-fun newmemfun ((idx Int)) Int (
ite (and (>= idx 0) (< idx 1000)) 0 (prevmemfun idx)
))
But that means that I need to define a new function for each memory write operation (even for individual store operations, not just multiple stores like memset or memcpy). Which would end up creating a very nested ITE structure that would even save "old" values for a same index. ie:
mem[0] = 1;
mem[0] = 2;
would be:
(ite (= idx 0) 2 (ite (= idx 0) 1 ...
Which is functionally correct but the size of the expression (and I guess the generated AST for it) tends to accumulate very fast and I'm not sure if Z3 is optimized to detect and handle this case.
So, the question is: what would be the most performant way to encode memory operations that can cope with large multiple stores like the example above and individual stores at the same time.
Thanks,
pablo.
PS: non-closed and non-matching parenthesis intended :P.
Without knowing a bit more about your end goal, aside from modeling memory accesses (e.g., are you going to be doing verification, test case generation, etc.?), it's somewhat hard to answer, as you have many options. However, you may have the most flexibility to control performance issues if you rely on one of the APIs. For example, you can define your own memory accesses as follows (link to z3py script: http://rise4fun.com/Z3Py/gO6i ):
address_bits = 7
data_bits = 8
s = Solver()
# mem is a list of length program step, of a list of length 2^address_bits of bitvectors of size 2^data_bits
mem =[]
# modify a single address addr to value at program step step
def modifyAddr(addr, value, step):
mem.append([]) # add new step
for i in range(0,2**address_bits):
mem[step+1].append( BitVec('m' + str(step + 1) + '_' + str(i), data_bits) )
if i != addr:
s.add(mem[step+1][i] == mem[step][i])
else:
s.add(mem[step+1][i] == value)
# set all memory addresses to a specified value at program step step
def memSet(value, step):
mem.append([])
for i in range(0,2**address_bits):
mem[step+1].append( BitVec('m' + str(step + 1) + '_' + str(i), data_bits) )
s.add(mem[step+1][i] == value)
modaddr = 23 # example address
step = -1
# initialize all addresses to 0
memSet(0, step)
step += 1
print s.check()
for i in range(0,step+1): print s.model()[mem[i][modaddr]] # print all step values for modaddr
modifyAddr(modaddr,3,step)
step += 1
print s.check()
for i in range(0,step+1): print s.model()[mem[i][modaddr]]
modifyAddr(modaddr,4,step)
step += 1
print s.check()
for i in range(0,step+1): print s.model()[mem[i][modaddr]]
modifyAddr(modaddr,2**6,step)
step += 1
print s.check()
for i in range(0,step+1): print s.model()[mem[i][modaddr]]
memSet(1,step)
step += 1
print s.check()
for i in range(0,step+1): print s.model()[mem[i][modaddr]]
for a in range(0,2**address_bits): # set all address values to their address number
modifyAddr(a,a,step)
step += 1
print s.check()
print "values for modaddr at all steps"
for i in range(0,step+1): print s.model()[mem[i][modaddr]] # print all values at each step for modaddr
print "values at final step"
for i in range(0,2**address_bits): print s.model()[mem[step][i]] # print all memory addresses at final step
This naive implementation allows you to either (a) set all memory addresses to some value (like your memset), or (b) modify a single memory address, constraining all other addresses to have the same value. For me, it took a few seconds to run and encoded about 128 steps of 128 addresses, so it had around 20000 bitvector expressions of 8 bits each.
Now, depending on what you are doing (e.g., do you allow atomic writes to several addresses like this memset, or do you want to model them all as individual writes?), you could add further functions, like modify a subset of addresses to some values in a program step. This will allow you some flexibility to trade off modeling accuracy for performance (e.g., atomic writes to blocks of memory versus modifying single addresses at a time, which will run into performance problems). Also, nothing about this implementation requires the APIs, you could encode this as an SMT-LIB file as well, but you will probably have more flexibility (e.g., lets say you want to interact with models to constrain future sat checks) if you use one of the APIs.

Why does Lua have no "continue" statement?

I have been dealing a lot with Lua in the past few months, and I really like most of the features but I'm still missing something among those:
Why is there no continue?
What workarounds are there for it?
In Lua 5.2 the best workaround is to use goto:
-- prints odd numbers in [|1,10|]
for i=1,10 do
if i % 2 == 0 then goto continue end
print(i)
::continue::
end
This is supported in LuaJIT since version 2.0.1
The way that the language manages lexical scope creates issues with including both goto and continue. For example,
local a=0
repeat
if f() then
a=1 --change outer a
end
local a=f() -- inner a
until a==0 -- test inner a
The declaration of local a inside the loop body masks the outer variable named a, and the scope of that local extends across the condition of the until statement so the condition is testing the innermost a.
If continue existed, it would have to be restricted semantically to be only valid after all of the variables used in the condition have come into scope. This is a difficult condition to document to the user and enforce in the compiler. Various proposals around this issue have been discussed, including the simple answer of disallowing continue with the repeat ... until style of loop. So far, none have had a sufficiently compelling use case to get them included in the language.
The work around is generally to invert the condition that would cause a continue to be executed, and collect the rest of the loop body under that condition. So, the following loop
-- not valid Lua 5.1 (or 5.2)
for k,v in pairs(t) do
if isstring(k) then continue end
-- do something to t[k] when k is not a string
end
could be written
-- valid Lua 5.1 (or 5.2)
for k,v in pairs(t) do
if not isstring(k) then
-- do something to t[k] when k is not a string
end
end
It is clear enough, and usually not a burden unless you have a series of elaborate culls that control the loop operation.
You can wrap loop body in additional repeat until true and then use do break end inside for effect of continue. Naturally, you'll need to set up additional flags if you also intend to really break out of loop as well.
This will loop 5 times, printing 1, 2, and 3 each time.
for idx = 1, 5 do
repeat
print(1)
print(2)
print(3)
do break end -- goes to next iteration of for
print(4)
print(5)
until true
end
This construction even translates to literal one opcode JMP in Lua bytecode!
$ luac -l continue.lua
main <continue.lua:0,0> (22 instructions, 88 bytes at 0x23c9530)
0+ params, 6 slots, 0 upvalues, 4 locals, 6 constants, 0 functions
1 [1] LOADK 0 -1 ; 1
2 [1] LOADK 1 -2 ; 3
3 [1] LOADK 2 -1 ; 1
4 [1] FORPREP 0 16 ; to 21
5 [3] GETGLOBAL 4 -3 ; print
6 [3] LOADK 5 -1 ; 1
7 [3] CALL 4 2 1
8 [4] GETGLOBAL 4 -3 ; print
9 [4] LOADK 5 -4 ; 2
10 [4] CALL 4 2 1
11 [5] GETGLOBAL 4 -3 ; print
12 [5] LOADK 5 -2 ; 3
13 [5] CALL 4 2 1
14 [6] JMP 6 ; to 21 -- Here it is! If you remove do break end from code, result will only differ by this single line.
15 [7] GETGLOBAL 4 -3 ; print
16 [7] LOADK 5 -5 ; 4
17 [7] CALL 4 2 1
18 [8] GETGLOBAL 4 -3 ; print
19 [8] LOADK 5 -6 ; 5
20 [8] CALL 4 2 1
21 [1] FORLOOP 0 -17 ; to 5
22 [10] RETURN 0 1
Straight from the designer of Lua himself:
Our main concern with "continue" is that there are several other control structures that (in our view) are more or less as important as "continue" and may even replace it. (E.g., break with labels [as in Java] or even a more generic goto.) "continue" does not seem more special than other control-structure mechanisms, except that it is present in more languages. (Perl actually has two "continue" statements, "next" and "redo". Both are useful.)
The first part is answered in the FAQ as slain pointed out.
As for a workaround, you can wrap the body of the loop in a function and return early from that, e.g.
-- Print the odd numbers from 1 to 99
for a = 1, 99 do
(function()
if a % 2 == 0 then
return
end
print(a)
end)()
end
Or if you want both break and continue functionality, have the local function perform the test, e.g.
local a = 1
while (function()
if a > 99 then
return false; -- break
end
if a % 2 == 0 then
return true; -- continue
end
print(a)
return true; -- continue
end)() do
a = a + 1
end
I've never used Lua before, but I Googled it and came up with this:
http://www.luafaq.org/
Check question 1.26.
This is a common complaint. The Lua authors felt that continue was only one of a number of possible new control flow mechanisms (the fact that it cannot work with the scope rules of repeat/until was a secondary factor.)
In Lua 5.2, there is a goto statement which can be easily used to do the same job.
Lua is lightweight scripting language which want to smaller as possible. For example, many unary operation such as pre/post increment is not available
Instead of continue, you can use goto like
arr = {1,2,3,45,6,7,8}
for key,val in ipairs(arr) do
if val > 6 then
goto skip_to_next
end
# perform some calculation
::skip_to_next::
end
We can achieve it as below, it will skip even numbers
local len = 5
for i = 1, len do
repeat
if i%2 == 0 then break end
print(" i = "..i)
break
until true
end
O/P:
i = 1
i = 3
i = 5
We encountered this scenario many times and we simply use a flag to simulate continue. We try to avoid the use of goto statements as well.
Example: The code intends to print the statements from i=1 to i=10 except i=3. In addition it also prints "loop start", loop end", "if start", and "if end" to simulate other nested statements that exist in your code.
size = 10
for i=1, size do
print("loop start")
if whatever then
print("if start")
if (i == 3) then
print("i is 3")
--continue
end
print(j)
print("if end")
end
print("loop end")
end
is achieved by enclosing all remaining statements until the end scope of the loop with a test flag.
size = 10
for i=1, size do
print("loop start")
local continue = false; -- initialize flag at the start of the loop
if whatever then
print("if start")
if (i == 3) then
print("i is 3")
continue = true
end
if continue==false then -- test flag
print(j)
print("if end")
end
end
if (continue==false) then -- test flag
print("loop end")
end
end
I'm not saying that this is the best approach but it works perfectly to us.
Again with the inverting, you could simply use the following code:
for k,v in pairs(t) do
if not isstring(k) then
-- do something to t[k] when k is not a string
end
Why is there no continue?
Because it's unnecessary¹. There's very few situations where a dev would need it.
A) When you have a very simple loop, say a 1- or 2-liner, then you can just turn the loop condition around and it's still plenty readable.
B) When you're writing simple procedural code (aka. how we wrote code in the last century), you should also be applying structured programming (aka. how we wrote better code in the last century)
C) If you're writing object-oriented code, your loop body should consist of no more than one or two method calls unless it can be expressed in a one- or two-liner (in which case, see A)
D) If you're writing functional code, just return a plain tail-call for the next iteration.
The only case when you'd want to use a continue keyword is if you want to code Lua like it's python, which it just isn't.²
What workarounds are there for it?
Unless A) applies, in which case there's no need for any workarounds, you should be doing Structured, Object-Oriented or Functional programming. Those are the paradigms that Lua was built for, so you'd be fighting against the language if you go out of your way to avoid their patterns.³
Some clarification:
¹ Lua is a very minimalistic language. It tries to have as few features as it can get away with, and a continue statement isn't an essential feature in that sense.
I think this philosophy of minimalism is captured well by Roberto Ierusalimschy in this 2019 interview:
add that and that and that, put that out, and in the end we understand the final conclusion will not satisfy most people and we will not put all the options everybody wants, so we don’t put anything. In the end, strict mode is a reasonable compromise.
² There seems to be a large number of programmers coming to Lua from other languages because whatever program they're trying to script for happens to use it, and many of them want don't seem to want to write anything other than their language of choice, which leads to many questions like "Why doesn't Lua have X feature?"
Matz described a similar situation with Ruby in a recent interview:
The most popular question is: "I’m from the language X community; can’t you introduce a feature from the language X to Ruby?", or something like that. And my usual answer to these requests is… "no, I wouldn’t do that", because we have different language design and different language development policies.
³ There's a few ways to hack your way around this; some users have suggested using goto, which is a good enough aproximation in most cases, but gets very ugly very quickly and breaks completely with nested loops. Using gotos also puts you in danger of having a copy of SICP thrown at you whenever you show your code to anybody else.

Resources