local scopes in Julia - arrays

I understand that the for loops are now local in Julia. But there is something I do not understand. Please consider the following two examples.
Example 1
a = 0
for i = 1:3
a += 1
end
Example 2
a = [0]
for i = 1:3
a[1] += 1
end
Example 1 throws an error message, which I understand. But Example 2 works as it is expected. How should I understand this? Arrays are not variables? Can somebody explain this to me?

This question is essentially a duplicate of Julia scoping: why does this function modify a global variable?, where it is discussed in detail that the difference is that a = ... is an assignment operation (changes binding of a variable a) and a[1] = ... is a setindex! operation (changes value contained in a collection). Also see Creating copies in Julia with = operator.
I am not marking it as a duplicate only because in your case the first example fails in REPL under Julia 1.4.2 but will work under Julia 1.5 once it is released, see https://github.com/JuliaLang/julia/blob/v1.5.0-rc1/NEWS.md:
The interactive REPL now uses "soft scope" for top-level expressions: an assignment inside a scope block such as a for loop automatically assigns to a global variable if one has been defined already. This matches the behavior of Julia versions 0.6 and prior, as well as IJulia. Note that this only affects expressions interactively typed or pasted directly into the default REPL (#28789, #33864).

Related

How to idiomatically support closing values with pairs()?

I'm new to Lua (language version 5.4 if it matters, there doesn't seem to be a tag for that version on SO yet) and I'm trying to find the most idiomatic way to implement iteration (for loop) over a userdata object.
The Lua 5.4 Reference Manual says regarding a loop statement for var_1, ···, var_n in explist do body end:
The loop starts by evaluating explist to produce four values: an iterator function, a state, an initial value for the control variable, and a closing value.
The idiomatic way to loop seems to be using the pairs(t) function. This also works for userdata via the __pairs metamethod. However:
If t has a metamethod __pairs, calls it with t as argument and returns the first three results from the call.
Why only three instead of four? If I have a complex userdata object that needs to allocate some resource for a loop, I'll need that closing value so I know when to deallocate that resource in case the loop ends early, right? Does that mean I cannot use pairs in such a case or am I missing something?
I could of course provide a new function, say pairs4, but that doesn't seem to be very idiomatic.
Because that's how it has always worked since at least Lua 5.0. pairs always returned 3 values because for previously only took 3 values.
"to-be-closed variables" are a new feature of Lua 5.4, as is the fourth value for generic for. Why pairs wasn't updated to match is unknown. It is possible that pairs returns all of the values from the __pairs metamethod, but I haven't looked at the implementation to verify this.
In this case, I would suggest writing a pairs_close that returns 4 arguments from the __pairs metamethod.

Julia, use of map to run a function multiple times,

I have some code that runs fine and does what I want, although there may be a simpler more elegant solution, this works :
round(Int16, floor(rand(TruncatedNormal(150,20,50,250))))
However when I try to execute it multiple times, using map, it throws an error saying it doesn't like the Int16 specification, so this:
map(round(Int16, floor(rand(TruncatedNormal(150,20,50,250)))), 1:2)
throws this error
ERROR: MethodError: objects of type Int16 are not callable
I just want to run it twice (in this case) and sum the results. Why is it unhappy? Thx. J
The first argument to map is a function. So, with your code, Julia is trying to make a function call:
round(Int16, floor(rand(TruncatedNormal(150,20,50,250))))()
But the output of round(Int16, ...) isn't a function, it's a number, so you cannot call it. That's why the error says "objects of type Int16 are not callable." You could fix this by using an anonymous function:
map(() -> round(Int16, floor(rand(TruncatedNormal(150,20,50,250)))), 1:2)
But the "Julian" way to do this is to use a comprehension:
[round(Int16, floor(rand(TruncatedNormal(150,20,50,250)))) for _ in 1:2]
EDIT:
If you are going to sum the results, then you can use something that looks like a comprehension but is called a generator expression. This is basically everything above with the [ ] around the expression. A generator expression can be used directly in functions like sum or mean, etc.
sum(round(Int16, floor(rand(TruncatedNormal(150,20,50,250)))) for _ in 1:2)
The advantage to generator expressions is that they don't allocate the memory for the full array. So, if you did this 100 times and used the sum approach above, you wouldn't need to allocate space for 100 numbers.
This goes beyond the original question, but OP wanted to use the sum expression where the 2 in 1:2 is a 1-element vector. Of course, if the input is always a 1-element vector, then I recommend first(x) like the comments. But this is a nice opportunity to show the importance of breaking things down into functions frequently in Julia. For example, you could take the entire sum expression and define a function
generatenumbers(n::Integer) = sum(... for _ in 1:n)
where n is a scalar. Then if you have some odd array expression for n (1-element vector, many such ns in a multi-dim array, etc.), you can just do:
generatenumbers.(ns)
# will apply to each element and return same shape as ns
If the de-sugaring logic is more complex than applying element-wise, you can even define:
generatenumbers(ns::AbstractArray) = # ... something more complex
The point is to define an "atomic" function that expresses the statement or task you want clearly, then use dispatch to apply it to more complicated data-structures that appear in practical code. This is a common design pattern in Julia (not the only option, but an effective one).
Adding on the answer from #darsnack.
If you want to run it multiple times in order to keep the results (it wasn't clear from the question). Then you could also ask rand to produce a vector by doing the following (and also making the type conversion through the floor call).
Moving from:
map(round(Int16, floor(rand(TruncatedNormal(150,20,50,250)))), 1:2)
to:
floor.(Int16, rand(TruncatedNormal(150,20,50,250), 2))
The documentation is here.

Racket: Macro outputs something weird instead of an array

My aim is to populate an array in compile phase (i.e. in a macro), and use it in execution phase. For some reason, though, object returned by a macro is not recognized by Racket as an array. To illustrate the problem, shortest code showing this behaviour:
(require (for-syntax math/array))
(require math/array)
(define-syntax (A stx)
(datum->syntax stx `(define a ,(array #[#[1 2] #[3 4]]))))
(A)
After execution of this macro, 'a' is something, but I don't know what it is. It is not an array ((array? a) -> #f) nor a string, array-ref is not working on it, obviously, but it prints as: (array #[#[1 2] #[3 4]]). "class-of" from the "swindle" module claims it is "primitive-class:unknown-primitive", for what it's worth.
I have tried outputting a vector instead of an array, but it works as expected, i.e. resulting value is a vector in execution phase.
I have tried using CommonLisp style defmacro from "compatibility" module, thinking that this may have something to do do with datum->syntax transformation, but this changed nothing.
I have tested this on Win7 with Racket 6.5 and 6.7, as well as on Linux with Racket 6.7 - problem persists.
Any ideas?
update
Thanks to great answers and suggestions, I came up with following solution:
(require (for-syntax math/array))
(require math/array)
(define-syntax (my-array stx)
(syntax-case stx ()
[(_ id)
(let
([arr (build-array
#(20 20)
(lambda (ind)
(let
([x (vector-ref ind 1)]
[y (vector-ref ind 0)])
(list 'some-symbol x y (* x y)))))])
(with-syntax ([syn-arr (eval (read (open-input-string (string-append "#'" (format "~v" arr)))))])
#'(define id syn-arr)))]))
(my-array A)
I'm not sure if this is proper Racket (I welcome all suggestions on code improvement) but here is how it works:
Array is built and stored in "arr" variable. It is then printed to string, prepended with #' (so that this string represents syntax object now) and evaluated as code. This effectively converts array to syntax object, that can be embedded in macro output.
Advantage of this approach is, that every object that can be written out and then read back by Racket can be output by macro. Disadvantage is, that some objects can't (I'm looking at you, custom struct!) and therefore additional string-creating function may be required in some cases.
First of all, don’t use datum->syntax like that. You’re throwing away all hygiene information there, so if someone was using a different language where define was called something else (like def, for example), that would not work. For a principled introduction to Racket macros, consider reading Fear of Macros.
Second of all, the issue here is that you are creating what is sometimes known as “3D syntax”. 3D syntax should probably be an error in this context, but the gist is that there is only a small set of things that you can safely put inside of a syntax object:
a symbol
a number
a boolean
a character
a string
the empty list
a pair of two pieces of valid syntax
a vector of valid syntax
a box of valid syntax
a hash table of valid syntax keys and values
a prefab struct containing exclusively valid syntax
Anything else is “3D syntax”, which is illegal as the output of a macro. Notably, arrays from math/array are not permitted.
This seems like a rather extreme limitation, but the point is that the above list is simply the list of things that can end up in compiled code. Racket does not know how to serialize arbitrary things to bytecode, which is reasonable: it wouldn’t make much sense to embed a closure in compiled code, for example. However, it’s perfectly reasonable to produce an expression that creates an array, which is what you should do here.
Writing your macro more properly, you would get something like this:
#lang racket
(require math/array)
(define-syntax (define-simple-array stx)
(syntax-case stx ()
[(_ id)
#'(define id (array #(#(1 2) #(3 4))))]))
(define-simple-array x)
Now, x is (array #[#[1 2] #[3 4]]). Note that you can remove the for-syntax import of math/array, since you are no longer using it at compile time, which makes sense: macros just manipulate bits of code. You only need math/array at runtime to create the actual value you end up with.

Is there a reason that Swift array assignment is inconsistent (neither a reference nor a deep copy)?

I'm reading the documentation and I am constantly shaking my head at some of the design decisions of the language. But the thing that really got me puzzled is how arrays are handled.
I rushed to the playground and tried these out. You can try them too. So the first example:
var a = [1, 2, 3]
var b = a
a[1] = 42
a
b
Here a and b are both [1, 42, 3], which I can accept. Arrays are referenced - OK!
Now see this example:
var c = [1, 2, 3]
var d = c
c.append(42)
c
d
c is [1, 2, 3, 42] BUT d is [1, 2, 3]. That is, d saw the change in the last example but doesn't see it in this one. The documentation says that's because the length changed.
Now, how about this one:
var e = [1, 2, 3]
var f = e
e[0..2] = [4, 5]
e
f
e is [4, 5, 3], which is cool. It's nice to have a multi-index replacement, but f STILL doesn't see the change even though the length has not changed.
So to sum it up, common references to an array see changes if you change 1 element, but if you change multiple elements or append items, a copy is made.
This seems like a very poor design to me. Am I right in thinking this? Is there a reason I don't see why arrays should act like this?
EDIT: Arrays have changed and now have value semantics. Much more sane!
Note that array semantics and syntax was changed in Xcode beta 3 version (blog post), so the question no longer applies. The following answer applied to beta 2:
It's for performance reasons. Basically, they try to avoid copying arrays as long as they can (and claim "C-like performance"). To quote the language book:
For arrays, copying only takes place when you perform an action that has the potential to modify the length of the array. This includes appending, inserting, or removing items, or using a ranged subscript to replace a range of items in the array.
I agree that this is a bit confusing, but at least there is a clear and simple description of how it works.
That section also includes information on how to make sure an array is uniquely referenced, how to force-copy arrays, and how to check whether two arrays share storage.
From the official documentation of the Swift language:
Note that the array is not copied when you set a new value with subscript syntax, because setting a single value with subscript syntax does not have the potential to change the array’s length. However, if you append a new item to array, you do modify the array’s length. This prompts Swift to create a new copy of the array at the point that you append the new value. Henceforth, a is a separate, independent copy of the array.....
Read the whole section Assignment and Copy Behavior for Arrays in this documentation. You will find that when you do replace a range of items in the array then the array takes a copy of itself for all items.
The behavior has changed with Xcode 6 beta 3. Arrays are no longer reference types and have a copy-on-write mechanism, meaning as soon as you change an array's content from one or the other variable, the array will be copied and only the one copy will be changed.
Old answer:
As others have pointed out, Swift tries to avoid copying arrays if possible, including when changing values for single indexes at a time.
If you want to be sure that an array variable (!) is unique, i.e. not shared with another variable, you can call the unshare method. This copies the array unless it already only has one reference. Of course you can also call the copy method, which will always make a copy, but unshare is preferred to make sure no other variable holds on to the same array.
var a = [1, 2, 3]
var b = a
b.unshare()
a[1] = 42
a // [1, 42, 3]
b // [1, 2, 3]
The behavior is extremely similar to the Array.Resize method in .NET. To understand what's going on, it may be helpful to look at the history of the . token in C, C++, Java, C#, and Swift.
In C, a structure is nothing more than an aggregation of variables. Applying the . to a variable of structure type will access a variable stored within the structure. Pointers to objects do not hold aggregations of variables, but identify them. If one has a pointer which identifies a structure, the -> operator may be used to access a variable stored within the structure identified by the pointer.
In C++, structures and classes not only aggregate variables, but can also attach code to them. Using . to invoke a method will on a variable ask that method to act upon the contents of the variable itself; using -> on a variable which identifies an object will ask that method to act upon the object identified by the variable.
In Java, all custom variable types simply identify objects, and invoking a method upon a variable will tell the method what object is identified by the variable. Variables cannot hold any kind of composite data type directly, nor is there any means by which a method can access a variable upon which it is invoked. These restrictions, although semantically limiting, greatly simplify the runtime, and facilitate bytecode validation; such simplifications reduced the resource overhead of Java at a time when the market was sensitive to such issues, and thus helped it gain traction in the marketplace. They also meant that there was no need for a token equivalent to the . used in C or C++. Although Java could have used -> in the same way as C and C++, the creators opted to use single-character . since it was not needed for any other purpose.
In C# and other .NET languages, variables can either identify objects or hold composite data types directly. When used on a variable of a composite data type, . acts upon the contents of the variable; when used on a variable of reference type, . acts upon the object identified by it. For some kinds of operations, the semantic distinction isn't particularly important, but for others it is. The most problematical situations are those in which a composite data type's method which would modify the variable upon which it is invoked, is invoked on a read-only variable. If an attempt is made to invoke a method on a read-only value or variable, compilers will generally copy the variable, let the method act upon that, and discard the variable. This is generally safe with methods that only read the variable, but not safe with methods that write to it. Unfortunately, .does has not as yet have any means of indicating which methods can safely be used with such substitution and which can't.
In Swift, methods on aggregates can expressly indicate whether they will modify the variable upon which they are invoked, and the compiler will forbid the use of mutating methods upon read-only variables (rather than having them mutate temporary copies of the variable which will then get discarded). Because of this distinction, using the . token to call methods that modify the variables upon which they are invoked is much safer in Swift than in .NET. Unfortunately, the fact that the same . token is used for that purpose as to act upon an external object identified by a variable means the possibility for confusion remains.
If had a time machine and went back to the creation of C# and/or Swift, one could retroactively avoid much of the confusion surrounding such issues by having languages use the . and -> tokens in a fashion much closer to the C++ usage. Methods of both aggregates and reference types could use . to act upon the variable upon which they were invoked, and -> to act upon a value (for composites) or the thing identified thereby (for reference types). Neither language is designed that way, however.
In C#, the normal practice for a method to modify a variable upon which it is invoked is to pass the variable as a ref parameter to a method. Thus calling Array.Resize(ref someArray, 23); when someArray identifies an array of 20 elements will cause someArray to identify a new array of 23 elements, without affecting the original array. The use of ref makes clear that the method should be expected to modify the variable upon which it is invoked. In many cases, it's advantageous to be able to modify variables without having to use static methods; Swift addresses that means by using . syntax. The disadvantage is that it loses clarify as to what methods act upon variables and what methods act upon values.
To me this makes more sense if you first replace your constants with variables:
a[i] = 42 // (1)
e[i..j] = [4, 5] // (2)
The first line never needs to change the size of a. In particular, it never needs to do any memory allocation. Regardless of the value of i, this is a lightweight operation. If you imagine that under the hood a is a pointer, it can be a constant pointer.
The second line may be much more complicated. Depending on the values of i and j, you may need to do memory management. If you imagine that e is a pointer that points to the contents of the array, you can no longer assume that it is a constant pointer; you may need to allocate a new block of memory, copy data from the old memory block to the new memory block, and change the pointer.
It seems that the language designers have tried to keep (1) as lightweight as possible. As (2) may involve copying anyway, they have resorted to the solution that it always acts as if you did a copy.
This is complicated, but I am happy that they did not make it even more complicated with e.g. special cases such as "if in (2) i and j are compile-time constants and the compiler can infer that the size of e is not going to change, then we do not copy".
Finally, based on my understanding of the design principles of the Swift language, I think the general rules are these:
Use constants (let) always everywhere by default, and there won't be any major surprises.
Use variables (var) only if it is absolutely necessary, and be vary careful in those cases, as there will be surprises [here: strange implicit copies of arrays in some but not all situations].
What I've found is: The array will be a mutable copy of the referenced one if and only if the operation has the potential to change the array's length. In your last example, f[0..2] indexing with many, the operation has the potential to change its length (it might be that duplicates are not allowed), so it's getting copied.
var e = [1, 2, 3]
var f = e
e[0..2] = [4, 5]
e // 4,5,3
f // 1,2,3
var e1 = [1, 2, 3]
var f1 = e1
e1[0] = 4
e1[1] = 5
e1 // - 4,5,3
f1 // - 4,5,3
Delphi's strings and arrays had the exact same "feature". When you looked at the implementation, it made sense.
Each variable is a pointer to dynamic memory. That memory contains a reference count followed by the data in the array. So you can easily change a value in the array without copying the whole array or changing any pointers. If you want to resize the array, you have to allocate more memory. In that case the current variable will point to the newly allocated memory. But you can't easily track down all of the other variables that pointed to the original array, so you leave them alone.
Of course, it wouldn't be hard to make a more consistent implementation. If you wanted all variables to see a resize, do this:
Each variable is a pointer to a container stored in dynamic memory. The container holds exactly two things, a reference count and pointer to the actual array data. The array data is stored in a separate block of dynamic memory. Now there is only one pointer to the array data, so you can easily resize that, and all variables will see the change.
A lot of Swift early adopters have complained about this error-prone array semantics and Chris Lattner has written that the array semantics had been revised to provide full value semantics ( Apple Developer link for those who have an account). We will have to wait at least for the next beta to see what this exactly means.
I use .copy() for this.
var a = [1, 2, 3]
var b = a.copy()
a[1] = 42
Did anything change in arrays behavior in later Swift versions ? I just run your example:
var a = [1, 2, 3]
var b = a
a[1] = 42
a
b
And my results are [1, 42, 3] and [1, 2, 3]

Visual Foxpro Array [] or ( )

In a Visual Foxpro application one of the users get an error (the rest doesn't). And i believe its because arrays are used in the form of arr(number) instead of arr[number] . Does anyone know what causes this strange behavior at a single user?
Thanks!
Foxpro does not differentiate between the two. This is actually documented in both the DIMENSION and DECLARE commands' remarks.
In fact, the documentation doesn't strictly follow one way or another. The DIMENSION and DECLARE commands define the syntax with parenthesis ().
DIMENSION ArrayName1(nRows1 [, nColumns1]) [AS cType]
[, ArrayName2(nRows2 [, nColumns2])] ...
But the example provided in the Arrays section of the documentation uses brackets [].
DIMENSION ArrayName[5,2]
ArrayName[1,2] = 966789
Either use of array references is valid as long as its properly balanced as () or []. The problem is probably upstream where the array is getting declared or prepared. I've had to debug historically strange instances like this where one user was going about a process in a totally different way than others, and the business work flow... Anyhow, because of some "bypassed" process, the array wasn't getting created and thus forced a failure.
Does it always crash at the same location in the process?
I would strongly encourage some error trapping in the process for this "one" user. Worse comes to worse, I would put something in the area of the code something like...
if atc( "PersonsLoginName", sys(0)) > 0
TurnOnMyCustomDebugging() && for this special scenario trapping
endif
Additionally, I don't know what you have for error trapping routines, but I'd get a dump of memory at the time of the error and the full call stack that got the user to that point. If you need help on that, let me know too.
I didn't understand why this question was "bumped" from 2010. Maybe because it is kind of "VFP basics" and need details?
Answers are already good. [] and () could be used. It is primarily a preference.
VFP actually doesn't even care if the name denotes and array. It might be a function accepting one or two integer parameters (1..N). However, if there is an array in scope then it takes precedence.
Example:
Dimension Dummy[10]
? Dummy[5] && prints .F. - array members are not initialized
Dummy[2] = 6 && sets array member
? Dummy[2] && prints 6
Release Dummy && array variable released
? Dummy[5] && prints 10 - procedure is called
* Dummy[2] = 6 && error - variable does not exists
? Dummy[2] && prints 4 - procedure is called
Procedure Dummy(tnDim1)
Return m.tnDim1 * 2
endproc
It wouldn't matter if you used [] or () for an array or function (or procedure - in VFP procedure and function also has no difference, both accept parameters and return result).
As per the OP question, a single user wouldn't have a different result just because [] or () used.

Resources