Confusion with "..." operator in golang - arrays

What is the difference between the following two syntaxes in go?
x := [...]int{ 1:1, 2:2 }
x := []int{ 1:1, 2:2 }
Go's document says "The notation ... specifies an array length equal to the maximum element index plus one". But both the above syntaxes gives same lenght (3).
Is there a name for this operator "..."?
Didn't find a way to search this operator in google.

The first line creates an array using an array literal, its length computed automatically by the compiler. It is detailed in the Composite literals section of the Language Specification.
The notation ... specifies an array length equal to the maximum element index plus one.
Note: this is not to be confused with the ... used to specify variadic parameters or to pass slices as their values. It is detailed in the Function types section of the spec.
The second line uses a slice literal and will result in a slice. Note that under the hood an array will also be created, but that is opaque.

Related

Why have arrays in Go?

I understand the difference between arrays and slices in Go. But what I don't understand is why it is helpful to have arrays at all. Why is it helpful that an array type definition specifies a length and an element type? Why can't every "array" that we use be a slice?
There is more to arrays than just the fixed length: they are comparable, and they are values (not reference or pointer types).
There are countless advantages of arrays over slices in certain situations, all of which together more than justify the existence of arrays (along with slices). Let's see them. (I'm not even counting arrays being the building blocks of slices.)
1. Being comparable means you can use arrays as keys in maps, but not slices. Yes, you could say now that why not make slices comparable then, so that this alone wouldn't justify the existence of both. Equality is not well defined on slices. FAQ: Why don't maps allow slices as keys?
They don't implement equality because equality is not well defined on such types; there are multiple considerations involving shallow vs. deep comparison, pointer vs. value comparison, how to deal with recursive types, and so on.
2. Arrays can also give you higher compile-time safety, as the index bounds can be checked at compile time (array length must evaluate to a non-negative constant representable by a value of type int):
s := make([]int, 3)
s[3] = 3 // "Only" a runtime panic: runtime error: index out of range
a := [3]int{}
a[3] = 3 // Compile-time error: invalid array index 3 (out of bounds for 3-element array)
3. Also passing around or assigning array values will implicitly make a copy of the entire array, so it will be "detached" from the original value. If you pass a slice, it will still make a copy but just of the slice header, but the slice value (the header) will point to the same backing array. This may or may not be what you want. If you want to "detach" a slice from the "original" one, you have to explicitly copy the content e.g. with the builtin copy() function to a new slice.
a := [2]int{1, 2}
b := a
b[0] = 10 // This only affects b, a will remain {1, 2}
sa := []int{1, 2}
sb := sa
sb[0] = 10 // Affects both sb and sa
4. Also since the array length is part of the array type, arrays with different length are distinct types. On one hand this may be a "pain in the ass" (e.g. you write a function which takes a parameter of type [4]int, you can't use that function to take and process an array of type [5]int), but this may also be an advantage: this may be used to explicitly specify the length of the array that is expected. E.g. you want to write a function which takes an IPv4 address, it can be modeled with the type [4]byte. Now you have a compile-time guarantee that the value passed to your function will have exactly 4 bytes, no more and no less (which would be an invalid IPv4 address anyway).
5. Related to the previous, the array length may also serve a documentation purpose. A type [4]byte properly documents that IPv4 has 4 bytes. An rgb variable of type [3]byte tells there are 1 byte for each color components. In some cases it is even taken out and is available, documented separately; for example in the crypto/md5 package: md5.Sum() returns a value of type [Size]byte where md5.Size is a constant being 16: the length of an MD5 checksum.
6. They are also very useful when planning memory layout of struct types, see JimB's answer here, and this answer in greater detail and real-life example.
7. Also since slices are headers and they are (almost) always passed around as-is (without pointers), the language spec is more restrictive regarding pointers to slices than pointers to arrays. For example the spec provides multiple shorthands for operating with pointers to arrays, while the same gives compile-time error in case of slices (because it's rare to use pointers to slices, if you still want / have to do it, you have to be explicit about handling it; read more in this answer).
Such examples are:
Slicing a p pointer to array: p[low:high] is a shorthand for (*p)[low:high]. If p is a pointer to slice, this is compile-time error (spec: Slice expressions).
Indexing a p pointer to array: p[i] is a shorthand for (*p)[i]. If p is pointer to a slice, this is a compile time error (spec: Index expressions).
Example:
pa := &[2]int{1, 2}
fmt.Println(pa[1:1]) // OK
fmt.Println(pa[1]) // OK
ps := &[]int{3, 4}
println(ps[1:1]) // Error: cannot slice ps (type *[]int)
println(ps[1]) // Error: invalid operation: ps[1] (type *[]int does not support indexing)
8. Accessing (single) array elements is more efficient than accessing slice elements; as in case of slices the runtime has to go through an implicit pointer dereference. Also "the expressions len(s) and cap(s) are constants if the type of s is an array or pointer to an array".
May be suprising, but you can even write:
type IP [4]byte
const x = len(IP{}) // x will be 4
It's valid, and is evaluated and compile-time even though IP{} is not a constant expression so e.g. const i = IP{} would be a compile-time error! After this, it's not even surprising that the following also works:
const x2 = len((*IP)(nil)) // x2 will also be 4
Note: When ranging over a complete array vs a complete slice, there may be no performance difference at all as obviously it may be optimized so that the pointer in the slice header is only dereferenced once. For details / example, see Array vs Slice: accessing speed.
See related questions where an array can be used / makes more sense than a slice:
Why use arrays instead of slices?
Why can't Go slice be used as keys in Go maps pretty much the same way arrays can be used as keys?
Hash with key as an array type
How do I check the equality of three values elegantly?
Slicing a slice pointer passed as argument
And this is just for curiosity: a slice can contain itself while an array can't. (Actually this property makes comparison easier as you don't have to deal with recursive data structures).
Must-read blogs:
Go Slices: usage and internals
Arrays, slices (and strings): The mechanics of 'append'
Arrays are values, and it is often useful to have a value instead of a pointer.
Values can be compared, hence you can use arrays as map keys.
Values are always initialized, so there's you don't need to initialize, or make them like you do with a slice.
Arrays give you better control of memory layout, where as you can't allocate space directly in a struct with a slice, you can with an array:
type Foo struct {
buf [64]byte
}
Here, a Foo value will contains a 64 byte value, rather than a slice header which needs to be separately initialized. Arrays are also used to pad structs to match alignment when interoperating with C code and to prevent false sharing for better cache performance.
Another aspect for improved performance is that you can better define memory layout than with slices, because data locality can have a very big impact on memory intensive calculations. Dereferencing a pointer can take considerable time compared to the operations being performed on the data, and copying values smaller than a cache line incurs very little cost, so performance critical code often uses arrays for that reason alone.
Arrays are more efficient in saving space. If you never update the size of the slice (i.e. start with a predefined size and never go past it) there really is not much of a performance difference. But there is extra overhead in space, as a slice is simply a wrapper containing the array at its core. Contextually, it also improves clarity as it makes the intended use of the variable more apparent.
Every array could be a slice but not every slice could be an array. If you have a fixed collection size you can get a minor performance improvement from using an array. At the very least you'll save the space occupied by the slice header.

Apply slicing to array valued function in Fortran [duplicate]

How does one access an element of an array that is returned from a function? For example, shape() returns an array of integers. How does one compare an element of that array to an integer? The following does not compile:
integer :: a
integer, dimension(5) :: b
a = 5
if (a .eq. shape(b)) then
print *, 'equal'
end if
The error is:
if (a .eq. shape(c)) then
1
Error: IF clause at (1) requires a scalar LOGICAL expression
I understand that this is because shape(c) returns an array. However, accessing an element of the array does not appear to be possible like so: shape(c)(1)
Now if I add these two lines:
integer, dimension(1) :: c
c = shape(b)
...and change the if clause to this:
if (a .eq. c(1)) then
... then it works. But do I really have to declare an extra array variable to hold the return value of shape(), or is there some other way to do it?
Further to the answers that deal with SHAPE and logical expressions etc, the general answer to your question "How does one access an element of an array that is returned from a function?" is
you assign the expression that has the function reference to an array variable, and then index that array variable.
you use the expression that has the function reference as an actual argument to a procedure that takes a dummy array argument, and does the indexing for you.
Consequently, the general answer to your last questions "But do I really have to declare an extra array variable to hold the return value of shape(), or is there some other way to do it?" is "Yes, you do need to declare another array variable" and hence "No, there is no other way".
(Note that reasonable optimising compilers will avoid the need for any additional memory operations/allocations etc once they have the result of the array function, it's really just a syntax issue.)
The rationale for this particular aspect of language design is sometimes ascribed to a need to avoid syntax ambiguity and confusion for array function results that are of character type (they could potentially be indexed and/or substringed - how do you tell what was intended?). Others think it was done this way just to annoy C programmers.
Instead of using shape(array), I would use size(array).
Note that this will return an integer indicating how many elements there are in ALL dimensions, unless you specify the DIM attribute, in which case it will return only the number of elements in the specified dimension.
Take a look at the gfortran documentation:
http://gcc.gnu.org/onlinedocs/gfortran/SIZE.html.
Also, look up lbound and ubound.
Note that the expression
a == shape(b)
returns a rank-1 array of logicals and the if statement requires that the condition reduce to a scalar logical expression. You could reduce the rank-1 array to a scalar like this:
if (all(a == shape(b)))
This is certainly not a general replacement for the syntactically-invalid application of array indexing to an array-returning function such as shape(b)(1).
It is possible even without the intermediate variable using ASSOCIATE:
real c(3,3)
associate (x=>shape(c))
print *,x(1),x(2)
end associate
end

Maxima: what does Maxima call an "array"?

I am a bit confused ; I noticed that if I do :
a[sqrt(2)] : 1;
arrays;
I would get :
[a]
So a is an array for Maxima… yet sqrt(2) is an irrational number.
I use to think of an array as a collection of items sorted by indices, where those indices are integer numbers… I acknowledge that my definition for "array" has been strongly influenced by other, "non-symbolic" programming languages. In those languages, arrays "map" to a certain contiguous region of a computer's memory. It is therefore natural to use integer number as indices since integer number are countable. However, real numbers are not countable.
Obviously, maxima seems to have a different definition for the term "array" : what is it exactly ?
(the documentation does not define it, at least there is no introductory paragraph in the documentation section dedicated to arrays)
Maxima's concept of arrays, lists, and matrices is pretty confused, since various ideas have accreted in the many years of the project.
Maxima's "subscripted variable" = symbol with subscript (with arbitrary index) and no assigned value. E.g. a[sqrt(2)] with no value assigned.
Maxima's "undeclared array" = hash table with arbitrary keys, associated with array symbol as a symbol property, not a value. Your a[sqrt(2)] : 1 is an example of an undeclared array. Maxima creates the array a the first time a value is assigned.
Maxima's "declared array" = contiguous storage, associated with array symbol as a symbol property, not a value.
Maxima's "Lisp array" = contiguous storage, associated with array symbol as symbol value.
Maxima's "fast array" = hash table, associated with array symbol as a symbol value.
Yes, this is a mess. Sorry about that. These are all interesting ideas, but there is no unifying framework. I haven't even mentioned lists and matrices. Hope this helps all the same.

Smalltalk Array Types

When looking at Smalltalk syntax definitions I noticed a few different notations for arrays:
#[] "ByteArray"
#() "Literal Array"
{} "Array"
Why are there different array types? In other programming languages I know there's only one kind of array independent of the stored type.
When to choose which kind?
Why do literal array and array have a different notation but same class?
There's a bit of terminological confusion in Michael's answer, #() is a literal array whereas {} is not. A literal array is the one created by the compiler and can contain any other literal value (including other literal arrays) so the following is a valid literal array:
#(1 #blah nil ('hello' 3.14 true) $c [1 2 3])
On the other hand {} is merely a syntactic sugar for runtime array creation, so { 1+2. #a. anObject} is equivalent to:
(Array new: 3) at: 1 put: 1 + 2; at: 2 put: #a; at: 3 put: anObject; yourself
Here's a little walkthrough:
Firstly, we can find out the types resp. classes of the resulting objects:
#[] class results in ByteArray
#() class results in Array
{} class also results in Array
So apparently the latter two produce Arrays while the first produces a ByteArray. ByteArrays are what you would expect -- fixed sized arrays of bytes.
Now we'll have to figure out the difference between #() and {}. Try evaluating #(a b c), it results in #(#a #b #c); however when you try to evaluate {a b c}, it doesn't work (because a is not defined). The working version would be {#a. #b. #c}, which also results in #(#a #b #c).
The difference between #() and {} is, that the first takes a list of Symbol names separated by spaces. You're also allowed to omit the # signs. Using this notation you can only create Arrays that contain Symbols. The second version is the generic Array literal. It takes any expressions, separated by . (dots). You can even write things like {1+2. anyObject complexOperation}.
This could lead you to always using the {} notation. However, there are some things to keep in mind: The moment of object creation differs: While #() Arrays are created during compilation, {} Arrays are created during execution. Thus when you run code with an #() expression, it will also return the same Array, while {} only returns equal Arrays (as long as you are using equal contents). Also, AFAIK the {} is not necessarily portable because it's not part of the ST-80 standard.

What does a.{X} mean in OCaml?

I'm currently trying to port some OCaml to F#. I'm "in at the deep end" with OCaml and my F# is a bit rusty.
Anyway, the OCaml code builds fine in the OCaml compiler, but (not surprisingly) gives a load of errors in the F# compiler even with ML compatibility switched on. Some of the errors look to be reserved words, but the bulk of the errors are complaining about the .{ in lines such as:
m.(a).(b) <- w.{a + b * c};
a,b,c are integers.
I've done a lot of searching through OCaml websites, Stackoverflow, the English translation of the French O'Reilly book, etc. and cannot find anything like this. Of course it doesn't help that most search facilities have problems with punctuation characters! Yes I've found references to . being used to refer to record members, and { } being used to define records, but both together? From the usage, I assume it is some kind of associative or sparse array?
What does this syntax mean? What is the closest F# equivalent?
There is a pdf of the oCaml documentation/manual available here:
http://caml.inria.fr/distrib/ocaml-3.12/ocaml-3.12-refman.pdf
On page 496 (toward the bottom of the page), it says of generic arrays and their get method:
val get : (’a, ’b, ’c) t -> int array -> ’a
Read an element of a generic big array. Genarray.get a [|i1; ...; iN|] returns
the element of a whose coordinates are i1 in the first dimension, i2 in the second
dimension, . . ., iN in the N-th dimension.
If a has C layout, the coordinates must be greater or equal than 0 and strictly less than
the corresponding dimensions of a. If a has Fortran layout, the coordinates must be
greater or equal than 1 and less or equal than the corresponding dimensions of a. Raise
Invalid_argument if the array a does not have exactly N dimensions, or if the
coordinates are outside the array bounds.
If N > 3, alternate syntax is provided: you can write a.{i1, i2, ..., iN} instead of
Genarray.get a [|i1; ...; iN|]. (The syntax a.{...} with one, two or three
coordinates is reserved for accessing one-, two- and three-dimensional arrays as
described below.)
Further, it says (specifically about one dimensional arrays):
val get : (’a, ’b, ’c) t -> int -> ’a
Array1.get a x, or alternatively a.{x}, returns the element of a at index x. x must
be greater or equal than 0 and strictly less than Array1.dim a if a has C layout. If a
has Fortran layout, x must be greater or equal than 1 and less or equal than
Array1.dim a. Otherwise, Invalid_argument is raised.
In F#, you can access array elements using the Array.get method as well. But, a closer syntax would be w.[a + b * c]. In short, in F#, use [] instead of {}.

Resources