Which are differences between Variant arrays and dynamic arrays? - arrays

Which are differences between using a Variant array (Like shown here)
var
VarArray : Variant;
begin
VarArray := VarArrayCreate([0, 1], varInteger);
VarArray[0] := 123;
<...>
end;
instead of a common dynamic array?
var
DynArray : array of Integer;
begin
SetLength(DynArray, 1);
DynArray[0] := 123;
<...>
end;

Variants are a type that gets special handling from the compiler and the runtime. Under the hood, they are records of the type TVarRec. They can contain many different kinds of types internally, and can even be used to convert between some of these types. But they can also contain arrays of values, even arrays of other Variants, single- and multi-dimensional. That are Variant arrays. The System.Variants unit contains functions to define and handle such arrays.
Some more info on the Delphi Basics site.
Variants are typically used by Windows COM. Note that they can be pretty slow, especially Variant arrays with multiple dimensions. The number of types they can contain is limited.
Dynamic arrays are built-in types. They are normal arrays that can contain elements of any conceivable type, built-in or user defined. The difference with normal (static) arrays is that they can be instantiated, enlarged or shrunk dynamically (e.g. using SetLength), and their variables are pointers to the real array (which is allocated on the heap). Their lifetime is managed by the runtime.
Dynamic arrays are proper built-in types and far more general than Variants (and Variant arrays).
Delphi Basics also has more info on them.
Update
As Remy Lebeau commented, I should mention that a Variant array (and also an OleVariant array) is based on COM's SAFEARRAY structure, and thus can only be created with COM/OLE-compatible data types, even though Delphi's Variant can hold non-COM/OLE types.

Related

Data structure - Array

Here it says:
Arrays are useful mostly because the element indices can be computed
at run time. Among other things, this feature allows a single
iterative statement to process arbitrarily many elements of an array.
For that reason, the elements of an array data structure are required
to have the same size and should use the same data representation.
Is this still true for modern languages?
For example, Java, you can have an array of Objects or Strings, right? Each object or string can have different length. Do I misunderstand the above quote, or languages like Java implements Array differently? How?
In java all types except primitives are referenced types meaning they are a pointer to some memory location manipulated by JVM.
But there are mainly two types of programming languages, fixed-typed like Java and C++ and dynamically-typed like python and PHP. In fixed-typed languages your array should consist of the same types whether String, Object or ...
but in dynamically-typed ones there's a bit more abstraction and you can have different data types in array (I don't know the actual implementation though).
An array is a regular arrangement of data in memory. Think of an array of soldiers, all in a line, with exactly equal spacing between each man.
So they can be indexed by lookup from a base address. But all items have to be the same size. So if they are not, you store pointers or references to make them the same size. All languages use that underlying structure, except for what are sometimes called "associative arrays", indexed by key (strings usually), where you have what is called a hash table. Essentially the hash function converts the key into an array index, with a fix-up to resolve collisions.

Why do strings need to be initialized with an initial value?

I have a string lx : String that I want to set the value for later on in my code, but I'm getting error unconstrained subtype not allowed (need initialization) provide initial value or explicit array bounds
I also have a string array L_array : array (1 .. user_size) of String;, which is throwing error unconstrained element type in array declaration. I can't initialize this to begin with, as the values are read in from a text file. What should I do about this if I want to set these values later?
There are two questions here really, but with the same underlying cause : a String must have its size fixed (i.e. constrained) when it is created.
If you know its size, and (in the case of the array of String) all Strings are the same size, constraining them is easy and needs no further comment.
Think about what Strings of unknown length implies : unknown storage requirements. One approach is to use pointers or access types, allocate storage to hold the string, and remember to free it later. That way, as in other languages, has the potential for bugs, memory leaks etc. So does the alternative approach of guessing an upper limit for the size, which opens the potential for buffer overflows. You can do it in Ada, as in other languages, but ... not best practice.
Ada provides abstractions over both these patterns, in the form of Unbounded_String and Bounded_String respectively which aim to minimise the problems. But they are still less convenient in use than String.
There's a somewhat intense discussion of these abstractions on the comp.lang.ada newsgroup (my apologies for using the Google Groups gateway to it)
So I'll suggest ways you can do these two tasks just with String.
For the case of a single string lx : String where you set the value later on, the answer is simple : just declare the String later on, initialised with that value. Instead of
lx : String;
...
lx := Read(My_File);
Process_String(lx);
use a declare block (typically in a loop body):
...
declare
lx : String := Read(My_File);
begin
Process_String(lx);
end;
At end the string lx goes out of scope, and it is created anew (with the correct size, from the initialisation, next time you reach the declare block.
An Array of String is more difficult if each member has a different size, and Bounded_String or Unbounded_String are useful candidates.
But an alternative approach (since Ada-2005) would be to use the Ada.Containers package instead of an Array. These come in Definite and Indefinite flavours, you want an Indefinite container to store members of different sizes. Specifically Ada.Containers.Indefinite_Vectors as a Vector can be indexed similar to an Array.
This approach has similarities to using std_vector in C++, in fact the Standard Template Library was originally for Ada, and was later adapted to C++.
Prior to Ada-2005, Ada.Containers was not part of the language but you'd use an equivalent from an external library such as (I think) the Booch Components (Grady Booch).
A starter :
with Ada.Containers.Indefinite_Vectors;
with Ada.Text_IO;
procedure String_Vector is
User_Size : constant natural := 10;
subtype Index is natural range 1 .. User_Size;
-- Indefinite_Vectors is a generic package.
-- You can't use it directly, instantiate it with index and content types
package String_Holder is new Ada.Containers.Indefinite_Vectors(Index,String);
-- make String_Holder operations visible
use String_Holder;
LV : String_Holder.Vector; -- initially empty
L_Vector : String_Holder.Vector := -- initialise to size with empty elements
To_Vector(Ada.Containers.Count_Type(User_Size));
begin
L_Vector.Replace_Element(1,"hello");
LV.Append("world");
Ada.Text_IO.Put_Line(L_Vector(1) & " " & LV(1));
end String_Vector;
Ada's String type is defined as
type String is array(Positive range <>) of Character. Which needs an initial range to declare an variable, by either given an initial string or given a range constraint, otherwise compiler won't be able to know how large will the object be.
Take a look at the sample at Rosettacode for how to read from a file.

Why have arrays in Go?

I understand the difference between arrays and slices in Go. But what I don't understand is why it is helpful to have arrays at all. Why is it helpful that an array type definition specifies a length and an element type? Why can't every "array" that we use be a slice?
There is more to arrays than just the fixed length: they are comparable, and they are values (not reference or pointer types).
There are countless advantages of arrays over slices in certain situations, all of which together more than justify the existence of arrays (along with slices). Let's see them. (I'm not even counting arrays being the building blocks of slices.)
1. Being comparable means you can use arrays as keys in maps, but not slices. Yes, you could say now that why not make slices comparable then, so that this alone wouldn't justify the existence of both. Equality is not well defined on slices. FAQ: Why don't maps allow slices as keys?
They don't implement equality because equality is not well defined on such types; there are multiple considerations involving shallow vs. deep comparison, pointer vs. value comparison, how to deal with recursive types, and so on.
2. Arrays can also give you higher compile-time safety, as the index bounds can be checked at compile time (array length must evaluate to a non-negative constant representable by a value of type int):
s := make([]int, 3)
s[3] = 3 // "Only" a runtime panic: runtime error: index out of range
a := [3]int{}
a[3] = 3 // Compile-time error: invalid array index 3 (out of bounds for 3-element array)
3. Also passing around or assigning array values will implicitly make a copy of the entire array, so it will be "detached" from the original value. If you pass a slice, it will still make a copy but just of the slice header, but the slice value (the header) will point to the same backing array. This may or may not be what you want. If you want to "detach" a slice from the "original" one, you have to explicitly copy the content e.g. with the builtin copy() function to a new slice.
a := [2]int{1, 2}
b := a
b[0] = 10 // This only affects b, a will remain {1, 2}
sa := []int{1, 2}
sb := sa
sb[0] = 10 // Affects both sb and sa
4. Also since the array length is part of the array type, arrays with different length are distinct types. On one hand this may be a "pain in the ass" (e.g. you write a function which takes a parameter of type [4]int, you can't use that function to take and process an array of type [5]int), but this may also be an advantage: this may be used to explicitly specify the length of the array that is expected. E.g. you want to write a function which takes an IPv4 address, it can be modeled with the type [4]byte. Now you have a compile-time guarantee that the value passed to your function will have exactly 4 bytes, no more and no less (which would be an invalid IPv4 address anyway).
5. Related to the previous, the array length may also serve a documentation purpose. A type [4]byte properly documents that IPv4 has 4 bytes. An rgb variable of type [3]byte tells there are 1 byte for each color components. In some cases it is even taken out and is available, documented separately; for example in the crypto/md5 package: md5.Sum() returns a value of type [Size]byte where md5.Size is a constant being 16: the length of an MD5 checksum.
6. They are also very useful when planning memory layout of struct types, see JimB's answer here, and this answer in greater detail and real-life example.
7. Also since slices are headers and they are (almost) always passed around as-is (without pointers), the language spec is more restrictive regarding pointers to slices than pointers to arrays. For example the spec provides multiple shorthands for operating with pointers to arrays, while the same gives compile-time error in case of slices (because it's rare to use pointers to slices, if you still want / have to do it, you have to be explicit about handling it; read more in this answer).
Such examples are:
Slicing a p pointer to array: p[low:high] is a shorthand for (*p)[low:high]. If p is a pointer to slice, this is compile-time error (spec: Slice expressions).
Indexing a p pointer to array: p[i] is a shorthand for (*p)[i]. If p is pointer to a slice, this is a compile time error (spec: Index expressions).
Example:
pa := &[2]int{1, 2}
fmt.Println(pa[1:1]) // OK
fmt.Println(pa[1]) // OK
ps := &[]int{3, 4}
println(ps[1:1]) // Error: cannot slice ps (type *[]int)
println(ps[1]) // Error: invalid operation: ps[1] (type *[]int does not support indexing)
8. Accessing (single) array elements is more efficient than accessing slice elements; as in case of slices the runtime has to go through an implicit pointer dereference. Also "the expressions len(s) and cap(s) are constants if the type of s is an array or pointer to an array".
May be suprising, but you can even write:
type IP [4]byte
const x = len(IP{}) // x will be 4
It's valid, and is evaluated and compile-time even though IP{} is not a constant expression so e.g. const i = IP{} would be a compile-time error! After this, it's not even surprising that the following also works:
const x2 = len((*IP)(nil)) // x2 will also be 4
Note: When ranging over a complete array vs a complete slice, there may be no performance difference at all as obviously it may be optimized so that the pointer in the slice header is only dereferenced once. For details / example, see Array vs Slice: accessing speed.
See related questions where an array can be used / makes more sense than a slice:
Why use arrays instead of slices?
Why can't Go slice be used as keys in Go maps pretty much the same way arrays can be used as keys?
Hash with key as an array type
How do I check the equality of three values elegantly?
Slicing a slice pointer passed as argument
And this is just for curiosity: a slice can contain itself while an array can't. (Actually this property makes comparison easier as you don't have to deal with recursive data structures).
Must-read blogs:
Go Slices: usage and internals
Arrays, slices (and strings): The mechanics of 'append'
Arrays are values, and it is often useful to have a value instead of a pointer.
Values can be compared, hence you can use arrays as map keys.
Values are always initialized, so there's you don't need to initialize, or make them like you do with a slice.
Arrays give you better control of memory layout, where as you can't allocate space directly in a struct with a slice, you can with an array:
type Foo struct {
buf [64]byte
}
Here, a Foo value will contains a 64 byte value, rather than a slice header which needs to be separately initialized. Arrays are also used to pad structs to match alignment when interoperating with C code and to prevent false sharing for better cache performance.
Another aspect for improved performance is that you can better define memory layout than with slices, because data locality can have a very big impact on memory intensive calculations. Dereferencing a pointer can take considerable time compared to the operations being performed on the data, and copying values smaller than a cache line incurs very little cost, so performance critical code often uses arrays for that reason alone.
Arrays are more efficient in saving space. If you never update the size of the slice (i.e. start with a predefined size and never go past it) there really is not much of a performance difference. But there is extra overhead in space, as a slice is simply a wrapper containing the array at its core. Contextually, it also improves clarity as it makes the intended use of the variable more apparent.
Every array could be a slice but not every slice could be an array. If you have a fixed collection size you can get a minor performance improvement from using an array. At the very least you'll save the space occupied by the slice header.

Pascal: how to declare number of dimensions of a dynamic array during runtime?

I know that setlength(array, a, b...) is for declaring the length of dynamic arrays and dimensions and it requires you to know the number of dimensions, so how do you declare n dimensions (n is a variable)?
This is not possible to vary the nesting level of normal (dynamic) arrays runtime.
But have a look at COM arrays, the vararray functions in unit variants though, afaik it is possible with those, but that is a purely library construct.
An example is at http:///www.stack.nl/~marcov/phpser.zip implementing a simple php array deserializer that decodes a nested structure.

Is it safe to replace array of XXX with TArray<XXX>

I have quite a few variables declared as
var
Something: array of XXX;
begin
SetLength(Something, 10);
try
...
finally
SetLength(Something, 0);
end;
end;
To what extend is safe to have them replaced:
var
Something: TArray<XXX>;
begin
SetLength(Something, 10);
try
...
finally
SetLength(Something, 0);
end;
end;
As already answered, TArray<XXX> is exactly like any other custom type defined as array of XXX. In fact, TArray<XXX> is a custom type defined as array of XXX.
That said, a custom type defined as array of XXX is not equivalent to array of XXX in the context of a procedure or function parameter. In procedure Foo(x: array of Integer), x is an open array parameter, which can accept any type of integer array. In contrast, procedure Foo(x: TArray<Integer>) takes an actual TArray<Integer> type only. You can see the difference when attempting to pass a fixed-size array, but also when attempting to pass a TDynIntegerArray (a different type, also defined as array of Integer).
So, for variables, sure, if you have array of XXX, change it to TArray<XXX> all you want. Just make sure you don't do a global search and replace.
It is perfectly safe to do this. The compiler will produce identical output.
I personally would, all other things being equal, recommend doing so. The use of the generic array TArray<T> gives you much more flexibility and better type compatibility.
Of course those benefits can only be seen with more realistic code that does some work. You most typically see benefits when using generic containers. But you might also see benefits when trying to build code using multiple different libraries.
The use of generic arrays allows easy type compatibility. Before generic arrays you would define an array type like this:
TIntArray = array of Integer;
If two libraries do this then you have incompatible types. If the libraries agree to use generic arrays then there will be compatibility.
To see this more clearly, consider this fragment:
type
TIntArray1 = array of Integer;
TIntArray2 = array of Integer;
....
var
arr1: TIntArray1;
arr2: TIntArray2;
....
arr1 := arr2;
This assignment is not valid and fails with a type mis-match compiler error. This is entirely to be expected within the Pascal language. It is after all strongly typed and these are distinct types. Even if they are implemented identically.
On the other hand:
var
arr1: TArray<Integer>;
arr2: TArray<Integer>;
....
arr1 := arr2;
is valid and does compile. The documentation for generic type compatibility says:
Two instantiated generics are considered assignment compatible if the base types are identical (or are aliases to a common type) and the type arguments are identical.

Resources