Any efficient way to convert TArray<string> to TStringDynArray? - arrays

Quite a big portion of my code (delphi-dutil, etc.) uses TStringDynArray. Now I want to convert all keys a TDictionary<string, string> to a TStringDynArray. Unfortunately I only found TDictionary.Keys.ToArray, which is TArray<T>.
I know I can write a simple copy function to do a raw content copy, but my TStringDynArray is usually very big (roughly 10000 entries), so I am looking for an efficient way to convert TArray<string> to TStringDynArray.
function ConvertToStringDynArray(const A: TArray<string>): TStringDynArray;
var
I: Integer;
begin
assert(A <> nil);
SetLength(Result, Length(A));
for I := 0 to Length(A) - 1 do
Result[I] := A[I];
end;

The best solution is to convert all dynamic arrays in your codebase to be TArray<T>. This is best because generic types have much less stringent type compatibility rules. So I suggest that you consider taking this step across your codebase.
However, that's only viable in the code that you control. When interfacing with code and libraries that you cannot modify you have a problem.
At the root of all this, all these types are dynamic arrays and share a single, common, implementation. This means that the blockage is at the language level rather than implementation. So a cast can be used, safely, to make the compiler accept you assignments.
var
DynArr: TStringDynArray;
GenericArr: TArray<string>;
....
DynArr := TStringDynArray(GenericArr);
GenericArr := TArray<string>(DynArr);
You can use these casts in function call arguments.
Because you are suppressing type checking, you really don't want to scatter these casts all over your code. Centralise them into helper functions. For instance:
function StringDynArray(const arr: TArray<string>): TStringDynArray;
begin
Result := TStringDynArray(arr);
end;
One final comment: this related question may be of interest: Is it safe to type-cast TArray<X> to array of X?

Related

Access Types for Stack Allocated Array in Ada

I'm trying to do something with array passing and access types. I've run into a situation where the stack size on an embedded system makes it difficult to pass around a large array through the typical parameter passing mechanism.
To save stack size, I've started using access types but I don't want to do dynamic allocation.
What I have is something like this:
type My_Array_Type is array (Natural range<>) of Integer;
type My_Array_Type_Ptr is access all My_Array_Type;
procedure Do_Stuff(Things : My_Array_Type_Ptr) is
begin
-- things
end Do_Stuff;
procedure Do_Stuff(Num_Things : Integer) is
Things : My_Array_Type_Ptr := new My_Array_Type(1..Num_Things);
begin
Do_Stuff(Things);
-- more things
end Do_Stuff;
HOWEVER, what I'd like to do is something like this:
type My_Array_Type is array (Natural range<>) of Integer;
procedure Do_Stuff(Things : access all My_Array_Type) is
begin
-- things
end Do_Stuff;
procedure Do_Stuff(Num_Things : Integer) is
Things : aliased My_Array_Type(1..Num_Things);
begin
Do_Stuff(Things'Access);
-- more things
end Do_Stuff;
But obviously it doesn't work.
Essentially, I want to pass a reference to the stack allocated array and change it in the other subprogram, but I don't want dynamically allocated memory. How can I do this?
Also, as a side note: I keep seeing conflicting information about if something is dynamically allocated if it has to be released or not -- I read most implementations don't have garbage collectors but don't need them. Can anyone clarify -- In the example I showed first, do I need to explicitly deallocate?
EDIT:
After trying the solutions mentioned below, I settled on a combination of two things:
Using the in out parameter passing mechanism.
Reducing the storage requirements of my data type.
So rather than Integer I'm using:
type UInt8 is new Interfaces.Unsigned_8;
Which is:
type UInt8 is mod 2**8
with Size => 8;
This works perfectly, since my values aren't actually integers, they are in fact unsigned bytes.
In Ada, you really don’t need to use access types for this situation.
Remarks below for native Ada code; for imported (& I suppose exported) subprograms the generated code obviously needs to obey the foreign language conventions.
The mode of a parameter (whether you’re allowed to write to it, whether (if you’re allowed to write to it) it has some initial value) is distinct from the parameter passing mechanism.
If you have a parameter of size larger than a register and the parameter passing mechanism is not by-reference (i.e. by passing the address of the actual object), complain to your compiler vendor!
The Ada way would be like this:
with Ada.Text_IO; use Ada.Text_IO;
with System.Storage_Elements;
procedure Jsinglet is
type My_Array_Type is array (Natural range<>) of Integer;
procedure Do_Stuff_1 (Things : in out My_Array_Type) is
begin
Put_Line (System.Storage_Elements.To_Integer (Things'Address)'Img);
end Do_Stuff_1;
procedure Do_Stuff (Num_Things : Integer) is
Things : My_Array_Type (1 .. Num_Things);
begin
Put_Line (System.Storage_Elements.To_Integer (Things'Address)'Img);
Do_Stuff_1 (Things);
end Do_Stuff;
begin
Do_Stuff (42);
end Jsinglet;
Running the program results here in
$ ./jsinglet
140732831549024
140732831549024
showing that the address has been passed, not the value.
The in out mode on Do_Stuff_1’s parameter means that Do_Stuff_1 can read the contents of the array passed to it before writing to it.
out would mean that Do_Stuff_1 shouldn’t read the contents until it has itself written them (it can, but - depending on the parameter’s type - it may read uninitialized or default-initialized data)
in would mean that the contents couldn’t be written to.
As of Ada2012 you can mark parameters as aliased and they will be passed by reference, but your source object must also be either tagged or aliased.
EDIT: It cannot be an array either it appears, so look below. Aliased parameters do work for types like integers, enumerations, records, etc. though.
You can also wrap the array in a tagged or limited record to force the compiler to use by reference passing as those are "by reference" types
type My_Array_Type is array (Natural range<>) of Integer;
type By_Reference(Length : Natural) is tagged record -- or limited
Elements : My_Array_Type(1..Length);
end record;
procedure Do_Stuff(Things : in out By_Reference) is
begin
-- things
end Do_Stuff;
procedure Do_Stuff(Num_Things : Integer) is
Things : By_Reference(Num_Things);
begin
Do_Stuff(Things);
-- more things
end Do_Stuff;
For your deallocation question, your example must explicitly deallocate the memory. There are ways to get automatic deallocation:
Compiler with Garbage Collection (I know of none)
Custom 3rd party library that provides it
Use a holder or container from Ada.Containers
Use a local (non library level) named access type with a specified storage size (your access type is library level and has not storage size specified). When the access type goes out of scope it will deallocate.

Why do strings need to be initialized with an initial value?

I have a string lx : String that I want to set the value for later on in my code, but I'm getting error unconstrained subtype not allowed (need initialization) provide initial value or explicit array bounds
I also have a string array L_array : array (1 .. user_size) of String;, which is throwing error unconstrained element type in array declaration. I can't initialize this to begin with, as the values are read in from a text file. What should I do about this if I want to set these values later?
There are two questions here really, but with the same underlying cause : a String must have its size fixed (i.e. constrained) when it is created.
If you know its size, and (in the case of the array of String) all Strings are the same size, constraining them is easy and needs no further comment.
Think about what Strings of unknown length implies : unknown storage requirements. One approach is to use pointers or access types, allocate storage to hold the string, and remember to free it later. That way, as in other languages, has the potential for bugs, memory leaks etc. So does the alternative approach of guessing an upper limit for the size, which opens the potential for buffer overflows. You can do it in Ada, as in other languages, but ... not best practice.
Ada provides abstractions over both these patterns, in the form of Unbounded_String and Bounded_String respectively which aim to minimise the problems. But they are still less convenient in use than String.
There's a somewhat intense discussion of these abstractions on the comp.lang.ada newsgroup (my apologies for using the Google Groups gateway to it)
So I'll suggest ways you can do these two tasks just with String.
For the case of a single string lx : String where you set the value later on, the answer is simple : just declare the String later on, initialised with that value. Instead of
lx : String;
...
lx := Read(My_File);
Process_String(lx);
use a declare block (typically in a loop body):
...
declare
lx : String := Read(My_File);
begin
Process_String(lx);
end;
At end the string lx goes out of scope, and it is created anew (with the correct size, from the initialisation, next time you reach the declare block.
An Array of String is more difficult if each member has a different size, and Bounded_String or Unbounded_String are useful candidates.
But an alternative approach (since Ada-2005) would be to use the Ada.Containers package instead of an Array. These come in Definite and Indefinite flavours, you want an Indefinite container to store members of different sizes. Specifically Ada.Containers.Indefinite_Vectors as a Vector can be indexed similar to an Array.
This approach has similarities to using std_vector in C++, in fact the Standard Template Library was originally for Ada, and was later adapted to C++.
Prior to Ada-2005, Ada.Containers was not part of the language but you'd use an equivalent from an external library such as (I think) the Booch Components (Grady Booch).
A starter :
with Ada.Containers.Indefinite_Vectors;
with Ada.Text_IO;
procedure String_Vector is
User_Size : constant natural := 10;
subtype Index is natural range 1 .. User_Size;
-- Indefinite_Vectors is a generic package.
-- You can't use it directly, instantiate it with index and content types
package String_Holder is new Ada.Containers.Indefinite_Vectors(Index,String);
-- make String_Holder operations visible
use String_Holder;
LV : String_Holder.Vector; -- initially empty
L_Vector : String_Holder.Vector := -- initialise to size with empty elements
To_Vector(Ada.Containers.Count_Type(User_Size));
begin
L_Vector.Replace_Element(1,"hello");
LV.Append("world");
Ada.Text_IO.Put_Line(L_Vector(1) & " " & LV(1));
end String_Vector;
Ada's String type is defined as
type String is array(Positive range <>) of Character. Which needs an initial range to declare an variable, by either given an initial string or given a range constraint, otherwise compiler won't be able to know how large will the object be.
Take a look at the sample at Rosettacode for how to read from a file.

Which are differences between Variant arrays and dynamic arrays?

Which are differences between using a Variant array (Like shown here)
var
VarArray : Variant;
begin
VarArray := VarArrayCreate([0, 1], varInteger);
VarArray[0] := 123;
<...>
end;
instead of a common dynamic array?
var
DynArray : array of Integer;
begin
SetLength(DynArray, 1);
DynArray[0] := 123;
<...>
end;
Variants are a type that gets special handling from the compiler and the runtime. Under the hood, they are records of the type TVarRec. They can contain many different kinds of types internally, and can even be used to convert between some of these types. But they can also contain arrays of values, even arrays of other Variants, single- and multi-dimensional. That are Variant arrays. The System.Variants unit contains functions to define and handle such arrays.
Some more info on the Delphi Basics site.
Variants are typically used by Windows COM. Note that they can be pretty slow, especially Variant arrays with multiple dimensions. The number of types they can contain is limited.
Dynamic arrays are built-in types. They are normal arrays that can contain elements of any conceivable type, built-in or user defined. The difference with normal (static) arrays is that they can be instantiated, enlarged or shrunk dynamically (e.g. using SetLength), and their variables are pointers to the real array (which is allocated on the heap). Their lifetime is managed by the runtime.
Dynamic arrays are proper built-in types and far more general than Variants (and Variant arrays).
Delphi Basics also has more info on them.
Update
As Remy Lebeau commented, I should mention that a Variant array (and also an OleVariant array) is based on COM's SAFEARRAY structure, and thus can only be created with COM/OLE-compatible data types, even though Delphi's Variant can hold non-COM/OLE types.

Is it safe to replace array of XXX with TArray<XXX>

I have quite a few variables declared as
var
Something: array of XXX;
begin
SetLength(Something, 10);
try
...
finally
SetLength(Something, 0);
end;
end;
To what extend is safe to have them replaced:
var
Something: TArray<XXX>;
begin
SetLength(Something, 10);
try
...
finally
SetLength(Something, 0);
end;
end;
As already answered, TArray<XXX> is exactly like any other custom type defined as array of XXX. In fact, TArray<XXX> is a custom type defined as array of XXX.
That said, a custom type defined as array of XXX is not equivalent to array of XXX in the context of a procedure or function parameter. In procedure Foo(x: array of Integer), x is an open array parameter, which can accept any type of integer array. In contrast, procedure Foo(x: TArray<Integer>) takes an actual TArray<Integer> type only. You can see the difference when attempting to pass a fixed-size array, but also when attempting to pass a TDynIntegerArray (a different type, also defined as array of Integer).
So, for variables, sure, if you have array of XXX, change it to TArray<XXX> all you want. Just make sure you don't do a global search and replace.
It is perfectly safe to do this. The compiler will produce identical output.
I personally would, all other things being equal, recommend doing so. The use of the generic array TArray<T> gives you much more flexibility and better type compatibility.
Of course those benefits can only be seen with more realistic code that does some work. You most typically see benefits when using generic containers. But you might also see benefits when trying to build code using multiple different libraries.
The use of generic arrays allows easy type compatibility. Before generic arrays you would define an array type like this:
TIntArray = array of Integer;
If two libraries do this then you have incompatible types. If the libraries agree to use generic arrays then there will be compatibility.
To see this more clearly, consider this fragment:
type
TIntArray1 = array of Integer;
TIntArray2 = array of Integer;
....
var
arr1: TIntArray1;
arr2: TIntArray2;
....
arr1 := arr2;
This assignment is not valid and fails with a type mis-match compiler error. This is entirely to be expected within the Pascal language. It is after all strongly typed and these are distinct types. Even if they are implemented identically.
On the other hand:
var
arr1: TArray<Integer>;
arr2: TArray<Integer>;
....
arr1 := arr2;
is valid and does compile. The documentation for generic type compatibility says:
Two instantiated generics are considered assignment compatible if the base types are identical (or are aliases to a common type) and the type arguments are identical.

Array manipulation in OCaml

I am manipulating 2-dimensional arrays in OCaml. I have some questions:
How to declare an array whose length is of type int64, instead of int? For instance, Array.make : int -> 'a -> 'a array, what if I need a bigger array whose index is of type int64?
May I write something like the following:
let array = Array.make_matrix 10 10 0 in
array.(1).(2) <- 5; array.(3).(4) <- 20; (* where I modify a part of values in array)
f array ...
...
The code above seems to me unnatural, because we modify the value of array inside the let, do I have to this, or is there a more natural way to do this?
Could anyone help? thank you very much!
On 64-bit systems, the size of OCaml arrays from the Array module is limited to 2^54 - 1 and on 32-bit systems the limit is 4,194,303. For arrays of float, the limit is 2 times smaller. In both cases the index is easily represented as an int, so there's no advantage in using int64 as an index.
The value for 32-bit systems is way too small for some problems, so there is another module named Bigarray that can represent larger arrays. It supports much larger arrays, but the indices are still int. If you really need to have large indices, you are possibly on a 64-bit system where this isn't such a limitation. If not, you're going to run out of address space anyway, I would think. Maybe what you really want is a hash table?
I'm not sure what you're saying about "let". The purpose of let is to give something a name. It's not unreasonable to give the array a name before you start storing values into it. If you want to define the values at the time you create the array you can use Array.init and write an arbitrary function for setting the array values.
Array code in OCaml is inherently imperative, so you will usually end up with code that has that look to it. I often use begin and end and just embrace the Algolic quality of it.

Resources