What expressions are allowed as the array length N in [_; N]? - arrays

Please consider the following minimal example in Rust:
const FOOBAR: usize = 3;
trait Foo {
const BAR: usize;
}
struct Fubar();
impl Foo for Fubar {
const BAR: usize = 3;
}
struct Baz<T>(T);
trait Qux {
fn print_bar();
}
impl<T: Foo> Qux for Baz<T> {
fn print_bar() {
println!("bar: {}", T::BAR); // works
println!("{:?}", [T::BAR; 3]); // works
println!("{:?}", [1; FOOBAR]); // works
println!("{:?}", [1; T::BAR]); // this gives an error
}
}
fn main() {
Baz::<Fubar>::print_bar();
}
The compiler gives the following error:
error[E0599]: no associated item named `BAR` found for type `T` in the current scope
--> src/main.rs:24:30
|
24 | println!("{:?}", [1; T::BAR]); // this gives an error
| ^^^^^^ associated item not found in `T`
|
= help: items from traits can only be used if the trait is implemented and in scope
= note: the following trait defines an item `BAR`, perhaps you need to implement it:
candidate #1: `Foo`
Whatever the answer to my question, this is not a particularly good error message because it suggests that T does implement Foo despite the latter being a trait bound. Only after burning a lot of time did it occur to me that in fact T::BAR is a perfectly valid expression in other contexts, just not as a length parameter to an array.
What are the rules that govern what kind of expressions can go there? Because arrays are Sized, I completely understand that the length are to be known at compile time. Coming from C++ myself, I would expect some restriction akin to constexpr but I have not come across that in the documentation where it just says
A fixed-size array, denoted [T; N], for the element type, T, and the non-negative compile-time constant size, N.

As of Rust 1.24.1, the array length basically needs to either be a numeric literal or a "regular" constant that is a usize. There's a small amount of constant evaluation that exists today, but it's more-or-less limited to basic math.
a perfectly valid expression in other contexts, just not as a length parameter to an array
Array lengths don't support generic parameters. (#43408)
this is not a particularly good error message
Error message should be improved for associated consts in array lengths (#44168)
I would expect some restriction akin to constexpr
This is essentially the restriction, the problem is that what is allowed to be used in a const is highly restricted at the moment. Notably, these aren't allowed:
functions (except to construct enums or structs)
loops
multiple statements / blocks
Work on good constant / compile-time evaluation is still ongoing. There are a large amount of RFCs, issues, and PRs improving this. A sample:
Const fn tracking issue (RFC 911)
Allow locals and destructuring in const fn (RFC 2341)
Allow if and match in constants (RFC 2342)

Related

Does ccall really convert arguments passed by pointer?

Considering a dynamic library with this native function that returns the sum of all even (32-bit unsigned) numbers in an array:
uint32_t sum_of_even(const uint32_t *numbers, size_t length);
The implementation of the function above was written in Rust as below, and packaged into a C dynamic library.
use libc::size_t;
use std::slice;
#[no_mangle]
pub extern "C" fn sum_of_even(n: *const u32, len: size_t) -> u32 {
let numbers = unsafe {
assert!(!n.is_null());
slice::from_raw_parts(n, len as usize)
};
numbers
.iter()
.filter(|&v| v % 2 == 0)
.sum()
}
I wrote the following Julia (v1.0.1) wrapper function:
lib = Libdl.dlopen(libname)
sumofeven_sym = Libdl.dlsym(lib, :sum_of_even)
sumofeven(a) = ccall(
sumofeven_sym,
UInt32,
(Ptr{UInt32}, Csize_t),
a, length(a)
)
The documentation states multiple times that arguments in ccall are converted to become compatible with the C function prototype (emphasis mine):
Each argvalue to the ccall will be converted to the corresponding argtype, by automatic insertion of calls to unsafe_convert(argtype, cconvert(argtype, argvalue)). (See also the documentation for unsafe_convert and cconvert for further details.) In most cases, this simply results in a call to convert(argtype, argvalue).
And moreover, that when passing an Array{T} by Ptr{U} to a C function, the call is invalidated if the two types T and U are different, since no reinterpret cast is added (section Bits Types):
When an array is passed to C as a Ptr{T} argument, it is not reinterpret-cast: Julia requires that the element type of the array matches T, and the address of the first element is passed.
Therefore, if an Array contains data in the wrong format, it will have to be explicitly converted using a call such as trunc(Int32, a).
However, this is seemingly not the case. If I deliberately pass an array with another type element:
println(sumofeven(Float32[1, 2, 3, 4, 5, 6]))
The program calls the C function with the array passed directly, without converting the values nor complaining about the different element types, resulting in either senseless output or a segmentation fault.
If I redefine the function to accept a Ref{UInt32} instead of a Ptr{UInt32}, I am prevented from calling it with the array of floats:
ERROR: LoadError: MethodError: Cannot `convert` an object of type Array{Float32,1} to an object of type UInt32
Closest candidates are:
convert(::Type{T<:Number}, !Matched::T<:Number) where T<:Number at number.jl:6
convert(::Type{T<:Number}, !Matched::Number) where T<:Number at number.jl:7
convert(::Type{T<:Integer}, !Matched::Ptr) where T<:Integer at pointer.jl:23
...
However, Ref was not designed for arrays.
Making the example work with Ptr{UInt32} requires me to either specify Array{UInt32} as the type of input a (static enforcement), or convert the array first for a more flexible function.
sumofeven(a:: Array{UInt32}) = ccall( # ← either this
sumofeven_sym,
UInt32,
(Ptr{UInt32}, Csize_t),
convert(Array{UInt32}, a), # ← or this
length(a))
With that, I still feel that there is a gap in my reasoning. What is the documentation really suggesting when it says that an array passed to C as a Ptr{T} is not reinterpret-cast? Why is Julia letting me pass an array of different element types without any explicit conversion?
This turned out to be either a bug in the core library or a very misguided documentation, depending on the perspective (issue #29850). The behavior of the function unsafe_convert changed from version 0.4 to 0.5, in a way that makes it more flexible than what is currently suggested.
According to this commit, unsafe_convert changed from this:
unsafe_convert(::Type{Ptr{Void}}, a::Array) = ccall(:jl_array_ptr, Ptr{Void}, (Any,), a)
To this:
unsafe_convert{S,T}(::Type{Ptr{S}}, a::AbstractArray{T}) = convert(Ptr{S}, unsafe_convert(Ptr{T}, a))
For arrays, this relaxed implementation will enable a transformation from an array of T to a pointer of another type S. In practice, unsafe_convert(cconvert(array)) will reinterpret-cast the array's base pointer, as in the C++ nomenclature. We are left with a dangerously reinterpreted array across the FFI boundary.
The key takeaway is that one needs to take extra care when passing arrays to C functions, as the element type of an array in a C-call function parameter is not statically enforced. Use type signatures and/or explicit conversions where applicable.

D straight array indexed by an enum

In my D program, I have a read-only array of fixed length and I wish to index the array by an enumerated type.
If I do something like
static const my_struct_t aray[ my_enum_t ] = ... whatever ...;
my_enum_t index;
result = aray[ index ];
then the code produced by GDC is huge, full of calls to the runtime when the array is indexed. So it looks as if either the array is being treated as variable-length or as an associative array (hash table) or something, anyway far from a lightweight C-style array of fixed length with straightforward indexing. Since enums have a fixed cardinality and can't grow, and I have a modest sparse range of values (I'm not misusing the keyword enum just to defined a load of random constants) then I don't know why this happened.
I fixed the issue by changing the line to
static const my_struct_t aray[ my_enum_t.max + 1 ]
and as I understand it that will mean the value in the square brackets is just a known constant of integral type. Since the index is now not an enum at all, I now have an array indexed by an integer, so I have lost type checking, I could index it with any random integer typed variable rather than ensuring that only the correct (strong) type is used.
What should I be doing?
In the more general case, (silly example)
static const value_t aray[ bool ] = blah
for example, where I have an index type that is perfectly sensible semantically, but not just a typeless size_t/int/uint I presume I would get the same problem.
I wouldn't want to say that this is a compiler design problem. It's certainly a case of sub-optimal behaviour. But to be fair to the compiler what exactly is telling it whether the array is fixed-length or variable, and sparse or dense? I want two things; type checking of the index and non-variable length. Actually, in this particular case the array is const (I could have put immutable just as well) so it clearly can't be variable-length any way. But with an array that has modifiable content but is of fixed length you need to be able to declare that it is fixed-length.
V[K] name is the syntax for an associative array which does indeed do runtime calls and such, even when the type is limited to a small number of values like bool or an enum. The compiler probably could optimize that, making it act to the program like an AA while implementing it as a simple fixed-length array, but it doesn't; it treats all key types the same.
I would suggest going with what you started: T[enum.max + 1], but then doing a wrapper if you want to force type safety. You can make the index overloads static if you only want one instance of it:
enum Foo {
one,
two
}
struct struct_t {}
struct array {
static private struct_t[Foo.max + 1] content;
static struct_t opIndex(Foo idx) { return content[cast(int) idx]; }
}
void main() {
struct_t a = array[Foo.one];
}
Then, you can just genericize that if you want simpler reuse.
struct enum_array(Key, Value) {
static private struct_t[Key.max + 1] content;
static Value opIndex(Key idx) { return content[cast(int) idx]; }
}
alias array = enum_array!(Foo, struct_t);
Or, of course, you don't need to make it static, you could do a regular instance too, and initialize the contents inside and such.
In D, both static and dynamic arrays are indexed by size_t, just like they would be in C and C++. And you can't change the type of the index in D any more than you can in C or C++. So, in D, if you put a type between the brackets in the array declaration, you're defining an associative array and not a static array. If you want a static array, you must provide an integer literal or compile-time constant, and there is no way to require that a naked, static array be indexed by an enum type that has a base type of size_t or a type that implicitly converts to size_t.
If you want to require that your static array be indexed by a type other than size_t, then you need to wrap it in a struct or class and control the access to the static array via the member functions. You could overload opIndex to take your enum type and treat your struct type as if it were a static array. So, the effect should then be effectively what you were trying to do with putting the enum type in the static array declaration, but it would be the member function that took the enum value and called the static array with it rather than doing anything to the static array itself.

What is the purpose of Rust's function parameter syntax over C's?

Weird title, anyway, in C function parameters are as follows:
void func_name(int a, int b) {
}
However in Rust:
fn func_name(a: int, b: int) {
}
Is this just a preference in syntax and was appealing to the creators of Rust, or is this for a specific purpose that I don't know about? For example, Go has "optional semi-colons", but they are actually to show when an expression ends. Please bare in mind that I'm a complete novice at Rust, so if you try to provide some fancy examples in Rust I probably wont understand :(
The declaration of a function argument is just a special case of variable declarations in Rust, therefore the answer to your question lies in variable declaration in general.
Let us start with C:
a b = 1;
a = 2;
From a grammar point of view, C is not quite regular:
in a b = 1;, a is the type and b is the name of a new variable being declared (and initialized)
in a = 1;, a is the name of a variable that was declared previously and is now either initialized or assigned a new value (overwriting the previous one).
Therefore, in C, knowing whether a is a type or a variable name requires looking ahead (ie, if followed by another variable then it's a type, otherwise it's a variable).
Now, in Rust:
let a = 1;
a = 2;
The syntax for introducing a new variable requires using the let keyword, there is no ambiguity and no need to look ahead to disambiguate. This is all the more important because of shadowing in Rust (let a = ...; let a = a.foo;).
The question was about types though, so let's extend the example:
let a: b = 1;
a = 2;
In this case, again, there is no need to look ahead. Immediately after let comes the variable name, and only after parsing a : comes the variable type.
Therefore, the syntax of Rust is simply meant to avoid look ahead (Rust aims at having a LL(1) syntax) and the syntax of function arguments simply follows the regular syntax.
Oh, and by the way, not all arguments have a type:
impl Foo {
fn doit(&self);
}
In normal Rust code a variable is declared this way:
let x : i32 = 0;
The C style is not possible because the type is optional, so the former is equivalent to this one:
let x = 0i32;
You need the let keyword to declare the intention of declaring a name.
In a function declaration the type is mandatory, the initialization is not allowed and the let keyword makes no sense. Other than that the syntax is the same:
fn foo(x : i32)
It would be weird to have a different syntax for declaring local variables and function arguments, don't you think?

Declaring array using a constant expression for its size

I have a newtype wrapper around an array. I assumed that I could use size_of instead of manually passing the size of the array around, but the compiler thinks I'm wrong.
use std::mem::{size_of, size_of_val};
#[repr(C, packed)]
struct BluetoothAddress([u8, ..6]);
fn main() {
const SIZE: uint = size_of::<BluetoothAddress>();
let bytes = [0u8, ..SIZE];
println!("{} bytes", size_of_val(&bytes));
}
(playpen link)
I'm using the nightly: rustc 0.13.0-nightly (7e43f419c 2014-11-15 13:22:24 +0000)
This code fails with the following error:
broken.rs:9:25: 9:29 error: expected constant integer for repeat count, found variable
broken.rs:9 let bytes = [0u8, ..SIZE];
^~~~
error: aborting due to previous error
The Rust Reference on Array Expressions makes me think that this should work:
In the [expr ',' ".." expr] form, the expression after the ".." must be a constant expression that can be evaluated at compile time, such as a literal or a static item.
Your SIZE definition is not legal; it’s just that the errors in it occur later than the error on the array construction. If you change [0u8, ..SIZE] to [0u8, ..6] just so that that part works, you find the problems with the SIZE declaration:
<anon>:7:24: 7:53 error: function calls in constants are limited to struct and enum constructors [E0015]
<anon>:7 const SIZE: uint = size_of::<BluetoothAddress>();
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<anon>:7:24: 7:51 error: paths in constants may only refer to items without type parameters [E0013]
<anon>:7 const SIZE: uint = size_of::<BluetoothAddress>();
^~~~~~~~~~~~~~~~~~~~~~~~~~~
You simply can’t call size_of like that at present.
An alternative is to invert things so that SIZE is the canonical definition and the other places use it:
use std::mem::{size_of, size_of_val};
const SIZE: uint = 6;
#[repr(C, packed)]
struct BluetoothAddress([u8, ..SIZE]);
fn main() {
let bytes = [0u8, ..SIZE];
println!("{} bytes", size_of_val(&bytes));
}
Update: With Rust 1.0, this question has been effectively obsoleted, and the compiler error messages have been improved so that they are even more clear.
Furthermore, with #42859 recently landed, rustc nightly will allow using size_of in a constant context, provided the crate has #![feature(const_fn)] (and when #43017 lands, that won’t be needed any more either, and then it will filter through to stable).
In other words, improvements to the language have made this no longer an issue.

Declare a constant array

I have tried:
const ascii = "abcdefghijklmnopqrstuvwxyz"
const letter_goodness []float32 = { .0817,.0149,.0278,.0425,.1270,.0223,.0202, .0609,.0697,.0015,.0077,.0402,.0241,.0675, .0751,.0193,.0009,.0599,.0633,.0906,.0276, .0098,.0236,.0015,.0197,.0007 }
const letter_goodness = { .0817,.0149,.0278,.0425,.1270,.0223,.0202, .0609,.0697,.0015,.0077,.0402,.0241,.0675, .0751,.0193,.0009,.0599,.0633,.0906,.0276, .0098,.0236,.0015,.0197,.0007 }
const letter_goodness = []float32 { .0817,.0149,.0278,.0425,.1270,.0223,.0202, .0609,.0697,.0015,.0077,.0402,.0241,.0675, .0751,.0193,.0009,.0599,.0633,.0906,.0276, .0098,.0236,.0015,.0197,.0007 }
The first declaration and initialization works fine, but the second, third and fourth don't work.
How can I declare and initialize a const array of floats?
An array isn't immutable by nature; you can't make it constant.
The nearest you can get is:
var letter_goodness = [...]float32 {.0817, .0149, .0278, .0425, .1270, .0223, .0202, .0609, .0697, .0015, .0077, .0402, .0241, .0675, .0751, .0193, .0009, .0599, .0633, .0906, .0276, .0098, .0236, .0015, .0197, .0007 }
Note the [...] instead of []: it ensures you get a (fixed size) array instead of a slice. So the values aren't fixed but the size is.
As pointed out by #jimt, the [...]T syntax is sugar for [123]T. It creates a fixed size array, but lets the compiler figure out how many elements are in it.
From Effective Go:
Constants in Go are just that—constant. They are created at compile time, even when defined as locals in functions, and can only be numbers, characters (runes), strings or booleans. Because of the compile-time restriction, the expressions that define them must be constant expressions, evaluatable by the compiler. For instance, 1<<3 is a constant expression, while math.Sin(math.Pi/4) is not because the function call to math.Sin needs to happen at run time.
Slices and arrays are always evaluated during runtime:
var TestSlice = []float32 {.03, .02}
var TestArray = [2]float32 {.03, .02}
var TestArray2 = [...]float32 {.03, .02}
[...] tells the compiler to figure out the length of the array itself. Slices wrap arrays and are easier to work with in most cases. Instead of using constants, just make the variables unaccessible to other packages by using a lower case first letter:
var ThisIsPublic = [2]float32 {.03, .02}
var thisIsPrivate = [2]float32 {.03, .02}
thisIsPrivate is available only in the package it is defined. If you need read access from outside, you can write a simple getter function (see Getters in golang).
There is no such thing as array constant in Go.
Quoting from the Go Language Specification: Constants:
There are boolean constants, rune constants, integer constants, floating-point constants, complex constants, and string constants. Rune, integer, floating-point, and complex constants are collectively called numeric constants.
A Constant expression (which is used to initialize a constant) may contain only constant operands and are evaluated at compile time.
The specification lists the different types of constants. Note that you can create and initialize constants with constant expressions of types having one of the allowed types as the underlying type. For example this is valid:
func main() {
type Myint int
const i1 Myint = 1
const i2 = Myint(2)
fmt.Printf("%T %v\n", i1, i1)
fmt.Printf("%T %v\n", i2, i2)
}
Output (try it on the Go Playground):
main.Myint 1
main.Myint 2
If you need an array, it can only be a variable, but not a constant.
I recommend this great blog article about constants: Constants
As others have mentioned, there is no official Go construct for this. The closest I can imagine would be a function that returns a slice. In this way, you can guarantee that no one will manipulate the elements of the original slice (as it is "hard-coded" into the array).
I have shortened your slice to make it...shorter...:
func GetLetterGoodness() []float32 {
return []float32 { .0817,.0149,.0278,.0425,.1270,.0223 }
}
In addition to #Paul's answer above, you can also do the following if you only need access to individual elements of the array (i.e. if you don't need to iterate on the array, get its length, or create slices out of it).
Instead of
var myArray [...]string{ /* ... */ }
you can do
func myConstArray(n int) string {
return [...]string{ /* ... */ }[n]
}
and then instead of extracting elements as
str := myArray[i]
you extract them as
str := myConstArray(i)
Link on Godbolt: https://godbolt.org/z/8hz7E45eW (note how in the assembly of main no copy of the array is done, and how the compiler is able to even extract the corresponding element if n is known at compile time - something that is not possible with normal non-const arrays).
If instead, you need to iterate on the array or create slices out of it, #Paul's answer is still the way to go¹ (even though it will likely have a significant runtime impact, as a copy of the array needs to be created every time the function is called).
This is unfortunately the closest thing to const arrays we can get until https://github.com/golang/go/issues/6386 is solved.
¹ Technically speaking you can also do it with the const array as described in my answer, but it's quite ugly and definitely not very efficient at runtime: https://go.dev/play/p/rQEWQhufGyK

Resources