Does ccall really convert arguments passed by pointer? - arrays

Considering a dynamic library with this native function that returns the sum of all even (32-bit unsigned) numbers in an array:
uint32_t sum_of_even(const uint32_t *numbers, size_t length);
The implementation of the function above was written in Rust as below, and packaged into a C dynamic library.
use libc::size_t;
use std::slice;
#[no_mangle]
pub extern "C" fn sum_of_even(n: *const u32, len: size_t) -> u32 {
let numbers = unsafe {
assert!(!n.is_null());
slice::from_raw_parts(n, len as usize)
};
numbers
.iter()
.filter(|&v| v % 2 == 0)
.sum()
}
I wrote the following Julia (v1.0.1) wrapper function:
lib = Libdl.dlopen(libname)
sumofeven_sym = Libdl.dlsym(lib, :sum_of_even)
sumofeven(a) = ccall(
sumofeven_sym,
UInt32,
(Ptr{UInt32}, Csize_t),
a, length(a)
)
The documentation states multiple times that arguments in ccall are converted to become compatible with the C function prototype (emphasis mine):
Each argvalue to the ccall will be converted to the corresponding argtype, by automatic insertion of calls to unsafe_convert(argtype, cconvert(argtype, argvalue)). (See also the documentation for unsafe_convert and cconvert for further details.) In most cases, this simply results in a call to convert(argtype, argvalue).
And moreover, that when passing an Array{T} by Ptr{U} to a C function, the call is invalidated if the two types T and U are different, since no reinterpret cast is added (section Bits Types):
When an array is passed to C as a Ptr{T} argument, it is not reinterpret-cast: Julia requires that the element type of the array matches T, and the address of the first element is passed.
Therefore, if an Array contains data in the wrong format, it will have to be explicitly converted using a call such as trunc(Int32, a).
However, this is seemingly not the case. If I deliberately pass an array with another type element:
println(sumofeven(Float32[1, 2, 3, 4, 5, 6]))
The program calls the C function with the array passed directly, without converting the values nor complaining about the different element types, resulting in either senseless output or a segmentation fault.
If I redefine the function to accept a Ref{UInt32} instead of a Ptr{UInt32}, I am prevented from calling it with the array of floats:
ERROR: LoadError: MethodError: Cannot `convert` an object of type Array{Float32,1} to an object of type UInt32
Closest candidates are:
convert(::Type{T<:Number}, !Matched::T<:Number) where T<:Number at number.jl:6
convert(::Type{T<:Number}, !Matched::Number) where T<:Number at number.jl:7
convert(::Type{T<:Integer}, !Matched::Ptr) where T<:Integer at pointer.jl:23
...
However, Ref was not designed for arrays.
Making the example work with Ptr{UInt32} requires me to either specify Array{UInt32} as the type of input a (static enforcement), or convert the array first for a more flexible function.
sumofeven(a:: Array{UInt32}) = ccall( # ← either this
sumofeven_sym,
UInt32,
(Ptr{UInt32}, Csize_t),
convert(Array{UInt32}, a), # ← or this
length(a))
With that, I still feel that there is a gap in my reasoning. What is the documentation really suggesting when it says that an array passed to C as a Ptr{T} is not reinterpret-cast? Why is Julia letting me pass an array of different element types without any explicit conversion?

This turned out to be either a bug in the core library or a very misguided documentation, depending on the perspective (issue #29850). The behavior of the function unsafe_convert changed from version 0.4 to 0.5, in a way that makes it more flexible than what is currently suggested.
According to this commit, unsafe_convert changed from this:
unsafe_convert(::Type{Ptr{Void}}, a::Array) = ccall(:jl_array_ptr, Ptr{Void}, (Any,), a)
To this:
unsafe_convert{S,T}(::Type{Ptr{S}}, a::AbstractArray{T}) = convert(Ptr{S}, unsafe_convert(Ptr{T}, a))
For arrays, this relaxed implementation will enable a transformation from an array of T to a pointer of another type S. In practice, unsafe_convert(cconvert(array)) will reinterpret-cast the array's base pointer, as in the C++ nomenclature. We are left with a dangerously reinterpreted array across the FFI boundary.
The key takeaway is that one needs to take extra care when passing arrays to C functions, as the element type of an array in a C-call function parameter is not statically enforced. Use type signatures and/or explicit conversions where applicable.

Related

Can I pass char* array instead of several arguments to a function?

I am trying to write a program that will run a function from a dynamic library with its name, arguments and their types inputted by the user. I can get a void* pointer to this function using dlsym(func_name), however I do not know the types of arguments and their amount beforehand, so I can't dereference this pointer the normal way.
If I copy the byte representations of all the arguments into a char * array, can I call a function with this array as its arguments? For example when calling printf in assembly I can push the arguments and the format string to the stack and call printf. Maybe I can do something similar in C, or use assembly insertions?
For example, lets say the function is void func(int, double); and I am given arguments int a and double b. Then I would create an array:
char buf[sizeof(int) + sizeof(double)]
memcpy(buf, &a, sizeof(int))
memcpy(buf + sizeof(int), &b, sizeof(double))
And use it as the arguments pushed onto a stack.
The user inputs the name of the function, a string describing its type, and the arguments themselves. For example:
func_name
viid
5
6
2.34
func_name is the name of the function which will be used in dlsym. viid means that the function has void func_name(int, int, double) type (first char is the return type, the rest are arguments. v = void, i = int, d = double). The rest are arguments.
No, except possibly in ABIs that pass arguments purely on the stack. Any computing environment has a specification of how arguments are passed. (This specification is part of an Application Binary Interface [ABI] specification for the platform.) The argument types and sizes affect whether they are passed in general registers, floating-point registers, or other locations, as well as how they are aligned. Few, if any, such specifications pass arguments as a raw dump of bytes to the stack.
Code can be written to decode a description of the arguments and construct the necessary state to call the function. Such code would need to be written at least partially in assembly or various language extensions. (I would expect some people may have already written such code and made it available but do not have references for such.) This is likely an excessively costly solution to implement for most situations; you should seek alternative ways to achieve your underlying goal.
If an ABI does specify that all arguments are put on the stack, then it might be possible to pass arbitrary arguments in a structure containing an array of unsigned char:
A maximum size must be determined, such as struct { unsigned char bytes[128]; }, because standard C does not support structures of variable size.
The structure must be aligned as required for whatever argument has the strictest alignment.
The bytes of all arguments must be properly aligned within the structure.
The function would be called by passing the structure itself as an argument. Take care that the ABI must say that even large structures are passed directly on the stack. Some might say that a pointer to (a copy of) the structure is passed.
Note that ABIs might include other details affecting function calls, such as saying that a certain register must contain the number of arguments or be set a certain way, even if it does not itself contain an argument.

No compiler warning by inconsistent function parameter use

During a C code review of the software that did not want to run on the target HW, the following inconsistency has been discovered in a function use:
The SW component 1 implements the function:
void foo(uint8 par_var[2])
The function foo() also writes the two element of the par_var array.
The SW component 2 gets an external declaration of the foo() as following
extern void foo(uint8 *par_var)
and uses it as following :
uint8 par_var;
foo(&par_var); //-> sending a pointer to a scalar
// instead to an array of 2 elements.
Obviously, it may lead and leads actually to the program failure.
The question is, if it could be possible for compiler/linker to intercept the inconsistency by issuing a warning, for example.
I have scanned and tried some of the gcc (CygWin) compiler options along with the standard ones (-Wall, -pedantic ) https://gcc.gnu.org/onlinedocs/gcc-3.4.4/gcc/Warning-Options.html
but could not find one that could issue a corresponding warning .
There is a C feature that could aid a compiler in diagnosing this, but I am not aware of a compiler that takes advantage of it and warns. If foo were declared as void foo(uint8 par_var[static 2]);, then a caller is required to pass a pointer to at least two elements, per C 2018 6.7.6.3 7:
If the keyword static also appears within the [ and ] of the array type derivation, then for each call to the function, the value of the corresponding actual argument shall provide access to the first element of an array with at least as many elements as specified by the size expression.
So, a compiler seeing uint8 par_var; foo(&par_var); could recognize the failure to pass two elements and provide a warning. (While I am not aware of compilers that check the declared size, some compilers will warn when a null pointer is passed for such a parameter.)
As is well known, in the declaration void foo(uint8 par_var[2]), par_var is automatically adjusted to uint8 *par_var. As an alternative, instead of passing a pointer to uint8, you could pass a pointer to an array of uint8 by declaring foo as void foo(uint8 (*par_var)[2]);.
Then you would have to pass it an array, such as:
uint8 A[2];
foo(&A);
If it were called with a pointer to uint8, the compiler should issue a warning. Unfortunately, this also constraints the routine; you must pass it a pointer to an array of two uint8 and cannot pass it a pointer to a larger array or a pointer to a uint8 in a larger array. So it has limited use. Nonetheless, it could serve in certain situations.

ccall with Array type in signature calling a struct in C from Julia

I am having trouble with calling a C function from Julia. This may be generally useful question, but I'll describe it here in the concrete setting I am struggling with. I am trying to create a bson object:
BSONObject("{a:1}")
so this object's constructor is called:
BSONObject(jsonString::String) = begin
jsonCStr = bytestring(jsonString)
bsonError = BSONError()
_wrap_ = ccall(
(:bson_new_from_json, libbson),
Ptr{Void}, (Ptr{Uint8}, Csize_t, Ptr{Uint8}),
jsonCStr,
length(jsonCStr),
bsonError._wrap_
)
_wrap_ != C_NULL || error(bsonError)
bsonObject = new(_wrap_, None)
finalizer(bsonObject, destroy)
return bsonObject
end
in the https://github.com/pzion/LibBSON.jl/blob/master/src/BSONObject.jl LibBSON package needed to handle MongoDB queries, but the setting is not particularly important. What is important is the ccall, which passes a string, jsonCStr, this string's length, and bsonError._wrap_. This last object comes from https://github.com/pzion/LibBSON.jl/blob/master/src/BSONError.jl and is an array:
type BSONError
_wrap_::Vector{Uint8}
function BSONError()
return new(Array(Uint8, 512))
end
end
created in the above constructor of BSONError object, an array of 512 Uint8's. This Julia bsonError._wrap_ refers to the following struct in C:
typedef struct
{
uint32_t domain;
uint32_t code;
char message[504];
} bson_error_t;
see on http://api.mongodb.org/libbson/current/bson_error_t.html, and this struct is of length 4 + 4 + 504 = 512, so it looks OK.
Now going back to the ccall, its type signature is Ok: Ptr{Uint8} points to the string, Csize_t is the type of its size, and Ptr{Uint8} points to the struct. This latter, however, returns with an error message:
LoadError: MethodError: `convert` has no method matching convert(::Type{Ptr{UInt8}}, ::Array{UInt8,1})
This may have arisen from a call to the constructor Ptr{UInt8}(...),
since type constructors fall back to convert methods.
Closest candidates are:
call{T}(::Type{T}, ::Any)
convert{T<:Union{Int8,UInt8}}(::Type{Ptr{T<:Union{Int8,UInt8}}}, !Matched::Cstring)
convert{T}(::Type{Ptr{T}}, !Matched::UInt64)
...
while loading In[2], in expression starting on line 1
in convert at /Users/szalmaf/.julia/v0.4/LibBSON/src/BSONError.jl:21
in call at /Users/szalmaf/.julia/v0.4/LibBSON/src/BSONObject.jl:33
apparently, trying to convert the Array to a type Ptr{UInt8}.
The Julia manual http://julia.readthedocs.org/en/latest/manual/calling-c-and-fortran-code/#mapping-c-types-to-julia says in the 'Mapping C Types to Julia' section's 'Bits Types' subsection that a Julia Array{T,N} should be passed as Ptr{T}, where T is UInt8 in this case. So the Julia ccall looks ok, but there is still that error message. It is a pretty burning problem since it prevents more complex queries in the database. Any suggestions as to how to remedy this ccall problem?
P.S. Note if you install the Mongo package it comes with the LibBSON package and the libbson C library.
The problem is unrelated to the ccall. It is caused by this line, as indicated in the stack trace.
In 0.4, there is no longer a convert{T}(::Type{Ptr{T}},Array{T}) function.
ccall calls converts its argument using the (non-exported) unsafe_convert method (which is why the above code doesn't cause an error). If you want a Ptr object in user code, the easiest way is to use the pointer method.
It turns out the BSONError.jl has had a bug, the
return bytestring(convert(Ptr{Uint8}, bsonError._wrap_[9:end]))
should be
return bytestring(bsonError._wrap_[9:end])
in the BSONError.jl's convert function. That is, the error could not be printed out because the error printing is buggy.
Fixing that, from the error message received form C it turns out that the correct bson format is
BSONObject("{\"a\":1}")
to create a bson object.

Declare a constant array

I have tried:
const ascii = "abcdefghijklmnopqrstuvwxyz"
const letter_goodness []float32 = { .0817,.0149,.0278,.0425,.1270,.0223,.0202, .0609,.0697,.0015,.0077,.0402,.0241,.0675, .0751,.0193,.0009,.0599,.0633,.0906,.0276, .0098,.0236,.0015,.0197,.0007 }
const letter_goodness = { .0817,.0149,.0278,.0425,.1270,.0223,.0202, .0609,.0697,.0015,.0077,.0402,.0241,.0675, .0751,.0193,.0009,.0599,.0633,.0906,.0276, .0098,.0236,.0015,.0197,.0007 }
const letter_goodness = []float32 { .0817,.0149,.0278,.0425,.1270,.0223,.0202, .0609,.0697,.0015,.0077,.0402,.0241,.0675, .0751,.0193,.0009,.0599,.0633,.0906,.0276, .0098,.0236,.0015,.0197,.0007 }
The first declaration and initialization works fine, but the second, third and fourth don't work.
How can I declare and initialize a const array of floats?
An array isn't immutable by nature; you can't make it constant.
The nearest you can get is:
var letter_goodness = [...]float32 {.0817, .0149, .0278, .0425, .1270, .0223, .0202, .0609, .0697, .0015, .0077, .0402, .0241, .0675, .0751, .0193, .0009, .0599, .0633, .0906, .0276, .0098, .0236, .0015, .0197, .0007 }
Note the [...] instead of []: it ensures you get a (fixed size) array instead of a slice. So the values aren't fixed but the size is.
As pointed out by #jimt, the [...]T syntax is sugar for [123]T. It creates a fixed size array, but lets the compiler figure out how many elements are in it.
From Effective Go:
Constants in Go are just that—constant. They are created at compile time, even when defined as locals in functions, and can only be numbers, characters (runes), strings or booleans. Because of the compile-time restriction, the expressions that define them must be constant expressions, evaluatable by the compiler. For instance, 1<<3 is a constant expression, while math.Sin(math.Pi/4) is not because the function call to math.Sin needs to happen at run time.
Slices and arrays are always evaluated during runtime:
var TestSlice = []float32 {.03, .02}
var TestArray = [2]float32 {.03, .02}
var TestArray2 = [...]float32 {.03, .02}
[...] tells the compiler to figure out the length of the array itself. Slices wrap arrays and are easier to work with in most cases. Instead of using constants, just make the variables unaccessible to other packages by using a lower case first letter:
var ThisIsPublic = [2]float32 {.03, .02}
var thisIsPrivate = [2]float32 {.03, .02}
thisIsPrivate is available only in the package it is defined. If you need read access from outside, you can write a simple getter function (see Getters in golang).
There is no such thing as array constant in Go.
Quoting from the Go Language Specification: Constants:
There are boolean constants, rune constants, integer constants, floating-point constants, complex constants, and string constants. Rune, integer, floating-point, and complex constants are collectively called numeric constants.
A Constant expression (which is used to initialize a constant) may contain only constant operands and are evaluated at compile time.
The specification lists the different types of constants. Note that you can create and initialize constants with constant expressions of types having one of the allowed types as the underlying type. For example this is valid:
func main() {
type Myint int
const i1 Myint = 1
const i2 = Myint(2)
fmt.Printf("%T %v\n", i1, i1)
fmt.Printf("%T %v\n", i2, i2)
}
Output (try it on the Go Playground):
main.Myint 1
main.Myint 2
If you need an array, it can only be a variable, but not a constant.
I recommend this great blog article about constants: Constants
As others have mentioned, there is no official Go construct for this. The closest I can imagine would be a function that returns a slice. In this way, you can guarantee that no one will manipulate the elements of the original slice (as it is "hard-coded" into the array).
I have shortened your slice to make it...shorter...:
func GetLetterGoodness() []float32 {
return []float32 { .0817,.0149,.0278,.0425,.1270,.0223 }
}
In addition to #Paul's answer above, you can also do the following if you only need access to individual elements of the array (i.e. if you don't need to iterate on the array, get its length, or create slices out of it).
Instead of
var myArray [...]string{ /* ... */ }
you can do
func myConstArray(n int) string {
return [...]string{ /* ... */ }[n]
}
and then instead of extracting elements as
str := myArray[i]
you extract them as
str := myConstArray(i)
Link on Godbolt: https://godbolt.org/z/8hz7E45eW (note how in the assembly of main no copy of the array is done, and how the compiler is able to even extract the corresponding element if n is known at compile time - something that is not possible with normal non-const arrays).
If instead, you need to iterate on the array or create slices out of it, #Paul's answer is still the way to go¹ (even though it will likely have a significant runtime impact, as a copy of the array needs to be created every time the function is called).
This is unfortunately the closest thing to const arrays we can get until https://github.com/golang/go/issues/6386 is solved.
¹ Technically speaking you can also do it with the const array as described in my answer, but it's quite ugly and definitely not very efficient at runtime: https://go.dev/play/p/rQEWQhufGyK

How to convert GMP C parameter convention into something more natural?

For example, I would like to do something like this:
#include <gmp.h>
typedef mpz_t Integer;
//
Integer F(Integer a,Integer b,Integer c,Integer d) {
Integer ret = times(plus(a,b),plus(c,d));
}
But, GMP doesn't let me do this, apparently mpz_t is an array, so I get the error:
error: ‘F’ declared as function returning an array
So instead I would have to do something like this:
void F(Integer ret,Integer a,Integer b,Integer c,Integer d) {
Integer tmp1,tmp2;
plus(tmp1,a,b);
plus(tmp2,c,d);
times(ret,tmp1,tmp2);
}
This is unnatural, and not following the logical way that C (or in general mathematical) expressions can be composed. In fact, you can't compose anything in a math-like way because apparently you can't return GMP numbers! If I wanted to write - for example - a simple yacc/bison style parser that converted a simple syntax using +, -, /, * etc. into C code implementing the given expressions using GMP it seems it would be much more difficult as I would have to keep track of all the intermediate values.
So, how can I force GMP to bend to my will here and accept a more reasonable syntax? Can I safely "cheat" and cast mpz_t to a void * and then reconstitute it at the other end back into mpz_t? I'm assuming from reading the documentation that it is not really passing around an array, but merely a reference, so why can't it return a reference as well? Is there some good sound programming basis for doing it this way that I should consider in writing my own program?
From gmp.h:
typedef __mpz_struct mpz_t[1];
This makes a lot of sense, and is pretty natural. Think about it: having an
array of size 1 allows you to deal with an obscured pointer (known as opaque
reference) and all its advantages:
mpz_t number;
DoubleIt(number); /* DoubleIt() operates on `number' (modifies it) as
it will be passed as a pointer to the real data */
Were it not an array, you'd have to do something like:
mpz_t number;
DoubleIt(&number);
And then it comes all the confusion. The intention behind the opaque type is
to hide these, so you don't have to worry about it. And one of the main
concerns should be clear: size (which leads to performance). Of course you
can't return such struct that holds data limited to the available memory. What
about this one (consider mpz_t here as a "first-class" type):
mpz_t number = ...;
number = DoubleIt(number);
You (the program) would have to copy all the data in number and push it as a
parameter to your function. Then it needs to leave appropriate space for
returning another number even bigger.
Conclusion: as you have to deal with data indirectly (with pointers) it's
better to use an opaque type. You'll be passing a reference only to your
functions, but you can operate on them as if the whole concept was
pass-by-reference (C defaults to pass-by-reference).

Resources