Proper way of passing array parameters to D functions - arrays

1st Question:
Are D array function parameters always passed by reference, or by value?
Also, does the language implements Copy on Write for arrays?
E.g.:
void foo(int[] arr)
{
// is arr a local copy or a ref to an external array?
arr[0] = 42; // How about now?
}
2nd Question:
Suppose I have a large array that will be passed to function foo as a read-only parameter and it should be avoided as much as possible copying the array, since it is assumed to be a very large object. Which from the following (or none of them) would be the best declaration for function foo:
void foo(const int[] bigArray)
void foo(in int[] bigArray)
void foo(const ref int[] bigArray)

Technically, a dynamic array like int[] is just a pointer and a length. Only the pointer and length get copied onto the stack, not the array contents. An arr[0] = 42; does modify the original array.
On the other side, a static array like int[30] is a plain old data type consisting of 30 consecutive ints in memory. So, a function like void foo(int[30] arr) would copy 120 bytes onto the stack for a start. In such a case, arr[0] = 42; modifies the local copy of the array.
According to the above, each of the ways you listed avoids copying the array contents. So, whether you need the parameter to be const, in, const ref or otherwise depends on what you are trying to achieve besides avoiding array copy. For example, if you pass a ref int [] arr parameter, not only you can modify its contents, but also you will be able to modify the pointer and length (for example, create a wholly new array and assign it to arr so that it is visible from outside the function).
For further information, please refer to the corresponding articles on the DLang site covering arrays and array slices.

Related

Modifying swift [CChar] arrays in C functions without returning value

I started learning C and wanted to try some of the Swift-C interoperability.
I have a small C function which reads me a file and concatenates some useful letters into a char* variable. After some testing, I cannot find a way to pass my obtained char* data back to swift. I have written a small dummy code to illustrate what I am trying to achieve.
var letters: [CChar] = []
functionWithArray(&letters)
print("Back in swift: \(letters)")
And the C function is:
void functionWithArray(char* letters) {
int arrayLenght = 5;
int testLenght = 10; // Expand array to this value (testing)
int currentArrayPosition = 0; //Keep track of the assigned values
letters = malloc(sizeof(char)*arrayLenght);
while (currentArrayPosition < testLenght) {
if (currentArrayPosition == arrayLenght) {
arrayLenght++;
letters = realloc(letters, sizeof(char)*arrayLenght);
}
letters[currentArrayPosition] = *"A";
++currentArrayPosition;
}
printf("End of C function: %s\n", letters);
}
I get this as an output:
End of C function: AAAAAAAAAA
Back in swift: []
Program ended with exit code: 0
As you can see, inside the C function I've got the desired result, but back in swift I could not find a way to obtain the modified array. I do not return letters directly with the function because I need to return more values from that function. I'm new to C so please be kind.
There are two main issues with your approach here — one in C and one in Swift:
In C, function parameters are passed by value, and are effectively mutable local variables. That means that when functionWithArray receives char *letters, letters is a local variable containing a pointer value to the buffer of letters in memory. Importantly, that means that letters is assignable, but not in the way that you think:
letters = malloc(sizeof(char)*arrayLenght);
allocates an entirely new buffer through malloc, and assigns the newly-created pointer value to your local letters variable. Before the assignment, letters is a pointer to the buffer you were getting from Swift; after, to an unrelated buffer in memory. These two buffers are completely unrelated to one another, and because letters is just a local variable, this assignment is not propagaged in any way outside of the function.
Note that this is just a rule of C: as you learn more C, you'll likely discover that in order to assign a variable from inside of a function to outside of a function, you need to wrap the variable in another layer of pointers and write through that pointer (e.g., you would need to receive char **letters and assign *letters = malloc(...) to have any effect on a variable being passed in — and the variable couldn't be passed in directly, but rather, its address would need to be passed in).
However, you can't generally make use of this fact because,
The implicit conversion of an Array<T> to an UnsafeMutablePointer<T> (e.g. [CChar] → UnsafeMutablePointer<CChar> in Swift == char * in C) does not allow you to assign an entirely new buffer to the array instance. You can write into the contents of the buffer by writing to pointer values, but you cannot allocate a new block of memory and reassign the contents of the array to that new block
Instead, you'll need to either:
Have functionWithArray return an entirely new array and length from C — you mention this isn't possible for functionWithArray specifically because of the other values it needs to return, but theoretically you can also create a C struct which wraps up all of the return values together and return one of those instead
Rewrite functionWithArray to receive an array and a length, and pre-reserve enough space in the array up-front to fill it appropriately:
var letters: [CChar] = []
letters.reserveCapacity(/* however much you need */)
functionWithArray(&letters, letters.capacity)
In functionWithArray, don't reassign letters, but instead fill it up to the capacity given to you with results. Of course, this will only work if you know in Swift ahead of time how much space functionWithArray will need, which you might not
Alternatively, you can also use Array.init(unsafeUninitializedCapacity:initializingWith:) to combine these operations by having Array preallocate some space, and you can pass in the inout UnsafeMutableBufferPointer<CChar> to C where you can allocate memory if you need to and assign to the buffer pointer, then write out to the inout Int how many array elements you allocated and initialized. This does also require a capacity, though, and is a more complicated solution
Of these two approaches, if functionWithArray really does need to dynamically reallocate memory and grow the buffer, then (1) is likely going to be easier.

Does D distinguish between return/argument types of dynamic vs. static arrays?

Suppose I define a function, mutate, which replaces a random index's contents of an int array, a, with some function applied
int[] mutate(int[] a) {
int randomIndex = cast(int) uniform(a[randomIndex]);
a[randomIndex] = a[randomIndex] + 1;
return a;
}
Does this function specify input and return values of dynamic int array, static int array, or both? That is, is this function limited to accepting and returning either subtype of array? Is there a way to distinguish between dynamic and static arrays as arguments to a function?
Do either of the following throw an error?
void main() {
int[] dyn;
dyn = [1, 2, 3];
writeln(mutate(dyn));
int[3] stat = [1,2,3];
writeln(mutate(stat));
}
int[] mutate(int[] a)
That takes a slice and returns a slice. A slice is not necessarily a dynamic array, it might be a static array, though then you need to pass it as stat[] instead of just stat.
A slice is like a ptr and length combo in C: a pointer to data (which may reside anywhere, a dynamic array, a malloc array, a static array, some block of memory, whatever) and a count of the length.
When you return one like that, you do need to be kinda careful not to store it. The slice doesn't know where it is stored and you might easily lose track of who owns it and end up using a bad pointer! So be sure the scope is safe when doing something like this.
Read this for more info:
http://dlang.org/d-array-article.html

passing multidimensional array as argument in C

C newbie here, I need some help: Can anyone explain to (and offer a workaroud) me why this works:
int n=1024;
int32_t data[n];
void synthesize_signal(int32_t *data) {
...//do something with data}
which let me alter data in the function; but this does not?
int n=1024;
int number=1024*16;
int32_t data[n][2][number];
void synthesize_signal(int32_t *data) {
...//do something with data}
The compiler error message is something like it expected int32_t * but got int32_t (*)[2][(sizetype)(number)] instead.
First, passing arrays in C is by reference. So you pass a pointer of some sort, and the function can modify the data in the array. You don't have to worry about passing a pointer to the array. In fact, in C there is no real different between a pointer that happens to be to the being of an array, and the array itself.
In your first version. You making a one-dimensional array data[n], and you are passing it to your function. In the array, you'll using it by saying, something like data[i]. This translates directly to (data + (i sizeof(int32_t)). It is using the size of the elements in the array to find the memory location that is i positions in front of the beginning of your array.
int n=1024;
int number=1024*16;
int32_t data[n][2][number];
void synthesize_signal(int32_t *data)
In the second case, you're setting up a mufti-dimensional array (3D in your case). You setup correctly. The problem is that when you pass it to the function, the only thing that gets passed the address of the being of the array. When it gets used inside the function, you'll do something like
data[i][1][x] = 5;
Internally C is calculating how from the beginning of the array this location is. In order for it to do that, it need to know the dimensions of the array. (Unlike some newer languages, C store any extra data about array lengths or sizes or anything). You just need to change the function signature so it knows the shape/size of array to expect. Because of the way, it calculates array positions, it doesn't need the first dimension.
In this case, change your function signature to look like this:
void synthesize_signal(int32_t data[][2][number]) { ...
Setup the array the same way you are doing the second one above, and just call it you'd expect:
synthesize_signal(data);
This should fix everything for you.
The comments mention some useful information about using more descriptive variable names, and global vs. local variable. All valid comments to keep in mind. I just addressed to code problem you're having in terms of mufti-dimensional arrays.
try
synthesize_signal(int32_t** data)
{
}
Your function also needs to know that data is multi dimensional. You should also consider renaming your data array. I suspect that it is a global variable and using the same name in function can lead to problems.
When you call the function, do it like this:
synthesize_signal(&data[0][0][0]);

Is passing a static array to a function efficient?

#define BUFF_SIZE 100000
unsigned char buffer[BUFF_SIZE];
void myfunc(unsigned char[],int,int);
void myfuncinfunc(unsigned char[],int,int);
int main()
{
int a = 10, b = 10;
myfunc(buffer,a,b);
}
void myfunc(unsigned char array[],int a,int b)
{
int m,n;
//blah blah
myfuncinfunc(array,m,n);
}
void myfuncinfunc(unsigned char array[],int a, int b)
{
//blah blah
}
I wish to know the following:
I have created a static array as seen above the 'main' function. Is this efficient? Would it be better if I used a point and malloc instead?
I know it doesn't use the stack, so when I pass the array into inner functions, would it create a copy of the whole array or just send the location of the first entry?
When working on 'array' in the function 'myfunc', am I working directly with the static defined array or some local copy?
Inside the function 'myfunc', when we pass the array into the function 'myfuncinfunc', would it again, send only the first location or a complete copy of the array into the stack?
Thanks for reading the question and would greatly appreciate any help! I'm new to C and trying to learn it off the internet.
I don't see how it would be more or less efficient than an array on the heap.
It decays into a pointer to the first entry.
Therefore it's not a local copy, it's the array itself.
Ditto.
By the way, if a and b are indexes within the array, consider using the size_t type for them (it's an unsigned int guaranteed big enough for indexing arrays).
I have created a static array as seen above the 'main' function. Is this efficient? Would it be better if I used a point and malloc instead?
Define "efficient". Statically allocated arrays are always faster than dynamic ones, because of the runtime overhead for allocation/deallocation.
In this case, you allocate a huge amount of 100k bytes, which might be very memory-inefficient.
In addition, your process might not have that much static memory available, depending on OS. On desktop systems, it is therefore considered best practice to allocate on the heap whenever you are using large amounts of data.
I know it doesn't use the stack, so when I pass the array into inner functions, would it create a copy of the whole array or just send the location of the first entry?
You can't pass arrays by value in C. So a pointer to the first element of the array will be saved on the stack and passed to the function.
When working on 'array' in the function 'myfunc', am I working directly with the static defined array or some local copy?
Directly on the static array. Again, you can't pass arrays by value.
Inside the function 'myfunc', when we pass the array into the function 'myfuncinfunc', would it again, send only the first location or a complete copy of the array into the stack?
A pointer to the first element.

How many asterisks should I use when declaring a pointer to an array of C-strings?

I am having a VB application request a list of users from a C DLL:
VB will ask the DLL how many users there are, and then initialize an array to the appropriate size.
VB will then pass its array by reference to a DLL function, which will fill it with usernames.
I started writing the C function like this: foo(char **bar); which would be treated as an array of strings. But then I realized, I'm going to make each item in the array point to a different C-string (the char *username in the struct userlist linked list) rather than modify the data already being pointed to. The array of arrays is being passed by value: a copy of a list of addresses, so the addresses point to the original data, but modifying the addresses in that copy won't change the list of addresses of the caller (I think, anyways). So, should I be declaring it foo(char ***bar);? This would be a pointer to the array of strings, so that if I change the strings that array is pointing to, it will modify the array of strings the caller (VB) is using....right?
This is my usage so far (haven't tested it yet... I'm still just coding the DLL as of yet, there's no VB front-end to call it thus far)
EXPORT void __stdcall update_userlist(char ***ulist){
int i = 0;
userlist *cur_user = userlist_head; //pointer to first item in linked list
for(; i < usercount_; ++i){
*ulist[i] = cur_user->username;
cur_user = cur_user->next;
}
}
In general it's not simple to do what you're asking, because VB just doesn't understand C-style ASCIIZ strings and arrays.
If your DLL is not expecting a VB SafeArray of BSTR, you're going to have some difficulty populating it.
It would be simple to have VB pass in an array of Long (C int) by reference to the first element, and you could fill that with the pointers to individual strings. The VB side could copy them to VB strings. But in that case, who disposes of the C strings, and when?
If you create the VB array and fill it with pre-sized strings, you'll still have to deal with a SafeArray on the C side, because you can't pass a single VB string array element by reference and expect to find the remaining strings contiguous to it in memory.
The best, safest method is to have your DLL create a SafeArray of so-called 'Ansi BSTR', and declare the function in VB as returning an array of strings. Then you don't need two calls, because the array bounds will tell the whole story.
===== edit =====
When VB passes a string array to a Declared function it does some voodoo behind the scenes. It first converts all the strings from Unicode to a bastard form commonly known as 'Ansi BSTR'. To C, these look like and can be treated as ASCIIZ or LPSTR except that you can't create or lengthen them in the normal C way, you can only fill them in. On the C side, the passed array looks like ppSA (SAFEARRAY**). The Ansi BSTR are a series of pointers referenced by the pData member of the SafeArray.
You absolutely cannot pass a single string from the array (as char*) and expect to find the rest of the strings contiguous to it in memory. You have to pass the array itself and manipulate it using the SafeArray API (or knowledge of the SA structure).
That's why the best option overall is to do all of this directly in the DLL. Create the array using SafeArrayCreate, then create Ansi BSTRs using SysAllocStringByteLen and place those strings (which are BSTR, so a 4-byte pointer) into the array slots. On return, VB does its voodoo and converts the strings to Unicode for you.
In VB your function would be Declared as returning a String().
two asterixes is the way to go.
char* // is a pointer to a char
char** // is a pointer to a char pointer
char*** // is a pointer to a pointer to a char pointer - e.g. multi-dimensional array (err...)
I've confused myself :)
So let me get this straight. Your function fills in an array of strings from data contained in a linked list ?
If you know the size of the list beforehand, you can just pass a char **, but if you do not know the size and need to be able to grow the list, you will need a char ***.
From looking at your code, you seem to already know the length, so you just need to allocate an array of the correct length before you call the function. Here is an example:
void update_userlist(char **ulist)
{
int i = 0;
userlist *cur_user = userlist_head;
for(; i < usercount_; ++i)
{
ulist[i] = cur_user->username; // I am assuming that username is a char *
cur_user = cur_user->next;
}
}
// This sets up the array and calls the function.
char **mylist = malloc(sizeof(char*) * usercount_);
update_userlist(mylist);
Update: Here is the difference between the various levels of pointers:
void func1(char *data)
This passes a copy of a pointer to a C string. If you change the pointer to point to a different string, the calling function will still point to the original string.
void func2(char **data)
This passes a copy of a pointer to an array of pointers to C strings. You can replace the pointer to any string in the array and the calling function's array will be changed because it has not made a copy of the array, it only points to the caller's array.
void func3(char ***data)
This passes a pointer to a pointer to an array of pointers to C strings. With this, you can completely replace the entire array. You would only need this level of indirection if you need to grow the array since C arrays cannot be re-sized.

Resources