How Swift implement Array's copy-on-write behavior? - arrays

After watching build better apps with value type . In the photoshop example they made, they said that
the only thing that gets copied in the two instances of that diagram are the tiles that contain the person's shirt. So even though I have two distinct documents, the old state and the new state, the only new data that I have had to consume as a result of that is the tiles contained in this person's shirt.
So I begin to wonder how would these two array in memory looks like. So I do a little experiment.
struct Test {
var i: Int
var j: Int
}
var valueArray = [Test(i: 1, j: 9), Test(i: 2, j: 7)]
var valueArray2 = valueArray
When I print valueArray and valueArray2's address, they are not the same.
"Maybe they implement this by store pointer in array?"
But when I print memory content using lldb , they are actually just 4 Int (1,9,2,7).
So I am confused, I haven't even change the array yet. And they seems to make a copy of entire array? So where did I misunderstand?
The function I used to print struct's address is by using the method provided by #nschum in this question.
func address(o: UnsafePointer<Void>) {
let addr = unsafeBitCast(o, Int.self)
print(NSString(format: "%p", addr))
}
This is not a duplicate question of this question. I am asking about language feather and the other one is about programming skill.

Okay, I did many experiment and finally figured out.
We can's use & to get array address because once we do that , Swift will copy the array to better interact with C, use & get object's address that adjacent to array and do the math instead. Or use lldb instruction frame variable -L
The whole Array is copied once any of it's value element changed.
Actual value element of Array is allocated at heap.
Swift also did a lot of optimization for Array whose element is class.
Swift is awesome.
I actually write my first blog for this.

After your comments and getting a better understanding if this, I loaded it up in a playground and it seems to be working as expected
original answer for reference
The thing to remember with this is that structs are basically chunks of data in memory. When you create the valueArray, a chunk of memory is being set to value it's assigned
When you create valueArray2, you're creating a new instance of the struct, which means it will have a brand new chunk of memory, and you're then setting the value of that new chunk of memory to the same value of the chunk of memory from the valueArray. This results in a copy of the data in two different memory locations.
This is in contrast to an object, in which case valueArray would be a pointer to a chunk of memory, and when you create valueArray2 it would be creating a new pointer to the same chunk of memory.

Related

When to use slice instead of an array in GO

I am learning GO. According to documentation, slices are richer than arrays.
However, I am failing to grasp hypothetical use cases for slices.
What would be use case where one would use a slice instead of array?
Thanks!
This is really pretty elementary and probably should already have been covered in whatever documentation you're reading (unless it's just the language spec), but: A Go array always has a fixed size. If you always need 10 things of type T, [10]T is fine. But what if you need a variable number of things n, where n is determined at runtime?
A Go slice—which consists of two parts, a slice header and an underlying backing array—is pretty ideal for holding information needed to access a variable-sized array. Note that just declaring a slice-header variable:
var x []T
doesn't actually allocate any array of T yet: the slice header will be initialized to hold nil (converted to the right type) as the (missing) backing array, 0 as the current size, and 0 as the capacity of this array. As a result of this, the test x == nil will say that yes, x is nil. To get an actual array, you will need either:
an actual array, or
a call to make, or
use of the built-in append or similar (e.g., copy, append hidden behind some function, etc).
Since the call to make happens at runtime, it can make an array of whatever size is needed at this point. A series of calls to append can build up an array. Note that each call to append may have to allocate a new backing array, or may be able to extend the existing array in-place, depending on what's in the capacity. That's why you need x = append(x, elem) or x = append(x, elems...) and not just append(x, elem) or append(x, elems...).
The Go blog entry on slices has a lot more to say on this. I like this page more than the sequence of pages in the Go Tour starting here, but opinions vary.

How to avoid freeing objects that are stored in containers with the same reference count

I have been working on some features of a custom programming language written in c. Currently i'm working on a system that does reference counting for objects in the language, which in c are represented as structs with among other things, a reference count.
There also is a feature which can free all currently allocated objects (say before the exit of the program to clean up all memory). Now here lies the problem exactly.
I have been thinking about how to do it best but i'm running into some problems. Let me sketch out the situation a bit:
2 new integers are allocated. both have reference count of 1
1 new list is allocated, also with a reference count of 1
now both integers go in the list, which gives them a reference count of 2
after these actions both integers go out of scope for some reason, so their reference count drops to 1 as they are still in the list.
Now i'm done with these objects so i run the function to delete all tracked objects. However, as you might have noticed both the list and the objects in the list have the same reference count (1). This means there is no way to decide which object to free first.
If i would free the integers before the list, the list will try to decrement the reference count on the integers which were freed before, which will segfault.
If the list would be freed before the integers, it would decrement the reference count of the integers to 0, which automatically frees them too and no further steps need to be taken to free the integers. They aren't tracked anymore.
Currently i have a system that works most of the time but not for the example i give above, where i free the objects based on their reference count. Highest count latest. This obviously only works as long as the integers have higher reference count than the list which is as visible in the example above, not always the case. (It only works assuming the integers didn't drop out of scope so they still have a higher reference count than the list)
Note: i have already found one way which i really don't like: adding a flag to every object indicating it is in a container so cant be freed. I don't like this because it adds some memory overhead to every allocated object, and when there is a circular dependency no object would be freed. Of course a cycle detector could fix this but preferably i'd like to do this with the reference counting only.
Let me give a concrete example of the described steps above:
//this initializes and sets a garbage collector object.
//Basically it's a datastructure which records every allocated object,
//and is able to free them all or in the future
//run some cycle detection on all objects.
//It has to be set before allocating objects
garbagecollector *gc = init_garbagecollector();
set_garbagecollector(gc);
//initialize a tracked object fromthe c integer value 10
myobject * a = myinteger_from_cint(10);
myobject * b = myinteger_from_cint(10);
myobject * somelist = mylist_init();
mylist_append(somelist,a);
mylist_append(somelist,b);
// Simulate the going out of scope of the integers.
// There are no functions yet so i can't actually do it but this
// is a situation which can happen and has happened a couple of times
DECREF(a);
DECREF(b);
//now the program is done. all objects have a refcount of 1
//delete the garbagecollector and with that all tracked objects
//there is no way to prevent the integers being freed before the list
delete_garbagecollector(gc);
what of course should happen is that 100% of the time, the list is freed before the integers are.
What would be a smarter way of freeing all existing objects, in a way such that objects stored in containers aren't freed before the containers they're in?
It depends on your intention with:
There also is a feature which can free all currently allocated objects (say before the exit of the program to clean up all memory).
If the goal is to forcibly deallocate every single object regardless of its ref count, then I would have a separate chunk of code that walks the object graph and frees each object without touching its ref count. The ref count itself is going to end up freed too, so there's little point in updating it.
If the goal is to just tell the system "We don't need the objects anymore" then another option is to simply walk the roots and decrement their ref counts. If there are no other references to them, they'll hit zero. They will then decrement the ref counts of everything they refer to before being deallocated. That in turn percolates through the object graph. If the roots are the only thing holding onto references at the point that you call this, it will effectively free everything.
You should not free anything until the reference count for somelist is zero.

How does Go guarantee element pointers are valid for arrays and slices?

When you use indexer on array or slice in return you get the variable so you can take an address of it. I wonder how it is possible because the array/slice could more nested than the target variable:
// ptr declaration here
{
// array declaration here
ptr = &array[0];
}
In array case I see a problem that the data are on stack, with slice, that allocating it on heap does not solve automatically the problem, because GC could remove entire slice unless taking an address of the element links to the slice itself (thus preventing freeing memory).
Example: what happens when there is no guarantee on validity of the pointers -- let's say my array is a collection of colors. I pick one element, take an address of it, entire array is deleted (because it went of out scope), I check the value of element and it is 3.14. Or "hello world". Or maybe green. Since there is no guarantee it could be anything that is located at given address.
The Go compiler and the Go garbage collector guarantee that memory is not freed until it is no longer used.
To learn the basics of garbage collection, the Go team recommends The Garbage Collection Handbook: The Art of Automatic Memory Management.
See The Go Blog: Getting to Go: The Journey of Go's Garbage Collector for some history.

Reinitialising an array after creation

Out of pure interest, why do most programming languages not allow the programmer to reinitialise an array after it's creation.
Example
int apples[4][4]
apples[0][1] = "blue"
apples = apples[8][8] // Reinitialise the array with a new size of 8x8
apples[7][4] = "purple"
Explanation of what I mean
As you can see above, I create an array that is 4x4, then I assign a value, then I reinitialise that same array with a new size of 8x8, then I assign another value. In theory, I'd prefer that it destroy the contents of the old array (so my new 8x8 array doesn't have that value at 0x1).
However, I've searched high and low, yet I've not managed to find anything that explains why programming languages enforce this restriction. In my eyes it seems greatly beneficial to allow this and I can't see any immediate issues. But clearly there is an issue otherwise this would be allowed.
Question
So my question is: What's the reason that programming languages do not allow programmers to reinitialise an array after it's creation?
Initialization can be done until code is not in running state, that's why initialization can take place only one time in a code, for the second time if you are initializing same array, its already out of creation phase and now in running phase.
for better clarity size of array needs to be a constant at compile time because it might be getting used in code at somewhere.
int i=arr.lenght;
if(i<5)
{
//do something}
you can resize a dynamic array by realloc() in c and by using collection properties with arrayList, but its not re initialization.

C# -- Create Managed Array from Pointer

I'm trying to create a Managed Array of doubles from an array of bytes. I have the problem working currently, but I wanted to optimize. Here's some code that I would like to work:
private unsafe static double[] _Get_Doubles(byte[] _raw_data)
{
double[] ret;
fixed (byte* _pd = _raw_data)
{
double* _pret = (double*)_pd;
ret = (double[])*_pret; //FAILURE
}
}
Please let me know how to cope with these problems.
-Aaron
One of the key things to notice about the code you have posted is that there is no way to know how many items are pointed to by the return value, and a managed array needs to know how big it is. You can return a double* or create a new double[XXX] and copy the values or even (if the count is constant) create a struct with a public fixed double _data[2]; member and cast the raw data to that type.
Just now, I thought that stackalloc would be the right way, but it fails. Most importantly, I now know that it was doomed to fail. There is no way to do what I want to do.
This can be seen by restating the question:
How can I create a managed array around an 'unsafe' array?
Since a managed array has header information (because it's a class around a chuck of memory), it requires more space in memory than the array itself. So, the answer is:
Allocate space before (and/or after? depending on the way managed arrays are stored in memory) the array itself and put the managed information (length, (et cetera)) around the 'unsafe' array.
This is not easily possible because to guarantee that there is data enough around the array is shaky at best. In my particular example there may be enough space for it because a managed byte[] is passed in meaning that there is data around the array, but to assert that the same data is appropriate for managed double[] is dubious at best, but most likely erroneous, and to change the data to make it appropriate for managed double[] is nefarious.
[EDIT]
It looks like Marshal.Copy is the way to go here. Create a new array and let Marshal copy them (hoping that he will be quicker than me, or that perhaps at some later date, he will be quicker):
var ret = new double[_raw_data.Length / sizeof(double)];
System.Runtime.InteropServices.Marshal.Copy(new System.IntPtr(_pret), ret, 0, ret.Length);

Resources