OpenCL - copy Tree to device memory

OpenCL - copy Tree to device memory - c

I'm implemented a Binary-Search-Tree in C code. Each of my tree nodes looks like this:
typedef struct treeNode {
int key;
struct treeNode *right;
struct treeNode *left;
} treeNode_t;
The construction of the Tree made by the Host. The query of the tree made by the device.
Now, let's assumed that I'm already finished building my Tree in host memory.
I'm want to copy the root of my tree to the memory of my device.
Copying the root of the tree it self isn't enough. Because the right \ left child isn't located in the device memory. This is a problem.
So, my question is what is the easiest way to copy my whole tree to the device memory?

The easiest (and likely also best) way is to change your structure to use node indexes instead of pointers. The issue with pointers is that the device has different pointers and even if you copy all nodes separately, it would still not work as the pointers also need to be updated to device pointers. And unfortunately OpenCL 1.2 does not even guarantee that device pointers stay valid longer than a single kernel invocation. For this reason you have to use indexes instead of pointers at least on the device.
Modify your structure like this:
typedef struct treeNode {
int key;
int left;
int right;
} treeNode_t;
Before you build the tree you allocate one big array of tree nodes, large enough to hold all nodes.
treeNode_t nodes[MAX_NODES]; // or dynamic allocation
int first_free_node=0;
Every time you would normally allocate a new node, you now use nodes[first_free_node] to store the data and increment the first_free_node counter. When you are done building your tree, you can just use a single clEnqueueCopyBuffer call to copy all nodes to the device. You only need to copy first_free_node*sizeof(treeNode_t) bytes from the start of the nodes array to the device. If you cannot change you host tree building code, you can use a simple recursive deep first travesal of the tree to count the number of nodes and convert the nodes from the pointer based format to the index based format.
On some devices you might get a higher performance if you convert the structure of your tree from array of structures to structure of arrays. Padding the structure to 16 byte per node could also help.

If your device supports OpenCL 2.0 then you can use Shared Virtual Memory. The pointers created on the host will be valid on the device too. Here is the description and the binary search tree example: opencl-2-shared-virtual-memory.

Related

memory usage of a particular function in c program

I am doing a project of Data Structures in c language in which I have been using some functions like insertion in a tree, deletion from a tree, finding a particular value in a tree. I have to calculate memory using of every individual function like memory usage in insertion function, deletion function e.t.c. Kindly guide me if there is any library or built-in function to calculate memory usage. I have found some content like to find memory usage of whole program but I am concerned with memory usage of a particular function.

Your question is vague, but I will try to answer it like I understand it.
Allocated memory for a tree with lets say 20 Nodes will consume 20xNode size so lets say our Node looks like this:
typedef struct node
{
int val;
struct node * next;
} node_t;
Let's assume you have a 64 bit system, then an integer will take 4 bytes and a pointer would take another 8 bytes. Together Node size will be 12 bytes of allocated memory ready to use.
For our example a tree with 20 nodes will consume 240 bytes of memory.
When you add Nodes to the tree you are basically increasing the memory stored and hence expanding the memory usage.
If you need to keep check of how much memory is used by each function you should calculate each function usage separately(remember that each function stores the variables locally and not globaby) and add the overall allocated or freed memory in the process.
Let's say a function that adds a node looks like this :
void addNode(node_t *leaf, int a)
{
node *new=malloc(sizeof(node));
new->val=a;
new->next=NULL;
*leaf->next=new;
}
this function will add 12 bytes to your overall memory usage.
you can add a counter in your main function to keep track of the memory stored and at the end of an addition add sizeof(node) to the counter via pointers or subtracts from it.
If i understand your question you don't need to keep track of local memory because its not important after the functions end. You just need to keep track of memory added or subtracted after initial allocation for the tree you receive.
To my knowledge there are no function that can do it for you so you will have to add it yourself.
My suggestion to you is to break your program to many little function and at the end of each keep track of what it has done to the overall memory stored.
you can also keep track of local variables but they are deleted at each transition between function (if they are not allocated pointers).

How to represent a buffer as linked list in C

I have preallocated buffer (array of chars) to which strings are being written, it looks like this:
"This\0buffer\0contains\0strings\00000..."
I want to be able to remove the last string, so I came up with the idea to represent each string in the buffer as a linked list node with pointers to its start and end, so I would just find the last node and fill the region it holds with zeros.
struct node
{
char *str_start;
char *str_end;
struct node *next;
};
It looks like a very common problem, however, I failed to find such implementations anywhere.
The question is: Is there some data structure for my usecase I overlooked, or is there a better solution to do this?
Note: This is to be used in the kernel module, so maybe it's already implemented in kernel

You could use a 2 dimensional array (considering the buffer is preallocated and you know of a maximum size for every string).
That would make enforcing string sizes easier, hence your code more secure.

When implementing a linked list in C who is responsible for freeing the value?

I am implementing a linked list in C and I am running into the issue where C does not implement any specific scheme for memory management other than just giving you the ability to allocate and free memory by passing a pointer. There is no concept of whether the value might be needed later on in the program.
The typical implementation I find online for a linked list basically deallocs the deleted node but does not dealloc the node's value.
Whose responsibility should it be to release the memory taken up by the value when deleted from the list ? The linked list's or the normal flow of the program ?
example:
// allocate 10 bytes
char *text = malloc(sizeof(char) * 10);
// create the linked list
LinkedList *list = list_create();
// add the text pointer to the linked list
list_append(list, text);
// remove the pointer from the linked list
list_remove_last(list);
In this case text would end up not getting deallocated as list_remove_last just frees the memory that the new node takes up. What would be the proper way to release the memory taken up by text ?

that is a very common way of container implementation in C.
basically you dynamically allocate the contents of the list and pass the pointer to the container, now the container is responsible for freeing it.
You can also pass in a function pointer to list_create() so it knows how to do list_remove_last() properly, this is especially useful for using a generic container that does not know what type of elements it will contain (it will just hold void * pointers).
think of the case where the data itself is a struct that contains other pointers. in this case list_remove() can not do a simple free() on its data field, instead it should use the function pointer that was passed in to free the data.
your approach has a small problem:
if you have list* as the return type of list_create(), then you will have to do a free(list) in your main function. alternatively, you can have list_create() return a list, as opposed to a list*, this is a logical choice because a list has its bulk of information dynamically allocated and accessible through a pointer anyway.
in the second case you would need a function list_destroy(list) that would destroy any element your list holds.

C does not implement any specific scheme for memory management other than just giving you the ability to allocate and free memory by passing a pointer
Yes, C lacks any kind of automatic memory management, so you have to be careful to deallocate any memory blocks that you instantiate.
Whose responsibility should it be to release the memory taken up by the value when deleted from the list? The linked list's or the normal flow of the program?
It's your responsibility. You can do it however you like. You can write a general purpose linked list where the caller has to be responsible for allocating and deallocating space for each value in the list because the list management functions don't know how much space each value might require, or whether the values might be needed beyond the lifetime of the node. Or, you can write a list implementation that manages every aspect of the node, including space for the value stored in the node. In some cases, a list node includes the value in the node definition, like:
struct Node {
struct Node *next;
int value;
};
and other times the node has a pointer to some other block that has the actual value:
struct Node {
struct Node *next;
void *value;
};
Another approach is to define a structure with just the part needed for the list operation (i.e. the next pointer), and then piggyback data onto that structure:
struct Node {
struct Node *next;
};
struct MyNode {
struct Node node;
int price;
int quantity;
};
So, there are lots of ways to do it, and none of them are wrong. You should choose the style that makes sense for your needs. Do you have big, complex values that you don't want to duplicate, that you want to store in a linked list, but which you want to continue to use even after they're removed from the list? Go with the first style above. Do you want to manage everything related to the linked list in one place? Then go with the second style.
The point is: C dictates a lot less than other languages do, and while that means that you have to think harder about program correctness, you also get the freedom to do things very directly and in a style of your choosing. Embrace that.

My guide line is: the one who allocates memory is also responsible for de-allocating it.
If you implement a linked list that allocates the memory for the values, the implementation should also take care of freeing this memory when the entries are removed from the list. For strings this could be done by copying the strings to a newly allocated buffer of adequate size.
If your implementation of a linked list only stores plain values (e.g. pointers) without allocating extra memory for the values, it should also avoid freeing memory it did not allocate, because it doesn't know what the allocator planned for this memory in the future.

The proper way would be to have list_remove_node() a function that would free not only the list (node) itself, but also the value that was allocated for that specific node. Also, you shouldn't need to search for a specific node according to your text as you should be able to just call free(node->text) (which can be done even in the current list_remove_last() function)
The main C logic is that you are supposed to free() anything that you allocated yourself. Certain libraries will allocate memory for their own work, which most often you are supposed to clean up as well (as you were the one who asked for it).

How to implement structure of linked list or binary tree in MPI?

In C we define structure for linked list or binary tree like that:
struct list{
int val;
list *next;
};
OR
struct tree_node{
int val;
tree_node *left, *right;
};
we can easily assign pointer of next memory location in serial programming. My question is how do I handle pointer in MPI where multiple processor has its local memory? How do I keep track it? How to implement linked list/binary tree in MPI? I know about MPI_Graph. But it is not useful in my scenario.
I appreciate your answer. Thanks in advance.

I'll discuss a linked list, but all of this applies to a binary tree just as easily with a little extra work.
Implementing a linked list in the classical sense isn't exactly possible in MPI because, as you said, each process has its own local memory which won't be consistent on other processes. So that essentially limits using something simple like point to point messaging unless you want to do a lot of work that wouldn't really make sense.
However, it is possible to do something using one sided communication, or RMA. In fact, there's some example code here. The basic idea of RMA is that each rank exposes a region of memory to the other processes. Then, with the appropriate accessors and synchronization calls, each process can get data from and put data into the other processes memory.
The example uses a dynamic window to allow the application to allocate memory as needed, but it's also possible to statically allocate all your memory up front and point each process to it at the beginning of the application, which might make it a little easier to understand.
Whether or not all of this is efficient or the right thing to do is a different argument. For sufficiently large lists, this can be powerful because you can store more data that you would be able to in a single node's memory. However, for small data structures, the costs of traversing the list become rather high, so it's pretty inefficient to distribute the list and it might be more practical to replicate it on each node.

What benefits does using a struct pointer give over just a struct?

I am a bit confused about this because I was looking at some code for a dispatcher, and they defined a struct PCB (process control block) that basically contains a bunch of information about a running process and a struct queue. The queue basically just manages the order that the processes are executed, but also ocassionaly moves the processes across queues (eg. move a PCB from queue1 to queue2). The queue struct is essentially defined as
struct Queue{
pcbptr front;
pcbptr back;
}
where pcbptr is defined as
typedef pcb * pcbptr
I am a bit confused about why you would use a pcbptr in this case and not simply define queue to use pcb?
Thanks for any help

The reason is simply because of time. It is way faster to have a pointer to a different struct than to have a copy of the struct by value.
Also, the people using the Queue may wish to modify the PCBs, which they would not be able to do if the PCBs were passed by value.
Also, if you were asking about why they didn't simply use PCB * and not typedef a PCB * to be pcbptr, that is all just down to naming convention. They are exactly the same functionally.

Just think about a queue and what it means at different lengths.
If it is 0, you need to tell that somehow (we do this normally by just setting the pointers for front and back to NULL).
If it is 1, both the front and the back need to point to the same thing.
If it is 2, they point at different things, but have a link going from one to the other.
If it is 3, front points at first one, the first one points at the middle one, and the middle one points at the last done, and back points at the last one.
So no matter what you do, with a variable size like this you need to use pointers. for middle ones.. why bother making the front and back anything but a pointer? (especially since it's faster to change a pointer than to try to move a struct around and the code is about 100x more complicated)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight