How to set a vertex's Index myself in Titan Graph Database - database

The thing is that whenever I add a vertex with "addVertex()" command, the index of the vertex is chosen randomly like V[0] and the second time is V[2] and so on. I want to set it myself...How can I do it?

So that is not the index of your vertex. It is the id of your vertex, and if you asking how can you set that, then the answer is that you can't. Titan sets the Ids internally and they are immutable.
What you can do however, is create your own index so you can do fast lookups. I would recommend starting with simple composite index.
You can create a composite index as follows:
graph = TitanFactory.open('conf.properties');
mgmt = graph.openManagement();
myId = mgmt.makePropertyKey("MY-ID").dataType(String.class).make();
mgmt.buildIndex('byMyID', Vertex.class).addKey(myId).buildCompositeIndex();
mgmt.commit();
The above will create a property called MY-ID and index it. Which means that any vertex with that property can be looked up quickly.
Side Note: Make sure you are initialising a Titan Graph not a Tinker Graph. Tinker Graphs do not support indexing.

Related

What does 'x' here mean?

CHAINED-HASH-INSERT(T,x)
Insert x at head of list T[h(x.key)]
CHAINED-HASH-SEARCH(T,k)
Search for an element with key k in list T[h(k)]
CHAINED-HASH-DELETE(T,x)
Delete x from the list T[h(x.key)]
I am studying hash tables right now and I am not able to understand what exactly x means in the attached pseudocode with this question. The three hash table dictionary functions are being implemented here to insert, search and delete an element from the table.
Now I understand what x.key means but the problem is, but what exactly is x in the data that has to be inserted into the table. Please can you help me with an example. I am currently trying to implement it using C.
Also h() is the hash function.
In practice it will be a struct for key/value pairs. x.key is the key that is used for the hashing. The value might be named x.value, or x.data or something like that. There must be a definition for the type of x somewhere, but it's name is not important to understand the algorithm and to implement it. You might use a void pointer to represent the value so that data of any kind could be stored in the hash table.
So x represents the data structure that is stored in the hash table's collision resolution list, which is where the key and value are stored.
Update
I have found this familiar looking reference which explains it all: Introduction to Algorithms

Difference between Array, Set and Dictionary in Swift

I am new to Swift Lang, have seen lots of tutorials, but it's not clear – my question is what's the main difference between the Array, Set and Dictionary collection type?
Here are the practical differences between the different types:
Arrays are effectively ordered lists and are used to store lists of information in cases where order is important.
For example, posts in a social network app being displayed in a tableView may be stored in an array.
Sets are different in the sense that order does not matter and these will be used in cases where order does not matter.
Sets are especially useful when you need to ensure that an item only appears once in the set.
Dictionaries are used to store key, value pairs and are used when you want to easily find a value using a key, just like in a dictionary.
For example, you could store a list of items and links to more information about these items in a dictionary.
Hope this helps :)
(For more information and to find Apple's own definitions, check out Apple's guides at https://developer.apple.com/library/content/documentation/Swift/Conceptual/Swift_Programming_Language/CollectionTypes.html)
Detailed documentation can be found here on Apple's guide. Below are some quick definations extracted from there:
Array
An array stores values of the same type in an ordered list. The same value can appear in an array multiple times at different positions.
Set
A set stores distinct values of the same type in a collection with no defined ordering. You can use a set instead of an array when the order of items is not important, or when you need to ensure that an item only appears once.
Dictionary
A dictionary stores associations between keys of the same type and values of the same type in a collection with no defined ordering. Each value is associated with a unique key, which acts as an identifier for that value within the dictionary. Unlike items in an array, items in a dictionary do not have a specified order. You use a dictionary when you need to look up values based on their identifier, in much the same way that a real-world dictionary is used to look up the definition for a particular word.
Old thread yet worth to talk about performance.
With given N element inside an array or a dictionary it worth to consider the performance when you try to access elements or to add or to remove objects.
Arrays
To access a random element will cost you the same as accessing the first or last, as elements follow sequentially each other so they are accessed directly. They will cost you 1 cycle.
Inserting an element is costly. If you add to the beginning it will cost you 1 cycle. Inserting to the middle, the remainder needs to be shifted. It can cost you as much as N cycle in worst case (average N/2 cycles). If you append to the end and you have enough room in the array it will cost you 1 cycle. Otherwise the whole array will be copied which will cost you N cycle. This is why it is important to assign enough space to the array at the beginning of the operation.
Deleting from the beginning or the end it will cost you 1. From the middle shift operation is required. In average it is N/2.
Finding element with a given property will cost you N/2 cycle.
So be very cautious with huge arrays.
Dictionaries
While Dictionaries are disordered they can bring you some benefits here. As keys are hashed and stored in a hash table any given operation will cost you 1 cycle. Only exception can be finding an element with a given property. It can cost you N/2 cycle in the worst case. With clever design however you can assign property values as dictionary keys so the lookup will cost you 1 cycle only no matter how many elements are inside.
Swift Collections - Array, Dictionary, Set
Every collection is dynamic that is why it has some extra steps for expanding and collapsing. Array should allocate more memory and copy an old date into new one, Dictionary additionally should recalculate basket indexes for every object inside
Big O (O) notation describes a performance of some function
Array - ArrayList - a dynamic array of objects. It is based on usual array. It is used for task where you very often should have an access by index
get by index - O(1)
find element - O(n) - you try to find the latest element
insert/delete - O(n) - every time a tail of array is copied/pasted
Dictionary - HashTable, HashMap - saving key/value pairs. It contains a buckets/baskets(array structure, access by index) where each of them contains another structure(array list, linked list, tree). Collisions are solved by Separate chaining. The main idea is:
calculate key's hash code[About] (Hashable) and based on this hash code the index of bucket is calculated(for example by using modulo(mod)).
Since Hashable function returns Int it can not guarantees that two different objects will have different hash codes. More over count of basket is not equals Int.max. When we have two different objects with the same hash codes, or situation when two objects which have different hash codes are located into the same basket - it is a collision. Than is why when we know the index of basket we should check if anybody there is the same as our key, and Equatable is to the rescue. If two objects are equal the key/value object will be replaces, otherwise - new key/value object will be added inside
find element - O(1) to O(n)
insert/delete - O(1) to O(n)
O(n) - in case when hash code for every object is the same, that is why we have only one bucket. So hash function should evenly distributes the elements
As you see HashMap doesn't support access by index but in other cases it has better performance
Set - hash Set. Is based on HashTable without value
*Also you are able to implement a kind of Java TreeMap/TreeSet which is sorted structure but with O(log(n)) complexity to access an element
[Java Thread safe Collections]

Finding k different keys using binary search in an array of n elements

Say, I have a sorted array of n elements. I want to find 2 different keys k1 and k2 in this array using Binary search.
A basic solution would be to apply Binary search on them separately, like two calls for 2 keys which would maintain the time complexity to 2(logn).
Can we solve this problem using any other approach(es) for different k keys, k < n ?
Each search you complete can be used to subdivide the input to make it more efficient. For example suppose the element corresponding to k1 is at index i1. If k2 > k1 you can restrict the second search to i1..n, otherwise restrict it to 0..i1.
Best case is when your search keys are sorted also, so every new search can begin where the last one was found.
You can reduce the real complexity (although it will still be the same big O) by walking the shared search path once. That is, start the binary search until the element you're at is between the two items you are looking for. At that point, spawn a thread to continue the binary search for one element in the range past the pivot element you're at and spawn a thread to continue the binary search for the other element in the range before the pivot element you're at. Return both results. :-)
EDIT:
As Oli Charlesworth had mentioned in his comment, you did ask for an arbitrary amount of elements. This same logic can be extended to an arbitrary amount of search keys though. Here is an example:
You have an array of search keys like so:
searchKeys = ['findme1', 'findme2', ...]
You have key-value datastructure that maps a search key to the value found:
keyToValue = {'findme1': 'foundme1', 'findme2': 'foundme2', 'findme3': 'NOT_FOUND_VALUE'}
Now, following the same logic as before this EDIT, you can pass a "pruned" searchKeys array on each thread spawn where the keys diverge at the pivot. Each time you find a value for the given key, you update the keyToValue map. When there are no more ranges to search but still values in the searchKeys array, you can assume those keys are not to be found and you can update the mapping to signify that in some way (some null-like value perhaps?). When all threads have been joined (or by use of a counter), you return the mapping. The big win here is that you did not have to repeat the initial search logic that any two keys may share.
Second EDIT:
As Mark has added in his answer, sorting the search keys allows you to only have to look at the first item in the key range.
You can find academic articles calculating the complexity of different schemes for the general case, which is merging two sorted sequences of possibly very different lengths using the minimum number of comparisons. The paper at http://www.math.cmu.edu/~af1p/Texfiles/HL.pdf analyses one of the best known schemes, by Hwang and Lin, and has references to other schemes, and to the original paper by Hwang and Lin.
It looks a lot like a merge which steps through each item of the smaller list, skipping along the larger list with a stepsize that is the ratio of the sizes of the two lists. If it finds out that it has stepped too far along the large list it can use binary search to find a match amongst the values it has stepped over. If it has not stepped far enough, it takes another step.

OpenGL 2.1 Draw object with different buffers index

I'm dealing with OBJ file with OpenGL 2.1. This is OBJ spec: http://www.martinreddy.net/gfx/3d/OBJ.spec
According to that, a face is indicated as vertex/texture/normal where vertex, texture and normal are different, this mean index of vertex, texture coordinate and normal index in each buffers are different.
Question: How can i draw an object that supply different indicate buffers for each vertex, texture and normal arrays?
Sad answer, this is not possible in OpenGL. In OpenGL (and many other such systems, especially Direct3D), a vertex is conceptually the union of all the vertex attributes, like position, normal or texture coordinates (it's unfortunate that the term vertex is often used only for the position attribute). So two vertices with the same position but different normal vectors are conceptually two different vertices.
That's the reason, why you have to use the same vertex index for all attributes. So to draw an .OBJ file with OpenGL or similar libraries, you won't get around processing the data after loading.
The easiest way is to completely drop indices and just store all vertex data for each triangle corner one after the other, using the indices read from the face specification, thus having 3*F vertices and no indices. This, however is rather inefficient, since you probably still have many duplicate vertices, even if considering all attributes at a whole.
Another option is to insert the index triples (comprised of vertex, texcoord and normal index) into a hash table and map it to a combined index. Whenever the index triple already exists, you replace it by the one index in the hash table and when it doesn't exist, you add a new vertex (comprised of position, normal and texCoords) to your final rendered vertex arrays, indexing the file's vertex arrays using the index triple (and insert this new vertex array size as unique index into the hash table, of course).

Big problem with Dijkstra algorithm in a linked list graph implementation

I have my graph implemented with linked lists, for both vertices and edges and that is becoming an issue for the Dijkstra algorithm. As I said on a previous question, I'm converting this code that uses an adjacency matrix to work with my graph implementation.
The problem is that when I find the minimum value I get an array index. This index would have match the vertex index if the graph vertexes were stored in an array instead. And the access to the vertex would be constant.
I don't have time to change my graph implementation, but I do have an hash table, indexed by a unique number (but one that does not start at 0, it's like 100090000) which is the problem I'm having. Whenever I need, I use the modulo operator to get a number between 0 and the total number of vertices.
This works fine for when I need an array index from the number, but when I need the number from the array index (to access the calculated minimum distance vertex in constant time), not so much.
I tried to search for how to inverse the modulo operation, like, 100090000 mod 18000 = 10000 and, 10000 invmod 18000 = 100090000 but couldn't find a way to do it.
My next alternative is to build some sort of reference array where, in the example above, arr[10000] = 100090000. That would fix the problem, but would require to loop the whole graph one more time.
Do I have any better/easier solution with my current graph implementation?
In your array, instead of just storing the count (or whatever you're storing there), store a structure which contains the count as well as the vertex index number.

Resources