Delete and re-index dictionary in Swift - arrays

Hi I have a dictionary of type:
1:[[12.342,34.234],[....,...],....]
2:[[......],[....]]....
Now I'd like to know if there are functions to delete specifics key and correspondents value, and a function to re-index it for examples if I delete the value correspondents to key 2 the key 3 should become the key 2 and so on.

I think you need to use an Array, not a Dictionary
var elms: [[[Double]]] = [
[[0.1],[0.2, 0.3]],
[[0.4], [0.5]],
[[0.6]],
]
elms.remove(at: 1) // remove the second element
print(elms)
[
[[0.10000000000000001], [0.20000000000000001, 0.29999999999999999]],
[[0.59999999999999998]]
]
Yes output values are slightly different from the original ones.

To delete a key and value, just do dict[key] = nil.
As for re-indexing, dictionary keys are not in any particular order, and shifting all the values over to different keys isn't how a dictionary is designed to work. If this is important for you, maybe you should use something like a pair of arrays instead.

Dictionaries are not ordered. This means that "key 3" becoming "key 2" is not a supported scenario. If keys have been in the same order that you've inserted them, you've been lucky so far, as this is absolutely not guaranteed.
If you want ordering and your list of key/value pairs is small (a hundred or so is small), you should consider using an array tuples: [(Key, Value)]. This has guaranteed ordering. If you need something bigger than that or faster key lookup, you should find a way to define an ordering relationship between keys (such that you can say that one should always be after some other key), and use a sorted collection like this one.

Related

What's the difference between an array and a hash whose keys are integers? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 months ago.
Improve this question
What is the difference between an array, an ordered collection of integer-indexed values, and a hash, a collection of key-value pairs, where every key is an index that starts from zero?
arr = [1,2,3,4]
hash = { 0 => 1, 1 => 2, 2 => 3, 3 => 4}
I know that it would be stupid to implement a hash like that, but what is the difference behind the hood? Are they stored in different ways in memory? What data structure makes your program faster when retrieving data from it?
Speed
Arrays are a better for iterating over a collection.
Hashes have almost constant lookup time O(log n), while Arrays have linear time O(n).
This means that you should prefer Hashes when you need to use #include?.
Hashes work great as dictionaries.
There is a third underutilised data structure in Ruby which behaves like a unique Array but uses a Hash underneath. This is Set. It works great when you don't care about the order of elements, need to check if something belongs to a group or need a unique collection of elements.
Implemetation
Yes, they are two completely different data structures.
In this particular example you could access these elements in a similar manner by calling arr[0] and hash[0].
The line arr[0] is in fact sugar syntax for arr.[](0). Same goes for arr[0] = 5 which is the same as arr.[]=(0, 5). Both Array and Hash have methods called [] and []=, but under the hood completely different things would happen, because these are two separate methods with the same name.
Arrays are always indexed with integers starting with 0 and the order of items is important.
Array is a simple list of elements backed by a dynamic C array, while a Hash is an implementation of a dictionary/hashmap, a much more sophisticated data structure.
Hashes store both keys and values, and possess the ability to retrieve a value based on the received key. Every single Ruby object that implements #eql? and #hash can become a Hash key.
The order of elements in a Hash can change and is not stable.
Every Hash is in fact an array of slots called buckets. These buckets hold individual key-value pairs. To determine which key should correspond to a certain bucket, Ruby uses a hashing function, provided by the method called #hash on the object passed as the key. If two objects return the same value from their #hash method, they are considered the same key.
This goes both ways, Ruby uses #hash to save the key-value pair in the right bucket, and to find the right value for the passed key.
For example "string".hash will always return the same value, even though these are different objects.
This value is calculated based on the content of the String. This is why Strings with the same content (even though they are different objects) are considered the same key in a Hash.
Sometimes two different objects (of separate classes) end up having the same #hash. Ruby handles these collisions by implementing individual buckets as linked lists of key-value pairs. So a few key-value pairs may be stored in the same bucket.
To decide which value should be returned for a given key, when there are a few key-value pairs in the same bucket, Ruby utilises the eql? method.
It compares each key stored in the bucket with the passed key like so passed_key.eql?(key_in_bucket). When they are equal Ruby considers them a match.
Here's a great article on Hashes.
Almost every object in Ruby has a hash method. This method calculates some number which is unique-ish. That number is used to retrieve a key in a Hash object. If you want to get the value for 2 then the hash value for 2 is calculated (for an integer this is really fast) and looked up in the Hash object. So it looks like a hash is a way to over-complicate things - but if you want to be sure, benchmark your use-case is the way to go.

What happens if hash is unique but hash % size is same in hash table?

Recently I'm studying hash table, and understand the basis is
create an array, for example
hashtable ht[4];
hash the key
int hash = hash_key(key);
get the index
int index = hash % 4
set to hashtable
ht[index] = insert_or_update(value)
And I know there is hash collision problem, if key1 and key2 has same hash, they go to same ht[index], so separate chaining can solve this.
keys with same hash go to same bucket, these keys will be stored in a linked list.
My question is, what happens if hash is different, but modulus is same?
For example,
hash(key1): 3
hash(key2): 7
hash(key3): 11
hash(key4): 15
so index is 3, these keys with different hash and different key go to same bucket
I search google for some hash table implementation, it seems they don't deal with this situation. Am I overthought? Anything wrong?
For example, these implementations:
https://gist.github.com/tonious/1377667#file-hash-c-L139
http://www.cs.yale.edu/homes/aspnes/pinewiki/C(2f)HashTables.html?highlight=%28CategoryAlgorithmNotes%29#CA-552d62422da2c22f8793edef9212910aa5fe0701_156
redis:
https://github.com/antirez/redis/blob/unstable/src/dict.c#L488
nginx:
https://github.com/nginx/nginx/blob/master/src/core/ngx_hash.c#L34
they just compare if key is equal
If two objects' keys hash to the same bucket, it doesn't really matter if it's because they have the same hash, or because they have different hashes but they both map (via modulo) to the same bucket. As you note, a collision that occurs because of either of these situations is commonly dealt with by placing both objects in a bucket-specific list.
When we look for an object in a hashtable, we are looking for an object that shares the same key. The hashing / modulo operation is just used to tell us in which bucket we should look to see if the object is present. Once we've found the proper bucket, we still need to compare the keys of any found objects (i.e., the objects in the bucket-specific list) directly to be sure we've found a match.
So the situation of two objects with different hashes but that map to the same bucket works for the same reason that two objects with the same hashes works: we only use the bucket to find candidate matches, and rely on the key itself to determine a true match.

Difference between Array, Set and Dictionary in Swift

I am new to Swift Lang, have seen lots of tutorials, but it's not clear – my question is what's the main difference between the Array, Set and Dictionary collection type?
Here are the practical differences between the different types:
Arrays are effectively ordered lists and are used to store lists of information in cases where order is important.
For example, posts in a social network app being displayed in a tableView may be stored in an array.
Sets are different in the sense that order does not matter and these will be used in cases where order does not matter.
Sets are especially useful when you need to ensure that an item only appears once in the set.
Dictionaries are used to store key, value pairs and are used when you want to easily find a value using a key, just like in a dictionary.
For example, you could store a list of items and links to more information about these items in a dictionary.
Hope this helps :)
(For more information and to find Apple's own definitions, check out Apple's guides at https://developer.apple.com/library/content/documentation/Swift/Conceptual/Swift_Programming_Language/CollectionTypes.html)
Detailed documentation can be found here on Apple's guide. Below are some quick definations extracted from there:
Array
An array stores values of the same type in an ordered list. The same value can appear in an array multiple times at different positions.
Set
A set stores distinct values of the same type in a collection with no defined ordering. You can use a set instead of an array when the order of items is not important, or when you need to ensure that an item only appears once.
Dictionary
A dictionary stores associations between keys of the same type and values of the same type in a collection with no defined ordering. Each value is associated with a unique key, which acts as an identifier for that value within the dictionary. Unlike items in an array, items in a dictionary do not have a specified order. You use a dictionary when you need to look up values based on their identifier, in much the same way that a real-world dictionary is used to look up the definition for a particular word.
Old thread yet worth to talk about performance.
With given N element inside an array or a dictionary it worth to consider the performance when you try to access elements or to add or to remove objects.
Arrays
To access a random element will cost you the same as accessing the first or last, as elements follow sequentially each other so they are accessed directly. They will cost you 1 cycle.
Inserting an element is costly. If you add to the beginning it will cost you 1 cycle. Inserting to the middle, the remainder needs to be shifted. It can cost you as much as N cycle in worst case (average N/2 cycles). If you append to the end and you have enough room in the array it will cost you 1 cycle. Otherwise the whole array will be copied which will cost you N cycle. This is why it is important to assign enough space to the array at the beginning of the operation.
Deleting from the beginning or the end it will cost you 1. From the middle shift operation is required. In average it is N/2.
Finding element with a given property will cost you N/2 cycle.
So be very cautious with huge arrays.
Dictionaries
While Dictionaries are disordered they can bring you some benefits here. As keys are hashed and stored in a hash table any given operation will cost you 1 cycle. Only exception can be finding an element with a given property. It can cost you N/2 cycle in the worst case. With clever design however you can assign property values as dictionary keys so the lookup will cost you 1 cycle only no matter how many elements are inside.
Swift Collections - Array, Dictionary, Set
Every collection is dynamic that is why it has some extra steps for expanding and collapsing. Array should allocate more memory and copy an old date into new one, Dictionary additionally should recalculate basket indexes for every object inside
Big O (O) notation describes a performance of some function
Array - ArrayList - a dynamic array of objects. It is based on usual array. It is used for task where you very often should have an access by index
get by index - O(1)
find element - O(n) - you try to find the latest element
insert/delete - O(n) - every time a tail of array is copied/pasted
Dictionary - HashTable, HashMap - saving key/value pairs. It contains a buckets/baskets(array structure, access by index) where each of them contains another structure(array list, linked list, tree). Collisions are solved by Separate chaining. The main idea is:
calculate key's hash code[About] (Hashable) and based on this hash code the index of bucket is calculated(for example by using modulo(mod)).
Since Hashable function returns Int it can not guarantees that two different objects will have different hash codes. More over count of basket is not equals Int.max. When we have two different objects with the same hash codes, or situation when two objects which have different hash codes are located into the same basket - it is a collision. Than is why when we know the index of basket we should check if anybody there is the same as our key, and Equatable is to the rescue. If two objects are equal the key/value object will be replaces, otherwise - new key/value object will be added inside
find element - O(1) to O(n)
insert/delete - O(1) to O(n)
O(n) - in case when hash code for every object is the same, that is why we have only one bucket. So hash function should evenly distributes the elements
As you see HashMap doesn't support access by index but in other cases it has better performance
Set - hash Set. Is based on HashTable without value
*Also you are able to implement a kind of Java TreeMap/TreeSet which is sorted structure but with O(log(n)) complexity to access an element
[Java Thread safe Collections]

Hiding vars in strings VS using objects with properties?

So, I've got a word analyzing program in Excel with which I hope to be able to import over 30 million words.
At first,I created a separate object for each of these words so that each word has a...
.value '(string), the actual word itself
.bool1 '(boolean)
.bool2 '(boolean)
.bool3 '(boolean)
.isUsed '(boolean)
.cancel '(boolean)
When I found out I may have 30 million of these objects (all stored in a single collection), I thought that this could be a monster to compile. And so I decided that all my words would be strings, and that I would stick them into an array.
So my array idea is to append each of the 30 million strings by adding 5 spaces (for my 5 bools) at the beginning of each string, with each empty space representing a false bool val. e.g,
If instr(3, arr(n), " ") = 1 then
'my 3rd bool val is false.
Elseif instr(3, arr(n), "*") = 1 then '(I'll insert a '*' to denote true)
'my third bool val is true.
End If
Anyway, what do you guys think? Which way (collection or array) should I go about this (for optimization specifically)?
(I wanted to make this a comment but it became too long)
An answer would depend on how you want to access and process the words, once stored.
There are significant benefits and distinct advantages for 3 candidates:
Arrays are very efficient to populate and retrieve all items at once (ex. range to array and array back to range), but much slower at re-sizing and inserting items in the middle. Each Redim copies the entire memory block to a larger location, and if Preserve is used, all values copied over as well. This may translate to perceived slowness for every operation (in a potential application)
More details (arrays vs collections) here (VB specific but it applies to VBA as well)
Collections are linked lists with hash-tables - quite slow to populate but after that you get instant access to any element in the collection, and just as fast at reordering (sorting) and re-sizing. This can translate into a slow opening file, but all other operations are instant. Other aspects:
Retrieve keys as well as the items associated with those keys
Handle case-sensitive keys
Items can be other collections, arrays, objects
While keys must be unique, they are also optional
An item can be returned in reference to its key, or in reference to its index value
Keys are always strings, and always case insensitive
Items are accessible and retrievable, but its keys are not
Cannot remove all items at once (either one by one, or destroy then recreate the Collection
Enumerating with For...Each...Next, lists all items
More info here and here
Dictionaries: same as collections but with the extra benefit of the .Exists() method which, in some scenarios, makes them much faster than collections. Other aspects:
Keys are mandatory and always unique to that Dictionary
An item can only be returned in reference to its key
The key can take any data type; for string keys, by default a Dictionary is case sensitive
Exists() method to test for the existence of a particular key (and item)
Collections have no similar test; instead, you must attempt to retrieve a value from the Collection, and handle the resulting error if the key is not found
Items AND keys are always accessible and retrievable to the developer
Item property is read/write, so it allows changing the item associated with a particular key
Allows you to remove all items in a single step without destroying the Dictionary itself
Using For...Each...Next dictionaries will enumerate the keys
A Dictionary supports implicit adding of an item using the Item property.
In Collections, items must be added explicitly
More details here
Other links: optimizing loops and optimizing strings (same site)

Sort Dictionary by Key Based on Values in Array

How can a dictionary be sorted, by its keys, in the order listed in an array? See example:
Dictionary to sort:
var item = [
"itemName":"radio",
"description":"battery operated",
"qtyInStock":"12",
"countOfChildren":"5",
"isSerialized":"0"
]
Order in which the keys should be sorted:
let sortOrder = [
"itemName",
"qtyInStock",
"countOfChildred",
"description",
"isSerialized"
]
I have tried the following in Playground:
var sortedItem = Dictionary<String, String>()
for i in sortOrder {
sortedItem[i] = item[i]
}
While viewing the value history in Playground displays everything in the correct order, the resulting dictionary is in a seemingly random order.
As mentioned by Tiago, a Dictionary by definition doesn't have an order. It is essentially a mapping of keys to values. I would recommend one of two approaches.
If the ordering you wish to achieve is manual as your question makes it seem. I would create an array to hold ordered keys. Then, at any point you need to, you can cycle through the array (which is order) and print out all of the values found in the dictionary. They will inherently be printed out using the ordered array of keys.
If the ordering is something than be done programmatically, you could grab a reference to all of the keys by doing myDictionary.keys.array. You could then sort the array, then once again, iterate through the array and grab the values from the dictionary.
Example:
let myKeys = myDictionary.keys.array
// sort the keys
for key in myKeys {
println(myDictionary[key])
}
Hope that helps.
A dictionary, by definition, doesn't have an order.
However, you can have an array of sorted items by touples or implement your sorted dictionary.

Resources