Associative array versus multi-dimensional array, VBA - arrays

I feel like I'm missing some basic understanding... hopefully this isn't considered too broad or subjective, as I'm not positive which Stack Exchange site to post to.
VBA's associative array is the Dictionary. My extremely gross understanding is that a Dictionary is just a multi-dimensional array; to find a value in the matrix, you'd still have to iterate and find a matching value in the first row of the matrix, which would then be used to output values in the nth row of the same column within the matrix.
If the above is in any way correct, then how is Dictionary more efficient than a standard multi-dimensional array?

to find a value in the matrix, you'd still have to iterate and find a matching value in the first row of the matrix, which would then be used to output values in the nth row of the same column within the matrix.
That's not how dictionaries work.
Dictionary lookups are hash lookups (keys must be unique), making them roughly O(1), whereas iterating the first row of the matrix as you describe would be O(n)... which means the more items you're looking at, the more advantageous a dictionary is vs. an array... assuming you're not iterating the keys (i.e. assuming you're retrieving items by key).

Use Bounds Like: Lbound(array) to Ubound (array) it will go through via each cells in row wise

Related

in matlab comparing subsets of bagofwords and converting bagofwords to cell array

i have one big bagOfWords array, and two bagOfWords arrays. vocabularies of two arrays are elements of the big array's vocabulary (meaning they are subsets of the big).
now i want to create an array that is, its first column must be big array's vocabulary, second column must be one of the two array's counts value for that row's vocabulary word, third column must be the other array's counts value for that row's vocabulary word. and if an array doesn't have a counts value for that row's vocabulary word then i want to make that index value 1.
how can i do this? i didn't find any function that converts bagofwords to cell array. and i'm not sure how to compare those arrays.
i can send you the variables but seems like i cannot upload them here.

Getting 10 nsmallest arrays from a set of arrays

First of all, I apologize for the confusing title, the task which I'm trying to accomplish is itself still confusing to me, hence why I'm finding it hard to do it. I'll try to be as clear as I can from now on.
I have 100 500x500 arrays, the values inside range from 0 to 1. What I would like to do is write a code that gives me 10 arrays, these arrays will be a sort of composite of the minimum values between them.
The first array is made of the absolute minimum values, the second array with the 2nd order minimum values....and so on. So the 10 arrays will be a composite of sorted ascending values.
I managed to get the absolute minimum with np.minimum() but I have no clue on how to proceed to the next ones.
To reiterate, I don't want to sort the 100 arrays, but loop through them and create new arrays with the lowest values found in each position.
Sorting is the most efficient way.
np.sort([array0,array1,...], 0)
Will yield an array where the first element is an 100x100 array of the smallest element-wise entries of all your arrays, the second the second smallest, etc.

formula to generate array with variables such as =B1/B2, etc

How does one generate an array whose elements are cell values or math operation of cell values? For example, how does one generate an array of {0,1,A1,B1}) or {8,3,A1/B1,A1*B1} or {0,1,A1/B1-2, A1*B1+3}?
I have not found anything on the net regarding this topic.

how to remove duplicate numbers from unsorted array

I was given the following question in a technical interview:
How do i remove duplicates from an unsorted array?
One option I was thinking of:
Create a hash map with the frequency of each number in the array
Go through the array and do a O(1) lookup in the hash map. If the frequency > 0, remove the number from the array.
Is there a more efficient way?
Another option
Sort the array O(nlog n) using quick sort or merge sort
Then iterate through the array and remove duplicates
Why is option 1 better than option 2?
I cannot use any functions that already do the work like array_unique.
Instead of removing the object from the array if the hash map says there is a duplicate, why don't you build a new array for each item in the hash map, and only add it to the array if there isn't a duplicate? The idea is to save the extra step of having 2 arrays with equal overhead at the start. PHP sucks at garbage collection so if you start with a massive array, even though you unset its value, it might still be hanging around in memory.
For the first option, time complexity is O(n); because creating as hash map O(n) and iterating through the array O(n), so in total O(n).
For the second option, time complexity is O(log(n)); because sort O(log(n)) and iterating O(n), so in total O(log(n)).
Clearly first option is better. Hope this helps:)
If you have no constraints on creating another data structure to track state but must mutate the array in-place and only remove duplicates without sorting, then a variant of your first option may be best.
I propose you make a hashmap as you iterate the array, use the array values as keys and any garbage (boolean set to TRUE perhaps) as the value. As you encounter each item in the array (which is O(n)), check the map. If it exists, delete the item from the array, if not add they key-value pair. No need to track count, you only need to track what has been encountered.
Many languages have a built-in set abstract data type which basically perform this operation on a construction or add all operation. If you can provide a separate data structure with duplicates removed, just create a new set with the array's items and let that data structure remove duplicates.

Way to judge if two Arrays are identical?

By identical I mean two Arrays contain same elements, order of elements in Arrays are not matter here.
The solution I came up with is like this(which turns out a wrong approach as pointed out in comments):
if the size of two Arrays are equal
See True, find all elements of Array A in Array B
All Found, find all elements of Array B in Array A
All Found, then I get conclusion two Arrays are identical
However, is there better algorithm in term of time complexity?
Let's say you have an User[] array 1 and User[] array 2. You can lop through array one and add them to Dictionary<User, int> dictionary where the key is the user and the value is a count. Then you loop through the second array and for each user in array 2 decrement the count in the dictionary (if count is greater than 1) or remove the element (if count is 1). If the user isn't in the dictionary, then you can stop, the arrays don't match.
If you get to the end and had previously checked length of the arrays is same, then the arrays match. If you hadn't checked length earlier (which of course you still should have), then you can just verify the dictionary is now empty after completely looping through array 2.
I don't know exactly what the performance of this is, but it will be faster than sorting both lists and looping through them comparing element by element. Takes more memory though, but if the arrays are not super large then memory usage shouldn't be an issue.
First, check the size of the two arrays. If they aren't equal then they don't contain the same elements.
After that, sort both the arrays in O(n lg(n)). Now, just check both the arrays element-by-element in O(n). As they are sorted, if they are equal then they will be equal in every position.
Your approach doesn't work, as it would treat [0, 1, 1] as being equal to [0, 0, 1]. Every item in A is in B and vice versa. You'd need to count the number of occurrences of each item in A and B. (You then don't need to do both, of course, if you've already checked the lengths.)
If the contents are sortable, you can sort both and then compare element-by-element, of course. That only works if you can provide a total ordering to elements though.
Sort both arrays according to a strict ordering and compare them term-by-term.
Update: Summarizing some of the points that have been raised, here the efficiency you can generally expect:
strict ordering available: O(log N) for sorting plus comparing term-by-term
equality and hash function available: compare hash counts term-by-term, plus actual object comparisons in the event of hash collisions.
only equality, no hashing available: must count each element or copy one container and remove (efficiency depends on the container).
The complexity of comparison term-by-term is linear in the position of the first mismatch.
My idea is to loop through the first array and look for items in the second array. The only issue of course is that you can't use an item in the second array twice. So, make a third array of booleans. This array indicates which items in array 2 'have been used'.
Loop through the first array. Inside that loop through each element in the second array to see if you can 'find' that element in the second array, but also check the third array to verify that the position in the second array hasn't been used. If you find a match update that position in the third array and move on.
You should only need to do this once. If you finish and you found a match for all items in array 2 then no unmatched items remain in array 2. You don't need to then loop through array 2 and see if array 1 contains the item.
Of course before you start all that check that the lengths are the same.
If you don't mind extra space you can do some like HashMap to Store the (element,count) pairs of the first array then check if the second array matches up; this would be linear in N (size of biggest array)
If the array sizes are identical and all of the elements in Array A are in Array B, then there is no need to verify that all of the elements in array B are in Array A. So at the very least you can omit that step.
EDIT: Depends on the definition of the problem. This solution would work if and only if his original solution would work, which it wouldn't if the arrays can have duplicate items and you weren't counting or marking them as "used."

Resources