i need to work with 100,000 list of data. sometimes need to remove an item close to the end or middle.
Structure Proxy
Dim ID as Integer
Dim Server As String
Dim Port As Integer
End Structure
dim oProxy(100,000) as Proxy
what is the best way to add, delete from any location within the arrays of structures structure
as you know looping to to remove from the middle or end can be a pain. should i use list<> instead?
EDIT
I would like to remove an item by ID
how do you find the item you want to remove? do you have its index available
or do you remove it after server and port and need to search it?
because the later i would use a hashed option. Dictionary in .net http://msdn.microsoft.com/en-us/library/xfhwa508.aspx
in that case finding the item you want to remove is your bigest concern.
other wise a List<> would be fine..... dont use arrays because of 100 000 items ;D
Related
I recently posted a question about how to "group" several open workbooks (not all workbooks). A couple techniques were mentioned, all of which I think I could make work but one technique seemed to stick out at me as being exactly what I needed. Creating a collection and putting the workbooks in the collection. Then I can refer to this collection throughout my program as needed. So... I began reading up on / learning about collections. However, Several articles compared collections to being similar to arrays. Which had me second guessing myself as to which one I should use. I am having trouble understanding if an array can even store "objects"? For example can an array store several "objects"? i.e: store several "workbooks"?
to store workbooks in an array we can create an array of workbooks:
Sub kjkj()
Dim wkbks(1 To 4) As Workbook
Set wkbks(1) = workbooks.Open(....)
Set wkbks(2) = workbooks.Open(....)
Set wkbks(3) = workbooks.Open(....)
Set wkbks(4) = workbooks.Open(....)
Dim i As Long
For i = 1 To 4
With wkbks(i)
'do something
End With
Next i
End Sub
Before I begin, I know all of this can be accomplished with a SQL query. Just take my word that currently building an active directory is in the works, but right now this is what i have:
An excel spreadsheet of about 30k rows with about 25 columns with a list of items, using the item # as the level 1 match. Let's call it the "master" sheet.
The item #'s, which are unique identifiers, may appear multiple times, i.e.:
Item # 10000 can appear multiple times in this sheet.
So I created a dynamic array, and inserted the entire master sheet into the array using
Sub Items()
Dim Items() As Variant
Sheets("Master").Activate
Items = Range("A3", "AL" & Range("a1").End(xlDown).Row)
End Sub
The user will be on another sheet within the workbook ("itemlist"), and will enter some items they need like this: Item List
I need all occurrences (whether it be a single or duplicate value) of each of the items in the "Master" to output to individual rows on another sheet.
I'm stuck how to achieve this in VBA. I'm finding it difficult to find examples of this.
Would i want to sort the array first to make finding the duplicates faster? Should I turn the user created list into a single dimensional array and try to find intersecting points with the 2d array? I'm not sure where to start after the creation of the "master" array.
The reason I'm using arrays instead of a bunch of index matching or iterative looping in the "Master" sheet is because the processing power/physical memory available will be inconsistent due to computing environment, so arrays seem to be the most efficient method to avoid some users taking several minutes for return values if it can process at all.
What do you mean 'may appear multiple times'? It sounds like you need xto group and sum, as you would in a database (Access or SQL Server) or do some ranking and pick the highest/lowest element in the array. Or, are you saying you want to copy/paste certain element in your array to a new sheet???
Let's say you want to copy items in column B, which have a value of 'x', and paste them into another sheet, well, just run the script below.
Sub CopyData10()
Dim Rng As Range, cell As Range
Dim rw As Long
Set Rng = Worksheets("Sheet1").Range("B1:B10")
rw = 1
For Each cell In Rng
If LCase(cell.Value) = "x" Then
Worksheets("Sheet2").Cells(rw, "A") = cell.Offset(0, -1)
rw = rw + 1
End If
Next
End Sub
I am creating a Word userform using VBA. I store several configuration using array in the program code, such as the following:
Public arrConfiguration[2, 3] as Integer
where index 2 represent type 0 to 2, and index 3 represent properties 0 to 3 for each type.
However, I planned to modify the program for larger amount of data (such as for 100 different types of data and 50 properties for each data).
My question is,
should I keep storing the data using array in the program, so that it will be
Public arrConfiguration[99, 49] as Integer
or store it in an Excel file, and make the program open the Excel file and access the cells repeatedly? Which one is better?
Thank you.
Please prefer excel. Sample example data image is appended here-under.
For creating two dimensional dynamic array in excel, follow the steps below:
◾Declare the two dimensional Array
◾Resize the array
◾Store values in array
◾Retrieve values from array
Sub FnTwoDimentionDynamic()
Dim arrTwoD()
Dim intRows
Dim intCols
intRows = Sheet1.UsedRange.Rows.Count
intCols = Sheet1.UsedRange.Columns.Count
ReDim Preserve arrTwoD(1 To intRows, 1 To intCols)
For i = 1 To UBound(arrTwoD, 1)
For j = 1 To UBound(arrTwoD, 2)
arrTwoD(i, j) = Sheet1.Cells(i, j)
Next
Next
MsgBox "The value is B5 is " & arrTwoD(5, 2)
End Sub
In the Message Box you will get the following output.
Further To visualize a two dimensional array we could picture a row of CD racks. To make things easier, we can imagine that each CD rack could be for a different artist. Like the CDs, the racks would be identifiable by number. Below we'll define a two dimensional array representing a row of CD racks. The strings inside of the array will represent album titles.
For multidimensional arrays it should be noted that only the last dimension can be resized. That means that given our example above, once we created the array with two CD racks, we would not be able to add more racks, we would only be able to change the number of CDs each rack held.
You can simplify #skkakkar code:
dim x as variant
x = range("A1").CurrentRegion
No Redim, no loops.
Depending on how you see things evolving, you might want to consider accessing your Excel data via ADO, rather than OLE Automation. That way, if you decide to change your storage system to Access, SQL Server or something else, you will have less work to do.
How To Use ADO with Excel Data from Visual Basic or VBA (Microsoft)
https://support.microsoft.com/en-gb/kb/257819
Read and Write Excel Documents Using OLEDB (Codeproject)
http://www.codeproject.com/Tips/705470/Read-and-Write-Excel-Documents-Using-OLEDB
I'm a long user of arrays in VBA but I recently learned a bit about hashing and I was wondering if I could use that to build more efficient searches in my arrays. To keep it specific, what I did was to turn a two dimensional array into a dictionary of rows where the keys is a string (which off course is unique) found in a 'cell' and turned into a double via asc.
I guess the code below explains what I mean:
Private pHook As Object
Sub test()
Set pHook = CreateObject("Scripting.Dictionary")
key = StoAsc("SomeStringOneWantstoFind")
If Not pHook.Exists(key) Then pHook.Add key, "TEST"
d = pHook(key)
End Sub
Public Function StoAsc(stg As String) As Double:
Dim key As String
key = ""
For ii = 1 To Len(stg)
S = Asc(Mid(stg, ii, 1))
key = key & S
Next ii
StoAsc = CDbl(key)
End Function
It looks like it works and it did the job of avoiding a the loop when I just want to find something in the data.
But I can't get out of my mind the idea that there should be a easier and more logical path than building the hashing myself. Am I in a good path? Are there easier ways to 'hash an array' so don't have to loop around every time I need something?
Dictionaries allow strings (or any data type except arrays) to be used as key values (and as item values). So as you suspected, you have no need to do any hashing yourself, all you need to do so store "SomeStringOneWantstoFind" in both the key and the value.
There is an exists method on the dictionary object that lets you find out whether a key value exists which can be used to do this.
Collections can be set up with just a key value, so you could use a collection instead of a dictionary but collections do not have the exists method.
I'm quite new to collections/dictionaries and arrays, so I created a useful crib sheet which I have shared here
I'd welcome your input, as I still feel I don't quote get it, and I'm sure you have moved on since you wrote this question.
Here's my understanding of your question and what you are doing.
In your code you convert "SomeStringOneWantstoFind" to a unique number (using Asc) and store this as hey key and "TEST" as the text. I suspect in reality you would store "SomeStringOneWantstoFind" as the value.
So why are you doing this is the question!
You mention hashing. So you want to look up a text value to see if it is in the dictionary. ie find out whether "MyTextToFind" exists.
So I assume you are converting "MyTextToFind" using Asc in a similar way then using the dictionary exists to see if it is there.
This is all a bit unnecessary - I think.
Be aware that Dictionaries always need a key and a Item (ie a value)
Which is faster? someCondition has the same probability of being true as it has of being false.
Insertion:
arrayList = Array("apple", "pear","grape")
if someCondition then
' insert "banana" element
end if
Deletion:
arrayList = Array("apple","banana","pear","grape")
if not someCondition then
' remove "banana" element
end if
It looks like it depends purely on the implementation of Insert and Remove. So which, in general, is faster? I'm leaning toward insertion because I've read that one can use CopyMemory to insert without looping. Is this the same for deletion? Does anyone have an example?
Edit:
This is VB6, not VB.NET.
For display reasons, I have to use insert rather than append.
For a delete, every item after the removed item must be shifted down.
For an Insert, space must be found for the new item. If there is empty space after the array that it can annex, then this takes no time, and the only time spend is more each item after the new item up, to make room in the middle.
If there is no available space locally, a whole new array must be allocated and every item copied.
So, when considering adding or deleting to the same array position, inserting could be as fast as deleting, but it maybe much longer. Inserting won't be faster.
Both have about the same performance because both require creating a new Array. Arrays are fixed size continuous structures.
In order to maintain this on an insert a new Array must be created with an additional element. All of the existing values are copied into the array in their new position and then the inserted element is added.
In order to maintain this for a delete a new Array must be created with one less element. Then all of the existing entries except for the delete must be copied into the new array.
Both of these operations have essentially the same operations over nearly identical sizes. Performance won't be significantly different.
I've found an example showing that one can delete without looping as well. It looks simpler than the code to insert.
Public Sub RemoveArrayElement_Str(AryVar() As String, ByVal _
RemoveWhich As Long)
'// The size of the array elements
'// In the case of string arrays, they are
'// simply 32 bit pointers to BSTR's.
Dim byteLen As Byte
'// String pointers are 4 bytes
byteLen = 4
'// The copymemory operation is not necessary unless
'// we are working with an array element that is not
'// at the end of the array
If RemoveWhich < UBound(AryVar) Then
'// Copy the block of string pointers starting at
' the position after the
'// removed item back one spot.
CopyMemory ByVal VarPtr(AryVar(RemoveWhich)), ByVal _
VarPtr(AryVar(RemoveWhich + 1)), (byteLen) * _
(UBound(AryVar) - RemoveWhich)
End If
'// If we are removing the last array element
'// just deinitialize the array
'// otherwise chop the array down by one.
If UBound(AryVar) = LBound(AryVar) Then
Erase AryVar
Else
ReDim Preserve AryVar(UBound(AryVar) - 1)
End If
End Sub
http://www.vb-helper.com/howto_delete_from_array.html
I'd have to guess insert, because it can always just append whereas with the delete you have to worry about holes.
But what version of vb? If you're in .Net and doing deletes or inserts, you shouldn't be using an array for this at all.
On topic but not quite an answer:
Inserting and deleting is not an application that is applicable on arrays. It goes beyond "Optimization" and into bad programming.
If this gets hidden in the bottom of a call structure and someone ends up calling it repeatedly, you could take a severe performance hit. In one case I changed an array insertion-sort to simply use a linked list and it changed runtime from 10+hours (locked the machine) to seconds/minutes).
It was populating a listbox with ip addresses. As designed and tested on a class-c address space it worked fine, but we had requirements to work on a class-b address space without failing (Could take a while, but not hours). We were tasked with the minimum possible refactor to get it to not fail.
Don't assume you know how your hack is going to be used.