this is just a question out of curiousity.
Let's say, I have an application where some objects are created dynamically during runtime.
And let's also say, that in most cases the number of objects won't surpass a certain treshhold, e.g. 20. And as a last precondition we say, that it is totally important to optimize the performance.
What would be the better-performing alternative?
First create an array[20]. When adding objects, perform a check whether the array is already totally occupied, and if so create a new array with newArray[array.length * f], where f is a float greater than 1.0f. Then replace the old array with the new one and add the item
simply use an ArrayList from the beginning
???
Remember, this is totally about performance optimization.
edit
I dunno the exact implementation in java, so it might be true, that 1.) and 2.) are quite similar. But after reading this: https://stackoverflow.com/a/10747397/1075211, I think that it might make a difference in C# or some other languages?
Your first option is more or less equivalent to creating an ArrayList like new ArrayList(20).
Related
I started coding in Smalltalk and got stuck here. I have this 2d array:
testArr := Array new: 1.
testArr at: 1
put: ((Array new: 3)
at: 1 put: '1A';
at: 2 put: '1B';
at: 3 put: '1C';
yourself).
But if I want to access lets say first element of first array, what should I write to make it happen?
Thanks!
I'm tempted to take advantage of your question & answer, and elaborate a little bit on knowledge that belongs in the Smalltalk folklore (which you may already be aware of).
As we progresses in the use of Smalltalk, we will likely note that the Array class starts to play a decreasing role in our models. Why is that? Because it takes time to find out which are the objects our models will produce; the balance between too many and too few is a delicate matter, mostly clueless at the beginning.
Arrays and their composition are handy data structures. However, they solve the problem of organizing data at the expense of dealing with it. If the client of such a structure needs to know how data is stored as a prerequisite to act on it, then the message paradigm becomes semantically idle.
Let's imagine a matrix object. There are several ways to keep their entries: a one-dimensional array, an array of rows, an array of columns, a dictionary of (sparse) non-null entries, a triangular structure if the matrices are known to be symmetric/anti-symmetric/Hermitian, and a lot more for special cases. Of course, this variety makes no sense for the problem at hand and, in any case, it would be a bad idea to spend time considering the most general approach: In Smalltalk, generality is attained at the message, not at the storage.
Regardless of the internal organization of data, our objects should always offer protocols that are independent of the underlying structure. Back to the matrix example, even if our initial organization is an array of rows, the matrix object should work the same whether rows are arrays or more sophisticated vector objects that are also used for other ends. This means that when coding the internal access to entry (i,j) we should pretend we don't know the class of row i but only the message to access its jth element. Something on the lines of
atRow: i column: j
| row |
row := self row: i.
^row at: j
Here we are not assuming that row is an Array; we are only assuming that it understands the at: message, which is the least we can assume when talking to a row object, whatever its actual nature. Of course, this code is only good under the assumption that rows are not recreated on the fly, as they would, had our class kept instead collections of columns. But this is ok, otherwise we would only need to add another class and override this and some few other low level messages.
In any case, the idea is to defer as much as possible any explicit knowledge about the internal organization so that it gets confined to a few private messages. One way to test we are applying this good practice is to make sure no low level code repeats in two or more methods. For instance, the use of the row: message above moves low level code away from atRow:column:, deferring it to another that makes sense for the (ideal) matrix protocol.
This example illustrates an important point: be suspicious of any code that needs to compose two at: messages. And --why not-- enjoy the beauty of not having to declare types.
So, the problem was in brackets.
^(testArr at: 1) at:1
returns
1A
as I needed.
I think it's probably a simple answer but I thought I'd quickly check...
Let's say I'm adding Ints to an array at various points in my code, and then I want to find if an array contains a certain Int in the future..
var array = [Int]()
array.append(2)
array.append(4)
array.append(5)
array.append(7)
if array.contains(7) { print("There's a 7 alright") }
Is this heavier performance wise than if I created a dictionary?
var dictionary = [Int:Int]()
dictionary[7] = 7
if dictionary[7] != nil { print("There's a value for key 7")}
Obviously there's reasons like, you might want to eliminate the possibility of having duplicate entries of the same number... but I could also do that with a Set.. I'm mainly just wondering about the performance of dictionary[key] vs array.contains(value)
Thanks for your time
Generally speaking, Dictionaries provide constant, i.e. O(1), access, which means searching if a value exists and updating it are faster than with an Array, which, depending on implementation can be O(n). If those are things that you need to optimize for, then a Dictionary is a good choice. However, since dictionaries enforce uniqueness of keys, you cannot insert multiple values under the same key.
Based on the question, I would recommend for you to read Ray Wenderlich's Collection Data Structures to get a more holistic understanding of data structures than I can provide here.
I did some sampling!
I edited your code so that the print statements are empty.
I ran the code 1.000.000 times. Every time I measured how long it takes to access the dictionary and array separately. Then I subtracted the dictTime for arrTime (arrTime - dictTime) and saved this number each time.
Once it finished I took the average of the results.
The result is: 23150. Meaning that over 1.000.000 tries the array was faster to access by 23150 nanoSec.
The max difference was 2426737 and the min was -5711121.
Here are the results on a graph:
In declaring an array in VB, would you ever leave the zero element empty and adjust the code to make it more user friendly?
This is for Visual Basic 2008
No, I wouldn't do that. It seems like it might help maintainability, but that's a very short-sighted view.
Think about it this way. It only takes each programmer who has to understand and maintain the code a short amount of time to get comfortable with zero-indexed arrays. But if you're using one-based arrays, which are unlike those found in almost all other VB.NET code, and in fact almost every other common programming language, it will take everyone on the team much longer. They'll be constantly making mistakes, tripping up because their natural assumptions aren't accurate in this one special case.
I know how it feels. When I worked in VB 6, I loved one-based arrays. They were very natural for the type of data that I was storing, and I used them all over the place. Perfectly documentable here, because you have an explicit syntax to specify the upper and lower bounds of the array. That's not the case in VB.NET (which is a newer, but incompatible version of the Visual Basic language), where all arrays have to be zero-indexed. I had a hard time switching to VB.NET's zero-based arrays for the first couple of days. After that initial period of adjustment, I can honestly say I've never looked back.
Some might argue that leaving the first element of every array empty would consume extra memory needlessly. While that's obviously true, I think it's a secondary reason behind the one I presented above. Good developers write code for others to read, so I commend you for considering how to make your code logical and understandable. You're on the right path by asking this question. But in the long run, I don't think this decision is a good one.
There might be a handful of exceptions in very specific cases, depending on the type of data that you're storing in the array. But again, failing to do this across the board seems like it would hurt readability in the aggregate, rather than helping it. It's not particularly counter-intuitive to simply write the following, once you've learned how arrays are indexed:
For i As Integer = 0 To (myArray.Length - 1)
'Do work
Next
And remember that in VB.NET, you can also use the For Each statement to iterate through your array elements, which many people find more readable. For example:
For Each i As Integer In myArray
'Do work
Next
First, it is about programmer friendly, not user friendly. User will never know the code is 0-based or 1-based.
Second, 0-based is the default and will be used more and more.
Third, 0-based is more natural to computer. From the very element, it has two status, 0 and 1, not 1 and 2.
I have upgraded a couple of VB6 projects to vb.net. To modify to 0-based array in the beginning is better than to debug the code a later time.
Most of my VB.Net arrays are 0-based and every element is used. That's usual in VB.Net and code mustn't surprise the reader. Readability is vital.
Any exceptions? Maybe if I had a program ported from VB6, so it used 0-based arrays with unused initial elements, and it needed a small change, I might match the pattern of the existing code. Least surprise again.
99 times out of 100 the question shouldn't arise because you should be using List(Of T) rather than an array!
Who are the "users" that are going to see the array indexes? Any good developer will be able to handle a zero-indexed array and no real user should ever see them. If the user has to interact with the array, then make an actually user-friendly system for doing so (text or a 1-based virtual index or whatever is called for).
In visual basic is it possible to declare an array starting from 1, if you find inconvenient to use a 0 based array.
Dim array(1 to 10) as Integer
It is just a matter of tastes. I use 1 based arrays in visual basic but 0 based arrays in C ;)
Edited...
Thanks for every one to try to help me!!!
i am trying to make a Finite Element Analysis in Mathemetica.... We can obtain all the local stiffness matrices that has 8x8 dimensions. I mean there are 2000 matrices they are similar but not same. every local stiffness matrix shown like a function that name is KK. For example KK[1] is first element local stiffness matrix
i am trying to assemble all the local matrices to make global stiffness matrix. To make it easy:
Do[K[e][i][j]=KK[[e]][[i]][[j]],{e,2000},{i,8},{j,8}]....edited
Here is my question.... this equality can affect the analysis time...If yes what can i do to improve this...
in matlab this is named as 3d array but i don't know what is called in Mathematica
what are the advantages and disadvantages of this explanation type in Mathematica...is t faster or is it easy way
Thanks for your help...
It is difficult to understand what your question is, so you might want to reformulate it.
As others have mentioned, there is no advantage to be expected from a switch from a 3D array to DownValues or SubValues. In fact you will then move from accessing data-structures to pattern matching, which is powerful and the real strength of Mathematica but not very efficient for what you plan to do, so I would strongly suggest to stay in the realm of ordinary arrays.
There is another thing that might not be clear for someone more familiar with matlab than with Mathematica: In Mathematica the "default" for arrays behave a lot like cell arrays in matlab: each entry can contain arbitrary content and they don't need to be rectangular (as High Performance Mark has mentioned they are just expressions with a head List and can roughly be compared to matlab cell arrays). But if such a nested list is a rectangular array and every element of it is of the same type such arrays can be converted to so called PackedArrays. PackedArrays are much more memory efficient and will also speed up many calculations, they behave in many respect like regular ("not-cell") arrays in matlab. This conversion is often done implicitly from functions like Table, which will oten return a packed array automatically. But if you are interested in efficiency it is a good idea to check with Developer`PackedArrayQ and convert explicitly with Developer`ToPackedArray if necessary. If you are working with PackedArrays speed and memory efficiency of many operations are much better and usually comparable to verctorized operations on normal matlab arrays. Unfortunately it can happen that packed arrays get "unpacked" by some operations, so if calculations become slow it is usually a good idea to check if that has happend.
Neither "normal" arrays nor PackedArrays are restricted in the rank (called Depth in Mathematica) they can have, so you can of course create and use "3D arrays" just as you can in matlab. I have never experienced or would know of any efficiency penalties when doing so.
It probably is of interest that newer versions of Mathematica (>= 10) bring the finite element method as one of the solver methods for NDSolve, so if you are not doing this as an exercise you might want to have a look what is available already, there is quite excessive documentation about it.
A final remark is that you can instead of kk[[e]][[i]][[j]] use the much more readable form kk[[e,i,j]] which is also easier and less error prone to type...
extended comment i guess, but
KK[e][[i]][[j]]
is not the (e,i,j) element of a "3d array". Note the single
brackets on the e. When you use the single brackets you are not denoting an array or list element but a DownValue, which is quite different from a list element.
If you do for example,
f[1]=0
f[2]=2
...
the resulting f appears similar to an array, but is actually more akin to an overloaded function in some other language. It is convenient because the indices need not be contiguous or even integers, but there is a significant performance drawback if you ever want to operate on the structure as a list.
Your 'do' loop example would almost certainly be better written as:
kk = Table[ k[e][i][j] ,{e,2000},{i,8},{j,8} ]
( Your loop wont even work as-is unless you previously "initialized" each of the kk[e] as an 8x8 array. )
Note now the list elements are all double bracketed, ie kk[[e]][[i]][[j]] or kk[[e,i,j]]
Is there a way to create an array/slice in Go without a hard-coded array size? Why is List ignored?
In all the languages I've worked with extensively: Delphi, C#, C++, Python - Lists are very important because they can be dynamically resized, as opposed to arrays.
In Golang, there is indeed a list.Liststruct, but I see very little documentation about it - whether in Go By Example or the three Go books that I have - Summerfield, Chisnal and Balbaert - they all spend a lot of time on arrays and slices and then skip to maps. In souce code examples I also find little or no use of list.List.
It also appears that, unlike Python, Range is not supported for List - big drawback IMO. Am I missing something?
Slices are lovely, but they still need to be based on an array with a hard-coded size. That's where List comes in.
Just about always when you are thinking of a list - use a slice instead in Go. Slices are dynamically re-sized. Underlying them is a contiguous slice of memory which can change size.
They are very flexible as you'll see if you read the SliceTricks wiki page.
Here is an excerpt :-
Copy
b = make([]T, len(a))
copy(b, a) // or b = append([]T(nil), a...)
Cut
a = append(a[:i], a[j:]...)
Delete
a = append(a[:i], a[i+1:]...) // or a = a[:i+copy(a[i:], a[i+1:])]
Delete without preserving order
a[i], a = a[len(a)-1], a[:len(a)-1]
Pop
x, a = a[len(a)-1], a[:len(a)-1]
Push
a = append(a, x)
Update: Here is a link to a blog post all about slices from the go team itself, which does a good job of explaining the relationship between slices and arrays and slice internals.
I asked this question a few months ago, when I first started investigating Go. Since then, every day I have been reading about Go, and coding in Go.
Because I did not receive a clear-cut answer to this question (although I had accepted one answer) I'm now going to answer it myself, based on what I have learned, since I asked it:
Is there a way to create an array /slice in Go without a hard coded
array size?
Yes. Slices do not require a hard coded array to slice from:
var sl []int = make([]int, len, cap)
This code allocates slice sl, of size len with a capacity of cap - len and cap are variables that can be assigned at runtime.
Why is list.List ignored?
It appears the main reasons list.List seem to get little attention in Go are:
As has been explained in #Nick Craig-Wood's answer, there is
virtually nothing that can be done with lists that cannot be done
with slices, often more efficiently and with a cleaner, more
elegant syntax. For example the range construct:
for i := range sl {
sl[i] = i
}
cannot be used with list - a C style for loop is required. And in
many cases, C++ collection style syntax must be used with lists:
push_back etc.
Perhaps more importantly, list.List is not strongly typed - it is very similar to Python's lists and dictionaries, which allow for mixing various types together in the collection. This seems to run contrary
to the Go approach to things. Go is a very strongly typed language - for example, implicit type conversions never allowed in Go, even an upCast from int to int64 must be
explicit. But all the methods for list.List take empty interfaces -
anything goes.
One of the reasons that I abandoned Python and moved to Go is because
of this sort of weakness in Python's type system, although Python
claims to be "strongly typed" (IMO it isn't). Go'slist.Listseems to
be a sort of "mongrel", born of C++'s vector<T> and Python's
List(), and is perhaps a bit out of place in Go itself.
It would not surprise me if at some point in the not too distant future, we find list.List deprecated in Go, although perhaps it will remain, to accommodate those rare situations where, even using good design practices, a problem can best be solved with a collection that holds various types. Or perhaps it's there to provide a "bridge" for C family developers to get comfortable with Go before they learn the nuances of slices, which are unique to Go, AFAIK. (In some respects slices seem similar to stream classes in C++ or Delphi, but not entirely.)
Although coming from a Delphi/C++/Python background, in my initial exposure to Go I found list.List to be more familiar than Go's slices, as I have become more comfortable with Go, I have gone back and changed all my lists to slices. I haven't found anything yet that slice and/or map do not allow me to do, such that I need to use list.List.
I think that's because there's not much to say about them as the container/list package is rather self-explanatory once you absorbed what is the chief Go idiom for working with generic data.
In Delphi (without generics) or in C you would store pointers or TObjects in the list, and then cast them back to their real types when obtaining from the list. In C++ STL lists are templates and hence parameterized by type, and in C# (these days) lists are generic.
In Go, container/list stores values of type interface{} which is a special type capable to represent values of any other (real) type—by storing a pair of pointers: one to the type info of the contained value, and a pointer to the value (or the value directly, if it's size is no greater than the size of a pointer). So when you want to add an element to the list, you just do that as function parameters of type interface{} accept values coo any type. But when you extract values from the list, and what to work with their real types you have to either type-asert them or do a type switch on them—both approaches are just different ways to do essentially the same thing.
Here is an example taken from here:
package main
import ("fmt" ; "container/list")
func main() {
var x list.List
x.PushBack(1)
x.PushBack(2)
x.PushBack(3)
for e := x.Front(); e != nil; e=e.Next() {
fmt.Println(e.Value.(int))
}
}
Here we obtain an element's value using e.Value() and then type-assert it as int a type of the original inserted value.
You can read up on type assertions and type switches in "Effective Go" or any other introduction book. The container/list package's documentation summaries all the methods lists support.
Note that Go slices can be expanded via the append() builtin function. While this will sometimes require making a copy of the backing array, it won't happen every time, since Go will over-size the new array giving it a larger capacity than the reported length. This means that a subsequent append operation can be completed without another data copy.
While you do end up with more data copies than with equivalent code implemented with linked lists, you remove the need to allocate elements in the list individually and the need to update the Next pointers. For many uses the array based implementation provides better or good enough performance, so that is what is emphasised in the language. Interestingly, Python's standard list type is also array backed and has similar performance characteristics when appending values.
That said, there are cases where linked lists are a better choice (e.g. when you need to insert or remove elements from the start/middle of a long list), and that is why a standard library implementation is provided. I guess they didn't add any special language features to work with them because these cases are less common than those where slices are used.
From: https://groups.google.com/forum/#!msg/golang-nuts/mPKCoYNwsoU/tLefhE7tQjMJ
It depends a lot on the number of elements in your lists,
whether a true list or a slice will be more efficient
when you need to do many deletions in the 'middle' of the list.
#1
The more elements, the less attractive a slice becomes.
#2
When the ordering of the elements isn't important,
it is most efficient to use a slice and
deleting an element by replacing it by the last element in the slice and
reslicing the slice to shrink the len by 1
(as explained in the SliceTricks wiki)
So
use slice
1. If order of elements in list is Not important, and you need to delete, just
use List swap the element to delete with last element, & re-slice to (length-1)
2. when elements are more (whatever more means)
There are ways to mitigate the deletion problem --
e.g. the swap trick you mentioned or
just marking the elements as logically deleted.
But it's impossible to mitigate the problem of slowness of walking linked lists.
So
use slice
1. If you need speed in traversal
Unless the slice is updated way too often (delete, add elements at random locations) the memory contiguity of slices will offer excellent cache hit ratio compared to linked lists.
Scott Meyer's talk on the importance of cache..
https://www.youtube.com/watch?v=WDIkqP4JbkE
list.List is implemented as a doubly linked list. Array-based lists (vectors in C++, or slices in golang) are better choice than linked lists in most conditions if you don't frequently insert into the middle of the list. The amortized time complexity for append is O(1) for both array list and linked list even though array list has to extend the capacity and copy over existing values. Array lists have faster random access, smaller memory footprint, and more importantly friendly to garbage collector because of no pointers inside the data structure.