Fundamental limitations of cell arrays, arrays of structs, and scalar structs? - arrays

I've been using Matlab on and off for decades. I thought I had a good grip on arrays, structs, cell arrays, tables, an array of structs, and a struct in which each field is an array. For the latter two, I assumed that each field needed to be of uniform type. I'm finding that no such limitation exists:
Perhaps Matlab is becoming more flexible with the years (I'm using 2015b), but it does undermine my confidence in choosing the best type of variable for a task if I find that understanding of the limitations of each type is wrong. For the purpose of this question, I can't really articulate the needs of the task because the manner in which I break down a large to-do into tasks depends on my understanding of the data types at my disposal, and their advantages/limitations.
I can (and have) read online documentation ad nauseum, and while they will walk you through code to illustrate what the data types are able to do, I haven't yet come across a succinct description of the comparative limitations between cell arrays, arrays of structs, and structs whose fields are themselves arrays -- to the point that I can use that knowledge to choose the best structure in a given situation. Basic stuff, I do find, e.g., the same field names will occur in each struct of a struct array (but as the above example shows, each field of each struct can contain highly heterogeneous data types and/or array sizes).
THE QUESTION
Can anyone point to such a comparison of limitations between cell arrays, arrays of structs, and scalar structs whose fields are themselves arrays? I'm looking for a treatment at a level that informs a coder in deciding on the best trade-off between (i) speed, (ii) memory, and (iii) readability, maintainability, and evolvability.
I've deliberately left out tables because, although I'm enamoured of their convenient access to, and subsetting of, data sets (and presentation thereof), they have proved rather slow for manipulation of data. They have their uses, and I use them liberally, but I'm not interested in them for the purpose of this comparison, which is under-the-hood algorithm coding.

I think your question eventually narrows down to these three "types" of data structures:
comparative limitations between cell arrays, arrays of structs, and structs whose fiels are themselves arrays
[Note that "structs whose fields are themselves arrays" I translate as "scalar structs" here. An array of structs can also contain arbitrary arrays. My thinking becomes clear below, I hope.]
To me, these are not very different. All three are containers for heterogeneous data. (Heterogeneous data is non-uniform data, each data element is potentially of a different type and size.) Each of these statements can return an array of any type, unrelated to the type of any other array in the container:
cell array: array{i,j}
struct array: array(i,j).value
scalar struct: array.value
So it all depends on how you want to index:
array(i,j).value
^ ^
A B
If you want to index using A only, use a cell array (though you then need curly braces, of course). If you want to index using B only, use a scalar struct. If you want both A and B, use a struct array.
There is no difference in cost that I'm aware of. Each of the arrays contained in these containers takes up some space. The spatial overhead of the various containers is similar, and I have never noted a time overhead difference.
However, there is a huge difference between these two:
array(i).value % s1
array.value(i) % s2
I think that the question deals with this difference also. s1 has a lot more spatial overhead than s2:
>> s1=struct('value',num2cell(1:100))
s1 =
1×100 struct array with fields:
value
>> s2=struct('value',1:100)
s2 =
struct with fields:
value: [1×100 double]
>> whos
Name Size Bytes Class Attributes
s1 1x100 12064 struct
s2 1x1 976 struct
The data needs 800 bytes, so s2 has 176 bytes of overhead, whereas s1 has 11264 (1408%)!
The reason is not the container, but the fact that we're storing one array with 100 elements in one, and 100 arrays with one element in the other. Each array has a header of a certain size that MATLAB uses to know what type of array it is, what sizes it has, to manage its storage and the delayed copy mechanism. The fewer arrays one has, the less memory one uses.
So, don't use a heterogeneous container to store scalars! These things only make sense when you need to store larger arrays, or arrays of different type or size.
The heterogeneous container that is not explicitly asked about (and after the edit explicitly not asked about) is the table. A table is similar to a scalar struct in that each column of the table is a single array, and different columns can have different types. Note that it is possible to use a cell array as a column, allowing for heterogenous elements to be stored in a column, but they make most sense if this is not the case.
One difference with a scalar struct is that each column must have the same number of rows. Another difference is that indexing can look like that of a cell array, a scalar struct, or a struct array.
Thus, the table forces some constrains upon the contained data, which is very beneficial in some circumstances.
However, and as the OP noted, working with tables is slower than working with structs. This is because table is a custom class, not a native type like structs and cell arrays. If you type edit table in MATLAB, you'll see the source code, how it's implemented. It's a classdef file, just like something any of us could write. Consequently, it has the same speed limitations: the JIT is not optimized for it, indexing into a table implies running a function written as an M-file, etc.
One more thing: Don't create cell arrays of structs, or scalar structs with cell arrays. This increases the levels of containers, which increases overhead (both in space and time), and makes the contents more difficult to use. I have seen questions here on SO related to difficulty accessing data, caused by this type of construct:
data{i,j}.value % A cell array with structs. Don't do this!
data.value{i,j} % A struct with cell arrays. Don't do this!
The first example is equal to a struct array (with a lot more overhead), except there is no control over the struct fields within each cell. That is, it is possible for one of the cells to not have a .value field.
The second example makes sense only if value is a different size than a second struct field. If all struct fields are (supposed to be) cell arrays of the same size like this, then use a struct array. Again, less overhead and more uniformity.

Related

Why should you use a 2D array of structs?

I`m curious about if there are cases to use a 2D array of structures, like f.e.:
typedef struct
{
//member variables
}myStruct;
myStruct array2D[x][y];
//Uses cases of array2D[x][y]? Do we need them?
Why should you use a 2D array of structs?
Why should you use a 2D array of structs?
A defined structure is basically just a type like any other (though of course, any type is different from the other and there is a difference between private and standard datatypes; not even mentioning the differences in memory allocation between objects of that types) with which you can declare objects.
You could also ask: Why should you use a char,int or double two-dimensional array?
It is not only a structure own kind of thing.
It depends on the context and its worth to have a clear "structure" in ones code; So to code in this way can help you to make your code more readable and clear, if you need a huge amount of objects of a certain type, in this case a structure.
Maybe you even want to group some objects and/or want to treat them differently. In this case, multiple array dimensions are beneficial because you can address objects of each dimension explicitly and separately.
As #buysbee mentioned in the comments:
One obvious example, where it is beneficial to store structure objects in a two-dimensional array is to storing the pixels of a picture. It is better to store the "pixel" structure objects in a two-dimensional array because this emulates how a picture is constructed of naturally.
We can add more questions: why we should use 3D, 4D. 5D ...... 10000000D arrays of something.
Data structures used in a project depend on the problem to be solved and the algorithm chosen.

Are there Erlang arrays "with a defined representation"?

Context:
Erlang programs running on heterogeneous nodes, retrieving and storing data
from Mnesia databases. These database entries are meant to be used for a long
time (e.g. across multiple Erlang version releases) remains in the form of
Erlang objects (i.e. no serialization). Among the information stored, there are
currently two uses for arrays:
Large (up to 16384 elements) arrays. Fast access to an element
using its index was the basis for choosing this type of collection.
Once the array has been created, the elements are never modified.
Small (up to 64 elements) arrays. Accesses are mostly done using indices, but there are also some iterations (foldl/foldr). Both reading and replacement of the elements is done frequently. The size of the collection remains constant.
Problem:
Erlang's documentation on arrays states that "The representation is not
documented and is subject to change without notice." Clearly, arrays should not be used in my context: database entries containing arrays may be
interpreted differently depending on the node executing the program and
unannounced changes to how arrays are implemented would make them unusable.
I have noticed that Erlang features "ordsets"/"orddict" to address a similar
issue with "sets"/"dict", and am thus looking for the "array" equivalent. Do you know of any? If none exists, my strategy is likely going to be using lists of lists to replace my large arrays, and orddict (with the index as key) to replace the smaller ones. Is there a better solution?
An array is a tuple of nested tuples and integers, with each tuple being a fixed size of 10 and representing a segment of cells. Where a segment is not currently used an integer (10) acts as a place holder. This without the abstraction is I suppose the closet equivalent.You could indeed copy the array module from otp and add to your own app and thus it would be a stable representation.
As to what you should use devoid of array depends on the data and what you will do with it. If data that would be in your array is fixed, then a tuple makes since, it has constant access time for reads/lookups. Otherwise a list sounds like a winner, be it a list of lists, list of tuples, etc. However, once again, that's a shot in the dark, because I don't know your data or how you use it.
See the implementation here: https://github.com/erlang/otp/blob/master/lib/stdlib/src/array.erl
Also see Robert Virding's answer on the implementation of array here: Arrays implementation in erlang
And what Fred Hebert says about the array in A Short Visit to Common Data Structures
An example showing the structure of an array:
1> A1 = array:new(30).
{array,30,0,undefined,100}
2> A2 = array:set(0, true, A1).
{array,30,0,undefined,
{{true,undefined,undefined,undefined,undefined,undefined,
undefined,undefined,undefined,undefined},
10,10,10,10,10,10,10,10,10,10}}
3> A3 = array:set(19, true, A2).
{array,30,0,undefined,
{{true,undefined,undefined,undefined,undefined,undefined,
undefined,undefined,undefined,undefined},
{undefined,undefined,undefined,undefined,undefined,
undefined,undefined,undefined,undefined,true},
10,10,10,10,10,10,10,10,10}}
4>

When is the best time to use a Structure or an array

I am a little new to C programming. I was writing a C program which has 3 integers to handle. I had all of them inside an array and suddenly I had a thought of why should I not use a structure.
My question here is when is the best time to use a structure and when to use an array. And is there any memory usage difference between the two in this particular case.
Any help regarding this is appriciated. Thanks!
An array is best when you want to loop through the values (which, essentially, means they're strongly related). Otherwise a structure allows you to give them meaningful names and avoids the need to document that array, e.g. myVar[1] is the name of the company and myVar[0] is its phone number, etc. as opposed to companyName, companyPhone.
The difference is about semantic information. If you want to store your information as a list where there is no semantic distinction between different members of that list, then use an array. Perhaps each member of the list represents a different value for the same thing.
If each of those integers represents something special or different, use a struct. Note the implications of using a struct, such as the fact that people expect the members to be closely related semantically.
struct has other advantages over array which can make it more powerful. For example, its ability to encapsulate multiple data types.
If you are passing this information between many functions, a structure is likely more practical (because there is no need to pass the size). It would be bad to pass an array (which decays to a pointer) and expect the callee to know how many items are in the array. Using a struct implicitly makes this part of the function contract.
In terms of size, there is no difference. A 4 byte int would typically be 4-byte aligned.
You can think of structure like an object in OOP languages, a structure ties related data into a single type and allows you to access each member of the structure using the member's name instead of array indices. If you can think of a singular name that could unify the related data then you should be using a structure.
An array can be thought of as a list of items, if the name you thought of above contains the word list or collection or is a plural, then you should be using arrays or other collection types. The primary use of arrays is to loop over it and apply the same operation to every items in the array or a range of items in the array. If you used an array but never looped over it, it's an indication that probably array may not be the best data type.
I would suggest to use an array if the different things you store are logically the same data, but different instance of this. (like a list of telephone numbers or ages). And use a struct when they mean different things (like age and size) bound together because they are related to the same thing (a person).
The size is equal, since both store 3 integers without anything else; You could actually cast the struct to an array and use it like that (although you shouldn't do that for its ugliness).
You could test that with this simple programm:
#include <stdio.h>
struct three_numbers{
int x;
int y;
int z;
};
int main(int argc, char** argv) {
int test[3];
printf("struct: %d, array: %d\n", sizeof(three_numbers), sizeof(test));
}
prints on my system:
struct: 12, array: 12
In my opinion, you should think first from the perspective of the design to decide which one to use. In your question you have mentioned that "I have three integers to handle". The point here is that how did you arrive at three integers?
Just as many others have noted, let's say you need store details of a person, first you need to think of the person as an object and then decide what all information relevant to that person you will need and then decide what data type you need to use for each of those details. What you are trying to do is that you have decided that data types first and then trying work your way up.
To just put in simple words about the difference between structure and array. Structure is a Composite Data Type (or a User defined data type) whereas array is just a collection of similar data.
Use structures to group information about a single object. Use arrays to group information about multiple objects.

Are multidimensional arrays (like in C/C++) special cases of ragged arrays? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I had a discussion with a buddy about whether C++ and C multi-dimensional arrays are special cases of ragged arrays. One point of view was
A multi-dimensional array is not a ragged array, because each element of the multi-dimensional array has the same size. In a ragged array, at least one element has a different size than another element of the same array. ("If it doesn't have the possibility to be ragged, it's not a ragged array.").
The other point of view was
A multi-dimensional array is a special case of a ragged array, where each element has the same size. A ragged array may have rows of different sizes, but doesn't have to. ("A circle is an ellipsis.").
I'm interested in getting a definite answer as to what the common definition of a "ragged array" is in computer science and whether C and C++ multidimensional arrays are ragged arrays or not.
I don't know what the "exact definition" of a ragged array should be but I believe C/C++ multidimensional arrays are definitely not ragged. The reasons for this are the following:
A ragged array is a term referring the a certain way of "storage in memory" of the arrays such that there is/are at least one pair of rows/cells with different sizes.
Arrays in C/C++ are pretty straight-forward. The arrays are just a "contiguous block" of memory that is reserved for the structure (array).
Other high-level languages might have different implementations to save memory etc. but C/C++ arrays don't.
So I believe we cannot call C/C++ arrays ragged.
(Opinion).
EDIT:
And this also heavily depends on the "definition" of ragged. So this is not a well-defined term and so it will be difficult to reach a conclusion. (Should avoid unproductive debates).
A C multidimensional array, if declared as a multidimensional array, cannot be ragged. A 2D array, for example, is an "array of arrays" and each row is the same length, even if you don't use every entry in the array.
int a1[2][3]; // two rows, three columns
int a2[5][8]; // five rows, eight columns
The thing about C, though, is that you can use a pointer-to-a-pointer as if it were a 2D array:
int **a3 = malloc(4);
for (i = 0; i < 4; i++)
a3[i] = malloc(i);
Now a3 can be used in a lot of cases like a 2D array, but is definitely ragged.
IMHO, real arrays cannot be called ragged, but you can certainly fake it out if you have to... the terminology you use doesn't really seem that important from that standpoint.
I'd say the difference is a conceptual one. A multidimensional array
T x[d_1][d_2]...[d_N];
denotes a contiguous area of memory of size $\prod_i d_i$, if you pardon the TeX, and it's addressed in strides: x[i_1]...[i_N] is the element at position $i_N + i_{N-1} d_N + i_{n-2} d_{N-1} d_N + ... + i_1 d_2 ... d_N$. Intermediate indexes can be taken as pointers to the respective subarrays.
A ragged array on the other hand decouples the inner "dimension" from the outer one in memory:
T * r[M];
for (size_t i = 0; i != M; ++i)
r[M] = new T[get_size_at_row(i)];
Whether the sizes actually vary or not is immaterial here, but the conceptual difference is that a ragged array is an array of arrays, whereas a multidimensional array is a far more rigid and coherent object.
When discussing mathematical objects, I think that "ragged" is probably used as a modifier to "array" specifically to mean one that has mismatched secondary dimensions. So that's the first meaning rather than the second. Consider where the word is taken from - we don't say that a brand new handkerchief "is ragged, because it has the potential to fray around the edges, but it hasn't frayed yet". It's not ragged at all. So if we were to call a specific array "ragged", I would expect that to mean "not straight".
However, there will be some contexts in which it's worth defining "ragged array" to mean a "potentially-ragged array" rather than one that actually does have mismatches. For example, if you were going to write a "RaggedArray" class, you would not design in a class invariant that there is guaranteed to be a mismatched size somewhere, and be sure to throw an exception if someone tries to create one with all sizes equal. That would be absurd, despite that fact that you're going to call instances of this class "ragged arrays". So in that context, an array with equal sizes in all elements is a special case of a "ragged array". That's the second meaning rather than the first.
Of course, a C or C++ multi-dimensional array still would not be an instance of this class, but it might satisfy at least the read-only part of some generic interface referred to as "RaggedArray". It's basically a shortcut, that even though we know "ragged" means "having a size mismatch", for most purposes you simply can't be bothered to call that class or generic interface "PotentiallyRaggedArray" just to make clear that you won't enforce the constraint that there must be one.
There's a difference between whether a particular instance of a type has a specific property, and whether the type allows instances of it to have that property, and we frequently ignore that difference when we say that an instance of type X "is an X". Instances of type X potentially have the property, this instance doesn't have it, so this instance in fact does not potentially have the property either. Your two meanings of "ragged array" can be seen as an example of that difference. See the E-Prime crowd, and also the philosophy of Wittgenstein, for the kinds of confusion we create when we say that one thing "is" another, different thing. An instance "is not" a type, and a concrete example does not have the same potential properties as whatever it's an example of.
To specifically answer your question, I doubt that there is a universally-accepted preference for one meaning over the other in the CS literature. It's one of those terms that you just have to define for your own purposes when you introduce it to a given work (an academic paper, the documentation of a particular library, etc). If I could find two papers, one using each, then I'd have proved it, but I can't be bothered with that ;-)
My position would be that ragged array is distinguishable from a multi-dimensional array because it has (must have!) a index that tells you where each of the sub-arrays start. (A ragged array also needs some mechanism for keeping track of the size of each sub-array and while knowing that the sub-arrays are of uniform size will do it is not very general)
You could in principle build a index that connects to the sub-arrays of a standard multi-dimensional array
int arr[6][10]; // <=== Multi-dimensional array
int **ragged = calloc(6,sizeof(int*)); // <=== Ragged array (initially empty)
for (int i=0; i<6 ++i) {
ragged[i] = arr[i]; // <=== make the ragged array alias arr
}
Now I have an two-dimensional array and a two-dimensional ragged array using the same data.
So no, a language multi-dimensional array is not a special case of a ragged array.

What is the actual definition of an array? [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Arrays, What’s the point?
I tried to ask this question before in What is the difference between an array and a list? but my question was closed before reaching a conclusive answer (more about that).
I'm trying to understand what is really meant by the word "array" in computer science. I am trying to reach an answer not have a discussion as per the spirit of this website. What I'm asking is language agnostic but you may draw on your knowledge of what arrays are/do in various languages that you've used.
Ways of thinking about this question:
Imagine you're designing a new programming language and you decide to implement arrays in it; what does that mean they do? What will the properties and capabilities of those things be. If it depends on the type of language, how so?
What makes an array an array?
When is an array not an array? When it is, for example, a list, vector, table, map, or collection?
It's possible there isn't one precise definition of what an array is, if that is the case then are there any standard or near-standard assumptions or what an array is? Are there any common areas at least? Maybe there are several definitions, if that is the case I'm looking for the most precision in each of them.
Language examples:
(Correct me if I'm wrong on any of these).
C arrays are contiguous blocks of memory of a single type that can be traversed using pointer arithmetic or accessed at a specific offset point. They have a fixed size.
Arrays in JavaScript, Ruby, and PHP, have a variable size and can store an object/scalar of any type they can also grow or have elements removed from them.
PHP arrays come in two types: numeric and associative. Associative arrays have elements that are stored and retrieved with string keys. Numeric arrays have elements that are stored and retrieved with integers. Interestingly if you have: $eg = array('a', 'b', 'c') and you unset($eg[1]) you still retrieve 'c' with $eg[2], only now $eg[1] is undefined. (You can call array_values() to re-index the array). You can also mix string and integer keys.
At this stage of sort of suspecting that C arrays are the only true array here and that strictly-speaking for an array to be an array it has to have all the characteristics I mention in that first bullet point. If that's the case then — again these are suspicions that I'm looking to have confirmed or rejected — arrays in JS and Ruby are actually vectors, and PHP arrays are probably tables of some kind.
Final note: I've made this community wiki so if answers need to be edited a few times in lieu of comments, go ahead and do that. Consensus is in order here.
It is, or should be, all about abstraction
There is actually a good question hidden in there, a really good one, and it brings up a language pet peeve I have had for a long time.
And it's getting worse, not better.
OK: there is something lowly and widely disrespected Fortran got right that my favorite languages like Ruby still get wrong: they use different syntax for function calls, arrays, and attributes. Exactly how abstract is that? In fortran function(1) has the same syntax as array(1), so you can change one to the other without altering the program. (I know, not for assignments, and in the case of Fortran it was probably an accident of goofy punch card character sets and not anything deliberate.)
The point is, I'm really not sure that x.y, x[y], and x(y) should have different syntax. What is the benefit of attaching a particular abstraction to a specific syntax? To make more jobs for IDE programmers working on refactoring transformations?
Having said all that, it's easy to define array. In its first normal form, it's a contiguous sequence of elements in memory accessed via a numeric offset and using a language-specific syntax. In higher normal forms it is an attribute of an object that responds to a typically-numeric message.
array |əˈrā|
noun
1 an impressive display or range of a particular type of thing : there is a vast array of literature on the topic | a bewildering array of choices.
2 an ordered arrangement, in particular
an arrangement of troops.
Mathematics: an arrangement of quantities or symbols in rows and columns; a matrix.
Computing: an ordered set of related elements.
Law: a list of jurors empaneled.
3 poetic/literary elaborate or beautiful clothing : he was clothed in fine array.
verb
[ trans. ] (usu. be arrayed) display or arrange (things) in a particular way : arrayed across the table was a buffet | the forces arrayed against him.
[ trans. ] (usu. be arrayed in) dress someone in (the clothes specified) : they were arrayed in Hungarian national dress.
[ trans. ] Law empanel (a jury).
ORIGIN Middle English (in the senses [preparedness] and [place in readiness] ): from Old French arei (noun), areer (verb), based on Latin ad- ‘toward’ + a Germanic base meaning ‘prepare.’
From FOLDOC:
array
1. <programming> A collection of identically typed data items
distinguished by their indices (or "subscripts"). The number
of dimensions an array can have depends on the language but is
usually unlimited.
An array is a kind of aggregate data type. A single
ordinary variable (a "scalar") could be considered as a
zero-dimensional array. A one-dimensional array is also known
as a "vector".
A reference to an array element is written something like
A[i,j,k] where A is the array name and i, j and k are the
indices. The C language is peculiar in that each index is
written in separate brackets, e.g. A[i][j][k]. This expresses
the fact that, in C, an N-dimensional array is actually a
vector, each of whose elements is an N-1 dimensional array.
Elements of an array are usually stored contiguously.
Languages differ as to whether the leftmost or rightmost index
varies most rapidly, i.e. whether each row is stored
contiguously or each column (for a 2D array).
Arrays are appropriate for storing data which must be accessed
in an unpredictable order, in contrast to lists which are
best when accessed sequentially. Array indices are
integers, usually natural numbers, whereas the elements of
an associative array are identified by strings.
2. <architecture> A processor array, not to be confused with
an array processor.
Also note that in some languages, when they say "array" they actually mean "associative array":
associative array
<programming> (Or "hash", "map", "dictionary") An array
where the indices are not just integers but may be
arbitrary strings.
awk and its descendants (e.g. Perl) have associative
arrays which are implemented using hash coding for faster
look-up.
If you ignore how programming languages model arrays and lists, and ignore the implementation details (and consequent performance characteristics) of the abstractions, then the concepts of array and list are indistinguishable.
If you introduce implementation details (still independent of programming language) you can compare data structures like linked lists, array lists, regular arrays, sparse arrays and so on. But then you are not longer comparing arrays and lists per se.
The way I see it, you can only talk about a distinction between arrays and lists in the context of a programming language. And of course you are then talking about arrays and lists as supported by that language. You cannot generalize to any other language.
In short, I think this question is based on a false premise, and has no useful answer.
EDIT: in response to Ollie's comments:
I'm not saying that it is not useful to use the words "array" and "list". What I'm saying is the words do not and cannot have precise and distinct definitions ... except in the context of a specific programming language. While you would like the two words to have distinct meaning, it is a fact that they don't. Just take a look at the way the words are actually used. Furthermore, trying to impose a new set of definitions on the world is doomed to fail.
My point about implementation is that when we compare and contrast the different implementations of arrays and lists, we are doing just that. I'm not saying that it is not a useful thing to do. What I am saying is that when we compare and contrast the various implementations we should not get all hung up about whether we call them arrays or lists or whatever. Rather we should use terms that we can agree on ... or not use terms at all.
To me, "array" means "ordered collection of things that is probably efficiently indexable" and "list" means "ordered collection of things that may be efficiently indexable". But there are examples of both arrays and lists that go against the trend; e.g. PHP arrays on the one hand, and Java ArrayLists on the other hand. So if I want to be precise ... in a language-agnostic context, I have to talk about "C-like arrays" or "linked lists" or some other terminology that makes it clear what data structure I really mean. The terms "array" and "list" are of no use if I want to be clear.
An array is an ordered collection of data items indexed by integer. It is not possible to be certain of anything more. Vote for this answer you believe this is the only reasonable outcome of this question.
An array:
is a finite collection of elements
the elements are ordered, and this is their only structure
elements of the same type
supported efficient random access
has no expectation of efficient insertions
may or may not support append
(1) differentiates arrays from things like iterators or generators. (2) differentiates arrays from sets. (3) differentiates arrays from things like tuples where you get an int and a string. (4) differentiates arrays from other types of lists. Maybe it's not always true, but a programmer's expectation is that random access is constant time. (5) and (6) are just there to deny additional requirements.
I would argue that a real array stores values in contiguous memory. Anything else is only called an array because it can be used like array, but they aren't really ("arrays" in PHP are definately not actual arrays (non-associative)). Vectors and such are extensions of arrays, adding additional functionality.
an array is a container, and the objects it holds have no any relationships except the order; the objects are stored in a continuous space abstractly (high level, of course low level may continuous too), so you could access them by slot[x,y,z...].
for example, per array[2,3,5,7,1], you could get 5 using slot[2] (slot[3] in some languages).
for a list, a container too, each object (well, each object-holder exactly such as slot or node) it holds has indicators which "point" to other object(s) and this is the main relationship; in general both high or low level the space is not continuous, but may be continuous; so accessing by slot[x,y,z...] is not recommended.
for example, per |-2-3-5-7-1-|, you need to do a travel from first object to 3rd one to get 5.

Resources