Going between 2D and 1D array - arrays

Ok, I have a some array A[4][4], and another A[16], which are both just different representations of each other. Now, I am given an elements position on the 2D array, but I have to access it from the 1D array. IE, if I'm told to access the element A[2][0] in my 1D array, how would I do that?

In this simple example, A[2][0] maps to A[8] since you are requesting the first element in the third group of four. Similarly, A[0][2] maps to A[2] since you are requesting the third element of the first group of four. In general (A[i][j]) you are requesting the (i*4+j)-th element of A.
In the even more general case, you are requesting the (i*width+j)-th element.

This depends on your programming language and choice of array type. Depending on your language, arrays are either stored in row-major order or column-major order:
Edit: Java does not have multidimensional arrays, as per the documentation: In Java, a multi-dimensional array is structured an array of arrays, i.e. an array whose elements are references to array objects. This means that each row can be of different length and therefore the storage format is neither row-major nor column-major.
Row-major order is used in C/C++, PL/I, Python, Speakeasy and others. Column-major order is used in Fortran, MATLAB, GNU Octave, R, Julia, Rasdaman, and Scilab.
In some languages, you can also choose the order (e.g. MATLAB)
For row-major order, A[2][0] would be at A[2*4+0] (where 4 is the size of one row):
offset = row*NUMCOLS + column
For column-major order, A[2][0] would be at A[0*4+2] (where 4 is the size of one column):
offset = row + column*NUMROWS
It really depends on your programming language!

Related

When are arrays C-contiguous and F-contiguous simultaneously?

Under what conditions can arrays be C-contiguous and F-contiguous simultaneosly?
I can think of the following:
The 1D case that is trivially C- and F-contiguous.
Similarly, multi-dimensional arrays where all dimensions are singleton dimensions except for one.
Are there any others?
You got it. An array is both C and Fortran contiguous (i.e. is both row major and column major) when it has at most 1 dimension longer than 1. Basically, vectors and scalars, plus degenerate arrays with additional "unnecessary" dimensions.

How are Swift arrays different than c arrays

I'm working to convert some of my ObjC code that uses primitive c arrays to Swift arrays. However, using a playground, I've found some strange behaviors.
For instance, the following is perfectly valid in Swift
var largearray : [[Float]] = []
largearray.append([0,1,2]) //3 elements
largearray.append([3,4,5]) //3 elements
largearray.append([6,7,8,9]) //-4- elements
largearray.append([10,11,12]) //3 elements
//pull those back out
largearray[1][0] //gives 3
largearray[1][2] //gives 5
//largearray[1][3] //error
largearray[2][0] //gives 6
largearray[2][2] //gives 8
largearray[2][3] //gives 9
largearray[3][0] //gives 10
I don't understand how it's possible to have a mixed row lengths is Swift. Can someone explain what's going on here, because the documentation doesn't go into that kind of detail. I'm curious if it is even storing a contiguous Float array behind the scenes or not.
Then another question I have is about accessing rows or columns. In Swift I see that I can access an entire row using largearray[0] gives [0,1,2], just as largearray[2] gives [6,7,8,9]. Which isn't how c arrays are indexed. (If I just specified one index for a 2D c-array, it would act as a sequential index row by column. So, is there some way to access an entire column in swift? In c, and Swift, largearray[][2] is invalid. But I'm curious if there is some technique not mentioned in the docs, since it seems obvious that Swift is keeping track of extra information.
I should add that I will be making use of the Accelerate framework. So if any of the above "strange" ways of using a Swift array will cause performance issues on massive arrays, let me know.
How are Swift arrays different than c arrays
In C, an array is always a contiguous list of elements. In Swift, an array is a much more abstract data structure. You can make assumptions about how data is organized in memory with a C array, and you can even calculate the addresses of an element given the base address, element size, and an index. In Swift, not so much. Think of Swift's Array type the same way you think of NSArray in Objective-C. It's an ordered sequence of elements that provides array-like operations, but you shouldn't worry about how it stores the actual data.
I don't understand how it's possible to have a mixed row lengths is Swift.
Well, for one thing, you're really looking at an array of arrays. If an array is an object, then an array of arrays is probably implemented as an list of object pointers rather than a contiguous series of same-sized lists. You can do the same thing with NSArray, for example, because each item in an NSArray can be an object of any type.
So, is there some way to access an entire column in swift?
You'd need to iterate over the items in the array, which are themselves arrays, and examine the element at the "column" position you're interested in. I don't think there's a faster way to do it than that.

Are multidimensional arrays (like in C/C++) special cases of ragged arrays? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I had a discussion with a buddy about whether C++ and C multi-dimensional arrays are special cases of ragged arrays. One point of view was
A multi-dimensional array is not a ragged array, because each element of the multi-dimensional array has the same size. In a ragged array, at least one element has a different size than another element of the same array. ("If it doesn't have the possibility to be ragged, it's not a ragged array.").
The other point of view was
A multi-dimensional array is a special case of a ragged array, where each element has the same size. A ragged array may have rows of different sizes, but doesn't have to. ("A circle is an ellipsis.").
I'm interested in getting a definite answer as to what the common definition of a "ragged array" is in computer science and whether C and C++ multidimensional arrays are ragged arrays or not.
I don't know what the "exact definition" of a ragged array should be but I believe C/C++ multidimensional arrays are definitely not ragged. The reasons for this are the following:
A ragged array is a term referring the a certain way of "storage in memory" of the arrays such that there is/are at least one pair of rows/cells with different sizes.
Arrays in C/C++ are pretty straight-forward. The arrays are just a "contiguous block" of memory that is reserved for the structure (array).
Other high-level languages might have different implementations to save memory etc. but C/C++ arrays don't.
So I believe we cannot call C/C++ arrays ragged.
(Opinion).
EDIT:
And this also heavily depends on the "definition" of ragged. So this is not a well-defined term and so it will be difficult to reach a conclusion. (Should avoid unproductive debates).
A C multidimensional array, if declared as a multidimensional array, cannot be ragged. A 2D array, for example, is an "array of arrays" and each row is the same length, even if you don't use every entry in the array.
int a1[2][3]; // two rows, three columns
int a2[5][8]; // five rows, eight columns
The thing about C, though, is that you can use a pointer-to-a-pointer as if it were a 2D array:
int **a3 = malloc(4);
for (i = 0; i < 4; i++)
a3[i] = malloc(i);
Now a3 can be used in a lot of cases like a 2D array, but is definitely ragged.
IMHO, real arrays cannot be called ragged, but you can certainly fake it out if you have to... the terminology you use doesn't really seem that important from that standpoint.
I'd say the difference is a conceptual one. A multidimensional array
T x[d_1][d_2]...[d_N];
denotes a contiguous area of memory of size $\prod_i d_i$, if you pardon the TeX, and it's addressed in strides: x[i_1]...[i_N] is the element at position $i_N + i_{N-1} d_N + i_{n-2} d_{N-1} d_N + ... + i_1 d_2 ... d_N$. Intermediate indexes can be taken as pointers to the respective subarrays.
A ragged array on the other hand decouples the inner "dimension" from the outer one in memory:
T * r[M];
for (size_t i = 0; i != M; ++i)
r[M] = new T[get_size_at_row(i)];
Whether the sizes actually vary or not is immaterial here, but the conceptual difference is that a ragged array is an array of arrays, whereas a multidimensional array is a far more rigid and coherent object.
When discussing mathematical objects, I think that "ragged" is probably used as a modifier to "array" specifically to mean one that has mismatched secondary dimensions. So that's the first meaning rather than the second. Consider where the word is taken from - we don't say that a brand new handkerchief "is ragged, because it has the potential to fray around the edges, but it hasn't frayed yet". It's not ragged at all. So if we were to call a specific array "ragged", I would expect that to mean "not straight".
However, there will be some contexts in which it's worth defining "ragged array" to mean a "potentially-ragged array" rather than one that actually does have mismatches. For example, if you were going to write a "RaggedArray" class, you would not design in a class invariant that there is guaranteed to be a mismatched size somewhere, and be sure to throw an exception if someone tries to create one with all sizes equal. That would be absurd, despite that fact that you're going to call instances of this class "ragged arrays". So in that context, an array with equal sizes in all elements is a special case of a "ragged array". That's the second meaning rather than the first.
Of course, a C or C++ multi-dimensional array still would not be an instance of this class, but it might satisfy at least the read-only part of some generic interface referred to as "RaggedArray". It's basically a shortcut, that even though we know "ragged" means "having a size mismatch", for most purposes you simply can't be bothered to call that class or generic interface "PotentiallyRaggedArray" just to make clear that you won't enforce the constraint that there must be one.
There's a difference between whether a particular instance of a type has a specific property, and whether the type allows instances of it to have that property, and we frequently ignore that difference when we say that an instance of type X "is an X". Instances of type X potentially have the property, this instance doesn't have it, so this instance in fact does not potentially have the property either. Your two meanings of "ragged array" can be seen as an example of that difference. See the E-Prime crowd, and also the philosophy of Wittgenstein, for the kinds of confusion we create when we say that one thing "is" another, different thing. An instance "is not" a type, and a concrete example does not have the same potential properties as whatever it's an example of.
To specifically answer your question, I doubt that there is a universally-accepted preference for one meaning over the other in the CS literature. It's one of those terms that you just have to define for your own purposes when you introduce it to a given work (an academic paper, the documentation of a particular library, etc). If I could find two papers, one using each, then I'd have proved it, but I can't be bothered with that ;-)
My position would be that ragged array is distinguishable from a multi-dimensional array because it has (must have!) a index that tells you where each of the sub-arrays start. (A ragged array also needs some mechanism for keeping track of the size of each sub-array and while knowing that the sub-arrays are of uniform size will do it is not very general)
You could in principle build a index that connects to the sub-arrays of a standard multi-dimensional array
int arr[6][10]; // <=== Multi-dimensional array
int **ragged = calloc(6,sizeof(int*)); // <=== Ragged array (initially empty)
for (int i=0; i<6 ++i) {
ragged[i] = arr[i]; // <=== make the ragged array alias arr
}
Now I have an two-dimensional array and a two-dimensional ragged array using the same data.
So no, a language multi-dimensional array is not a special case of a ragged array.

What does a.{X} mean in OCaml?

I'm currently trying to port some OCaml to F#. I'm "in at the deep end" with OCaml and my F# is a bit rusty.
Anyway, the OCaml code builds fine in the OCaml compiler, but (not surprisingly) gives a load of errors in the F# compiler even with ML compatibility switched on. Some of the errors look to be reserved words, but the bulk of the errors are complaining about the .{ in lines such as:
m.(a).(b) <- w.{a + b * c};
a,b,c are integers.
I've done a lot of searching through OCaml websites, Stackoverflow, the English translation of the French O'Reilly book, etc. and cannot find anything like this. Of course it doesn't help that most search facilities have problems with punctuation characters! Yes I've found references to . being used to refer to record members, and { } being used to define records, but both together? From the usage, I assume it is some kind of associative or sparse array?
What does this syntax mean? What is the closest F# equivalent?
There is a pdf of the oCaml documentation/manual available here:
http://caml.inria.fr/distrib/ocaml-3.12/ocaml-3.12-refman.pdf
On page 496 (toward the bottom of the page), it says of generic arrays and their get method:
val get : (’a, ’b, ’c) t -> int array -> ’a
Read an element of a generic big array. Genarray.get a [|i1; ...; iN|] returns
the element of a whose coordinates are i1 in the first dimension, i2 in the second
dimension, . . ., iN in the N-th dimension.
If a has C layout, the coordinates must be greater or equal than 0 and strictly less than
the corresponding dimensions of a. If a has Fortran layout, the coordinates must be
greater or equal than 1 and less or equal than the corresponding dimensions of a. Raise
Invalid_argument if the array a does not have exactly N dimensions, or if the
coordinates are outside the array bounds.
If N > 3, alternate syntax is provided: you can write a.{i1, i2, ..., iN} instead of
Genarray.get a [|i1; ...; iN|]. (The syntax a.{...} with one, two or three
coordinates is reserved for accessing one-, two- and three-dimensional arrays as
described below.)
Further, it says (specifically about one dimensional arrays):
val get : (’a, ’b, ’c) t -> int -> ’a
Array1.get a x, or alternatively a.{x}, returns the element of a at index x. x must
be greater or equal than 0 and strictly less than Array1.dim a if a has C layout. If a
has Fortran layout, x must be greater or equal than 1 and less or equal than
Array1.dim a. Otherwise, Invalid_argument is raised.
In F#, you can access array elements using the Array.get method as well. But, a closer syntax would be w.[a + b * c]. In short, in F#, use [] instead of {}.

Representing a 2D array as a 1D array [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicates:
Implementing a matrix, which is more efficient - using an Array of Arrays (2D) or a 1D array?
Performance of 2-dimensional array vs 1-dimensional array
I was looking at one of my buddy's molecular dynamics code bases the other day and he had represented some 2D data as a 1D array. So rather than having to use two indexes he only has to keep track of one but a little math is done to figure out what position it would be in if it were 2D. So in the case of this 2D array:
two_D = [[0, 1, 2],
[3, 4, 5]]
It would be represented as:
one_D = [0, 1, 2, 3, 4, 5]
If he needed to know what was in position (1,1) of the 2D array he would do some simple algebra and get 4.
Is there any performance boost gained by using a 1D array rather than a 2D array. The data in the arrays can be called millions of times during the computation.
I hope the explanation of the data structure is clear...if not let me know and I'll try to explain it better.
Thank you :)
EDIT
The language is C
For a 2-d Array of width W and height H you can represent it as a 1-d Array of length W*H where each index
(x,y)
where x is the column and y is the row, of the 2-d array is mapped to to the index
i=y*W + x
in the 1-D array. Similarily you can use the inverse mapping:
y = i / W
x = i % W
. If you make W a power of 2 (W=2^m), you can use the hack
y = i >> m;
x = (i & (W-1))
where this optimization is restricted only to the case where W is a power of 2. A compiler would most likely miss this micro-optimization so you'd have to implement it yourself.
Modulus is a slow operator in C/C++, so making it disappear is advantageous.
Also, with large 2-d arrays keep in mind that the computer stores them in memory as a 1-d array and basically figures out the indexes using the mappings I listed above.
Far more important than the way that you determine these mappings is how the array is accessed. There are two ways to do it, column major and row major. The way that you traverse is more important than any other factor because it determines if you are using caching to your advantage. Please read http://en.wikipedia.org/wiki/Row-major_order .
Take a look at Performance of 2-dimensional array vs 1-dimensional array
Often 2D arrays are implemented as 1D arrays. Sometimes 2D arrays are implemented by a 1D array of pointers to 1D arrays. The first case obviously has no performance penalty compared to a 1D array, because it is identical to a 1D array. The second case might have a slight performance penalty due to the extra indirection (and additional subtle effects like decreased cache locality).
It's different for each system what kind is used, so without information about what you're using there's really no way to be sure. I'd advise to just test the performance if it's really important to you. And if the performance isn't that important, then don't worry about it.
For C, 2D arrays are 1D arrays with syntactic sugar, so the performance is identical.
You didn't mention which language this is regarding or how the 2D array would be implemented. In C 2D arrays are actually implemented as 1D arrays where C automatically performs the arithmetic on the indices to acces the right element. So it would do what your friend does anyway behind the scenes.
In other languages a 2d array might be an array of pointers to the inner arrays, in which case accessing an element would be array lookup + pointer dereference + array lookup, which is probably slower than the index arithmetic, though it would not be worth optimizing unless you know that this is a bottleneck.
oneD_index = 3 * y + x;
Where x is the position within the row and y the position in the column. Instead of 3 you use your column width. This way you can convert your 2D coordinates to a 1D coordinate.

Resources