Efficient method of locating adjacent elements in a matrix - arrays

I'm trying to write some code that will locate the elements that are orthogonal to a given entry in a matrix. The output needs to include what the elements are themselves, their indices, and the algorithm needs to work for the edges as well. For example, consider
A = [ 1 2 5 6 7
2 3 1 6 9
3 6 7 8 1 ]
Then if I want the elements adjacent to entry (2,2), the code would return:
[2,2,1,6] %-> these elements are orthogonal to entry (2,2)
[w,x,y,z] %-> where w,x,y,z correspond to the index of the orthogonal entries
%found (they can be linear indices or otherwise).
So I implemented my own function to do this and well, I realized its pretty bad. It doesn't seem to consistently work for the edges (although I could try padding the matrix and see if that fixes it-I haven't had a chance yet), and importantly, my code loops over the entire freaking matrix. And so its very inefficient. I was wondering if someone had a quick, efficient way of doing what I've outlined above? MATLAB doesn't seem to have a function for doing this-I've checked.
I'd appreciate the help.

for(int i = row-1; i <= row+1; i += 2) {
for(int j = col-1; j <= col+1; j += 2) {
if(row>=0 && col>=0 && row < MATRIX_SIZE && col < MATRIX_SIZE)
std::cout << mat[row, col];
}
}
this in a example in c++. the output will not be very clear but this is just an example. in programming its assumed the rows/cols in the matrix starts from 0 (not 1) so in your example, the solution you gave will fit the input (1,1) and not (2,2). the run time is O(1) of course.
row = given row argument (for example 1)
col = given column argument (for example also 1)
MATRIX_SIZE = the size of the matrix: if the matrix is nxn then MATRIX_SIZE = n, and the last index in each row/col of the matrix is n-1.

If your 2D matrix contains Wdt columns and Hgt rows, then indexes of neighbours of k-th element are
top = k - Wdt // if k > Wdt
bottom = k + Wdt // if k <= Wdt * Hgt - Wdt
right = k + 1 // if (k - 1) mod Wdt > 0
left = k - 1 // if (k - 1) mod Wdt < Wdt - 1
if-expressions are intended to exclude off-edge elements

Related

Obtaining a submatrix from a squared matrix in C

I want to find a way to obtain a submatrix from an initial bigger squared matrix in c , more specifically , the bottom right submatrix . Then i want the for cycle to give me all the submatrix that i can obtain from the original squared matrix.
I found some code online :
int initial_matrix[3][3]
int submatrix[2][2];
for( int i = 0; i < R - r + 1; i++){
for(int j = 0; j < C - c + 1; j++){
submatrix[i][j]=initial_matrix[i][j]
}
}
where :
R is the number of rows of the initial matrix (so in this case R=3)
C is the number of columns of the initial matrix (so in this case C=3)
r is the number of rows of the submatrix that i want to obtain (so in this case r=2)
c is the number of columns of the submatrix that i want to obtain (so in this case c=2)
But this cycle only gives me the upper left submatrix , while I want the bottom right and then expand it so that it gives me all the possible submatrix of the initial matrix.
At first your indices in the loops are not correct! You want to fill in your target matrix rows from indices 0 to r and columns from indices 0 to c so your loops need to look like:
for(size_t i = 0; i < r; ++i)
{
for(size_t j = 0; j < c; ++j)
{
// ...
}
}
From here on you now can simply assign your target matrix at indices i and j:
submatrix[i][j] = ...;
Problem is that these are not the same indices as in your source matrix (unless you want to use the top-left corner submatrix), so you need to translate the indices to the appropriate positions within the source matrix. Luckily, this is not difficult; if the top-left corner of the submatrix within the source matrix is at row r0 and column c0 then you simply need to add these to the corresponding loop indices, thus you get:
... = initial_matrix[r0 + i][c0 + j];
In your case this would mean e.g. [1 + i][1 + j] to get the bottom-right submatrix with both i and j counting up from 0 to excluded 2 (i.e. counting 0 and 1).

Algorithm to find k smallest numbers in an array in same order using O(1) auxiliary space

For example if the array is arr[] = {4, 2, 6, 1, 5},
and k = 3, then the output should be 4 2 1.
It can be done in O(nk) steps and O(1) space.
Firstly, find the kth smallest number in kn steps: find the minimum; store it in a local variable min; then find the second smallest number, i.e. the smallest number that is greater than min; store it in min; and so on... repeat the process from i = 1 to k (each time it's a linear search through the array).
Having this value, browse through the array and print all elements that are smaller or equal to min. This final step is linear.
Care has to be taken if there are duplicate values in the array. In such a case we have to increment i several times if duplicate min values are found in one pass. Additionally, besides min variable we have to have a count variable, which is reset to zero with each iteration of the main loop, and is incremented each time a duplicate min number is found.
In the final scan through the array, we print all values smaller than min, and up to count values exactly min.
The algorithm in C would like this:
int min = MIN_VALUE, local_min;
int count;
int i, j;
i = 0;
while (i < k) {
local_min = MAX_VALUE;
count = 0;
for (j = 0; j < n; j++) {
if ((arr[j] > min || min == MIN_VALUE) && arr[j] < local_min) {
local_min = arr[j];
count = 1;
}
else if ((arr[j] > min || min == MIN_VALUE) && arr[j] == local_min) {
count++;
}
}
min = local_min;
i += count;
}
if (i > k) {
count = count - (i - k);
}
for (i = 0, j = 0; i < n; i++) {
if (arr[i] < min) {
print arr[i];
}
else if (arr[i] == min && j < count) {
print arr[i];
j++;
}
}
where MIN_VALUE and MAX_VALUE can be some arbitrary values such as -infinity and +infinity, or MIN_VALUE = arr[0] and MAX_VALUE is set to be maximal value in arr (the max can be found in an additional initial loop).
Single pass solution - O(k) space (for O(1) space see below).
The order of the items is preserved (i.e. stable).
// Pseudo code
if ( arr.size <= k )
handle special case
array results[k]
int i = 0;
// init
for ( ; i < k, i++) { // or use memcpy()
results[i] = arr[i]
}
int max_val = max of results
for( ; i < arr.size; i++) {
if( arr[i] < max_val ) {
remove largest in results // move the remaining up / memmove()
add arr[i] at end of results // i.e. results[k-1] = arr[i]
max_val = new max of results
}
}
// for larger k you'd want some optimization to get the new max
// and maybe keep track of the position of max_val in the results array
Example:
4 6 2 3 1 5
4 6 2 // init
4 2 3 // remove 6, add 3 at end
2 3 1 // remove 4, add 1 at end
// or the original:
4 2 6 1 5
4 2 6 // init
4 2 1 // remove 6, add 1 -- if max is last, just replace
Optimization:
If a few extra bytes are allowed, you can optimize for larger k:
create an array size k of objects {value, position_in_list}
keep the items sorted on value:
new value: drop last element, insert the new at the right location
new max is the last element
sort the end result on position_in_list
for really large k use binary search to locate the insertion point
O(1) space:
If we're allowed to overwrite the data, the same algorithm can be used, but instead of using a separate array[k], use the first k elements of the list (and you can skip the init).
If the data has to be preserved, see my second answer with good performance for large k and O(1) space.
First find the Kth smallest number in the array.
Look at https://www.geeksforgeeks.org/kth-smallestlargest-element-unsorted-array-set-2-expected-linear-time/
Above link shows how you can use randomize quick select ,to find the kth smallest element in an average complexity of O(n) time.
Once you have the Kth smallest element,loop through the array and print all those elements which are equal to or less than Kth smallest number.
int small={Kth smallest number in the array}
for(int i=0;i<array.length;i++){
if(array[i]<=small){
System.out.println(array[i]+ " ");
}
}
A baseline (complexity at most 3n-2 for k=3):
find the min M1 from the end of the list and its position P1 (store it in out[2])
redo it from P1 to find M2 at P2 (store it in out[1])
redo it from P2 to find M3 (store it in out[0])
It can undoubtedly be improved.
Solution with O(1) space and large k (for example 100,000) with only a few passes through the list.
In my first answer I presented a single pass solution using O(k) space with an option for single pass O(1) space if we are allowed to overwrite the data.
For data that cannot be overwritten, ciamej provided a O(1) solution requiring up to k passes through the data, which works great.
However, for large lists (n) and large k we may want a faster solution. For example, with n=100,000,000 (distinct values) and k=100,000 we would have to check 10 trillion items with a branch on each item + an extra pass to get those items.
To reduce the passes over n we can create a small histogram of ranges. This requires a small storage space for the histogram, but since O(1) means constant space (i.e. not depending on n or k) I think we're allowed to do that. That space could be as small as an array of 2 * uint32. Histogram size should be a power of two, which allows us to use bit masking.
To keep the following example small and simple, we'll use a list containing 16-bit positive integers and a histogram of uint32[256] - but it will work with uint32[2] as well.
First, find the k-th smallest number - only 2 passes required:
uint32 hist[256];
First pass: group (count) by multiples of 256 - no branching besides the loop
loop:
hist[arr[i] & 0xff00 >> 8]++;
Now we have a count for each range and can calculate which bucket our k is in.
Save the total count up to that bucket and reset the histogram.
Second pass: fill the histogram again,
now masking the lower 8 bits and only for the numbers belonging in that range.
The range check can also be done with a mask
After this last pass, all values represented in the histogram are unique
and we can easily calculate where our k-th number is.
If the count in that slot (which represents our max value after restoring
with the previous mask) is higher than one, we'll have to remember that
when printing out the numbers.
This is explained in ciamej's post, so I won't repeat it here.
---
With hist[4] and a list of 32-bit integers we would need 8 passes.
The algorithm can easily be adjusted for signed integers.
Example:
k = 7
uint32_t hist[256]; // can be as small as hist[2]
uint16_t arr[]:
88
258
4
524
620
45
440
112
380
580
88
178
Fill histogram with:
hist[arr[i] & 0xff00 >> 8]++;
hist count
0 (0-255) 6
1 (256-511) 3 -> k
2 (512-767) 3
...
k is in hist[1] -> (256-511)
Clear histogram and fill with range (256-511):
Fill histogram with:
if (arr[i] & 0xff00 == 0x0100)
hist[arr[i] & 0xff]++;
Numbers in this range are:
258 & 0xff = 2
440 & 0xff = 184
380 & 0xff = 124
hist count
0 0
1 0
2 1 -> k
... 0
124 1
... 0
184 1
... 0
k - 6 (first pass) = 1
k is in hist[2], which is 2 + 256 = 258
Loop through arr[] to display the numbers <= 258 in preserved order.
Take care of possible duplicate highest numbers (hist[2] > 1 in this case).
we can easily calculate how many we have to print of those.
Further optimization:
If we can expect k to be in the lower ranges, we can even optimize this further by using the log2 values instead of fixed ranges:
There is a single CPU instruction to count the leading zero bits (or one bits)
so we don't have to call a standard log() function
but can call an intrinsic function instead.
This would require hist[65] for a list with 64-bit (positive) integers.
We would then have something like:
hist[ 64 - n_leading_zero_bits ]++;
This way the ranges we have to use in the following passes would be smaller.

The greatest sum in part of matrix array

I am stuck at a simple problem, I am looking for a better solution than my.
I have an integers matrix array (tab[N][M]) and integer (k) and I have to find the smallest rectangle (sub matrix array) that has sum of it's elements greater then k
So, my current attempt of a solution is:
Make additional matrix array (sum[N][M]) and integer solution = infinity
For each 1 < i <= N + 1 and 1 < j <= M + 1
sum[ i ][ j ] = sum[ i - 1 ][ j ] + sum [ i ][ j - 1] + tab[ i ] [ j ] - sum[ i - 1] [ j - 1]
Then look on each rectangle f.e rectangle that starts at (x, y) and ends (a, b)
Rectangle_(x,y)_(a,b) = sum[ a ][ b ] - sum[ a - x ] [ b ] - sum[ a ][ b - y ] + sum[ a - x ][ b - y ]
and if Rectangle_(x,y)_(a,b) >= k then solution = minimum of current_solution and (a - x) * (b - y)
But this solution is quite slow (quartic time), is there any possibility to make it faster? I am looking for iterated logarithmic time (or worse/better). I managed to reduce my time , but not substantially.
If the matrix only contains values >= 0, then there is a linear time solution in the 1D case that can be extended to a cubic time solution in the 2D case.
For the 1D case, you do a single pass from left to right, sliding a window across the array, stretching or shrinking it as you go so that the numbers contained in the interval always sum to at least k (or breaking out of the loop if this is not possible).
Initially, set the left index bound of the interval to the first element, and the right index bound to -1, then in a loop:
Increment the right bound by 1, and then keep incrementing it until either the values inside the interval sum to > k, or end of the array is reached.
Increment the left bound to shrink the interval as small as possible without letting the values sum to less than or equal to k.
If the result is a valid interval (meaning the first step did not reach the end of the array without finding a valid interval) then compare it to the smallest so far and update if necessary.
This doesn't work if negative values are allowed, because in the second step you need to be able to assume that shrinking the interval always leads to a smaller sum, so when the sum dips below k you know that's the smallest possible for a given interval endpoint.
For the 2D case, you can iterate over all possible sub-matrix heights, and over each possible starting row for a given height, and perform this horizontal sweep for each row.
In pseudo-code:
Assume you have a function rectangle_sum(x, y, a, b) that returns the sum of the values from (x, y) to (a, b) inclusive and runs in O(1) time used a summed area table.
for(height = 1; height <= M; height++) // iterate over submatrix heights
{
for(row = 0; row <= (M-h); row++) // iterate over all rows
{
start = 0; end = -1; // initialize interval
while(end < N) // iterate across the row
{
valid_interval = false;
// increment end until the interval sums to > k:
while(end < (N-1))
{
end = end + 1;
if(rectangle_sum(start, row, end, row + height) > k)
{
valid_interval = true;
break;
}
}
if(!valid_interval)
break;
// shrink interval by incrementing start:
while((start < end) &&
rectangle_sum(start+1, row, end, row + height) > k))
start = start + 1;
compare (start, row), (end, row + height) with current smallest
submatrix and make it the new current if it is smaller
}
}
}
I have seen a number of answers to matrix rectangle problems here which worked by solving a similar 1-dimensional problem and then applying this to every row of the matrix, every row formed by taking the sum of two adjacent rows, every sum from three adjacent rows, and so on. So here's an attempt at finding the smallest interval in a line which has at least a given sum. (Clearly, if your matrix is tall and thin instead of short and fat you would work with columns instead of rows)
Work from left to right, maintaining the sums of all prefixes of the values seen so far, up to the current position. The value of an interval ending in a position is the sum up to and including that position, minus the sum of a prefix which ends just before the interval starts. So if you keep a list of the prefix sums up to just before the current position you can find, at each point, the shortest interval ending at that point which passes your threshold. I'll explain how to search for this efficiently in the next paragraph.
In fact, you probably don't need a list of all prefix sums. Smaller prefix sums are more valuable, and prefix sums which end further along are more valuable. So any prefix sum which ends before another prefix sum and is also larger than that other prefix sum is pointless. So the prefix sums you want can be arranged into a list which retains the order in which they were calculated but also has the property that each prefix sum is smaller than the prefix sum to the right of it. This means that when you want to find the closest prefix sum which is at most a given value you can do this by binary search. It also means that when you calculate a new prefix sum you can put it into its place in the list by just discarding all prefix sums at the right hand end of the list which are larger than it, or equal to it.

Minimum difference between heights of Towers?

I was going through some interview questions, I saw this one
You are given the height of n towers and value k. You have to either increase or decrease the height of every tower by k. You need to minimize the difference between the height of the longest and the shortest tower and output this difference.
I think the answer will be (maxheight-k) - (minheight + k).
I have tried on some test cases it is running fine.
But I am not sure, I think I am missing something, Am I ?
m7thon's answer explains the problem with your solution, so I'll just explain how you can actually solve this . . .
The big thing to observe is that for any given tower, if you choose to increase its height from hi to hi + k, then you might as well increase the height of all shorter towers: that won't affect the maximum (because if hj < hi, then hj + k < hi + k), and may help by increasing the minimum. Conversely, if you choose to decrease the height of a tower from hi to hi − k, then you might as well decrease the heights of all taller towers.
So while there are 2n possible ways to choose which towers should be increased vs. decreased, we can actually ignore most of these. Some tower will be the tallest tower that we increase the height of; for all shorter towers, we will increase their height as well, and for all taller towers, we will decrease their height. So there are only n interesting ways to choose which towers should be increased vs. decreased: one for each tower's chance to be the tallest tower that we increase the height of.
[Pedantic note #1: You may notice that it's also valid to decrease the heights of all towers, in which case there's no such tower. But that's equivalent to increasing the heights of all towers — whether we add k to every height or subtract k from every height, either way we're not actually changing the max-minus-min.]
[Pedantic note #2: I've only mentioned "shorter towers" and "taller towers", but it's also possible that multiple towers have the same initial height. But this case isn't really important, because we might as well increase them all or decrease them all — there's no point increasing some and decreasing others. So the approach described here still works fine.]
So, let's start by sorting the original heights and numbering them in increasing order, so that h1 is the original height of the originally-shortest tower and hn is the original height of the originally-tallest tower.
For each i, try the possibility that the ith-shortest tower is the tallest tower that we increase the height of; that is, try the possibility that we increase h1 through hi and decrease hi+1 through hn. There are two groups of cases:
If i < n, then the final height of the finally-shortest tower is min(h1 + k, hi+1 − k), and the final height of the finally-tallest tower is max(hi + k, hn − k). The final difference in this case is the latter minus the former.
If i = n, then we've increased the heights of all towers equally, so the final difference is just hn − h1.
We then take the least difference from all n of these possibilities.
Here's a Java method that implements this (assuming int-valued heights; note that hi is arr[i-1] and hi+1 is arr[i]):
private static int doIt(final int[] arr, final int k) {
java.util.Arrays.sort(arr);
final int n = arr.length;
int result = arr[n - 1] - arr[0];
for (int i = 1; i < n; ++i) {
final int min = Math.min(arr[0] + k, arr[i] - k);
final int max = Math.max(arr[n - 1] - k, arr[i - 1] + k);
result = Math.min(result, max - min);
}
return result;
}
Note that I've pulled the i = n case before the loop, for convenience.
Lets say you have three towers of heights 1, 4 and 7, and k = 3. According to your reasoning the optimal minimum difference is (7 - 3) - (1 + 3) = 0. But what do you do with the tower of height 4? You either need to increase or decrease this, so the minimum difference you can achieve is in fact 3 in this example.
Even if you are allowed to keep a tower at its height, then the example 1, 5, 7 will disprove your hypothesis.
I know this does not solve the actual minimization problem, but it does show that it is not as simple as you thought. I hope this answers your question "Am I missing something?".
I assume this came from gfg.
The answer of #ruakh is perhaps the best I've found online, it'll work for most cases, but for the practice problem on gfg, there are a few cases which can cause the minimum to go below 0, and the question doesn't allow any height to be < 0.
So for that, you'll need an additional check, and the rest of it is pretty much entirely inspired from ruakh's answer
class Solution {
int getMinDiff(int[] arr, int n, int k) {
Arrays.sort(arr);
int ans = arr[n-1] - arr[0];
int smallest = arr[0] + k, largest = arr[n-1]-k;
for(int i = 0; i < n-1; i++){
int min = Math.min(smallest, arr[i+1]-k);
int max = Math.max(largest, arr[i]+k);
if(min < 0) continue;
ans = Math.min(ans, max-min);
}
return ans;
}
}
I also went in for 0-based indexing for the heights to make it more obvious, but maybe that's subjective.
Edit: One case where the < 0 check is important, is when the array is
8 1 5 4 7 5 7 9 4 6 and k is 5. The expected answer for this is 8, without the < 0 check, you'd get 7.
A bit late here. Already these guys have explained you the problem and given you the solution. However, I have prepared this code myself. The code I prepared is not the best code that you should follow but gives a clear understanding of what can be done to achieve this using brute-force.
set = list(map(int, input().split()))
k = int(input())
min = 999999999
for index in range(2**len(set)):
binary = [] //probably should have used integer to binary fuction here
while index != 0:
remainder = index % 2
index //= 2
binary.append(remainder)
while len(binary) != len(set):
binary.append(0)
binary.reverse()
separateset = []
flag = 0
for i in range(len(binary)):
if binary[i] == 0:
separateset.append(set[i]+k)
elif binary[i] == 1 and set[i]-k >= 0:
separateset.append(set[i]-k)
else:
flag = 1
break
if flag == 0:
separateset.sort()
if min > separateset[-1] - separateset[0]:
min = separateset[-1] - separateset[0]
print(min)
This is achieved by identifying all the possible subsets of the set variable but with just some modifications. If the digit is 0, the value at that i (index not the index in the for loop) in the set is added with k otherwise if the digit is 1 and set[i]-k >= 0, the value at that index in the set is subtracted by k (Now you can add or subtract k vice-versa, it doesn't matter until you get all possible combinations of +k and -k). set[i]-k >= 0 is to be followed because a negative height wouldn't make sense and if that happens, flag becomes 1 and breaks. But if the flag is 0, it means that all the heights are positive and then the separatesort is sorted and then min stores the difference between the largest and shortest tower. This min ultimately has the minimum of all the differences.
Step 1 :
Decrease all the heights by 'k' and sort them in non-decreasing order.
Step 2 :
We need to increase some subset of heights by '2 * k' (as they were decreased by
'k' in step1, so, to effectively increase their heights by 'k' we need
to add '2*k') .
Step 3 :
Clearly if we increase the 'i'th height without increasing the
'i-1'th then, it will not be useful as the minimum is still the same
and maximum may also increase !
Step 4 :
Consider all prefixes with '2*k' added to each element of the prefix .
Then calculate and update the (max - min) value.
Let me know which scenario am I missing here,
class Solution:
def getMinDiff(self, arr, n, k):
for i in range(n):
if arr[i] < k: arr[i] += k
else: arr[i] -= k
result = max(arr) - min(arr)
print('NewArr: {}\nresult: {}'.format(arr, result))
return result
C++ code :
int getMinDiff(int arr[], int n, int k) {
// code here
sort(arr , arr+n);
int result = arr[n - 1] - arr[0];
for (int i = 1; i < n; ++i) {
int min_ = min(arr[0] + k, arr[i] - k);
int max_ = max(arr[n - 1] - k, arr[i - 1] + k);
result = min(result, max_ - min_);
}
return result;
}
First you are gonna need to find average height of the towers.
lets say heights are 3, 7, 17, 25, 45 and k = 5
the average will be = ( 3 + 7 + 17 + 25 + 45 ) / 5 = 97 / 5 = 19.4
Now we will try to make every building closer to average height.
for 3 height tower we have to add 5 three times making height = 3 + (3*5) = 18 (18 is closer than 23) close to average.
for 7 height we will add 5 two times = 7 + (2 *5) = 17 (17 is closer than 22)
Similarly 25 will become 25 - 5 = 20
and 45 will become 45 - (5 *5) = 20
your height will becomes 18, 17, 17, 20, 20
This approach works on GfG practice, Problem link: https://practice.geeksforgeeks.org/problems/minimize-the-heights/0/
Approach :
Find max, min element from the array. O(n). Take the average, avg = (max_element + min_element)/2.
Iterate over the array again, and for each element, check if it is less than avg or greater.
If the current element arr[i] is less than avg, then add "k" to a[i], i.e a[i] = a[i] + k.
If the current element arr[i] is greater than or equal to avg, then subtract k from a[i], i.e a[i] = a[i] - k;
Find out the minimum and maximum element again from the modified array.
Return the min(max1-min1, max2-min2), where (max1, min1) = max and min elements initially before the array was modified, and (max2, min2) are the max and min elements after doing modification.
Entire Code can be found here: https://ide.geeksforgeeks.org/56qHOM0EOA

Transpose a 1 dimensional array, that does not represent a square, in place

This question is similar to this, but instead of an array that represents a square, I need to transpose a rectangular array.
So, given a width: x and a height: y, my array has x*y elements.
If width is 4 and height is 3, and I have:
{0,1,2,3,4,5,6,7,8,9,10,11}
which represents the matrix:
0 1 2 3
4 5 6 7
8 9 10 11
I would like:
{0,4,8,1,5,9,2,6,10,3,7,11}
I know how to do it by making a new array, but I'd like to know how to do it in place like the solution for the previously mentioned question.
A simple way to transpose in place is to rotate each element into place starting from the back of the matrix. You only need to rotate a single element into place at a time, so for the example, starting with [0,1,2,3,4,5,6,7,8,9,a,b], you get:
0,1,2,3,4,5,6,7,8,9,a,b, // step 0
,b, // step 1
,8,9,a,7, // step 2
4,5,6,8,9,a,3, // step 3
,a, // step 4
,8,9,6, // step 5
,4,5,8,9,2, // step 6
,9, // step 7
,8,5, // step 8
,4,8,1, // step 9
,8, // step 10
,4, // step 11
0, // step 12
(This just shows the elements rotated into their final position on each step.)
If you write out how many elements to rotate for each element (from back to front), it forms a nice progression. For the example (width= 4, height= 3):
1,4,7,1,3,5,1,2,3,1,1,1
Or, in a slightly better structured way:
1,4,7,
1,3,5,
1,2,3,
1,1,1
Rotations of 1 element are effectively no-ops, but the progression leads to a very simple algorithm (in C++):
void transpose(int *matrix, int width, int height)
{
int count= width*height;
for (int x= 0; x<width; ++x)
{
int count_adjustment= width - x - 1;
for (int y= 0, step= 1; y<height; ++y, step+= count_adjustment)
{
int last= count - (y+x*height);
int first= last - step;
std::rotate(matrix + first, matrix + first + 1, matrix + last);
}
}
}
One way to do this, is to move each existing element of the original matrix to its new position, taking care to pick up the value at the destination index first, so that it can also be moved to its new position. For an arbitrary NxM matrix, the destination index of an element at index X can be calculated as:
X_new = ((N*X) / (M*N)) + ((N*X) % (M*N))
where the "/" operator represents integer division (the quotient) and the "%" is the modulo operator (the remainder) -- I'm using Python syntax here.
The trouble is that you're not guaranteed to traverse all the elements in your matrix if you start at an arbitrary spot. The easiest way to work around this, is to maintain a bitmap of elements that have been moved to their correct positions.
Here's some Python code that achieves this:
M = 4
N = 3
MN = M*N
X = range(0,MN)
bitmap = (1<<0) + (1<<(MN-1))
i = 0
while bitmap != ( (1<<MN) - 1):
if (bitmap & (1<<i)):
i += 1
xin = X[i]
i = ((N*i)/MN) + ((N*i) % MN)
else:
xout = X[i]
X[i] = xin
bitmap += (1<<i)
i = ((N*i)/MN) + ((N*i) % MN)
xin = xout
print X
I've sacrificed some optimisation for clarity here. It is possible to use more complicated algorithms to avoid the bitmap -- have a look at the references in the related Wikipedia article if you're really serious about saving memory at the cost of computation.

Resources