Common Lisp: Why does lparallel have problems with assigning array elements? - arrays

I've written a function that copies many elements from one array to another. I wanted to speed it up using the (pdotimes) function from lparallel. The code looks like this:
(pdotimes (i (size output))
(setf (row-major-aref output i)
(row-major-aref input (dostuff i))))
The (dostuff) function does arithmetic on the row-major output index i to convert it to the row-major input index. When I run this function, the results tend to look like this:
#2A((9 9 9 9 9 9 9 9 9 9 5 5 5 5 5 5 5 5 5 5)
(9 9 9 9 9 9 9 9 9 9 5 5 5 5 5 5 5 5 5 5)
(9 9 9 9 9 9 9 9 9 9 5 5 5 5 5 5 5 5 5 5)
(9 9 9 9 9 9 9 9 9 9 5 5 5 5 5 5 5 5 5 5)
(9 0 9 9 9 9 9 9 9 9 5 5 0 5 5 5 5 5 5 5)
(9 9 9 9 9 9 9 9 9 9 5 5 5 5 5 5 5 5 5 5)
(9 9 9 9 9 9 9 9 9 9 5 5 5 5 5 5 5 5 5 5)
(9 9 9 9 9 9 9 9 9 9 5 5 5 5 5 5 5 5 5 5)
(9 9 9 9 9 9 0 0 9 9 5 5 5 5 5 5 5 5 5 5)
(9 9 9 9 9 9 9 9 9 9 5 5 5 5 5 5 5 5 5 5))
The function is supposed to catenate a matrix of 9s on the left and a matrix of 5s on the right. But notice that there are a few 0s in there too. Zeroes are the initial value for the output matrix, so that means that those elements didn't get assigned.
The non-assignment of elements is seemingly random; run the function many times and zeroes will appear in different places. For some reason, those elements are being missed.
I've tried wrapping the function in a future, like this:
(let ((f (future (pdotimes ...))))
(force f))
But that doesn't work either. One thing I've noticed is that the larger the number of threads and the smaller the size of the array, the more elements get missed. It suggests that the array element assignments are clobbering each other somehow.
I've also tried using (pmap-into) to map the function's results into a vector that's displaced to the output, but that fails in a different way: instead of 0s showing up where elements weren't assigned, elements get assigned in the wrong places. If the array contains repeating "1 2 3 4" sub-vectors, sometimes a "1 2 2" sequence will appear, for example.
AFAIK it should be possible for threads to concurrently assign different elements in the same array, but does Common Lisp have problems with this? Do I need to implement a lock so assignments are guaranteed to happen synchronously? If simultaneous assignments were a problem, I'd expect to see more unassigned elements. Any help appreciated.
Edit: I seem to have found how to prevent this, but not the root cause. Try running this in SBCL:
(let ((output (make-array '(20 20) :initial-element 0 :element-type '(unsigned-byte 7))))
(check-type output simple-array)
(pdotimes (i (array-total-size output) output)
(setf (row-major-aref output i)
(random-elt '(1 2 3 4 5 6)))))
No zeroes will appear in the output. Now try this in SBCL:
(let ((output (make-array '(20 20) :initial-element 0 :element-type '(unsigned-byte 4))))
(check-type output simple-array)
(pdotimes (i (array-total-size output) output)
(setf (row-major-aref output i)
(random-elt '(1 2 3 4 5 6)))))
And see zeroes aplenty. I just tested this with CCL and the output was fine. I'm going to try some other CLs but it seems like this is an SBCL problem so far. For some reason, SBCL has problems doing concurrent assignments to arrays with elements smaller than 7 bits. Character arrays are fine, as are floats and t-type arrays.

This is a slightly speculative answer, but I'm reasonably sure it's correct.
If an implementation supports arrays whose element size (in bits) is smaller than the smallest object the machine can read from and write to memory, and if it stores those arrays without wasted space (which is, really, the only purpose of having them), then the only approach to updating an array element is:
read smallest object containing element from memory;
update object with element;
write back.
Since writes to different array elements can result in reading and writing the same smallest object from memory, this is not safe in the presence of multiple threads without interlocking which would generally have catastrophic performance effects.
Probably all CL implementations have such arrays for modern machines which can't write single bits to memory, in the form of bit arrays. SBCL also has arrays of element types with 2 and 4 bits, which, assuming machines can read & write no object smaller than 8 bits are also in this area. It's also possible that arrays with very large object types could suffer from the same problem, if multiple reads & writes are required to load & store an object.
It should be possible to look at the disassembly of code that uses such arrays to see the behaviour. It's probably also the case that such arrays have lower performance than ones with larger element types (experimentally this is true for SBCL on x64: code which initialises an (unsigned-byte 4) array is 2.5 times slower than that which initialises an (unsigned-byte 8) array).
As a note, I suspect strongly the right approach to getting good performance out of array-bashing code is to partition the arrays amongst the cores in a fairly smart way.
That being said, here's a way to initialize an array of nibbles ((unsigned-byte 4)s) which I think should be safe on the assumption that the smallest object that can be written atomically is a byte. The trick is to write pairs of even-odd addresses at once:
(defun initialize-nibble-array (a)
;; the idea is to put some pattern in it I can see if it has holes
(declare (type (array (unsigned-byte 4) *) a))
(let ((s (array-total-size a)))
(pdotimes (i (truncate s 2))
(let ((rmi (* i 2)))
(setf (row-major-aref a rmi) (mod rmi 8)
(row-major-aref a (1+ rmi)) (mod (1+ rmi) 8))))
(when (oddp s)
;; if the array has an odd number of elements we've missed one
;; at the end
(setf (row-major-aref a (- s 1)) (mod (- s 1) 8)))
a))

I wrote a minimal example as follows (uses lparallel and alexandria)
(let ((output (make-array '(20 20) :initial-element '_)))
(check-type output simple-array)
(pdotimes (i (array-total-size output) output)
(setf (row-major-aref output i)
(random-elt '(a b c d e f g h)))))
And it consistently fills the output grid as follows, each time:
#2A((B G C D H A F E D C F D F G D F A C G G)
(C E D D F A H A F D G E G A C C F G E G)
(H C A E C F E H E D F G D B H B B A H D)
(D H G H H A E B G D E G D E G C E A B B)
(B E H G E E C D A H F A E C F D D A H H)
(C B D D G D H H D G H C A A H G B G C C)
(H H D D C F D B H B H G B C F G H F D E)
(F B C C A H D H G H C D G G D F E G A B)
(A E G C C H F C F C E F H H D E C H H D)
(H G H C D F G E D E C E A H C E A H H H)
(E C B E E C A D B G A F C B G A D G F D)
(H D D H A E A A G D H B H D A G A G C F)
(C D F H D G A D E C F C C D F A F F C H)
(H H D E C B C B E B B G G H H B A A E H)
(G F C C B F C D D D H F A B C F F C A B)
(D A H B B F H B B B F F H B G B H C F E)
(A G H C D H A H C H B F D D A G A E B G)
(G H A D H G B E A A B F C E G G G D E D)
(C E G F H F A A A H D D F B F C H B G B)
(H E H D D F F H E G G A A E D G C H H B))
But, 3.6 Traversal Rules and Side Effects
says that the consequences are undefined if you modify a fill-pointer (impossible for non-vectors) or adjust the array (?). But your example does not look like the array is being adjusted.
Sorry for the question but does it work with dotimes? Does my example work on your machine?

Related

Replace corresponding parts of one array with another array in R

I have a array/ named vector that looks like this:
d f g
1 2 3
I want to fill up the empty slots, meaning I want this:
a b c d e f g
0 0 0 1 0 2 3
Is there an elegant way of doing this, without having to write loops and conditionals? In my actual problem, instead of abcd as my array names, it's numbers. Not sure if that makes a difference. Figured alphabet is easier to understand for a reproducible example.
Create a vector of the final names, nms and then create a named vector of zeros from it using sapply and replace the elements corresponding to input names with the input values.
v <- c(d = 1, f = 2, g = 3) # input
nms <- letters[letters <= max(names(v))] # names on output vector, i.e. letters[1:7]
replace(sapply(nms, function(x) 0), names(v), v) ##
giving:
a b c d e f g
0 0 0 1 0 2 3
If in your actual vector the names are not letters then just set nms yourself. For example, nms <- c("dogs", "cats", "d", "elephants", "f", "g") would work with the same line marked ## above.
2) An alternative is to replace the line marked ## above with:
unlist(modifyList(as.list(setNames(numeric(length(nms)), nms)), as.list(v)))
Data
x <- c(d=1L,f=2L,g=3L);
x;
## d f g
## 1 2 3
Solution 1: First match new names into x and extract values, then replace NAs with zero.
x <- setNames(x[match(letters[1:7],names(x))],letters[1:7]);
x[is.na(x)] <- 0L;
x;
## a b c d e f g
## 0 0 0 1 0 2 3
Solution 2: One-liner, using nomatch argument of match().
setNames(c(x,0L)[match(letters[1:7],names(x),nomatch=length(x)+1L)],letters[1:7]);
## a b c d e f g
## 0 0 0 1 0 2 3

naming array from an array in GAWK

I have a file with repeating elements. I would like to assign records to an array until the file repeats, at which point I want to create a new array to assign the records to. I would like to do this an arbitrary amount of times.
for example.
$ cat repeat.txt
a
b
c
d
e
f
g
a
b
c
d
e
f
g
a
b
c
d
e
f
g
I want the output to be something like this
0 a a a
1 b b b
2 c c c
3 d d d
4 e e e
5 f f f
6 g g g
right now I am doing this with this hideous code.
awk 'BEGIN{n=0;z=0}
$1~"a" {n=0;z++}
z==1{a[n]=$0}
z==2{b[n]=$0}
z==3{c[n]=$0}
z==4{d[n]=$0}
z==5{e[n]=$0}
z==6{f[n]=$0}
{n++}
END{for (i in a)
print i,a[i],b[i],c[i],d[i],e[i],f[i],g[i],h[i],k[i],j[i]}'
repeat.txt
I would like the assignment of new arrays to be automatic.
I attempted this by the following
echo "abcdefghijklmopqrstuvwxyz" > alphabet.txt
awk 'BEGIN{N=0}
NR==FNR{FS=""}
NR==FNR{for (zz=0;zz<=NF;zz++) a[zz]=$zz; next}
NR!=FNR{FS="\t"}
NR!=FNR{if ($0~a) N++; (a[N])[N]=$0}
END{for (I in (a[N])) print I,(a[N])[I]}' alphabet.txt repeat.txt
but this didn't work because you can't do multidimensional arrays like this in gawk. I can't think of another way to do this.

Rowmax as new column in data table

I have rank scores of countries for different variables.
I would like to create a column with the maximum rank that occurs per row.
Say the data look something like:
A B C D E F G H I ....
V1 1 4 5 3 12 . 6 9 83
V2 . . 4 6 1 4 7 6 32
So A - X are countries. In rows V1 up you have various variables and in the cells you have the rank score relating to the variable.
Problem is that some countries for whatever reasons don´t score in relation to certain variables, perhaps because V1 is not relevant to country C or whatever.
So in the end I´d like something like
A B C D E F G H I .... newv
V1 1 4 5 3 12 . 6 9 83 83
V2 . . 4 6 1 4 7 6 5 6
I think egen newvar=rowmax(A B C D E F G H I…) does what you need. Have a look at the egen help file for more information. (I presume you need value 7 in the second row, not 6?)

Matlab reshape horizontal cat

Hi I want to reshape a matrix but the reshape command doesn't order the elements the way I want it.
I have matrix with elements:
A B
C D
E F
G H
I K
L M
and want to reshape it to:
A B E F I K
C D G H L M
So I know how many rows I want to have (in this case 2) and all "groups" of 2 rows should get appended horizontally. Can this be done without a for loop?
You can do it with two reshape and one permute. Let n denote the number of rows per group:
y = reshape(permute(reshape(x.',size(x,2),n,[]),[2 1 3]),n,[]);
Example with 3 columns, n=2:
>> x = [1 2 3; 4 5 6; 7 8 9; 10 11 12]
x =
1 2 3
4 5 6
7 8 9
10 11 12
>> y = reshape(permute(reshape(x.',size(x,2),n,[]),[2 1 3]),n,[])
y =
1 2 3 7 8 9
4 5 6 10 11 12
Cell array approach -
mat1 = rand(6,2) %// Input matrix
nrows = 3; %// Number of rows in the output
[m,n] = size(mat1);
%// Create a cell array each cell of which is a (nrows x n) block from the input
cell_array1 = mat2cell(mat1,nrows.*ones(1,m/nrows),n);
%// Horizontally concatenate the double arrays obtained from each cell
out = horzcat(cell_array1{:})
Output on code run -
mat1 =
0.5133 0.2916
0.6188 0.6829
0.5651 0.2413
0.2083 0.7860
0.8576 0.3032
0.1489 0.4494
out =
0.5133 0.2916 0.5651 0.2413 0.8576 0.3032
0.6188 0.6829 0.2083 0.7860 0.1489 0.4494

Rectangle intersection algorithm [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
Hi I tried a question to check whether two rectangles intersect or not
I have written code if rectangles are parallel to x-axis
struct point
{
int x;
int y;
};
struct rect
{
struct point left;
struct point right;
};
//1 - intersection
// 0- no intersection
int rectintersectioncheck(struct rect r1,struct rect r2)
{
int x_check = (r1.left.x > r2.right.x || r2.left.x > r1.right.x);
int y_check = (r1.right.y > r2.left.y || r2.right.y > r1.left.y);
if(x_check && y_check )
{
return 0;
}
return 1;
}
its working fine for this case but i am confused for algo in case of rectangle not parallel to x-axis
as only top left,right bottm points are givenn
please help?
A clarification first. If p1 and p2 are the top-left and bottom-right points of a rectangle, then the rectangle must be parallel to x axis (and y axis). So there is only exactly one rectangle satisfying these conditions. If the rectangle is not parallel to x axis, then the bottom cannot become right point simultaneously.
Since we are talking about rectangles that are not exactly parallel to x axis, let us drop that definition. Let us talk about rectangles whose two opposing vertices are p1 and p2 (not necessarily top-left and bottom-right).
Let p1 and p2 define the first rectangle, and p3 and p4 define the second rectangle.
If you take the union of all rectangle whose opposite corners are p1 and p2, you get a circle (with (p1+p2)/2 as center and |p1−p2| as diameter).
There are three cases:
If the p1–p2 line segment intersects the p3–p4 line segment, then the rectangles always intersect.
If the circle corresponding to p1,p2 intersects the circle corresponding to p3,p4, then those rectangles sometimes intersect.
Otherwise those rectangles never intersect.
#ELKamina: the circle approach you discussed is quite fine, but it can be really hard to distinguish in the cases where cirles intersect but rectangles don't.
I have got his idea in mind, glad to share it.
Why don't we contruct our rectangles in array in the specific cases to find out they intersect or not.
eg. rect 1- points (1,3)(3,1)(6,4)(4,6) rect2 points- (4,0)(5,0)(5,1)(4,1)
array represntation array representation
6 [F F F F # F F F] 6 [F F F F F F F F]
5 [F F F # # # F F] 5 [F F F F F F F F]
4 [F F # # # # # F] 4 [F F F F F F F F]
3 [F # # # # # F F] 3 [F F F F F F F F]
2 [F F # # # F F F] 2 [F F F F F F F F]
1 [F F F # F F F F] 1 [F F F F # # F F]
0 [F F F F F F F F] 0 [F F F F # # F F]
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
in the above case circles intersect but rectangles don't.
eg
g. rect 1- points (1,3)(3,1)(6,4)(4,6) rect2 points- (3,0)(4,0)(3,1)(4,1)
array represntation array representation
6 [F F F F # F F F] 6 [F F F F F F F F]
5 [F F F # # # F F] 5 [F F F F F F F F]
4 [F F # # # # # F] 4 [F F F F F F F F]
3 [F # # # # # F F] 3 [F F F F F F F F]
2 [F F # # # F F F] 2 [F F F F F F F F]
1 [F F F # F F F F] 1 [F F F # # F F F]
0 [F F F F F F F F] 0 [F F F # # F F F]
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
in the above case (3,1) has same value so it can be found that they intersect.
Similar representations can be used to check whether triangles intersect or not.
I will sketch out the answer.
Rotate both the rectangles so that one is in line with the x axis
You can then work out the formula (y=mx+c) for the edges
You will also know the formula for the sides of the other rectangle
See if any intersect.
The rotation can be performed using the link previously posed.
EDIT
Forgot translation - shift one rectangle to have 0,0 as one coordinate.

Resources