Variable length array estension using SIMD operation - arrays

I would like to do the following array extension using SIMD intrinsic.
I have two arrays:
cluster value (v_i): 10, 20, 30, 40
cluster length (l_i): 3, 2, 1, 2
I would like to create a resultant array containing the values: v_i repeated for l_i times, i.e:
result: 10, 10, 10, 20, 20, 30, 40, 40.
How can I compute this using SIMD intrinsic?

This may be optimized by SIMD if input array size is up to 8, output array size up to 16, and bytes as array values. At least SSSE3 is required. Extending this approach to larger arrays/elements is possible but efficiency will quickly degrade.
Compute prefix sum of array lengths. This may be quickly done if you reinterpret byte array of lengths as a single 64-bit (32-bit) word, multiply it by 0x101010101010100, and store the result in SIMD register.
Fill array of indexes (in single SIMD register) with average index (half-size of the array of prefix sums).
Perform binary search for proper index for each byte of index register (in parallel). This may be done by extracting appropriate byte of prefix sum register with PSHUFB instruction, comparing extracted prefix value with byte number using PCMPGTB (and optionally with PCMPEQB), then adding/subtracting half of index range.
(Optionally) fill all unused bytes of index register with 0xff.
Use PSHUFB to fill some register with values from cluster value array indexed by the index register.
Main loop of the algorithm (binary search) contains PSHUFB, PCMPGTB, and a few arithmetical and logical operations. It is executed log(input_array_size) times, most likely 2 or 3 times.
Here is an example:
cluster value: 10 20 30 40
cluster length: 3 2 1 2
prefix sum: 0 3 5 6 8
indexes: 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
prefix value: 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
byte number: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
mask: ff ff ff ff ff 0 0 0 0 0 0 0 0 0 0 0
indexes: 1 1 1 1 1 3 3 3 3 3 3 3 3 3 3 3
prefix value: 3 3 3 3 3 6 6 6 6 6 6 6 6 6 6 6
byte number: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
mask: ff ff ff 0 0 ff 0 0 0 0 0 0 0 0 0 0
indexes: 0 0 0 1 1 2 3 3 3 3 3 3 3 3 3 3
length constrained: 0 0 0 1 1 2 3 3 ff ff ff ff ff ff ff ff
cluster value: 10 10 10 20 20 30 40 40 0 0 0 0 0 0 0 0

Related

Sum row vectors IF two or more rows in given column match (MATLAB)

I have a 48x202 matrix, where the first columns in the matix is an ID, and the rest of the columns is related vectors to the row ID in the first column.
The ID column is sorted in acending order, and multiple rows can have the same ID.
I want to summarize all IDs that are equal, meaning that i want to sum the rows in the matrix who has identical ID in the first column.
The resulting matrix should be 32x202, since there are only 32 IDs.
Any ideas?
I'd totally approach this with accumarray as well as unique. Like the previous answer, let A be your matrix. You would obtain your answer thusly:
[vals,~,id] = unique(A(:,1),'stable');
B = accumarray(id, (1:numel(id)).', [], #(x) {sum(A(x,2:end),1)});
out = [vals cell2mat(B)];
The first line of code produces vals which is a list of all unique IDs seen in the first column of A and id assigns a unique integer ID without any gaps from 1 up to as many unique IDs there are in the first column of A. The reason why you want to do this is for the next line of code.
How accumarray works is that you provide a set of keys and a set of values associated with each key. accumarray groups all values that belong to the same key and does something to all of the values. The keys in our case is the IDs given in the first column of A and the values are the actual row locations of the matrix A from 1 up to as many rows as A. Now, the default behaviour when collecting all of the values together is to sum all of the values that belong to the same key together, but we're going to do something a bit different. What we'll do is that for each unique ID seen in the first column of A, there will be a bunch of row locations that map to the same ID. We're going to use these row locations and will access the matrix A and sum all of the columns from the second column to the end. That's what the anonymous function in the fourth argument of accumarray is doing. accumarray traditionally should output a single value representing all of the values mapped to a key, but we get around this by outputting a single cell, where each cell entry is the row sum of the mapped columns.
Each element of B gives you the row sum for each corresponding unique value in vals and so the last line of code pieces these together - the unique value in vals with the corresponding row sum. I had to use cell2mat because this was a matrix of cells and I had to convert all of these into a numerical matrix to complete the task.
Here's an example seeing this in action. I'm going to do this for a smaller set of data:
>> rng(123);
>> A = [[1;1;1;2;2;2;2;3;3;4;4;5;6;7] randi(10, 14, 10)];
>> A
A =
1 7 4 3 4 5 1 10 3 2 3
1 3 8 7 5 7 9 9 4 9 6
1 3 2 1 9 9 7 4 6 4 9
2 6 2 5 3 6 8 1 7 6 4
2 8 6 5 5 7 1 4 2 6 8
2 5 6 5 10 6 6 4 2 6 2
2 10 7 5 6 7 6 8 4 1 7
3 7 9 4 7 7 2 10 7 10 9
3 5 8 5 2 9 2 4 9 10 10
4 4 7 9 9 1 7 8 6 3 1
4 4 8 10 7 8 4 6 9 3 5
5 8 4 6 6 3 7 7 4 6 3
6 5 4 7 4 2 6 2 4 10 5
7 1 3 2 4 6 4 4 4 10 6
The first column is our IDs, and the next columns are the data. Running the above code I just wrote, we get:
>> out
out =
1 13 14 11 18 21 17 23 13 15 18
2 29 21 20 24 26 21 17 15 19 21
3 12 17 9 9 16 4 14 16 20 19
4 8 15 19 16 9 11 14 15 6 6
5 8 4 6 6 3 7 7 4 6 3
6 5 4 7 4 2 6 2 4 10 5
7 1 3 2 4 6 4 4 4 10 6
If you double check each row, summing over all of the columns that match each of the column IDs matches up. For example, the first three rows map to the same ID, and we should sum up all of these rows and we get the corresponding sum. The second column is equal to 7+3+3=13, the third column is equal to 4+8+2=14, etc.
Another approach is to apply unique and then use bsxfun to build a matrix that multiplied by the non-ID part of the input matrix will give the result.
Let the input matrix be denoted as A. Then:
[u, ~, v] = unique(A(:,1));
result = [ u bsxfun(#eq, u, u(v).') * A(:,2:end) ];
Example: borrowing from #rayryeng's answer, let
A = [ 1 7 4 3 4 5 1 10 3 2 3
1 3 8 7 5 7 9 9 4 9 6
1 3 2 1 9 9 7 4 6 4 9
2 6 2 5 3 6 8 1 7 6 4
2 8 6 5 5 7 1 4 2 6 8
2 5 6 5 10 6 6 4 2 6 2
2 10 7 5 6 7 6 8 4 1 7
3 7 9 4 7 7 2 10 7 10 9
3 5 8 5 2 9 2 4 9 10 10
4 4 7 9 9 1 7 8 6 3 1
4 4 8 10 7 8 4 6 9 3 5
5 8 4 6 6 3 7 7 4 6 3
6 5 4 7 4 2 6 2 4 10 5
7 1 3 2 4 6 4 4 4 10 6 ];
Then the result is
result =
1 13 14 11 18 21 17 23 13 15 18
2 29 21 20 24 26 21 17 15 19 21
3 12 17 9 9 16 4 14 16 20 19
4 8 15 19 16 9 11 14 15 6 6
5 8 4 6 6 3 7 7 4 6 3
6 5 4 7 4 2 6 2 4 10 5
7 1 3 2 4 6 4 4 4 10 6
and the intermediate matrix created with bsxfun is
>> bsxfun(#eq, u, u(v).')
ans =
1 1 1 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 1 1 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 1 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1
Pre-multiplying A by this matrix means that the first three rows of A are added to give the first row of the result; then the following four rows of A are added to give the second row of the result, etc.
You can find the unique row IDs with unique and then loop over all of those, summing the other columns: Let A be your matrix, then
rID = unique(A(:, 1));
B = zeros(numel(rID), size(A, 2));
for ii = 1:numel(rID)
B(ii, 1) = rID(ii);
B(ii, 2:end) = sum(A(A(:, 1) == rID(ii), 2:end), 1);
end
B contains your output.

VTK Structured Point file

I am trying to parse a VTK file in C by extracting its point data and storing each point in a 3D array. However, the file I am working with has 9 shorts per point and I am having difficulty understanding what each number means.
I believe I understand most of the header information (please correct me if I have misunderstood):
ASCII: Type of file (ASCII or Binary)
DATASET: Type of dataset
DIMENSIONS: dims of voxels (x,y,z)
SPACING: Volume of each voxel (w,h,d)
ORIGIN: Unsure
POINT DATA: Total number of points/voxels (dimx.dimy.dimz)
I have looked at the documentation and I am still not getting an understanding on how to interpret the data. Could someone please help me understand or point me to some helpful resources
# vtk DataFile Version 3.0
vtk output
ASCII
DATASET STRUCTURED_POINTS
DIMENSIONS 256 256 130
SPACING 1 1 1.3
ORIGIN 86.6449 -133.929 116.786
POINT_DATA 8519680
SCALARS scalars short
LOOKUP_TABLE default
0 0 0 0 0 0 0 0 0
0 0 7 2 4 5 3 3 4
4 5 5 1 7 7 1 1 2
1 6 4 3 3 1 0 4 2
2 3 2 4 2 2 0 2 6
...
thanks.
You are correct regarding the meaning of fields in the header.
ORIGIN corresponds to the coordinates of the 0-0-0 corner of the grid.
An example of a DATASET STRUCTURED_POINTS can be found in the documentation.
Starting from this, here is a small file with 6 shorts per point. Each line represents a point.
# vtk DataFile Version 2.0
Volume example
ASCII
DATASET STRUCTURED_POINTS
DIMENSIONS 3 4 2
ASPECT_RATIO 1 1 1
ORIGIN 0 0 0
POINT_DATA 24
SCALARS volume_scalars char 6
LOOKUP_TABLE default
0 1 2 3 4 5
1 1 2 3 4 5
2 1 2 3 4 5
0 2 2 3 4 5
1 2 2 3 4 5
2 2 2 3 4 5
0 3 2 8 9 10
1 3 2 8 9 10
2 3 2 8 9 10
0 4 2 8 9 10
1 4 2 8 9 10
2 4 2 8 9 10
0 1 3 18 19 20
1 1 3 18 19 20
2 1 3 18 19 20
0 2 3 18 19 20
1 2 3 18 19 20
2 2 3 18 19 20
0 3 3 24 25 26
1 3 3 24 25 26
2 3 3 24 25 26
0 4 3 24 25 26
1 4 3 24 25 26
2 4 3 24 25 26
The 3 first fields may be displayed to understand the data layout : x change faster than y, which change faster than z in file.
If you wish to store the data in an array a[2][4][3][6], just read while doing a loop :
for(k=0;k<2;k++){ //z loop
for(j=0;j<4;j++){ //y loop : y change faster than z
for(i=0;i<3;i++){ //x loop : x change faster than y
for(l=0;l<6;l++){
fscanf(file,"%d",&a[k][j][i][l]);
}
}
}
}
To read the header, fscanf() may be used as well :
int sizex,sizey,sizez;
char headerpart[100];
fscanf(file,"%s",headerpart);
if(strcmp(headerpart,"DIMENSIONS")==0){
fscanf(file,"%d%d%d",&sizex,&sizey,&sizez);
}
Note than fscanf() need the pointer to the data (&sizex, not sizex). A string being a pointer to an array of char terminated by \0, "%s",headerpart works fine. It can be replaced by "%s",&headerpart[0]. The function strcmp() compares two strings, and return 0 if strings are identical.
As your grid seems large, smaller files can be obtained using the BINARY kind instead of ASCII, but watch for endianess as specified here.

Delete determinate rows in a matrix

I have an array like this but with more rows:
104,206 99,557 96,667 1 33 1 120,993 0
104,708 99,189 96,641 6 14 1 123,989 65536
107,099 102,732 98,641 0 46 1 118,899 131072
104,985 101,174 98,251 5 30 2 118,445 196608
108,86 103,355 103,494 0 21 1 118,423 262144
I need a loop which delete all the rows when in the 4th column is a 0.
I need do this with all the rows and the result is as follows:
104,206 99,557 96,667 1 33 1 120,993 0
104,708 99,189 96,641 6 14 1 123,989 65536
104,985 101,174 98,251 5 30 2 118,445 196608
In a single line (using logical indexing):
data(data(:,4)==0,:) = [];
Example:
>> data = [5 8 6 0 9
1 3 3 5 2
4 5 6 0 8
2 2 7 3 5];
>> data(data(:,4)==0,:) = []
data =
1 3 3 5 2
2 2 7 3 5
data = randi(10,1000,10) -1; % random data
marks = find(data(:,4)); % find only returns non-zero elements
clean_data = data(marks,:); % return all data on row /marks/

From matrix to array [J]

I'm working on J.
How can I convert this matrix:
(i.10)*/(i.10)
0 0 0 0 0 0 0 0 0 0
0 1 2 3 4 5 6 7 8 9
0 2 4 6 8 10 12 14 16 18
0 3 6 9 12 15 18 21 24 27
0 4 8 12 16 20 24 28 32 36
0 5 10 15 20 25 30 35 40 45
0 6 12 18 24 30 36 42 48 54
0 7 14 21 28 35 42 49 56 63
0 8 16 24 32 40 48 56 64 72
0 9 18 27 36 45 54 63 72 81
in array?
0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 8 9 . . .
I tried
(i.10)*/(i.10)"0
and then I've added
~.(i.10)*/(i.10)"0
to eliminate doubles, but it doesn't work.
If you want to turn a 2-dimensional table (matrix) into a 1-dimensional list (vector or "array", though in the J world "array" usually means "rectangle with any number [N] of dimensions"), you can use ravel (,):
matrix =: (i.10)*/(i.10)
list =: , matrix
list
0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 ...
Now using nub (~.) to remove duplicates should work:
~. list
0 1 2 3 4 5 6 7 8 9 10 12 ...
Note that, in J, the shape of an array usually carries important information, so flattening a matrix like this would be fairly unusual. Still, nothing stopping you.
BTW, you can save yourself some keystrokes by using the adverb ~, which will copy the left argument of a dyad to the right side as well, so you could just say:
matrix =: */~ i. 10
and get the same result as (i.10) */ (i.10).

how to add a factor to a sequence?

I'm analysing a dataset with some data-mining tools.The response variable has ten levels and I'm trying to create a classifier.
Here comes the problem.When using nnet and bagging function,the result is not that good and the 5th level is even not in the prediction.
I want to use a confusion matrix to analyse the classifier.but as the 5th level is not shown in the prediction I can't get a well-formed matrix.So how can I get a well-formed matrix?i.e. I want a 10*10 matrix.
The confusion matrix:
library("mda")#This is where **confusion** comes from
> confusion(pre.bag$class,CLASS)#here confusion acts like table
true
predicted 1 2 3 4 6 7 8 9 10 5
1 338 9 6 0 5 12 10 1 15 46
2 9 549 1 59 18 0 3 0 0 6
3 18 1 44 0 0 0 2 0 0 4
4 0 1 0 21 0 0 0 0 0 0
6 2 13 0 1 299 2 9 0 0 0
7 5 2 1 0 10 231 6 0 1 0
8 0 0 0 0 0 5 76 0 0 0
9 5 1 0 0 0 0 0 62 0 0
10 7 3 1 0 0 2 1 6 181 16
attr(,"error")
[1] 0.1231743
attr(,"mismatch")
[1] 0.03386642
Try this:
pred <- factor(pre.bag$class, levels=levels(CLASS) )
confusion(pre.bag$class, CLASS)
(Tested with an fda-object.)

Resources