I am trying to parse a VTK file in C by extracting its point data and storing each point in a 3D array. However, the file I am working with has 9 shorts per point and I am having difficulty understanding what each number means.
I believe I understand most of the header information (please correct me if I have misunderstood):
ASCII: Type of file (ASCII or Binary)
DATASET: Type of dataset
DIMENSIONS: dims of voxels (x,y,z)
SPACING: Volume of each voxel (w,h,d)
ORIGIN: Unsure
POINT DATA: Total number of points/voxels (dimx.dimy.dimz)
I have looked at the documentation and I am still not getting an understanding on how to interpret the data. Could someone please help me understand or point me to some helpful resources
# vtk DataFile Version 3.0
vtk output
ASCII
DATASET STRUCTURED_POINTS
DIMENSIONS 256 256 130
SPACING 1 1 1.3
ORIGIN 86.6449 -133.929 116.786
POINT_DATA 8519680
SCALARS scalars short
LOOKUP_TABLE default
0 0 0 0 0 0 0 0 0
0 0 7 2 4 5 3 3 4
4 5 5 1 7 7 1 1 2
1 6 4 3 3 1 0 4 2
2 3 2 4 2 2 0 2 6
...
thanks.
You are correct regarding the meaning of fields in the header.
ORIGIN corresponds to the coordinates of the 0-0-0 corner of the grid.
An example of a DATASET STRUCTURED_POINTS can be found in the documentation.
Starting from this, here is a small file with 6 shorts per point. Each line represents a point.
# vtk DataFile Version 2.0
Volume example
ASCII
DATASET STRUCTURED_POINTS
DIMENSIONS 3 4 2
ASPECT_RATIO 1 1 1
ORIGIN 0 0 0
POINT_DATA 24
SCALARS volume_scalars char 6
LOOKUP_TABLE default
0 1 2 3 4 5
1 1 2 3 4 5
2 1 2 3 4 5
0 2 2 3 4 5
1 2 2 3 4 5
2 2 2 3 4 5
0 3 2 8 9 10
1 3 2 8 9 10
2 3 2 8 9 10
0 4 2 8 9 10
1 4 2 8 9 10
2 4 2 8 9 10
0 1 3 18 19 20
1 1 3 18 19 20
2 1 3 18 19 20
0 2 3 18 19 20
1 2 3 18 19 20
2 2 3 18 19 20
0 3 3 24 25 26
1 3 3 24 25 26
2 3 3 24 25 26
0 4 3 24 25 26
1 4 3 24 25 26
2 4 3 24 25 26
The 3 first fields may be displayed to understand the data layout : x change faster than y, which change faster than z in file.
If you wish to store the data in an array a[2][4][3][6], just read while doing a loop :
for(k=0;k<2;k++){ //z loop
for(j=0;j<4;j++){ //y loop : y change faster than z
for(i=0;i<3;i++){ //x loop : x change faster than y
for(l=0;l<6;l++){
fscanf(file,"%d",&a[k][j][i][l]);
}
}
}
}
To read the header, fscanf() may be used as well :
int sizex,sizey,sizez;
char headerpart[100];
fscanf(file,"%s",headerpart);
if(strcmp(headerpart,"DIMENSIONS")==0){
fscanf(file,"%d%d%d",&sizex,&sizey,&sizez);
}
Note than fscanf() need the pointer to the data (&sizex, not sizex). A string being a pointer to an array of char terminated by \0, "%s",headerpart works fine. It can be replaced by "%s",&headerpart[0]. The function strcmp() compares two strings, and return 0 if strings are identical.
As your grid seems large, smaller files can be obtained using the BINARY kind instead of ASCII, but watch for endianess as specified here.
Related
I am writing code in Forth that should create a 12x12 array of random numbers from 1 to 8.
create big_array 144 allocate drop
: reset_array big_array 144 0 fill ;
reset_array
variable rnd here rnd !
: random rnd # 31421 * 6927 + dup rnd ! ;
: choose random um* nip ;
: random_fill 144 1 do 8 choose big_array i + c! loop ;
random_fill
: Array_# 12 * + big_array swap + c# ;
: show_small_array cr 12 0 do 12 0 do i j Array_# 5 u.r loop cr loop ;
show_small_array
However, I notice that elements 128 to 131 of my array are always much larger than expected:
0 4 0 4 2 6 0 5 2 5 7 3
6 3 7 3 7 7 3 1 5 0 6 1
0 3 3 0 3 1 0 7 2 0 4 5
3 7 6 6 2 1 0 2 3 4 2 7
4 7 1 5 3 5 7 2 3 5 3 6
3 0 6 4 1 3 3 2 5 4 4 7
3 2 1 4 3 4 3 7 2 6 5 5
2 4 4 3 4 5 4 4 6 5 6 0
2 5 2 7 3 1 5 0 1 4 6 7
2 0 3 3 0 7 3 6 4 1 3 6
0 1 1 6 0 3 0 2 169 112 41 70
7 2 3 1 2 2 7 6 0 5 1 2
Moreover, when I try to change the value of these elements individually, this causes the other three elements to change value. For example, if I code:
9 choose big_array 128 + c!
then the array will become:
0 4 0 4 2 6 0 5 2 5 7 3
6 3 7 3 7 7 3 1 5 0 6 1
0 3 3 0 3 1 0 7 2 0 4 5
3 7 6 6 2 1 0 2 3 4 2 7
4 7 1 5 3 5 7 2 3 5 3 6
3 0 6 4 1 3 3 2 5 4 4 7
3 2 1 4 3 4 3 7 2 6 5 5
2 4 4 3 4 5 4 4 6 5 6 0
2 5 2 7 3 1 5 0 1 4 6 7
2 0 3 3 0 7 3 6 4 1 3 6
0 1 1 6 0 3 0 2 2 12 194 69
7 2 3 1 2 2 7 6 0 5 1 2
Do you have any idea why these specific elements are always impacted and if there is a way to prevent this?
Better readability and less error prone: 144 allocate ⇨ 144 chars allocate
A mistake: create big_array 144 allocate drop ⇨ create big_array 144 chars allot
A mistake: random um* nip ⇨ random swap mod
A mistake: 144 1 do ⇨ 144 0 do
An excessive operation: big_array swap + ⇨ big_array +
And add the stack comments, please. Especially, when you ask for help.
Do you have any idea why these specific elements are always impacted and if there is a way to prevent this?
Since you try to use memory in the dictionary space without reserving it. This memory is used by the Forth system.
Let's say we have array
0 1 2 3 4 5 8 7 8 9
There are two indexes that have value 8:
(i.10) ([#~8={) 0 1 2 3 4 5 8 7 8 9
6 8
Is there any shorter way to get this result? May be some built-in verb.
But more important. What about higher dimensions?
Let's say we have matrix 5x4
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
I want to find out what are coordinates with value 6.
I want to get result such (there are three coordinates):
4 1
3 2
2 3
It's pretty basic task and I think it should exist some simple solution.
The same in three dimensions?
Thank you
Using Sparse array functionality ($.) provides a very fast and lean solution that also works for multiple dimensions.
]a=: 5 ]\ 1 + i. 8
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
6 = a
0 0 0 0 0
0 0 0 0 1
0 0 0 1 0
0 0 1 0 0
4 $. $. 6 = a
1 4
2 3
3 2
Tacitly:
getCoords=: 4 $. $.
getCoords 6 = a ,: a
0 1 4
0 2 3
0 3 2
1 1 4
1 2 3
1 3 2
Verb indices I. almost does the job.
When you have a simple list, I.'s use is straightforward:
I. 8 = 0 1 2 3 4 5 8 7 8 9
6 8
For higher order matrices you can pair it with antibase #: to get the coordinates in base $ matrix. Eg:
]a =: 4 5 $ 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
I. 6 = ,a
9 13 17
($a) #: 9 13 17
1 4
2 3
3 2
Similarly, for any number of dimensions: flatten (,), compare (=), get indices (I.) and convert coordinates (($a)&#:):
]coords =: ($a) #: I. 5 = , a =: ? 5 6 7 $ 10
0 0 2
0 2 1
0 2 3
...
(<"1 coords) { a
5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
By the way, you can write I. x = y as x (I.#:=) y for extra performance. It is special code for
indices where x f y
I have a 48x202 matrix, where the first columns in the matix is an ID, and the rest of the columns is related vectors to the row ID in the first column.
The ID column is sorted in acending order, and multiple rows can have the same ID.
I want to summarize all IDs that are equal, meaning that i want to sum the rows in the matrix who has identical ID in the first column.
The resulting matrix should be 32x202, since there are only 32 IDs.
Any ideas?
I'd totally approach this with accumarray as well as unique. Like the previous answer, let A be your matrix. You would obtain your answer thusly:
[vals,~,id] = unique(A(:,1),'stable');
B = accumarray(id, (1:numel(id)).', [], #(x) {sum(A(x,2:end),1)});
out = [vals cell2mat(B)];
The first line of code produces vals which is a list of all unique IDs seen in the first column of A and id assigns a unique integer ID without any gaps from 1 up to as many unique IDs there are in the first column of A. The reason why you want to do this is for the next line of code.
How accumarray works is that you provide a set of keys and a set of values associated with each key. accumarray groups all values that belong to the same key and does something to all of the values. The keys in our case is the IDs given in the first column of A and the values are the actual row locations of the matrix A from 1 up to as many rows as A. Now, the default behaviour when collecting all of the values together is to sum all of the values that belong to the same key together, but we're going to do something a bit different. What we'll do is that for each unique ID seen in the first column of A, there will be a bunch of row locations that map to the same ID. We're going to use these row locations and will access the matrix A and sum all of the columns from the second column to the end. That's what the anonymous function in the fourth argument of accumarray is doing. accumarray traditionally should output a single value representing all of the values mapped to a key, but we get around this by outputting a single cell, where each cell entry is the row sum of the mapped columns.
Each element of B gives you the row sum for each corresponding unique value in vals and so the last line of code pieces these together - the unique value in vals with the corresponding row sum. I had to use cell2mat because this was a matrix of cells and I had to convert all of these into a numerical matrix to complete the task.
Here's an example seeing this in action. I'm going to do this for a smaller set of data:
>> rng(123);
>> A = [[1;1;1;2;2;2;2;3;3;4;4;5;6;7] randi(10, 14, 10)];
>> A
A =
1 7 4 3 4 5 1 10 3 2 3
1 3 8 7 5 7 9 9 4 9 6
1 3 2 1 9 9 7 4 6 4 9
2 6 2 5 3 6 8 1 7 6 4
2 8 6 5 5 7 1 4 2 6 8
2 5 6 5 10 6 6 4 2 6 2
2 10 7 5 6 7 6 8 4 1 7
3 7 9 4 7 7 2 10 7 10 9
3 5 8 5 2 9 2 4 9 10 10
4 4 7 9 9 1 7 8 6 3 1
4 4 8 10 7 8 4 6 9 3 5
5 8 4 6 6 3 7 7 4 6 3
6 5 4 7 4 2 6 2 4 10 5
7 1 3 2 4 6 4 4 4 10 6
The first column is our IDs, and the next columns are the data. Running the above code I just wrote, we get:
>> out
out =
1 13 14 11 18 21 17 23 13 15 18
2 29 21 20 24 26 21 17 15 19 21
3 12 17 9 9 16 4 14 16 20 19
4 8 15 19 16 9 11 14 15 6 6
5 8 4 6 6 3 7 7 4 6 3
6 5 4 7 4 2 6 2 4 10 5
7 1 3 2 4 6 4 4 4 10 6
If you double check each row, summing over all of the columns that match each of the column IDs matches up. For example, the first three rows map to the same ID, and we should sum up all of these rows and we get the corresponding sum. The second column is equal to 7+3+3=13, the third column is equal to 4+8+2=14, etc.
Another approach is to apply unique and then use bsxfun to build a matrix that multiplied by the non-ID part of the input matrix will give the result.
Let the input matrix be denoted as A. Then:
[u, ~, v] = unique(A(:,1));
result = [ u bsxfun(#eq, u, u(v).') * A(:,2:end) ];
Example: borrowing from #rayryeng's answer, let
A = [ 1 7 4 3 4 5 1 10 3 2 3
1 3 8 7 5 7 9 9 4 9 6
1 3 2 1 9 9 7 4 6 4 9
2 6 2 5 3 6 8 1 7 6 4
2 8 6 5 5 7 1 4 2 6 8
2 5 6 5 10 6 6 4 2 6 2
2 10 7 5 6 7 6 8 4 1 7
3 7 9 4 7 7 2 10 7 10 9
3 5 8 5 2 9 2 4 9 10 10
4 4 7 9 9 1 7 8 6 3 1
4 4 8 10 7 8 4 6 9 3 5
5 8 4 6 6 3 7 7 4 6 3
6 5 4 7 4 2 6 2 4 10 5
7 1 3 2 4 6 4 4 4 10 6 ];
Then the result is
result =
1 13 14 11 18 21 17 23 13 15 18
2 29 21 20 24 26 21 17 15 19 21
3 12 17 9 9 16 4 14 16 20 19
4 8 15 19 16 9 11 14 15 6 6
5 8 4 6 6 3 7 7 4 6 3
6 5 4 7 4 2 6 2 4 10 5
7 1 3 2 4 6 4 4 4 10 6
and the intermediate matrix created with bsxfun is
>> bsxfun(#eq, u, u(v).')
ans =
1 1 1 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 1 1 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 1 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1
Pre-multiplying A by this matrix means that the first three rows of A are added to give the first row of the result; then the following four rows of A are added to give the second row of the result, etc.
You can find the unique row IDs with unique and then loop over all of those, summing the other columns: Let A be your matrix, then
rID = unique(A(:, 1));
B = zeros(numel(rID), size(A, 2));
for ii = 1:numel(rID)
B(ii, 1) = rID(ii);
B(ii, 2:end) = sum(A(A(:, 1) == rID(ii), 2:end), 1);
end
B contains your output.
I have a 1 x 15 array of values:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
I need to rearrange them into a 3 x 5 matrix using a for loop:
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
How would I do that?
I'm going to show you three methods. One where you need to have a for loop, and two others when you don't:
Method #1 - for loop
First, create a matrix that is 3 x 5, then keep track of an index that will go through your array. After, create a double for loop that will help you populate the array.
index = 1;
array = 1 : 15; %// Array we wish to access
matrix = zeros(3,5); %// Initialize
for m = 1 : 3
for n = 1 : 5
matrix(m,n) = array(index);
index = index + 1;
end
end
matrix =
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
Method #2 - Without a for loop
Simply put, use reshape:
matrix = reshape(1:15, 5, 3).';
matrix =
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
reshape will take a vector and restructure it into a matrix so that you populate the matrix by columns first. As such, we want to put 1 to 5 in the first column, 6 to 10 in the second and 11 to 15 in the third column. Therefore, our output matrix is in fact 5 x 3. When you see this, this is actually the transposed version of the matrix we want, which is why you do .' to transpose the matrix back.
Method #3 - Another method without a for loop (tip of the hat goes to Luis Mendo)
You can use vec2mat, and specify that you need to have 5 columns worth for your matrix:
matrix = vec2mat(1:15, 5);
matrix =
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
vec2mat takes a vector and reshapes it into a matrix of as many columns as you specify in the second parameter. In this case, we need 5 columns.
For the sake of (bsx)fun, here is another option...
bsxfun(#plus,1:5,[0:5:10]')
ans =
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
less readable, maybe faster, but who cares if it is such a small of an array...
A = [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ] ;
A = reshape( A' , 3 , 5 ) ;
A' = 1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
I have this dataframe:
df <- data.frame(subject = c(rep("one", 20), c(rep("two", 20))),
score1 = sample(1:3, 40, replace=T),
score2 = sample(1:6, 40, replace=T),
score3 = sample(1:3, 40, replace=T),
score4 = sample(1:4, 40, replace=T))
subject score1 score2 score3 score4
1 one 2 4 2 2
2 one 3 3 1 2
3 one 1 2 1 3
4 one 3 4 1 2
5 one 1 2 2 3
6 one 1 5 2 4
7 one 2 5 3 2
8 one 1 5 1 3
9 one 3 5 2 2
10 one 2 3 3 4
11 one 3 2 1 3
12 one 2 5 2 1
13 one 2 4 1 4
14 one 2 2 1 3
15 one 1 3 1 4
16 one 1 6 1 3
17 one 3 4 2 2
18 one 3 2 1 3
19 one 2 5 3 1
20 one 3 6 2 1
21 two 1 6 3 4
22 two 1 2 1 2
23 two 3 2 1 2
24 two 1 2 2 1
25 two 2 3 1 3
26 two 1 5 3 3
27 two 2 4 1 4
28 two 2 6 2 4
29 two 1 6 2 2
30 two 1 5 1 4
31 two 2 1 2 4
32 two 3 6 1 1
33 two 1 1 3 1
34 two 2 4 2 3
35 two 2 1 3 2
36 two 2 3 1 3
37 two 1 2 3 4
38 two 3 5 2 2
39 two 2 1 3 4
40 two 2 1 1 3
Note that the scores have different ranges of values. Score 1 ranges from 1-3, score 2 from -6, score 3 from 1-3, score 4 from 1-4
I'm trying to reshape data like this:
library(reshape2)
dfMelt <- melt(df, id.vars="subject")
acast(dfMelt, subject ~ value ~ variable)
Aggregation function missing: defaulting to length
, , score1
1 2 3 4 5 6
one 6 7 7 0 0 0
two 8 9 3 0 0 0
, , score2
1 2 3 4 5 6
one 0 5 3 4 6 2
two 5 4 2 2 3 4
, , score3
1 2 3 4 5 6
one 10 7 3 0 0 0
two 8 6 6 0 0 0
, , score4
1 2 3 4 5 6
one 3 6 7 4 0 0
two 3 5 5 7 0 0
Note that the output array includes scores as "0" if they are missing. Is there any way to stop these missing scores being outputted by acast?
In this case, you might do better sticking to base R's table feature. I'm not sure that you can have an irregular array like you are looking for.
For example:
> lapply(df[-1], function(x) table(df[[1]], x))
$score1
x
1 2 3
one 9 6 5
two 11 4 5
$score2
x
1 2 3 4 5 6
one 2 5 4 3 3 3
two 4 2 2 3 4 5
$score3
x
1 2 3
one 9 5 6
two 4 11 5
$score4
x
1 2 3 4
one 4 4 8 4
two 2 6 5 7
Or, using your "long" data:
with(dfMelt, by(dfMelt, variable,
FUN = function(x) table(x[["subject"]], x[["value"]])))
Since each "score" subset is going to have a different shape, you will not be able to preserve the array structure. One option is to use lists of two-dim arrays or data.frames. eg:
# your original acast call
res <- acast(dfMelt, subject ~ value ~ variable)
# remove any columns that are all zero
apply(res, 3, function(x) x[, apply(x, 2, sum)!=0] )
Which gives:
$score1
1 2 3
one 7 8 5
two 6 8 6
$score2
1 2 3 4 5 6
one 4 2 6 4 1 3
two 2 5 3 4 3 3
$score3
1 2 3
one 5 10 5
two 5 11 4
$score4
1 2 3 4
one 5 4 4 7
two 4 6 6 4