MATLAB to C-code - c

I am following MathWorks guide to converting MATLAB code to C-code.
The first step is to enter
%#codegen
after every function that I want converted to C-code, however doing so has given me the following prompt on the code below.
function lanes=find_lanes(B,h, stats)
% Find the regions that look like lanes
%#codegen
lanes = {};
l=0;
for k = 1:length(B)
metric = stats(k).MajorAxisLength/stats(k).MinorAxisLength;
%testlane(k);
%end
%function testlane(k)
coder.inline('never');
if metric > 5 & all(B{k}(:,1)>100)
l=l+1;
lanes(l,:)=B(k);
else
delete(h(k))
end
end
end
around the curly braces:
code generation only supports cell operations for "varargin" and
"varargout"
Another prompt says
Code generation does not support variable "lanes" size growth through indexing
where lanes is mentioned for the second time.
The input Arguments for the function are:
B - Is the output of the bwboundaries Image Processing toolbox function. It is a P-by-1 cell array, where P is the number of objects and holes. Each cell in the cell array contains a Q-by-2 matrix. Each row in the matrix contains the row and column coordinates of a boundary pixel. Q is the number of boundary pixels for the corresponding region.
h - plots the boundaries of the objects with a green outline while being a matrix of size 1 X length(B), holding the values of the boundaries like so like so:
h(K)=plot(boundary(:,2), boundary(:,1), 'g', 'LineWidth', 2);//boundary(:,1) - Y coordinate, boundary(:,2) - X coordinate.
stats - 19x1 struct array acquired using the regionprops function from the Image Processing toolbox with fields:
MajorAxisLength and
MinorAxisLength (of the object)
I would really appreciate any input you can give in helping me clear this error. Thanks in Advance!

Few points about your code generation -
Only a subset of functions in MATLAB and Image Processing Toolbox support code generation - Image Processing Toolbox support for code generation.
Cell arrays do not support code generation yet - Cell array support.
In your code, it seems like your variable is growing i.e. the initial size of the array is not able to support your workflow. You should follow code generation for variable sized inputs.

I had a similar error i.e. code generation does not support variable size growth through indexing. Inside my for loop I had a statement as such which had the same error:
y(i) = k;
I introduced a temporary storage variable u and modified my code to:
u = y;
u(i) = k;
y = u;
I suggest you do the same for your variable lanes.

Related

removing second layer for loop when defining array

Working in MATLAB R2017a. I'm trying to optimise a piece of code I'm working on. It uses arrays to store field values on a grid.
In order to create a specific function in a field array I originally used the straight forward method of two for loops iterating over all the array elements. But i know for loops are slow so since then I came back and tried my best to remove them. However I could only manage to remove one of the loops; leaving me with this:
for n = 1:1:K
%%% define initial pertubation
t=n*dt;
% create array for source Ez field.
xtemps = (1:Ng)*dX;
for k = 1:Ng
ztemp = k*dX;
Ez0(k,:) = THzamp * (1/(1+exp(-(t-stepuppos)))) * exp(-((xtemps-...
THzstartx).^2)./(bx^2)) .* (t-((ztemp-THzstartz)/vg))*exp(-((t-((ztemp-...
THzstartz)/vg))^2)/(bt^2));
end
The important bit here is the last 5 lines, but I figured the stuff before might be important for context. I've removed the for loop looping over the x coordinates. I want to vectorize the z/k for loop but I can't figure out how to distinguish between the dimensions with the array oporators.
Edit: THzamp, stepuppos, bx, bt, THzstartz, THzstartx are all just scalars, they control the function (Ez0) I'm trying to create. dX and t are also just scalars. Ez0 is a square array of size Ng.
What I want to achieve is to remove the for loop that loops over k, so that that the values of ztemp are defined in a vector (like xtemps already is), rather than individually in the loop. However, I don't know how I'd write the definition of Ez0 in that case.
First time posting here, if I'm doing it wrong let me know. If you need more info just ask.
It isn't clear if n is used in the other headers and as stated in the comments your sizes aren't properly defined so you'll have to ensure the sizes are correct.
However, you can give this vectorize code a try.
n = 1:K
%%% define initial pertubation
t=n*dt;
% create array for source Ez field.
xtemps = (1:Ng)*dX;
for k = 1:Ng
ztemp = k*dX;
Ez0(k,:) = THzamp .* (1./(1+exp(-(t-stepuppos)))) .* exp(-((xtemps-...
THzstartx).^2)./(bx^2)) .* (t-((ztemp-THzstartz)/vg)).*exp(-((t-((ztemp-...
THzstartz)/vg)).^2)/(bt.^2));
end
So now t has the size K you'll need to ensure stepupposand (ztemp-THzstartz)/vg) have the same size K. Also you can take a look at vectors vs array operators here.

Network Formation and Large Array's in Matlab Optimization

I am getting an error using repmat. My Matlab version is 2017a. "Requested 3711450x2726 (75.4GB) array exceeds maximum array size..." First, some context.
I have an adjacency matrix of social network data call it D. D is 2725x2725 with 1s denoting a link between agents i and j and 0s otherwise. I have been provided a function and sub-functions for a network formation model. There are K regressors (x variables). The model requires forming a dyad-specific regressor matrix W that is W = 0.5N(N-1) x K. In my data, this is 3711450 x K. For a start, I select only one x variable so K=1.
In the main function, there are two steps. The first step calculates the joint MLE from a logit. I have a problem in the second step computation of the variance covariance matrix with array size. Inside this step, there is a calculation that creates a 3711450 x n (2725) matrix using repmat.
INFO = ((repmat((exp_Xbeta ./ (1+exp_Xbeta).^2),1,K) .* X)'*X);
exp_Xbeta is 3711450 x K and X is a sparse 3711450 x 2725 matrix with Bytes = 178171416 of class double. The error occurs at INFO.
I've tried converting X to a tall matrix but thus far no joy. I've tried adding sparse to the INFO line but again no joy. Anyone have any ideas short of going to a cluster or getting more ram? Could I somehow convert X from a sparse matrix to a full matrix inside a datastore and then call the datastore using tall? I have not been able to figure out how to do that if it is possible.
Once INFO is constructed as an array it will be used later in one of the sub-functions. So, it needs to be callable. In case you're curious, INFO is the second derivative matrix.
I have found that producing the INFO matrix all at once was too much for my memory constraints. I split up the steps, but still, repmat and subsequent steps were a problem. Now, I've turned to building up the INFO matrix one step at a time, while never holding more than exp_Xbeta, X, and two vectors in memory. Replacing the construction of INFO with
for i = 1:d
s1_i = step1(:,1).*X(:,i);
s1_i = s1_i';
for j = 1:d;
INFO(i,j) = s1_i*X(:,j);
end
clear s1_i;
end
has dropped the memory requirement, though its slow, and things seem to be working. For anyone interested, below is a little example illustrating the point.
clear all
N = 20
n = 0.5*N*(N-1)
exp_Xbeta = rand(n,1);
X = rand(n,N);
step1 = (exp_Xbeta ./ (1+exp_Xbeta).^2);
[c,d] = size(X);
INFO = zeros(d,d);
for i = 1:d
s1_i = step1(:,1).*X(:,i)
s1_i = s1_i'
for j = 1:d
INFO(i,j) = s1_i*X(:,j)
end
clear s1_i
end
K = 1
INFO2 = ((repmat((exp_Xbeta ./ (1+exp_Xbeta).^2),1,K) .* X)'*X);
% Methods produce equivalent matrices
INFO
INFO2

Smart and Fast Indexing of multi-dimensional array with R

This is another step of my battle with multi-dimensional arrays in R, previous question is here :)
I have a big R array with the following dimensions:
> data = array(..., dim = c(x, y, N, value))
I'd like to perform a sort of bootstrap comparing the mean (see here for a discussion about it) obtained with:
> vmean = apply(data, c(1,2,3), mean)
With the mean obtained sampling the N values randomly with replacement, to explain better if data[1,1,,1] is equals to [v1 v2 v3 ... vN] I'd like to replace it with something like [v_k1 v_k2 v_k3 ... v_kN] with k values sampled with sample(N, N, replace = T).
Of course I want to AVOID a for loop. I've read this but I don't know how to perform an efficient indexing of this array avoiding a loop through x and y.
Any ideas?
UPDATE: the important thing here is that I want a different sample for each sample in the fourth (value) dimension, otherwise it would be simple to do something like:
> dataSample = data[,,sample(N, N, replace = T), ]
Also there's the compiler package which speeds up for loops by using a Just In Time compiler.
Adding thes lines at the top of your code enables the compiler for all code.
require("compiler")
compilePKGS(enable=T)
enableJIT(3)
setCompilerOptions(suppressAll=T)

Creating a cell array of different randomized matrices

I'm trying to create a cell array of size N,
where every cell is a randomized Matrix of size M,
I've tried using deal or simple assignments, but the end result is always N identical Matrices of size M
for example:
N=20;
M=10;
CellArray=cell(1,N);
CellArray(1:20)={rand(M)};
this yields identical matrices in each cell, iv'e tried writing the assignment like so:
CellArray{1:20}={rand(M)};
but this yields the following error:
The right hand side of this assignment has too few values to satisfy the left hand side.
the ends results should be a set of transition probability matrices to be used for a model i'm constructing,
there's a currently working version of the model, but it uses loops to create the matrices, and works rather slowly,
i'd be thankful for any help
If you don't want to use loops because you are interested in a low execution time, get rid of the cells.
RandomArray=rand(M,M,N)
You can access each slice, which is your intended MxM matrix, using RandomArray(:,:,index)
Use cellfun:
N = 20;
M = 10;
CellArray = cellfun(#(x) rand(M), cell(1,N), 'uni',0)
For every cell it newly calls rand(M) - unlike before, you were assigning the same rand(M) to every cell, which was just computed once.

Cuda Fortran 4D array

My code is being slowed down by a my 4D arrays access in global memory.
I am using PGI compiler 2010.
The 4D array I am accessing is read only from the device and the size is known at run time.
I wanted to allocate to the texture memory and found that my PGI version does not support texture. As the size is known only at run time, it is not possible to use constant memory too.
Only One dimension is known at compile time like this MyFourD(100, x,y,z) where x,y,z are user input.
My first idea is about pointers but not familiar with pointer fortran.
If you have experience how to deal with such a situation, I will appreciate your help. Because only this makes my codes 5times slower than expected
Following is a sample code of what I am trying to do
int i,j,k
i = (blockIdx%x-1) * blockDim%x + threadIdx%x-1
j = (blockIdx%y-1) * blockDim%y + threadIdx%y-1
do k = 0, 100
regvalue1 = somevalue1
regvalue2 = somevalue2
regvalue3 = somevalue3
d_value(i,j,k)=d_value(i,j,k)
& +myFourdArray(10,i,j,k)*regvalue1
& +myFourdArray(32,i,j,k)*regvalue2
& +myFourdArray(45,i,j,k)*regvalue3
end do
Best regards,
I believe the answer from #Alexander Vogt is on the right track - I would think about re-ordering the array storage. But I would try it like this:
int i,j,k
i = (blockIdx%x-1) * blockDim%x + threadIdx%x-1
j = (blockIdx%y-1) * blockDim%y + threadIdx%y-1
do k = 0, 100
regvalue1 = somevalue1
regvalue2 = somevalue2
regvalue3 = somevalue3
d_value(i,j,k)=d_value(i,j,k)
& +myFourdArray(i,j,k,10)*regvalue1
& +myFourdArray(i,j,k,32)*regvalue2
& +myFourdArray(i,j,k,45)*regvalue3
end do
Note that the only change is to myFourdArray, there is no need for a change in data ordering in the d_value array.
The crux of this change is that we are allowing adjacent threads to access adjacent elements in myFourdArray and so we are allowing for coalesced access. Your original formulation forced adjacent threads to access elements that were separated by the length of the first dimension, and so did not allow for useful coalescing.
Whether in CUDA C or CUDA Fortran, threads are grouped in X first, then Y and then Z dimensions. So the rapidly varying thread subscript is X first. Therefore, in matrix access, we want this rapidly varying subscript to show up in the index that is also rapidly varying.
In Fortran this index is the first of a multiple-subscripted array.
In C, this index is the last of a multiple-subscripted array.
Your original code followed this convention for d_value by placing the X thread index (i) in the first array subscript position. But it broke this convention for myFourdArray by putting a constant in the first array subscript position. Thus your access to myFourdArray are noticeably slower.
When there is a loop in the code, we also don't want to place the loop variable first (for Fortran, or last for C) (i.e. k, in this case, as Alexander Vogt did) because doing that will also break coalescing. For each iteration of the loop, we have multiple threads executing in lockstep, and those threads should all access adjacent elements. This is facilitated by having the X thread indexed subscript (e.g. i) first (for Fortran, or last for C).
You could invert the indexing, i.e. let the first dimension change the Fastest. Fortran is column major!
do k = 0, 100
regvalue1 = somevalue1
regvalue2 = somevalue2
regvalue3 = somevalue3
d_value(k,i,j)=d_value(k,i,j) + &
myFourdArray(k,i,j,10)*regvalue1 + &
myFourdArray(k,i,j,32)*regvalue2 + &
myFourdArray(k,i,j,45)*regvalue3
end do
If the last (in the original case second) dimension is always fixed (and not too large), consider individual arrays instead.
In my experience, pointers do not change much in terms of speed-up when applied to large arrays. What you could try is strip-mining to optimize your loops in terms of cache access, but I do not know the compile option to enable this with the PGI compiler.
Ah, ok it is a simple directive:
!$acc do vector
do k=...
enddo

Resources