How do I generate a string array in MatLab 2016b? [duplicate] - arrays

When trying to run my code, for example
for ii= 1:10
output(ii)=rand(3);
end
I get the error
In an assignment A(:) = B, the number of elements in A and B must be the same
or
In an assignment A(I) = B, the number of elements in B and I must be the same.
What does this error mean? What is the approach to get rid of it?

This error comes because you are trying to fill a variable chunk with more (or less) values than its size. In other words, you have a statement A(:)=B on where size(A(:)) is different to size(B).
In the example in the question, rand(3) returns a 3x3 matrix, however, output(ii) is just a single value (even if output may be bigger, output(ii) is just a single value of output), thus the value returned by rand(3) does not fit inside output.
In order to solve this problem, you need to change the the size of the output variable, so you have space to fit all the result.
There are 2 ways of doing this. One of them is by creating a Matrix that fits the return, e.g. output=zeros(3,3,10).
Then we can change the code to
for ii= 1:10
output(:,:,ii)=rand(3);
end
Alternatively, you can fill the output as a cell array. This is particularly useful when the return of the function changes sizes each time, e.g. rand(ii);
In that case, the following would work
for ii= 1:10
output{ii}=rand(ii);
end
It is probable that unlike in the example in the question, in the real case you do not know the size of what the output returns, thus you do not know which of the two options to use to fix your code.
On possible way of learning that, is activating debugging help when the code errors, by typing dbstop if error in your command line. This will trigger a debugging stop when MATLAB throws an error, and you can type size(rand(ii)) and size(output(ii)) to see the sizes of both.
Often, reading the documentation of the function being used also helps, to see if different sizes are possible.
That said, the second option, cell arrays, will always ensure everything will fit. However matrices are generally a faster and easier to use in MATLAB, thus you should aim for the matrix based solution if you can.

Related

In an assignment A(:) = B, the number of elements in A and B must be the same

When trying to run my code, for example
for ii= 1:10
output(ii)=rand(3);
end
I get the error
In an assignment A(:) = B, the number of elements in A and B must be the same
or
In an assignment A(I) = B, the number of elements in B and I must be the same.
What does this error mean? What is the approach to get rid of it?
This error comes because you are trying to fill a variable chunk with more (or less) values than its size. In other words, you have a statement A(:)=B on where size(A(:)) is different to size(B).
In the example in the question, rand(3) returns a 3x3 matrix, however, output(ii) is just a single value (even if output may be bigger, output(ii) is just a single value of output), thus the value returned by rand(3) does not fit inside output.
In order to solve this problem, you need to change the the size of the output variable, so you have space to fit all the result.
There are 2 ways of doing this. One of them is by creating a Matrix that fits the return, e.g. output=zeros(3,3,10).
Then we can change the code to
for ii= 1:10
output(:,:,ii)=rand(3);
end
Alternatively, you can fill the output as a cell array. This is particularly useful when the return of the function changes sizes each time, e.g. rand(ii);
In that case, the following would work
for ii= 1:10
output{ii}=rand(ii);
end
It is probable that unlike in the example in the question, in the real case you do not know the size of what the output returns, thus you do not know which of the two options to use to fix your code.
On possible way of learning that, is activating debugging help when the code errors, by typing dbstop if error in your command line. This will trigger a debugging stop when MATLAB throws an error, and you can type size(rand(ii)) and size(output(ii)) to see the sizes of both.
Often, reading the documentation of the function being used also helps, to see if different sizes are possible.
That said, the second option, cell arrays, will always ensure everything will fit. However matrices are generally a faster and easier to use in MATLAB, thus you should aim for the matrix based solution if you can.

How to identify places in MATLAB where data is stored outside the bounds of an array?

I am trying to use MATLAB Coder to convert code from Matlab to a MEX file. If I have a code snippet of the following form:
x = zeros(a,1)
x(a+1) = 1
then in Matlab this will resize the array to accommodate the new element, while in the MEX file this will give an "Index exceeds matrix dimensions" error. I expect there are lots of places in the code where this is happening.
What I want to do is run the MATLAB version of the code (without using the coder) but have MATLAB generate an error or warning whenever it resizes an array because I assign to something outside the bounds. (I could just use the MEX file and see where the errors pop up, but this requires rebuilding the whole MEX file using MATLAB Coder every time I change the code at all, which takes a while.)
Is there a way to do this? Is there any kind of setting in MATLAB that will turn off the "automatically resize if you assign to an out-of-bounds index", or give a warning if this happens?
EDIT: As of Matlab 2015b, Coder now has runtime error checking as an option (from the Matlab release notes):
In R2015b, generated standalone libraries and executables can detect and report run-time errors such as out-of-bounds array indexing. In
previous releases, only generated MEX detected and reported run-time
errors.
By default, run-time error detection is enabled for MEX. By
default, run-time error detection is disabled for standalone libraries
and executables.
To enable run-time error detection for standalone
libraries and executables:
At the command line, use the code
configuration property RuntimeChecks.
cfg = coder.config('lib'); % or
'dll' or 'exe'
cfg.RuntimeChecks = true;
codegen -config cfg myfunction
Using the MATLAB Coder app, in the project build settings,
on the Debugging tab, select the Generate run-time error checks check
box.
The generated libraries and executables use fprintf to write
error messages to stderr and abort to terminate the application. If
fprintf and abort are not available, you must provide them. Error
messages are in English.
See Run-Time Error Detection and Reporting in Standalone C/C++ Code and Generate Standalone Code That Detects and Reports Run-Time Errors.
Original answer:
The answer in the comments regarding declaring a class subclassed from double wherein the subsref method is overloaded to disallow growing would be one good way to do it.
Another easy way to do it is to sprinkle assert commands throughout the code (in each loop iteration or at the bottom of a function) to assert that the size has not increased over the allocated size.
For instance, if you had the code in a format of:
x = zeros(a,1)
x(a+1) = 1
... lots of other operations
if coder.target('MATLAB')
assert(isequal(size(x), [a,1]), 'x has been indexed out of bounds')
end
this would let you have the assertion fail if any value were assigned that extended the array.
To make this a bit tidier, you might even make a function that bounds checked all the variables you care about, again wrapping the coder.target if statement around it. Then you could sprinkle this throughout your code.
It's not as elegant as overloading the double class, but on the other hand it doesn't add any overhead to the compiled code at all. It also won't give you errors exactly when the overrun happens, but it will give you confidence that the code is working well in a variety of situations.
Another way that you can be more confident of assignments is to do your own bounds checking on the assignment in situations where this may be appropriate. A common problem I've seen in assignment goes something like this. We have an allocated array and are copying in data from another array with a vector assignment. For instance, consider the following situation:
t = char(zeros(5,7)); % Allocate a 5 by 7 char array
tempstring = 'hello to anyone'; % Here's a string we want to put into it.
t(1, 1:numel(tempstring)) = tempstring; % A valid assignment in MATLAB
>> size(t)
ans =
5 15
Uh oh, exactly what you're concerned about in the question has happened: the t array has been automatically resized during the assignment, which works in MATLAB, but in Coder created code will cause a segfault or MEX error. An alternative is to use the power of the end function to keep the assignment tidy (but truncated.) If we change the assignment to:
t(1,1:min(end,numel(tempstring))) = tempstring(1:size(t, 2));
The size of t will remain unchanged, but the assignment will be truncated. The use of end allows bounds checking during the assignment. For some situations, this can be a nice way of handling the issue and will give you confidence that the bounds will never be exceeded, but obviously in some situations this is very undesirable (and won't give you error messages in MATLAB.)
Another helpful tool MATLAB provides is in the editor itself. If you use the %#codegen tag in your code, it will signal the editor's syntax checker to highlight various code generation problems, including places where you are obviously increasing the size of an array by indexing. This can't catch every situation, but it's a good help.
One last note. As mentioned in the question, a MEX file generated by Coder will give you an "Index exceeds matrix dimensions" error right at the time of the assignment, and will gracefully exit and even tell you the original line of code where the error occurred. A C library generated out of Coder has no such nice behavior or bounds checking and will segfault out completely with no diagnostics. The intermediate answer is to do exactly what you're doing, which is to run the code as a MEX. That's not very helpful to your question (as you say, rebuilding the MEX can take time), but for those of us who code for the cold, cruel world of external C code, the intermediate test of being able to run the MEX to find these errors is a lifesaver.
The bottom line is this is a divergence in behavior between MATLAB and Coder generated C code, and it can be a source of significant problems. In my own code, I am very careful with array access and growth for exactly this reason. It's an area where I'd like to see improvement in the Coder tool itself, but there are ways to be very careful when writing MATLAB code targeted for Coder.

R external interface

I would like to implement some R package, written in C code.
The C code must:
take an array (of any type) as input.
produce array as output (of unpredictable size).
What is the best practice of implementing array passing?
At the moment C code is called with .C(). It accesses array directly from R, through pointer. Unfortunately same can't be done for output, as output dimensions need to be known in advance which is not true in my case.
Would it make sense to pass array from C to R through a file? For example, in ramfs, if using linux?
Update 1:
Here the exact same problem was discussed.
There, possible option of returning array with unknown dimensions was mentioned:
Splitting external function into two, at the point before calculating the array but after dimensions are known. First part would return dimensions, then empty array would be prepared, then second part would run and populate the array in R.
In my case full dimensions are known only once whole code is executed, so this method would mean running C code twice. Taking a guess on maximal array size isn't optional either.
Update 2: It seems only way to do this is to use .Call() instead, as power suggested. Here are few good examples: http://www.sfu.ca/~sblay/R-C-interface.ppt.
Thanks.
What is the best practice of implementing array passing?
Is the package already written in ANSI C? .C() would then be quick and easy.
If you are writing from scratch, I suggest .Call() and Rcpp. In this way, you can pass R objects to your C/C++ code.
Would it make sense to pass array through a file?
No
Read "Writing R Extensions".

Allocating arrays of the same size

I'd like to allocate an array B to be of the same shape and have the same lower and upper bounds as another array A. For example, I could use
allocate(B(lbound(A,1):ubound(A,1), lbound(A,2):ubound(A,2), lbound(A,3):ubound(A,3)))
But not only is this inelegant, it also gets very annoying for arrays of (even) higher dimensions.
I was hoping for something more like
allocate(B(shape(A)))
which doesn't work, and even if this did work, each dimension would start at 1, which is not what I want.
Does anyone know how I can easily allocate an array to have the same size and bounds as another array easily for arbitrary array dimensions?
As of Fortran 2008, there is now the MOLD optional argument:
ALLOCATE(B, MOLD=A)
The MOLD= specifier works almost in the same way as SOURCE=. If you specify MOLD= and source_expr is a variable, its value need not be defined. In addition, MOLD= does not copy the value of source_expr to the variable to be allocated.
Source: IBM Fortran Ref
You can either define it in a preprocessor directive, but that will be with a fixed dimensionality:
#define DIMS3D(my_array) lbound(my_array,1):ubound(my_array,1),lbound(my_array,2):ubound(my_array,2),lbound(my_array,3):ubound(my_array,3)
allocate(B(DIMS3D(A)))
don't forget to compile with e.g. the -cpp option (gfortran)
If using Fortran 2003 or above, you can use the source argument:
allocate(B, source=A)
but this will also copy the elements of A to B.
If you are doing this a lot and think it too ugly, you could write your own subroutine to take care of it, copy_dims (template, new_array), encapsulating the source code line you show. You could even set up a generic interface so that it could handle arrays of several ranks -- see how to write wrapper for 'allocate' for an example of that concept.

Passing c arrays into fortran as a variable sized matrix

So, i've been commissioned to translate some fortran subroutines into C. These subroutines are being called as part of the control flow of a large porgram based primarily in C.
I am translating the functions one at a time, starting with the functions that are found at the top of call stacks.
The problem I am facing is the hand-off of array data from C to fortran.
Suppose we have declared an array in c as
int* someCArray = (int*)malloc( 50 * 4 * sizeof(int) );
Now, this array needs to be passed down into a fortran subroutine to be filled with data
someFortranFunc( someCArray, someOtherParams );
when the array arrives in fortran land, it is declared as a variable sized matrix as such:
subroutine somefortranfunc(somecarray,someotherparams)
integer somefarray(50,*)
The problem is that fortran doesn't seem to size the array correctly, becuase the program seg-faults. When I debug the program, I find that indexing to
somefarray(1,2)
reports that this is an invalid index. Any references to any items in the first column work fine, but there is only one available column in the array when it arrives in fortran.
I can't really change the fact that this is a variable sized array in fortran. Can anyone explain what is happening here, and is there a way that I can mitigate the problem from the C side of things?
[edit]
By the way, the fortran subroutine is being called from the replaced fortran code as
integer somedatastorage(plentybignumber)
integer someindex
...
call somefarray(somedatastorage(someindex))
where the data storage is a large 1d array. There isn't a problem with overrunning the size of the data storage. Somehow, though, the difference between passing the C array and the fortran (sub)array is causing a difference in the fortran subroutine.
Thanks!
Have you considered the Fortran ISO C Binding? I've had very good results with it to interface Fortran and C in both directions. My preference is to avoid rewriting existing, tested code. There are a few types that can't be transferred with the current version of the ISO C Binding, so a translation might be necessary.
What it shouldn't be that others suggested:
1. Size of int vs. size of Integer. Since the first column has the right values.
2. Row vs. column ordering. Would just get values in wrong order not segmentation faulted.
3. Reference vs value passing. Since the first column has the right values. Unless the compiler is doing something evil behind your back.
Are you sure you don't do this in some secret way?:
someCArray++
print out the value of the someCArray pointer right after you make it and right before you pass it. You also should print it out using the debugger in the fortran code just to verify that the compiler is not generating some temporary copies to help you.

Resources