Max of an Array (SAS) - arrays

I have an array of auc values, cv_auc0-cv_auc39, numbered 0-39. The maximum auc value is .7778, and it appears in several places in the array (33, 35, 38, 39). When I create the variable
auc_max = max(of cv_auc0-cv_auc&39);
It seems to identify place 39 as the maximum, even though this maximum appears elsewhere in the array.
These numbers 0-39 reflect the number of covariates in a model, and I want to keep this number as low as possible while maintaining max auc, thus I would like for the auc_max variable to identify place 33 instead of 39. How to do this?
I extract this covariate number, p, in the following code:
array a (*) cv_auc0-cv_auc&maxp;
do k = &maxp to 0 by -1;
if (a(k+1) = auc_max) then p = k;
end;
cross_val_auc = a(p+1);
keep p cross_val_auc;
And the p it returns is 39 instead of 33.

Why not just use the WHICHN() function? You might want to subtract one since your variable name suffixes start from zero instead of one.
auc_max = max(of cv_auc0-cv_auc&maxp);
p = whichn(auc_max,of cv_auc0-cv_auc&maxp)-1;

I don't see anything in here that could be incorrect. Best guess is that the max value is slightly different between the places. If the value in place 39 is, say, 1e-6 > the value in place 33, then you will return place 39.
Here is how I would do it. I would iterate up from the bottom and use the leave; statement to stop the loop.
data test;
array a[10] (1 2 3 4 4 3 2 4 1 4);
m = max(of a1-a10);
do p=1 to 10 ;
if a[p] = m then leave;
end;
put m= p=;
run;
returns:
m=4 p=4

Related

index negative number in array in lua

problem: I have an array called X. X[1:20]. I initialized to 0 to start with.
I want to pass a number to each element of array after adding 1 to it. So I read number and store in local variable z. sometimes "number" can be positive and negative.
However, array is only for positive (i.e. x [1....20] )
When "number" is negative lua gives error message that "attempt to perform arithmetic on field '?' (a nil value).
What should I do?
local x= {}
local number
local z
for i = 1, 20 do
x [i] = 0; -- array initialization
end
for y = 1, 5 do
z = number -- I am reading a "number" from hardware & it is a negative integer number
x[z] = x[z]+1 ;
end
It's because you are reading x[z], not just assigning it.
A simple fix would be to give it a default value.
Example:
x[z] = (x[z] or 0) + 1
This will make the code assume x[z] is equal to zero by default.
Your initialization code is only setting the positive values of the array X to 0. Simply modify the loop to initialize all the possible values of Z.
for i = -20, 20 do
x [i] = 0; -- array initialization
end
This assumes Z ranges from -20 to 20.
Negative index values work, but can sometimes lead to unexpected behavior when using the ipairs() function. The ipairs() function will only iterate through the values of the table starting at index 1 until the first uninitialized (nil) entry.
t = {}
for i = -2, 2 do t[i] = i end
print("all values")
for i = -2, 2 do
print(i, t[i])
end
print("positive integer values")
for k, v in ipairs(t) do
print(k, v)
end
Gives the following results:
all values
-2 -2
-1 -1
0 0
1 1
2 2
positive integer values
1 1
2 2
You can't index tables using negative numbers in Lua. You have got few possibilities:
Type cast the index to string and index with this string. It will make your table associative (you can index with any string):
z = tostring(number)
This is good if you do not have any boundaries for your indices
If you have limited interval of numbers (-20 to 20), you can do a transformation to the positive indices when accessing the values:
MIN_VALUE = -20
z = number - MIN_VALUE
This approach should give better performance, since the array is addressed directly.
You never set "number" to anything, therefore it gives you the error
attempt to perform arithmetic on field '?' (a nil value)
Notice the words "a nil value", because this is why this error occurred. So next time set "number" to something, or add an "or ___" when trying to do something like calculations, or setting variables.

matlab: how to speed up the count of consecutive values in a cell array

I have the 137x19 cell array Location(1,4).loc and I want to find the number of times that horizontal consecutive values are present in Location(1,4).loc. I have used this code:
x=Location(1,4).loc;
y={x(:,1),x(:,2)};
for ii=1:137
cnt(ii,1)=sum(strcmp(x(:,1),y{1,1}{ii,1})&strcmp(x(:,2),y{1,2}{ii,1}));
end
y={x(:,1),x(:,2),x(:,3)};
for ii=1:137
cnt(ii,2)=sum(strcmp(x(:,1),y{1,1}{ii,1})&strcmp(x(:,2),y{1,2}{ii,1})&strcmp(x(:,3),y{1,3}{ii,1}));
end
y={x(:,1),x(:,2),x(:,3),x(:,4)};
for ii=1:137
cnt(ii,3)=sum(strcmp(x(:,1),y{1,1}{ii,1})&strcmp(x(:,2),y{1,2}{ii,1})&strcmp(x(:,3),y{1,3}{ii,1})&strcmp(x(:,4),y{1,4}{ii,1}));
end
y={x(:,1),x(:,2),x(:,3),x(:,4),x(:,5)};
for ii=1:137
cnt(ii,4)=sum(strcmp(x(:,1),y{1,1}{ii,1})&strcmp(x(:,2),y{1,2}{ii,1})&strcmp(x(:,3),y{1,3}{ii,1})&strcmp(x(:,4),y{1,4}{ii,1})&strcmp(x(:,5),y{1,5}{ii,1}));
end
... continue for all the columns. This code run and gives me the correct result but it's not automated and it's slow. Can you give me ideas to automate and speed up the code?
I think I will write an answer to this since I've not done so for a while.
First convert your cell Array to a matrix,this will ease the following steps by a lot. Then diff is the way to go
A = randi(5,[137,19]);
DiffA = diff(A')'; %// Diff creates a matrix that is 136 by 19, where each consecutive value is subtracted by its previous value.
So a 0 in DiffA would represent 2 consecutive numbers in A are equal, 2 consecutive 0s would mean 3 consecutive numbers in A are equal.
idx = DiffA==0;
cnt(:,1) = sum(idx,2);
To do 3 consecutive number counts, you could do something like:
idx2 = abs(DiffA(:,1:end-1))+abs(DiffA(:,2:end)) == 0;
cnt(:,2) = sum(idx2,2);
Or use another Diff, the abs is used to avoid negative number + positive number that also happens to give 0; otherwise only 0 + 0 will give you a 0; you can now continue this pattern by doing:
idx3 = abs(DiffA(:,1:end-2))+abs(DiffA(:,2:end-1))+abs(DiffA(:,3:end)) == 0
cnt(:,3) = sum(idx3,2);
In loop format:
absDiffA = abs(DiffA)
for ii = 1:W
absDiffA = abs(absDiffA(:,1:end-1) + absDiffA(:,1+1:end));
idx = (absDiffA == 0);
cnt(:,ii) = sum(idx,2);
end
NOTE: this method counts [0,0,0] twice when evaluating 2 consecutives, and once when evaluating 3 consecutives.

Filling an array where one portion is linearly increasing and the rest is truncated

I'm trying to fill an array of size 1 x 200 with values. I want the array to be filled with values ranging from 0 to 216 in steps of 6 and then keep the value constant (216) for the remaining part of the array.
How can I do that?
One way is to initially create an array from 0 to 216 in steps of 6, then concatenate the array of 216s until you reach 200 values.
Something like:
out = 0:6:216;
N = 200;
out(end+1:end+N-numel(out)) = 216;
Another way is to create 200 values of 216, then fill replace the values of the array from 1 up to 216/6 = 36 and add 1 since we're including 0; fill this in with the desired array:
N = 200; stop = (N/6) + 1;
out = 216*ones(1,N);
out(1:stop) = 0:6:216;
Finally, another way is to create an array from 0 up to 200, truncate all values that are greater than 36 to be 36, then multiply the result by 6:
N = 200;
out = 0:N;
out(out > 36) = 36;
out = 6*out;
... and as for completeness, you can do this with min1:
out = min(0:199,36)*6;
The two argument min call outputs the minimum of the first and second input for each element between two arrays of compatible sizes. Should any of the inputs be constants, then this constant is compared with against all elements in the array. The explanation for this code is to generate an array from 0 to 199, then any values that are less than 36 we keep, but any values greater stay at 36. We then multiply the result by 6 to obtain the result.
1: Credit for this answer goes to user Stewie Griffin before he deleted his answer. I decided to put this in for completeness.
arr = min(0:6:(6*199),216);
should work
or:
arr = min((0:199)*6,216);

Substitute a vector value with two values in MATLAB

I have to create a function that takes as input a vector v and three scalars a, b and c. The function replaces every element of v that is equal to a with a two element array [b,c].
For example, given v = [1,2,3,4] and a = 2, b = 5, c = 5, the output would be:
out = [1,5,5,3,4]
My first attempt was to try this:
v = [1,2,3,4];
v(2) = [5,5];
However, I get an error, so I do not understand how to put two values in the place of one in a vector, i.e. shift all the following values one position to the right so that the new two values fit in the vector and, therefore, the size of the vector will increase in one. In addition, if there are several values of a that exist in v, I'm not sure how to replace them all at once.
How can I do this in MATLAB?
Here's a solution using cell arrays:
% remember the indices where a occurs
ind = (v == a);
% split array such that each element of a cell array contains one element
v = mat2cell(v, 1, ones(1, numel(v)));
% replace appropriate cells with two-element array
v(ind) = {[b c]};
% concatenate
v = cell2mat(v);
Like rayryeng's solution, it can replace multiple occurrences of a.
The problem mentioned by siliconwafer, that the array changes size, is here solved by intermediately keeping the partial arrays in cells of a cell array. Converting back to an array concenates these parts.
Something I would do is to first find the values of v that are equal to a which we will call ind. Then, create a new output vector that has the output size equal to numel(v) + numel(ind), as we are replacing each value of a that is in v with an additional value, then use indexing to place our new values in.
Assuming that you have created a row vector v, do the following:
%// Find all locations that are equal to a
ind = find(v == a);
%// Allocate output vector
out = zeros(1, numel(v) + numel(ind));
%// Determine locations in output vector that we need to
%// modify to place the value b in
indx = ind + (0:numel(ind)-1);
%// Determine locations in output vector that we need to
%// modify to place the value c in
indy = indx + 1;
%// Place values of b and c into the output
out(indx) = b;
out(indy) = c;
%// Get the rest of the values in v that are not equal to a
%// and place them in their corresponding spots.
rest = true(1,numel(out));
rest([indx,indy]) = false;
out(rest) = v(v ~= a);
The indx and indy statements are rather tricky, but certainly not hard to understand. For each index in v that is equal to a, what happens is that we need to shift the vector over by 1 for each index / location of v that is equal to a. The first value requires that we shift the vector over to the right by 1, then the next value requires that we shift to the right by 1 with respect to the previous shift, which means that we actually need to take the second index and shift by the right by 2 as this is with respect to the original index.
The next value requires that we shift to the right by 1 with respect to the second shift, or shifting to the right by 3 with respect to the original index and so on. These shifts define where we're going to place b. To place c, we simply take the indices generated for placing b and move them over to the right by 1.
What's left is to populate the output vector with those values that are not equal to a. We simply define a logical mask where the indices used to populate the output array have their locations set to false while the rest are set to true. We use this to index into the output and find those locations that are not equal to a to complete the assignment.
Example:
v = [1,2,3,4,5,4,4,5];
a = 4;
b = 10;
c = 11;
Using the above code, we get:
out =
1 2 3 10 11 5 10 11 10 11 5
This successfully replaces every value that is 4 in v with the tuple of [10,11].
I think that strrep deserves a mention here.
Although it's called string replacement and warns for non-char input, it still works perfectly fine for other numbers as well (including integers, doubles and even complex numbers).
v = [1,2,3,4]
a = 2, b = 5, c = 5
out = strrep(v, a, [b c])
Warning: Inputs must be character arrays or cell arrays of strings.
out =
1 5 5 3 4
You are not attempting to overwrite an existing value in the vector. You're attempting to change the size of the vector (meaning the number of rows or columns in the vector) because you're adding an element. This will always result in the vector being reallocated in memory.
Create a new vector, using the first and last half of v.
Let's say your index is stored in the variable index.
index = 2;
newValues = [5, 5];
x = [ v(1:index), newValues, v(index+1:end) ]
x =
1 2 5 5 3 4

how to get more than one number inside of matrices

Ok. I have a simple question although I'm still fairly new to Matlab (taught myself). So I was wanting a 1x6 matrix to look like this below:
0
0
1
0
321, 12 <--- needs to be in one box in 1x6 matrices
4,30,17,19 <--- needs to be in one box in 1x6 matrices
Is there a possible way to do this or am I going to just have to write them all in separate boxes thus making it a 1x10 matrix?
My code:
event_marker = 0;
event_count = 0;
block_number = 1;
date = [321,12] % (its corresponding variables = 321 and 12)
time = [4,30,17,19] % (its corresponding variable = 4 and 30 and 17 and 19)
So if I understand you correctly, you want an array that contains 6 elements, of which 1 element equals 1, another element is the array [312,12] and the last element is the array [4,30,17,19].
I'll suggest two things to accomplish this: matrices, and cell-arrays.
Cell arrays
In Matlab, a cell array is a container for arbitrary types of data. You define it using curly-braces (as opposed to block braces for matrices). So, for example,
C = {'test', rand(4), {#cos,#sin}}
is something that contains a string (C{1}), a normal matrix (C{2}), and another cell which contains function handles (C{3}).
For your case, you can do this:
C = {0,0,1,0, [321,12], [4,30,17,19]};
or of course,
C = {0, event_marker, event_count, block_number, date, time};
Matrices
Depending on where you use it, a normal matrix might suffice as well:
M = [0 0 0 0
event_marker 0 0 0
event_count 0 0 0
block_number 0 0 0
321 12 0 0
4 30 17 19];
Note that you'll need some padding (meaning, you'll have to add those zeros in the top-right somehow). There's tonnes of ways to do that, but I'll "leave that as an exercise" :)
Again, it all depends on the context which one will be easier.
Consider using cell arrays rather than matrices for your task.
data = cell(6,1); % allocate cell
data{1} = event_marker; % note the curly braces here!
...
data{6} = date; % all elements of date fits into a single cell.
If your date and time variables are actually represent date (numbers of days, months, years) and time (hours, mins, sec), they can be packed into one or two numbers.
Look into DATENUM function. If you have a vector, for example, [2013, 4, 10], representing April 10th of 2013 you can convert it into a serial date:
daten = datenum([2013, 4, 10]);
It's ok if you have number of days in a year, but not months. datenum([2013, 0, 300]) will also work.
The time can be packed together with date or separately:
timen = datenum([0, 0, 0, 4, 30, 17.19]);
or
datetimen = datenum([2013, 4, 10, 4, 30, 17.19]);
Once you have this serial date you can just keep it in one vector with other numbers.
You can convert this number back into either date vector or date string with DATEVEC and DATESTR function.

Resources