Substitute values in an array with a random number if value>1 - arrays

I have an array (1000 x 8) of values generated from a log-normal distribution representing the % die-off of bacteria on a surface after an hour. The problem is that some values are larger than 100% so I'd like to replace them with a random value between 0 and 1.
dieOff=zeros(1000,8); %make empty 1000X8 array
m = 0.9; % 90% die-off
v = 0.01;% std from experiment
mu = log((m^2)/sqrt(v+m^2)); %conver to lognorm
sigma = sqrt(log(v/(m^2)+1));
dieOff=lognrnd(mu,sigma,n,k);% generate values
dieOff(dieOff>1)=rand(); %replace with random
But it looks like rand() only produces 1 value and replaces all the values that are > 1 with that same value which is not what I'd like. How can I fix this in a neat format?
histogram(dieOff)

rand() gives a single number, i.e. you replace all values with that same, random, constant. Instead, use a random number for each occurrence:
dieOff(dieOff>1)=rand(nnz(dieOff>1),1);
rand(n,k) gives you an n-by-k matrix of random numbers between 0 and 1. rand(n) gives you an n-by-n matrix of random numbers (i.e. square), so for n=1 it is a single number. rand() is short for rand(1).

Related

Define a vector with random steps

I want to create an array that has incremental random steps, I've used this simple code.
t_inici=(0:10*rand:100);
The problem is that the random number keeps unchangable between steps. Is there any simple way to change the seed of the random number within each step?
If you have a set number of points, say nPts, then you could do the following
nPts = 10; % Could use 'randi' here for random number of points
lims = [0, 10] % Start and end points
x = rand(1, nPts); % Create random numbers
% Sort and scale x to fit your limits and be ordered
x = diff(lims) * ( sort(x) - min(x) ) / diff(minmax(x)) + lims(1)
This approach always includes your end point, which a 0:dx:10 approach would not necessarily.
If you had some maximum number of points, say nPtsMax, then you could do the following
nPtsMax = 1000; % Max number of points
lims = [0,10]; % Start and end points
% Could do 10* or any other multiplier as in your example in front of 'rand'
x = lims(1) + [0 cumsum(rand(1, nPtsMax))];
x(x > lims(2)) = []; % remove values above maximum limit
This approach may be slower, but is still fairly quick and better represents the behaviour in your question.
My first approach to this would be to generate N-2 samples, where N is the desired amount of samples randomly, sort them, and add the extrema:
N=50;
endpoint=100;
initpoint=0;
randsamples=sort(rand(1, N-2)*(endpoint-initpoint)+initpoint);
t_inici=[initpoint randsamples endpoint];
However not sure how "uniformly random" this is, as you are "faking" the last 2 data, to have the extrema included. This will somehow distort pure randomness (I think). If you are not necessarily interested on including the extrema, then just remove the last line and generate N points. That will make sure that they are indeed random (or as random as MATLAB can create them).
Here is an alternative solution with "uniformly random"
[initpoint,endpoint,coef]=deal(0,100,10);
t_inici(1)=initpoint;
while(t_inici(end)<endpoint)
t_inici(end+1)=t_inici(end)+rand()*coef;
end
t_inici(end)=[];
In my point of view, it fits your attempts well with unknown steps, start from 0, but not necessarily end at 100.
From your code it seems you want a uniformly random step that varies between each two entries. This implies that the number of entries that the vector will have is unknown in advance.
A way to do that is as follows. This is similar to Hunter Jiang's answer but adds entries in batches instead of one by one, in order to reduce the number of loop iterations.
Guess a number of required entries, n. Any value will do, but a large value will result in fewer iterations and will probably be more efficient.
Initiallize result to the first value.
Generate n entries and concatenate them to the (temporary) result.
See if the current entries are already too many.
If they are, cut as needed and output (final) result. Else go back to step 3.
Code:
lower_value = 0;
upper_value = 100;
step_scale = 10;
n = 5*(upper_value-lower_value)/step_scale*2; % STEP 1. The number 5 here is arbitrary.
% It's probably more efficient to err with too many than with too few
result = lower_value; % STEP 2
done = false;
while ~done
result = [result result(end)+cumsum(step_scale*rand(1,n))]; % STEP 3. Include
% n new entries
ind_final = find(result>upper_value,1)-1; % STEP 4. Index of first entry exceeding
% upper_value, if any
if ind_final % STEP 5. If non-empty, we're done
result = result(1:ind_final-1);
done = true;
end
end

How to expand the range of a random integer from 1-10 to 1-200

Given a function that produces random integers uniformly in range 1 to 10, how to write a function that produces random integers uniformly in range 1 to 200?
let u()=uniform(1,10), you can write your new random variable `
v() = 10*(u()-1) + u() + 100*I[u()>5]
Note that you need three invocations of the uniform function, third one can be a boolean variable though. Here I used as an indicator function
I[x] = x ? 1 : 0 // if x is true then 1 else 0.
Instead of u()>5, you can equivalently define u()%2==0 (u is even). You can think of creating 200 distinct values as 10 * 10 * 2 which requires the three invocations of the underlying uniform function, even though the third one is only used as binary value.
Here is an awk implementation and histogram test
awk 'function u() {return int(1+rand()*10)}
BEGIN {srand(); trials=100000;
for(i=1;i<=trials;i++) v[10*(u()-1)+u()+100*(u()%2)]++;
for(k in v) print k, v[k], (v[k]-trials/200)^2}' | sort -k3nr
the last column indicates the difference between the ideal distribution, either sum or max value can be used as a fitness value. There are more complicated tests, chi-square etc. for more sophisticated analysis.
Call the base function 3 times and scale the results.
int rand1to200() {
return ((rand1to10() - 1)*100 +
(rand1to10() - 1)*10 +
(rand1to10() - 1)*1)%200 + 1;
}

Python Random function

Write a program to create an array of random numbers between 1000 to 2000
and count the number of values that are higher than 1500.
I've kind of have the understanding of setting the range, but not counting the number of returns.
What I have is this:
import random
for x in range(20):
a=random.randint(1000,2000)
b=(a>1500)
print b
print
this simply returns Trues or Falses, I need to know the total number of numbers over 1500 not if they are or aren't Thanks
This is how you do it, assuming your original code is correct in other respects:
import random
count = 0
for x in range(20):
a=random.randint(1000,2000)
# b=(a>1500)// This is expected to give a boolean
# print b
if (a > 1500):
count = count + 1
print count

How do I create random Boolean array with at least one 1 in each row in matlab?

I am trying to create roandom boolean arrays in Matlab with atleast one 1 in each row.
you can use randi to generate random integers?
A = randi([0 1], 50, 10);
Generate a 50-by-10 array of integer values drawn uniformly from 0 or 1 and
you can convert matrix to dataset array by
ds = mat2dataset(A);
to convert a binary row to a number - as in the previous answers:
bin2dec(num2str(A(n,:)));
Suppose you want a random logical (boolean) matrix of size m-by-n with roughly p=0.25 entries set to true at each row but not less than one, then you can simply:
P = rand(m,n); %// generate random numbers in [0,1]
th = min( max(P,[],2), 1-p ); %// set threshold
B = bsxfun( #ge, P, th ); %// threshold the probability matrix to get random boolean entries
Note that the threshold is determined by the amount of true values you want per row, but it is also truncated to the max value of each row, ensuring that at least one element (at random) will be set to true.
Here's one way. Let
M = 5; %// number of columns
N = 4; %// number of rows
p = .5; %// initial probability of 1
You can generate the matrix with the given probability of ones, and then fill in a one at a random position in each row (possibly overwriting a zero) to make sure there's at least a one in each row:
result = rand(M,N)<p; %// generate matrix
result(bsxfun(#plus, floor(N*rand(1,M))*M, 1:M)) = 1; %// at least a one per row

Matlab: Help in implementing quantized time series

I am having trouble implementing this code due to the variable s_k being logical 0/1. In what way can I implement this statement?
s_k is a random sequence of 0/1 generated using a rand() and quantizing the output of rand() by its mean given below. After this, I don't know how to implement. Please help.
N =1000;
input = randn(N);
s = (input>=0.5); %converting into logical 0/1;
UPDATE
N = 3;
tmax = 5;
y(1) = 0.1;
for i =1 : tmax+N-1 %// Change here
y(i+1) = 4*y(i)*(1-y(i)); %nonlinear model for generating the input to Autoregressive model
end
s = (y>=0.5);
ind = bsxfun(#plus, (0:tmax), (0:N-1).');
x = sum(s(ind+1).*(2.^(-ind+N+1))); % The output of this conversion should be real numbers
% Autoregressive model of order 1
z(1) =0;
for j =2 : N
z(j) = 0.195 *z(j-1) + x(j);
end
You've generated the random logical sequence, which is great. You also need to know N, which is the total number of points to collect at one time, as well as a list of time values t. Because this is a discrete summation, I'm going to assume the values of t are discrete. What you need to do first is generate a sliding window matrix. Each column of this matrix represents a set of time values for each value of t for the output. This can easily be achieved with bsxfun. Assuming a maximum time of tmax, a starting time of 0 and a neighbourhood size N (like in your equation), we can do:
ind = bsxfun(#plus, (0:tmax), (0:N-1).');
For example, assuming tmax = 5 and N = 3, we get:
ind =
0 1 2 3 4 5
1 2 3 4 5 6
2 3 4 5 6 7
Each column represents a time that we want to calculate the output at and every row in a column shows a list of time values we want to calculate for the desired output.
Finally, to calculate the output x, you simply take your s_k vector, make it a column vector, use ind to access into it, do a point-by-point multiplication with 2^(-k+N+1) by substituting k with what we got from ind, and sum along the rows. So:
s = rand(max(ind(:))+1, 1) >= 0.5;
x = sum(s(ind+1).*(2.^(-ind+N+1)));
The first statement generates a random vector that is as long as the maximum time value that we have. Once we have this, we use ind to index into this random vector so that we can generate a sliding window of logical values. We need to offset this by 1 as MATLAB starts indexing at 1.

Resources