Creating an array with data conditional on another matrix - arrays

I know there must be an apply function or ave for this, but I am not quite sure how to do it:
I have data:
date player market
1: 1-1 1 1
2: 1-1 2 1
3: 1-1 1 2
4: 1-2 2 1
5: 1-2 3 2
6: 1-3 21 1
7: 1-4 1 1
8: 1-4 51 1
9: 1-4 1 1
10: 1-5 1 2
I also have a blank array, which has unique dates on the rows, unique markets on the columns, and unique players for the third dimension.
1
[,,1]
1 2
1-1
1-2
1-3
1-4
1-5
2
[,,2]
1 2
1-1
1-2
1-3
1-4
1-5
etc
I want to fill out the array with from the data.
I want each point to = 1 if the guy has an entry in the data where he is present for a date and market combination, and 0 if not. So for example, for 1 and 2, they would be filled out as:
1
[,,1]
1 2
1-1 1 1
1-2 0 0
1-3 0 0
1-4 0 1
1-5 0 1
2
[,,2]
1 2
1-1 1 0
1-2 1 0
1-3 0 0
1-4 0 0
1-5 0 0
Looping is out of the question. Thank you for your help.

You can use xtabs for this purpose. Where temp dates, Month market and day player.
data(airquality)
tab<-xtabs(~Temp+Month+Day,airquality)
> dim(tab)
[1] 40 5 31
> str(tab)
xtabs [1:40, 1:5, 1:31] 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, "dimnames")=List of 3
..$ Temp : chr [1:40] "56" "57" "58" "59" ...
..$ Month: chr [1:5] "5" "6" "7" "8" ...
..$ Day : chr [1:31] "1" "2" "3" "4" ...
- attr(*, "class")= chr [1:2] "xtabs" "table"
- attr(*, "call")= language xtabs(formula = ~Temp + Month + Day, data = airquality)
edit:
converting to data frame.
> head(as.data.frame(tab))
Temp Month Day Freq
1 56 5 1 0
2 57 5 1 0
3 58 5 1 0
4 59 5 1 0
5 61 5 1 0
6 62 5 1 0

Related

Error: Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent

I have the following dataset about the choices of different car brands and their attributes. I would like to create a matrix based on each attribute of the cars.
RespNum Task Concept Make Exterior.Design Interior.design
1 100086500 1 1 3 2 3
2 100086500 1 2 1 3 2
3 100086500 1 3 4 1 1
4 100086500 1 4 0 0 0
5 100086500 2 1 1 3 2
6 100086500 2 2 5 1 3
Driving.performance Driving.attributes Comfort Practibility Safety
1 1 1 1 3 3
2 3 3 3 2 1
3 2 2 2 1 2
4 0 0 0 0 0
5 3 2 1 1 3
6 1 3 3 3 2
Quality Equipment Sustainability Economy Price Response
1 2 1 1 3 1 0
2 1 3 3 1 3 0
3 3 2 2 2 2 1
4 0 0 0 0 0 0
5 3 2 1 1 4 0
6 1 3 3 3 8 0
I am using the function:
Make = attribcoding(6,4,'Other')
The first input (6) is the number of levels, the second (4) is the column position in the dataset, and the last ('Other') is the name of the outside option. However, I get the following error message:
Error in dimnames(x) <- dn :
length of 'dimnames' [2] not equal to array extent

Repeat rows based on item count for each row and assign values for repeated rows

I have a df with the item and it is available in different rooms
Item Room1 Room2 Room3 Room4
Ball 1 1 1 0
Bat 1 1 1 1
Wicket 1 1 1 0
Now I want to repeat the rows based on item counts on different Rooms. For example for Item - Ball there are three 1's in Room1, Room2, Room3 so need to repeat 3 rows with assigning 0 in each row only for Room1, Room2, Room3 columns, and Room4 is not considered for Item Ball and it can be 0's for all Ball item rows. There are 300 columns with different room names, for example Room1,room2,room3,room4,BlockArea1,Block2 etc.Below is the expected output
Item Room1 Room2 Room3 Room4
Ball 1 1 1 0
Ball 1 0 1 0
Ball 1 1 0 0
Bat 1 1 1 1
Bat 1 1 1 0
Bat 1 1 0 1
Bat 1 0 1 1
Wicket 1 1 1 0
Wicket 1 0 1 0
Wicket 1 1 0 0
Any help would be appreciated
To have a more interesting example, with a source row containing 0
somewhere else than in the last column, I created df as:
Item Room1 Room2 Room3 Room4
0 Ball 1 1 1 0
1 Bat 1 1 1 1
2 Wicket 1 1 1 0
3 Xxxx 0 1 1 1
The first step is to define a function to process each row:
def rowProc(row):
n = 0
res = []
for idx, val in row[row > 0].items():
outRow = row.copy()
if n > 0:
outRow[idx] = 0
res.append(outRow)
n += 1
return pd.DataFrame(res)
An important project detail is that the source row comes here from
a bit "changed" DataFrame, namely Item column will be set as
the index. So the only processed columns are "further" (Room...)
columns.
For the current row it generates a DataFrame containing:
as many rows as how many ones contains the source row,
the first output row is an exact copy of the source row (like in
your expected result),
further rows have consecutive ones set to 0.
Then run:
result = pd.concat(df.set_index('Item').apply(rowProc, axis=1).tolist())
result.index.name = 'Item'
result.reset_index(inplace=True)
The result is:
Item Room1 Room2 Room3 Room4
0 Ball 1 1 1 0
1 Ball 1 0 1 0
2 Ball 1 1 0 0
3 Bat 1 1 1 1
4 Bat 1 0 1 1
5 Bat 1 1 0 1
6 Bat 1 1 1 0
7 Wicket 1 1 1 0
8 Wicket 1 0 1 0
9 Wicket 1 1 0 0
10 Xxxx 0 1 1 1
11 Xxxx 0 1 0 1
12 Xxxx 0 1 1 0

Matrix Permutations with Contraint

I have the following matrix:
1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2
0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2
0 0 0 0 1 1 1 1 2 2 2 2 0 0 0 0 1 1 1 1 2 2 2 2
I'd like to randomly permute the columns, with the constraint that every four numbers in the second row should contain some form of
0 0 1 2
e.g. Columns 1:4, 5:8, 9:12, 13:16, 17:20, 21:24 in the example below each contain the numbers 0 0 1 2.
0 1 0 2 2 0 1 0 0 0 2 1 1 2 0 0 2 0 1 0 1 0 0 2
Every column in the permuted matrix should have a corresponding one in the first matrix. In other words, nothing should be altered within a column.
I can't seem to think of an intuitive solution to this - Is there another way of coming up with some form of the initial matrix that both satisfies the constraint and retains the integrity of the columns? Each column represents conditions in an experiment, which is why I'd like them to be balanced.
You can compute the permutations directly in the following manner: First, permute all columns with 0 in the second row among themselves, then all 1s among themselves, and finally all 2s among themselves. This ensures that, for example, any two 0 columns are equally likely to be the first two columns in the resulting permutation of A.
The second step is to permute all columns in blocks of 4: permute columns 1-4 randomly, permute columns 5-8 randomly, etc. Once you do this, you have a matrix that maintains the (0 0 1 2) pattern for every block of 4 columns, but each set of (0 0 1 2) is equally likely to be in any given block of 4, and the (0 0 1 2) are equally likely to be in any order.
A = [1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2
0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2
0 0 0 0 1 1 1 1 2 2 2 2 0 0 0 0 1 1 1 1 2 2 2 2];
%% Find the indices of the zeros and generate a random permutation with that size
zeroes = find(A(2,:)==0);
perm0 = zeroes(randperm(length(zeroes)));
%% Find the indices of the ones and generate a random permutation with that size
wons = find(A(2,:) == 1);
perm1 = wons(randperm(length(wons)));
%% NOTE: the spelling of `zeroes` and `wons` is to prevent overwriting
%% the MATLAB builtin functions `zeros` and `ones`
%% Find the indices of the twos and generate a random permutation with that size
twos = find(A(2,:) == 2);
perm2 = twos(randperm(length(twos)));
%% permute the zeros among themselves, the ones among themselves and the twos among themselves
A(:,zeroes) = A(:,perm0);
A(:,wons) = A(:,perm1);
A(:,twos) = A(:,perm2);
%% finally, permute each block of 4 columns, so that the (0 0 1 2) pattern is preserved, but each column still has an
%% equi-probable chance of being in any position
for i = 1:size(A,2)/4
perm = randperm(4) + 4*i-4;
A(:, 4*i-3:4*i) = A(:,perm);
end
Example result:
A =
Columns 1 through 15
1 1 2 2 2 2 1 1 2 2 1 2 2 1 2
0 0 2 1 0 2 0 1 0 2 1 0 1 2 0
0 1 2 2 2 0 1 1 1 1 2 0 0 2 0
Columns 16 through 24
2 1 1 1 1 1 2 2 1
0 2 0 0 1 0 0 1 2
1 1 2 2 0 0 2 1 0
I was able to generate 100000 constrained permutations of A in about 9.32 seconds running MATLAB 2016a, to give you an idea of how long this code takes. There are certainly ways to optimize the permutation selection so you don't have to make quite so many random draws, but I always prefer the simple, straightforward approach until it proves insufficient.
You could use a rejection method: keep trying random permutations, chosen equiprobably, until one satisfies the requirement. This guarantees that all valid permutations have the same probability of being picked.
A = [ 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2
0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2
0 0 0 0 1 1 1 1 2 2 2 2 0 0 0 0 1 1 1 1 2 2 2 2 ]; % data matrix
required = [0 0 1 2]; % restriction
row = 2; % row to which the resitriction applies
sorted_req = sort(required(:)); % sort required values
done = false; % initiallize
while ~done
result = A(:, randperm(size(A,2))); % random permutation of columns of A
test = sort(reshape(result(row,:), numel(required), []), 1); % reshape row
% into blocks, each block in a column; and sort each block
done = all(all(bsxfun(#eq, test, sorted_req))); % test if valid
end
Here's an example result:
result =
2 1 1 1 1 2 1 2 1 2 2 1 2 2 1 2 2 2 1 1 1 2 1 2
2 0 0 1 2 1 0 0 0 1 0 2 2 0 1 0 1 2 0 0 2 0 1 0
2 1 2 2 1 2 2 0 1 1 1 2 1 1 0 0 0 0 0 0 0 2 1 2

How to find all combinations of multiple 2D arrays(matrix) , rotation allowed

I have 3 2d Arrays(matrix) with 0 and 1-
For each array, I will rotate 4 times clock-wise , 4 times anti clock-wise and flip the array and repeat the above and for each iteration I will repeat the steps for other array and so on to combine the array to build a symmetry or kind of Rubik's cube but with 5 elements each side. It means if I like to add 2 arrays , it means 1 of Array 1 must be fit with 0 of Array 2.
Following kind of structure-
Following is my 3 arrays
0 0 1 0 1
1 1 1 1 1
0 1 1 1 0
1 1 1 1 1
0 1 0 1 1
-------------
0 1 0 1 0
0 1 1 1 0
1 1 1 1 1
0 1 1 1 0
0 0 1 0 0
-------------
1 0 1 0 0
1 1 1 1 1
0 1 1 1 0
1 1 1 1 1
0 1 0 1 0
-------------
This problem is evolved from the problem I asked How to solve 5 * 5 Cube in efficient easy way.
Consider my rotate methods are as follows -
rotateLeft()
rotateRight()
flipSide()
for (firstArray){
element = single.rotateLeft();
for(secondArray){
element2 = single.rotateLeft();
if(element.combine(element2){
for(thirdArray){
}
}
}
}
Currently I have fixed 3 arrays , but how exactly and efficiently I must solve this problem.

SAS, assigning the same numbers to specific observations

I want to assign the same id number to every four observations. For example, if I have the following data
age marital gender id
45 1 0 1
33 1 1 1
68 0 1 1
27 1 0 1
43 0 0 2
37 0 1 2
19 1 1 2
40 1 1 2
25 1 0 3
38 1 1 3
57 0 0 3
50 1 0 3
51 1 1 4
44 0 1 4
69 1 0 4
39 0 1 4
The last column id is something I want to produce.
Plus, the dataset have 500,000+ observations.
Thanks in advance.
Slightly more compact:
id = ceil(_n_/4);
Use the integer function and the built-in _n_ variable (which increments for each observation):
id = int( (_n_-4)/4 )+1;

Resources