Related
I have the following code that for a sorted Pandas data frame, groups by one column, and creates two new columns: one according to the previous 4 rows and current row in the group, and one based on the future row in the group.
data_test = {'nr':[1,1,1,1,1,6,6,6,6,6,6,6],
'val':[11,12,13,14,15,61,62,63,64,65,66,67]}
df_test = pd.DataFrame (data_test, columns = ['nr','val'])
print (df_test)
hence the following frame:
nr val
0 1 11
1 1 12
2 1 13
3 1 14
4 1 15
5 6 61
6 6 62
7 6 63
8 6 64
9 6 65
10 6 66
11 6 67
Now I have to following code which groups by 'nr' and build one column containing for each row previous 4 values of 'val' in the group and the current value. Similarly is build one extra column containing per row the future value of 'val' in the group.
df_test['past4'] = df_test.groupby(['nr'])['val'].transform(lambda x: x.shift(4).fillna(0))
df_test['past3'] = df_test.groupby(['nr'])['val'].transform(lambda x: x.shift(3).fillna(0))
df_test['past2'] = df_test.groupby(['nr'])['val'].transform(lambda x: x.shift(2).fillna(0))
df_test['past1'] = df_test.groupby(['nr'])['val'].transform(lambda x: x.shift(1).fillna(0))
df_test['future'] = df_test.groupby(['nr'])['val'].transform(lambda x: x.shift(-1).fillna(0))
df_test['amounts'] = df_test[['past4', 'past3','past2','past1','val']].values.tolist()
df_test.drop(columns = ['past4', 'past3', 'past2', 'past1'], inplace = True)
df_test
nr val future amounts
0 1 11 12 [0, 0, 0, 0, 11]
1 1 12 13 [0, 0, 0, 11, 12]
2 1 13 14 [0, 0, 11, 12, 13]
3 1 14 15 [0, 11, 12, 13, 14]
4 1 15 0 [11, 12, 13, 14, 15]
5 6 61 62 [0, 0, 0, 0, 61]
6 6 62 63 [0, 0, 0, 61, 62]
7 6 63 64 [0, 0, 61, 62, 63]
8 6 64 65 [0, 61, 62, 63, 64]
9 6 65 66 [61, 62, 63, 64, 65]
10 6 66 67 [62, 63, 64, 65, 66]
11 6 67 0 [63, 64, 65, 66, 67]
I'm sure I should be able to build the one list column called 'amounts' easier, probably one-liner. How can I do this?
Use custom function for create nested lists like:
def f(x):
#list comprehension with shift by 4,3,2,1,0
L = [x['val'].shift(i).fillna(0) for i in range(4, -1, -1)]
#shifting to another column
x['future'] = x['val'].shift(-1).fillna(0).astype(int)
#column filled by lists
x['amounts'] = pd.Series(np.array(L).astype(int).T.tolist(), index=x.index)
return (x)
df_test = df_test.groupby(['nr']).apply(f)
print (df_test)
nr val future amounts
0 1 11 12 [0, 0, 0, 0, 11]
1 1 12 13 [0, 0, 0, 11, 12]
2 1 13 14 [0, 0, 11, 12, 13]
3 1 14 15 [0, 11, 12, 13, 14]
4 1 15 0 [11, 12, 13, 14, 15]
5 6 61 62 [0, 0, 0, 0, 61]
6 6 62 63 [0, 0, 0, 61, 62]
7 6 63 64 [0, 0, 61, 62, 63]
8 6 64 65 [0, 61, 62, 63, 64]
9 6 65 66 [61, 62, 63, 64, 65]
10 6 66 67 [62, 63, 64, 65, 66]
11 6 67 0 [63, 64, 65, 66, 67]
Migrating your bloc into a function make the code more modular and lighter
In this specific example we send reversed(range(5)) as shift_values, this represents the list [4, 3, 2, 1, 0]
import pandas as pd
data_test = {'nr':[1,1,1,1,1,6,6,6,6,6,6,6],
'val':[11,12,13,14,15,61,62,63,64,65,66,67]}
df_test = pd.DataFrame(data_test, columns = ['nr','val'])
def generate_past(df, shift_values):
serie = pd.DataFrame([df.groupby('nr')['val'].transform(lambda x: x.shift(shift_value).fillna(0)) for shift_value in shift_values])
return serie.T.values.tolist()
df_test['future'] = df_test.groupby(['nr'])['val'].transform(lambda x: x.shift(-1).fillna(0))
df_test['amounts'] = generate_past(df_test, reversed(range(5)))
you can try like this (same as jezrael) but without using apply. Not a good approach as I am making new dataframe.
df_new = pd.DataFrame()
for i,grp in df_test.groupby('nr'):
grp = grp.reset_index(drop=True)
grp['future'] = pd.Series(grp['val'].shift(-1).fillna(0).astype(int))
grp['amount'] = pd.Series([grp['val'].shift(i).fillna(0).values[-5:] for i in range(len(grp)-1,-1,-1)])
df_new = df_new.append(grp)
df_new.reset_index(drop=True, inplace=True)
df_new:
nr val future amounts
0 1 11 12 [0.0, 0.0, 0.0, 0.0, 11.0]
1 1 12 13 [0.0, 0.0, 0.0, 11.0, 12.0]
2 1 13 14 [0.0, 0.0, 11.0, 12.0, 13.0]
3 1 14 15 [0.0, 11.0, 12.0, 13.0, 14.0]
4 1 15 0 [11, 12, 13, 14, 15]
5 6 61 62 [0.0, 0.0, 0.0, 0.0, 61.0]
6 6 62 63 [0.0, 0.0, 0.0, 61.0, 62.0]
7 6 63 64 [0.0, 0.0, 61.0, 62.0, 63.0]
8 6 64 65 [0.0, 61.0, 62.0, 63.0, 64.0]
9 6 65 66 [61.0, 62.0, 63.0, 64.0, 65.0]
10 6 66 67 [62.0, 63.0, 64.0, 65.0, 66.0]
11 6 67 0 [63, 64, 65, 66, 67]
Imagine we have the following array of 3 arrays, covering the range 1 to 150:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ... 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]
[51, 52, 53, 54, 55, 56, 57, 58, 59, 60 ... 92, 93, 94, 95, 96, 97, 98, 99, 100, 107]
[71, 73, 84, 101, 102, 103, 104, 105, 106, 108 ... 141, 142, 143, 144, 145, 146, 147, 148, 149, 150]
I want to build an array that stores in which array we find the values 1 to 150. The result must be then:
[1 1 1 ... 1 2 2 2 ... 2 3 2 3 2 ... 3 3 3 ... 3],
where each element corresponds to 1, 2, 3, ... ,150. The obtained array gives then the array-membership of the elements 1 to 150. The code must be applied for any number of arrays (so not only 3 arrays).
You can use an array comprehension. Here is an example with three vectors containing the range 1:10:
A = [1, 3, 4, 5, 7]
B = [2, 8, 9]
C = [6, 10]
Now we can write a comprehension using in with a fallback error to guard :
julia> [x in A ? 1 : x in B ? 2 : 3 for x in 1:10]
10-element Array{Int64,1}:
1
⋮
3
Perhaps also include a fallback error, in case the input is wrong
julia> [x in A ? 1 : x in B ? 2 : x in C ? 3 : error("not found") for x in 1:10]
10-element Array{Int64,1}:
1
⋮
3
Trade memory for search in this case:
Make an array to record which array each value is in.
# example arrays
N=100; A=rand(1:N,30);
B = rand(1:N,40);
C = rand(1:N,35);
# record array containing each value:
A=1,B=2,C=3;
not found=0;
arrayin = zeros(Int32, max(maximum(A),maximum(B),maximum(C)));
arrayin[A] .= 1;
arrayin[B] .= 2;
arrayin[C] .=3;
I'm new to programming and just started to learn c. I was hoping to create a simple battleship game but then I face a problem while generating random numbers. I wants to generate a random number that doesn't overlap with another range of number. Let's say I got 40 as my first random number and I don't want 40, 41, 42, 43 and 44 to appear again. So following is my code, I can't figure out what's wrong with it.
srand(time(NULL));
for(counter = 0; counter < ship_num; counter++){
checked = 0;
while (!checked){
ship_x[counter] = 1 + rand()%56;
ship_y[counter] = 1 + rand()%20;
checked = 1;
for(check = 0; check < counter; check++){
if(ship_y[counter] == ship_y[check]){
int check_x = ship_x[check] + 5;
if(ship_x[counter] >= ship_x[check] && ship_x[counter] <= check_x){
checked = 0;
}
}
}
}
}
All the variable is defined so I won't define it again here.
This is basically all the code that is used to generate the coordination of the ship.Output is like below.
0.Coordinates: 5, 47
1.Coordinates: 9, 10
2.Coordinates: 8, 55
3.Coordinates: 12, 51
4.Coordinates: 16, 48
5.Coordinates: 17, 32
6.Coordinates: 7, 24
7.Coordinates: 16, 35
8.Coordinates: 1, 7
9.Coordinates: 7, 36
10.Coordinates: 11, 54
11.Coordinates: 17, 29
12.Coordinates: 6, 24
13.Coordinates: 11, 8
14.Coordinates: 3, 5
15.Coordinates: 2, 5
16.Coordinates: 14, 21
17.Coordinates: 20, 24
18.Coordinates: 4, 18
19.Coordinates: 14, 19
20.Coordinates: 14, 1
21.Coordinates: 13, 48
22.Coordinates: 18, 43
23.Coordinates: 16, 25
24.Coordinates: 4, 30
25.Coordinates: 10, 3
26.Coordinates: 18, 17
27.Coordinates: 4, 56
28.Coordinates: 4, 9
29.Coordinates: 1, 4
30.Coordinates: 14, 29
31.Coordinates: 3, 27
32.Coordinates: 18, 56
33.Coordinates: 9, 9
34.Coordinates: 1, 24
35.Coordinates: 7, 42
36.Coordinates: 5, 3
37.Coordinates: 14, 27
38.Coordinates: 4, 50
39.Coordinates: 15, 8
40.Coordinates: 5, 36
41.Coordinates: 6, 37
42.Coordinates: 14, 44
43.Coordinates: 12, 21
44.Coordinates: 1, 49
45.Coordinates: 17, 41
46.Coordinates: 3, 24
47.Coordinates: 10, 2
48.Coordinates: 12, 13
49.Coordinates: 7, 32
50.Coordinates: 5, 11
51.Coordinates: 5, 10
52.Coordinates: 2, 36
53.Coordinates: 11, 29
54.Coordinates: 1, 45
55.Coordinates: 20, 40
56.Coordinates: 2, 52
57.Coordinates: 19, 28
58.Coordinates: 10, 34
59.Coordinates: 10, 31
60.Coordinates: 13, 18
61.Coordinates: 4, 39
62.Coordinates: 8, 33
63.Coordinates: 13, 26
64.Coordinates: 20, 10
65.Coordinates: 16, 18
66.Coordinates: 18, 35
67.Coordinates: 6, 13
68.Coordinates: 6, 34
69.Coordinates: 6, 30
70.Coordinates: 4, 49
71.Coordinates: 3, 14
72.Coordinates: 6, 8
73.Coordinates: 6, 19
74.Coordinates: 14, 11
75.Coordinates: 6, 55
76.Coordinates: 15, 36
77.Coordinates: 16, 15
78.Coordinates: 7, 31
79.Coordinates: 20, 3
As can see from above, coordinates 68 and 69 is the output that I didn't wish to see, it's too close together.
Never mind guys, Thanks for all the comment. I think I just found out what's wrong with my code. I remove all the number that is 1 to 4 number bigger than the one I have in my array but I forgot to remove the one that is smaller, that's why it keeps appearing. My new code just solve it, here it is and i hope it can help someone else who face the same problem.
srand(time(NULL));
for(counter = 0; counter < ship_num; counter++){
checked = 0;
while (checked == 0){
ship_x[counter] = 1 + rand()%56;
ship_y[counter] = 1 + rand()%20;
checked = 1;
for(check = 0; check < counter; check++){
if(ship_y[counter] == ship_y[check]){
if(ship_x[counter] <= ship_x[check] + 5 && ship_x[counter] >= ship_x[check] - 5){
checked = 0;
}
}
}
}
}
I want to do something like this, where df.index matches 2dim_arr exactly
df['newcol']=2dim_arr[df.index][df.existingcol.values]
I can get at the values I want if I do this:
for i in range(len(df)):
print(2dim_arr[i][df.iloc[i].existingcol])
Thanks in advance for assistance.
You are basically using the values from existingcol as column indices and going through each row of the 2D array to select one element per row off the 2D array. Thus, we can use NumPy's integer array indexing to achieve the desired new column -
col_idx = df.existingcol.values
df['newcol'] = dim2_arr[np.arange(len(dim2_arr)), col_idx]
Sample run -
1) Inputs :
In [311]: df
Out[311]:
existingcol
0 2
1 0
2 0
3 1
4 0
5 2
6 1
7 4
8 3
9 3
In [313]: dim2_arr
Out[313]:
array([[25, 75, 70, 45, 67],
[21, 85, 74, 68, 61],
[79, 33, 22, 77, 25],
[69, 31, 67, 11, 45],
[50, 12, 35, 55, 89],
[62, 59, 86, 55, 58],
[67, 41, 77, 88, 79],
[64, 30, 36, 25, 21],
[24, 73, 68, 84, 79],
[50, 53, 55, 71, 84]])
2) Use proposed codes :
In [314]: col_idx = df.existingcol.values
In [317]: df['newcol'] = dim2_arr[np.arange(len(dim2_arr)), col_idx]
In [318]: df
Out[318]:
existingcol newcol
0 2 70
1 0 21
2 0 79
3 1 31
4 0 50
5 2 86
6 1 41
7 4 21
8 3 84
9 3 71
MATLAB:
In MATLAB,
I have 2 m-by-n matrices, A and B. I want to make a set of n
m-by-2 matrices such as in ith matrix (of set of n), first column will be ith
column from A and second column will be ith column from B.
How to extract and concatenate ith columns from both matrices?
How I can store these n matrices? Using loops? (Memory?)
Example:
Input:
A = [ 1, 2, 3; 4, 5 ,6; 7, 8, 9] (3x3 matrix)
B = [ 11, 22, 33; 44, 55 ,66; 77, 88, 99] (3x3 matrix)
Output:
For i=1:3
C1 = [1, 11; 4, 44; 7, 77]
C2 = [2, 22; 5, 55; 8, 88]
C3 = [3, 33; 6, 66; 9, 99]
The first thing I'm going to do is change your variable names. Mainly this is just to make referring to the variables easier, especially as m and n change. Instead of writing
C1(:,:)
C2(:,:)
...
Cn(:,:)
I'm going to write
C(:,:,1)
C(:,:,2)
...
C(:,:,n)
All I've done is moved the index from the variable name to the index of the 3rd dimension.
Now, to create the C array:
A = [ 1, 2, 3; 4, 5 ,6; 7, 8, 9]
B = [ 11, 22, 33; 44, 55 ,66; 77, 88, 99]
[m,n]=size(A)
C = reshape([A',B']', m, 2, n)
The output of this is:
A =
1 2 3
4 5 6
7 8 9
B =
11 22 33
44 55 66
77 88 99
m = 3
n = 3
C =
ans(:,:,1) =
1 11
4 44
7 77
ans(:,:,2) =
2 22
5 55
8 88
ans(:,:,3) =
3 33
6 66
9 99
As you can see, C(:,:,1) is equal to C1 in your example, C(:,:,2) = C2 and so on. And this extends without change as the sizes of A and B change. You never have to come up with new variable names. And all you have to do to know how many m-by-2 matrices you've got is
numVars = size(C,3);
Note: This uses the same technique found in the answer here: matlab - how to merge/interlace 2 matrices?