Extract items from Column in a Dataframe - arrays

country_name
rank
show_title
Argentina
2
The Queen of Flow
India
1
Cobra Kai
Argentina
1
The Queen of Flow
England
3
Stay Close
Argentina
1
The Queen of Flow
I am trying to get a table that will display the number of times each show title is ranked 1st, 2nd or Third. The result something like this:
Rank
Cobra Kai
The Queen of Flow
Stay Close
1
1
2
0
2
0
1
1
3
0
0
0

You can use pivot_table like this.
df.pivot_table(index=['rank'], columns=['show_title'], aggfunc='count', fill_value=0)
Result
country_name
show_title Cobra Kai Stay Close The Queen of Flow
rank
1 1 0 2
2 0 0 1
3 0 1 0

Related

Repeat rows based on item count for each row and assign values for repeated rows

I have a df with the item and it is available in different rooms
Item Room1 Room2 Room3 Room4
Ball 1 1 1 0
Bat 1 1 1 1
Wicket 1 1 1 0
Now I want to repeat the rows based on item counts on different Rooms. For example for Item - Ball there are three 1's in Room1, Room2, Room3 so need to repeat 3 rows with assigning 0 in each row only for Room1, Room2, Room3 columns, and Room4 is not considered for Item Ball and it can be 0's for all Ball item rows. There are 300 columns with different room names, for example Room1,room2,room3,room4,BlockArea1,Block2 etc.Below is the expected output
Item Room1 Room2 Room3 Room4
Ball 1 1 1 0
Ball 1 0 1 0
Ball 1 1 0 0
Bat 1 1 1 1
Bat 1 1 1 0
Bat 1 1 0 1
Bat 1 0 1 1
Wicket 1 1 1 0
Wicket 1 0 1 0
Wicket 1 1 0 0
Any help would be appreciated
To have a more interesting example, with a source row containing 0
somewhere else than in the last column, I created df as:
Item Room1 Room2 Room3 Room4
0 Ball 1 1 1 0
1 Bat 1 1 1 1
2 Wicket 1 1 1 0
3 Xxxx 0 1 1 1
The first step is to define a function to process each row:
def rowProc(row):
n = 0
res = []
for idx, val in row[row > 0].items():
outRow = row.copy()
if n > 0:
outRow[idx] = 0
res.append(outRow)
n += 1
return pd.DataFrame(res)
An important project detail is that the source row comes here from
a bit "changed" DataFrame, namely Item column will be set as
the index. So the only processed columns are "further" (Room...)
columns.
For the current row it generates a DataFrame containing:
as many rows as how many ones contains the source row,
the first output row is an exact copy of the source row (like in
your expected result),
further rows have consecutive ones set to 0.
Then run:
result = pd.concat(df.set_index('Item').apply(rowProc, axis=1).tolist())
result.index.name = 'Item'
result.reset_index(inplace=True)
The result is:
Item Room1 Room2 Room3 Room4
0 Ball 1 1 1 0
1 Ball 1 0 1 0
2 Ball 1 1 0 0
3 Bat 1 1 1 1
4 Bat 1 0 1 1
5 Bat 1 1 0 1
6 Bat 1 1 1 0
7 Wicket 1 1 1 0
8 Wicket 1 0 1 0
9 Wicket 1 1 0 0
10 Xxxx 0 1 1 1
11 Xxxx 0 1 0 1
12 Xxxx 0 1 1 0

t-sql - select all combintations of groups of rows in single table

Ok, I have a table like this:
ThingID SubthingID ThingLevel
1 1 0
1 2 0
1 3 0
1 4 0
2 14 1
2 17 1
3 22 1
3 950 1
I need to select groups of subthings such that I end up with one subthing from a level 0 thing, and all the combinations of subthings, one each from the level 1 things. Note that each subthing belongs to its thing - they're not interchangeable. So there can't be, say, a combination of thing 2 with subthing 950. Also, things come in two levels - level 0 and level 1. A level 1 thing is also level 1, and a level 0 thing is always level 0 - really, it means that level 1 things can be combined with other level 1 or 0 things, but level 0 things can only be combined with level 1 things.
So the output would look like:
GroupID ThingID SubthingID ThingLevel
1 1 1 0
1 2 14 1
1 3 22 1
2 1 2 0
2 2 14 1
2 3 22 1
3 1 3 0
3 2 14 1
3 3 22 1
. . . .
. . . .
. . . .
x 1 4 0
x 2 17 1
x 3 950 1
There are multiple level 0 things, each with one to many subthings. There are multiple level 1 things, each with one to many subthings.
Offhand it would seem like nexted loops would be the answer:
For each level 0 thing begin
for each level 1 thing subthing begin
etc...
But that's obviously not going to handle variable numbers of level one things.
Is there a way to do this with recursion?

Comparisons across multiple rows in Stata (household dataset)

I'm working on a household dataset and my data looks like this:
input id id_family mother_id male
1 2 12 0
2 2 13 1
3 3 15 1
4 3 17 0
5 3 4 0
end
What I want to do is identify the mother in each family. A mother is a member of the family whose id is equal to one of the mother_id's of another family member. In the example above, for the family with id_family=3, individual 5 has mother_id=4, which makes individual 4 her mother.
I create a family size variable that tells me how many members there are per family. I also create a rank variable for each member within a family. For families of three, I then have the following piece of code that works:
bysort id_family: gen family_size=_N
bysort id_family: gen rank=_n
gen mother=.
bysort id_family: replace mother=1 if male==0 & rank==1 & family_size==3 & (id[_n]==id[_n+1] | id[_n]==id[_n+2])
bysort id_family: replace mother=1 if male==0 & rank==2 & family_size==3 & (id[_n]==id[_n-1] | id[_n]==id[_n+1])
bysort id_family: replace mother=1 if male==0 & rank==3 & family_size==3 & (id[_n]==id[_n-1] | id[_n]==id[_n-2])
What I get is:
id id_family mother_id male family_size rank mother
1 2 12 0 2 1 .
2 2 13 1 2 2 .
3 3 15 1 3 1 .
4 3 17 0 3 2 1
5 3 4 0 3 3 .
However, in my real data set, I have to get the mother for families of size 4 and higher (up to 9), which makes this procedure very inefficient (in the sense that there are too many row elements to compare "manually").
How would you obtain this in a cleaner way? Would you make use of permutations to index the rows? Or would you use a for-loop?
Here's an approach using merge.
// create sample data
clear
input id id_family mother_id male
1 2 12 0
2 2 13 1
3 3 15 1
4 3 17 0
5 3 4 0
end
save families, replace
clear
// do the job
use families
drop id male
rename mother_id id
sort id_family id
duplicates drop
list, clean abbreviate(10)
save mothers, replace
use families, clear
merge 1:1 id_family id using mothers, keep(master match)
generate byte is_mother = _merge==3
list, clean abbreviate(10)
The second list yields
id id_family mother_id male _merge is_mother
1. 1 2 12 0 master only (1) 0
2. 2 2 13 1 master only (1) 0
3. 3 3 15 1 master only (1) 0
4. 4 3 17 0 matched (3) 1
5. 5 3 4 0 master only (1) 0
where I retained _merge only for expositional purposes.

Creating an array with data conditional on another matrix

I know there must be an apply function or ave for this, but I am not quite sure how to do it:
I have data:
date player market
1: 1-1 1 1
2: 1-1 2 1
3: 1-1 1 2
4: 1-2 2 1
5: 1-2 3 2
6: 1-3 21 1
7: 1-4 1 1
8: 1-4 51 1
9: 1-4 1 1
10: 1-5 1 2
I also have a blank array, which has unique dates on the rows, unique markets on the columns, and unique players for the third dimension.
1
[,,1]
1 2
1-1
1-2
1-3
1-4
1-5
2
[,,2]
1 2
1-1
1-2
1-3
1-4
1-5
etc
I want to fill out the array with from the data.
I want each point to = 1 if the guy has an entry in the data where he is present for a date and market combination, and 0 if not. So for example, for 1 and 2, they would be filled out as:
1
[,,1]
1 2
1-1 1 1
1-2 0 0
1-3 0 0
1-4 0 1
1-5 0 1
2
[,,2]
1 2
1-1 1 0
1-2 1 0
1-3 0 0
1-4 0 0
1-5 0 0
Looping is out of the question. Thank you for your help.
You can use xtabs for this purpose. Where temp dates, Month market and day player.
data(airquality)
tab<-xtabs(~Temp+Month+Day,airquality)
> dim(tab)
[1] 40 5 31
> str(tab)
xtabs [1:40, 1:5, 1:31] 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, "dimnames")=List of 3
..$ Temp : chr [1:40] "56" "57" "58" "59" ...
..$ Month: chr [1:5] "5" "6" "7" "8" ...
..$ Day : chr [1:31] "1" "2" "3" "4" ...
- attr(*, "class")= chr [1:2] "xtabs" "table"
- attr(*, "call")= language xtabs(formula = ~Temp + Month + Day, data = airquality)
edit:
converting to data frame.
> head(as.data.frame(tab))
Temp Month Day Freq
1 56 5 1 0
2 57 5 1 0
3 58 5 1 0
4 59 5 1 0
5 61 5 1 0
6 62 5 1 0

Actionscript 3.0 Cube Crash like game

I'm trying to build game like http://games.yahoo.com/game/bricks-breaking in actionscript 3 (flash builder).
I am able to create an array of bricks (that are visible on game start), but I have no idea how to find a group of bricks in array.
Lets say we have array like so:
1 2 2 1 3 3 1 1 1 1 1 1 1
1 2 1 1 1 3 1 1 1 1 1 1 1
1 2 1 1 1 3 1 1 1 1 1 1 3
1 1 2 1 1 3 3 3 1 1 1 1 3
1 1 1 2 1 3 1 3 3 1 1 1 3
1 1 1 3 3 3 1 3 3 1 1 1 3
1 1 1 1 1 1 1 3 3 1 1 1 1
When the user clicks any brick colored red (in array lets say it is 3) the array after removing all 3 will look like that:
1 2 2 0 0 0 0 0 0 1 1 1 1
1 2 1 1 0 0 1 0 0 1 1 1 1
1 2 1 1 1 0 1 0 0 1 1 1 3
1 1 2 1 1 0 1 0 1 1 1 1 3
1 1 1 1 1 0 1 1 1 1 1 1 3
1 1 1 2 1 0 1 1 1 1 1 1 3
1 1 1 1 1 1 1 1 1 1 1 1 1
Basicly I want to remove all the items that are in group and are the same color.
Any suggestions how to do that?
Is there any kind of algorythm that I should use?
Thanks for advice
A simple way to remove elements is to use a recursive function. It's not the only way (or even a good one) but it should be enough for this kind of game. Basically something like this:
function breakBricks(x:int, y:int, color:int):void {
if(bricks[y][x] != color) return;
bricks[y][x] = 0;
breakBricks(x + 1, y, color);
breakBricks(x, y + 1, color);
breakBricks(x - 1, y, color);
breakBricks(x, y - 1, color);
}
Begin with the position that the user clicked and the colour of that position. If the colour matches it will set that entry to 0, if not it leaves the element alone. It recursively does this to all neighbouring elements. What is missing in this code are boundary checks which you need to add.
In the next step you could iterate over each of the arrays columns from bottom to top, keep reference of the position of the first 0 element you find and move any non-emtpy values you find after that to the lowest empty row position.

Resources