I think this is the hardest to date I have had to crack - so hard I had a hard time finding a good headline.
So we have a site where trucks come and buy say Gravel, or sand or other building materials.
Sometimes they also unload demolition waste first.
I need to find out a couple of things
how many trucks (and from what companys) came empty
if they came empty what did they buy from us.
what companys are sending full trucks and what are sending empty trucks.
a tope 10 of materials they will drive to us from to buy even when coming empty to our facility.
a list of all the order numbers that they drove to us til fill and came with empty trucks. ( I have distances linked to order numbers, so now I can estimate the value of our products)
The data I have available:
I have a full data set of when what customer buys what and / or pay to deliver.
E.G.:
I can see the parts I need to split the data into I think it should be something like this
find all unique licence plates
somehow map if they bought materials within 30 minutes of
offloading demolition waste (most trucks will come between 2 and 10
times per day)
Present all this data (on a normal day we have about 800 trucks = 2000 lines since they weigh in, weigh out, and then some buy something = 2 more weigh lines)
I can easily find unique licence plates per day (either by formula or by Excel function Data/delete doublets,
but after that I have no clue where to start.
I think I need some sheets in between, where I somehow mark if a material was bought from an "empty truck" and I need a counter for that .. somehow...
Any help on how to get started is appreciated.
It seems like the best way to start is with a helper column (in the following exampes, I have chosen "Column M") to flag whether the truck arrived empty.
In the helper column, you can use something similar to the following formula.
{=IF(ISBLANK(B2),0,IF(C2="In",0,IF(B2=$B$2:$B$13,IF($C$2:$C$13="In",IF($A$2:$A$13>(A2-TIME(0,30,0)),0,1),1),1)))}
This is an array formula, which means you have to press ctrl+shift+enter after pasting it in the cell. Then you can copy that cell down the column.
Just to explain, the first if statement knows the truck is not arriving empty if Column C is 'In'. The second if statement creates an array and tests to see if other the same truck appears in other rows. The third if statement checks to see if the same truck checked 'In' in the matching rows, and the fourth if statement verifies if the time they checked in was less than thirty minutes ago. You can adjust the length by editing the TIME(0,30,0) function. The format is TIME(hours,minuites,seconds). Unless the truck matches all three of the second, third and fourth if statements, it is marked as coming empty.
Once you have this helper column, just about all of your tasks are quite simple.
1a: How many trucks came empty? Sum Column M
1b: How many trucks from what company? Create a unique list of companies. Then create a COUNTIFS formula based on Column M = 1 and Column K = Company. For example, if C32 had Company B then the formula =COUNTIFS($M$2:$M$13,1,$K$2:$K$13,C32) would return 2
1c: How many times did a truck come empty? Similar to 1b, create a unique list of License Plates, then use a COUNTIFS based on Column M = 1 and Column B = License Plate.
2: Similar to 1b, just use a unique list of products tested against Column F
3: Similar to 1b, just create a second column, next to the first that uses =COUNTIFS($M$2:$M$13,0,$K$2:$K$13,C53,$C$2:$C$13,"In") Which tests that Column M reports the truck did not come empty, that matches the company in Column K and that the truck came 'In' so you don't double count the same truck when it goes 'out'
4: Just sort list created by number 2. You can highlight the range, right-click and select "Sort" > "Custom Sort", then select the column you want to sort on and largest to smallest.
5: There are a couple of different ways, you could do this. The formula
{=TEXTJOIN(", ",TRUE,IF($M$2:$M$13=1,$J$2:$J$13,""))}
(again, entered as an array formula)
would create a comma separated list of order numbers. An alternative if you want a column of order numbers (but would only work if they are actually numbers), is to paste the formula {=MAX(IF($M$2:$M$13=1,$J$2:$J$13,))} in the first row of the column (in my example, its O2) and then {=MAX(IF($M$2:$M$13=1,IF($J$2:$J$13<O2,$J$2:$J$13,)))} in the row below (change the reference to O2 if you pasted it in a different spot)(again, note that both of these are array formulas). Then copy and paste the second formula down the column. When order numbers of trucks that came in empty are exhausted, the formula will report 0.
So I play heroes of newerth. I have the desire to make a statistical program that shows which team of 5 heroes vs another 5 heroes wins the most. Given there are 85 heroes and games are 85 choose 5 vs 80 choose 5, that's a lot of combinations.
Essentially I'm going to take the stats data the game servers allow me to get and just put a 1 in an array which has heroes when they get a win [1,2,3,4,5][6,7,8,9,10][W:1][L:0]
So after I parse and build the array from the historical game data, I can put in what 5 heroes I want to see, and I can get back all the relevant game data telling me which 5 hero lineup has won/lost the most.
What I need help starting is a simple algorithm to write out my array. Here's similar output I need: (I have simplified this to 1-10, where the code I get I can just change 10 to x for how many heroes there are).
[1,2,3,4,5][6,7,8,9,10]
[1,2,3,4,6][5,7,8,9,10]
[1,2,3,4,7][5,6,8,9,10]
[1,2,3,4,8][5,6,7,9,10]
[1,2,3,4,9][5,6,7,8,10]
[1,2,3,4,10][5,6,7,8,9]
[1,2,3,5,6][4,7,8,9,10]
[1,2,3,5,7][4,6,8,9,10]
[1,2,3,5,8][4,6,7,9,10]
[1,2,3,5,9][4,6,7,8,10]
[1,2,3,5,10][4,6,7,8,9]
[1,2,3,6,7][4,5,8,9,10]
[1,2,3,6,8][4,5,7,9,10]
[1,2,3,6,9][4,5,7,8,10]
[1,2,3,6,10][4,5,7,8,9]
[1,2,3,7,8][4,5,6,9,10]
[1,2,3,7,9][4,5,6,8,10]
[1,2,3,7,10][4,5,6,8,9]
[1,2,3,8,9][4,5,6,7,10]
[1,2,3,8,10][4,5,6,7,9]
[1,2,3,9,10][4,5,6,7,8]
[1,2,4,5,6][3,7,8,9,10]
[1,2,4,5,7][3,6,8,9,10]
[1,2,4,5,8][3,6,7,9,10]
[1,2,4,5,9][3,6,7,8,10]
[1,2,4,5,10][3,6,7,8,9]
[1,2,4,6,7][3,5,8,9,10]
[1,2,4,6,8]...
[1,2,4,6,9]
[1,2,4,6,10]
[1,2,4,7,8]
[1,2,4,7,9]
[1,2,4,7,10]
[1,2,4,8,9]
[1,2,4,8,10]
[1,2,4,9,10]
...
You get the Idea. No repeating and order doesn't matter. Its essentially cut in half doesn't matter the order of the arrays either. Just need a list of all the combinations of teams that can be played against each other.
EDIT: additional thinking...
After quite a bit of thinking. I have come up with some ideas. Instead of writting out the entire array of [85*84*83*82*81][80*79*78*77*76*75] possible combinations of characters, which would have to be made larger for the introduction of of new heroes as to keep the array relevant and constantly updating.
To instead when reading from the server parse the information and build the array from there. It would be much simpler to just make an element in the array when one is not found, ei the combinations have never been played before. Then parsing the data would be 1 pass, and build your array as it complies along. Yes it might take a while, but the values that are created will be worth the wait. It can be done over time too. Starting with a small test case say 1000 games and working up the the number of matches that have been played. Another Idea would be to start from our current spot in time and build the data base from there. There is no need to go back to the first games ever played based off the amount of changes that have occurred to heroes over that time frame, but say go back 2-3 months to give it some foundation and reliability of data, and with each passing day only getting more accurate.
Example parse and build of the array:
get match(x)
if length < 15/25, x++; //determine what length matches we want and discredit shorter than 15 for sure.
if players != 10, x++; //skip the match because it didn't finish with 10 players.
if map != normal_mm_map // rule out non mm games, and mid wars
if != mm, rule out custom games
//and so forth
match_psr = match(x).get(average_psr);
match_winner = match(x).get(winner);
//Hero ids of winners
Wh1 = match.(x).get(winner.player1(hero_id)))
Wh2 = match.(x).get(winner.player2(hero_id)))
Wh3 = match.(x).get(winner.player3(hero_id)))
Wh4 = match.(x).get(winner.player4(hero_id)))
Wh5 = match.(x).get(winner.player5(hero_id)))
//hero ids of losers
Lh1 = match.(x).get(loser.player1(hero_id)))
Lh2 = match.(x).get(loser.player2(hero_id)))
Lh3 = match.(x).get(loser.player3(hero_id)))
Lh4 = match.(x).get(loser.player4(hero_id)))
Lh5 = match.(x).get(loser.player5(hero_id)))
//some sort of sorting algorithim to put the Wh1-5 in order of hero id from smallest to largest
//some sort of sorting algorithim to put the Lh1-5 in order of hero id from smallest to largest
if(array([Wh1, Wh2, Wh3, Wh4, Wh5],[Lh1,Lh2,Lh3,Lh4,Lh5],[],[],[],[],[],[],[],[],[]) != null)
array([Wh1, Wh2, Wh3, Wh4, Wh5],[Lh1,Lh2,Lh3,Lh4,Lh5],[],[],[],[],[],[],[],[],[]) += array([],[],[1],[][][][](something with psr)[][][[])
else(array.add_element([Wh1, Wh2, Wh3, Wh4, Wh5],[Lh1,Lh2,Lh3,Lh4,Lh5],[1],[][][][](something with psr)[][][[])
Any thoughts?
Encode each actor in the game using a simple scheme 0 ... 84
You can maintain a 2D matrix of 85*85 actors in the game.
Initialize each entry in this array to zero.
Now use just the upper triangular portion of your matrix.
So, for any two players P1,P2 you have a unique entry in the array, say array[small(p1,p2)][big(p1,p2)].
array(p1,p2) signifies how much p1 won against p2.
You event loop can be like this :
For each stat like H=(H1,H2,H3,H4,H5) won against L=(L1,L2,L3,L4,L5) do
For each tuple in H*L (h,l) do
if h<l
increment array[h][l] by one
else
decrement array[l][h] by one
Now, at the end of this loop, you have an aggregate information about players information against each other. Next step is an interesting optimization problem.
wrong approach : select 5 fields in this matrix such that no two field's row and column are same and the summation of their absolute values is maximum. I think you can get good optimization algorithms for this problem. Here, we will calculate five tuples (h1,l1), (h2,l2), (h3,l3) ... where h1 wins against l1 is maximized but you still did not see it l1 is good against h2.
The easier and correct options is to use brute force on the set of (85*84)C5 tuples.