I am trying to implement Drools Planner for allocating timetables. At the moment, my proficiency in Java and JavaBean design pattern is low and I need something simple to practice on.
Is there an AI optimization problem that
known to be solved very well with 'X' algorithm
the data model lends itself to be expressed in JavaBean design pattern in a simple manner
uses fewest number of extra features (like planning entity difficulty)
Such a problem would be good to cut my teeth on Drools Planner.
I am trying N-Queens problem right now which seems the simplest of these. So I am looking for something of this league.
Update: See CloudBalancingHelloWorld.java in optaplanner-examples (Drools Planner is renamed to OptaPlanner).
You could also try implementing the ITC2007 curriculum course scheduling yourself and then compare it with the source code of the example in Drools Planner.
If you want to keep it simple but get decent results too, follow this recipe and go for First Fit followed by Tabu Search.
Another good idea, is to join the ITC2011 scheduling competition: it's still open till 1-MAY-2012 and very similar to the curriculum course scheduling example.
I am trying 2X2 Sudoku (generating and solving) as something simple. You can model it on Nqueens code. While 2x2 sudokus are solved easily, 3x3 sudokus may get stuck. So you can implement swap moves.
Another interesting problem would be bucket sums. Given 10 buckets, each able to contain 5 numbers each, and 50 numbers; make a program to allocate the numbers so that the sum of numbers in each bucket are more or less even.
Bucket Bucket0 3 6 19 16 11 =55
Bucket Bucket1 8 2 5 25 15 =55
...
Bucket Bucket7 3 25 4 16 8 =56
Bucket Bucket8 12 20 12 9 2 =55
Bucket Bucket9 4 9 11 12 20 =56
This has practical implications, such as evenly distributing tasks of varying toughness throughout the week.
A collection of some problems: http://eclipseclp.org/examples/index.html
Related
I'm trying to figure out the best method / program to handle this computation to get the most people happy, ie the highest value for each person while still having all values be almost equal.
There are 24 people, 100 days and 4 people need to be selected for each day. All days must be full, ie the 24 people must be spread over 400 the slots with each person getting about 8 slots.
How can I create a program / algorithm that will allow the people to rank all 100 days in order of preference as well as the top 5 people they would prefer to be selected with. I was thinking that each day and each of the preferred people would get some sort of point value. Then the algorithm would run through the data set and find the combination that would yield the highest amount of people the happiest while still making everyone roughly even.
Is this easily possible using something like excel?
Thanks
Read up on "The Assignment Problem"; this is a well studied class of problems. Off the top of my head, the Hungarian Assignment method and the Stable Marriage/Stable Roommate method might be relevant.
You can solve this problem as a MILP using Solver as shown in this video and many others like it, but I am afraid that the build-in Solver may not allow enough binary variables for it to work. Get a feeling for how the problem work on a small scale and then download a nicer solver.
can anyone tell to me how k-means clustering working on textmining..
and i using cosine similarity as the distance metric.
nim 310910022 320910044 310910043 310910021
access 0 2 3 5
abdi 1 0 0 0
actual 5 0 0 1
arrow 0 1 1 2
this data is on listview
How can I do that in VB.net? to get any cluster and trending topic of the term ?
Thank in advance
First I would separate the problem into two parts:
Computing the k-means clustering assignments
Getting the data from the GUI (you mention the data is on a listview)
I assume 2 is straight forward and you just need help on 1.
I would start by re-writing the code to just read a TSV text file of the data as you specified. This will make things a lot easier to debug.
Then ask if you want to implement the kmeans algorithm yourself or use a library.
If you want to implement it, here is a link to the pseudo code
http://www.scribd.com/doc/89373376/K-Means-Pseudocode
You can also search up other kmeans pseudocode.
If you want to use a library to just "run" your data against a kmeans algorithm, here is one example in python/scipy.
http://glowingpython.blogspot.com/2012/04/k-means-clustering-with-scipy.html
Regardless of which approach you use, realize that kmeans is non-deterministic and you may get different answers everytime you run it. I would recommend computing against a known validation set to see if the data is roughly what you think it should be.
I've been asked to research some programming related to the "Taguchi Method", especially as it relates to Multi-variant testing. This is one of the first subjects I've tried to research that I've found zero, nada, zilch, code examples for, especially considering its mathematical basis.
I've found some books describing the math involved but it looks like I'm going to be doing some math brush up unless I can find some code examples I can relate to.
Is this one of those rare things that once you work out the programming, it's so valuable that no one shares? Or do I just fail at Taguchi + google?
Taguchi designs are the same thing as covering arrays. The basic idea is that if you have F data "fields" and every one can have N different values, it is possible to construct NF different test cases. A covering array is basically a set of test cases that together cover all possible pairwise combinations of two field values, and the idea is to generate as small one as possible. E.g. if F=3 and N=3, you have 27 possible test cases, but it is enough to have nine test cases if you aim for pairwise coverage:
Field A | Field B | Field C
---------------------------
1 1 1
1 2 2
1 3 3
2 1 2
2 2 3
2 3 1
3 1 3
3 2 1
3 3 2
In this table, you can choose any two fields and any two values and you can always find a row that contains the chosen values for the chosen fields.
Generating Taguchi designs in general is a difficult combinatorial problem.
You can generate Taguchi designs by various methods:
Branch and bound
Stochastic search (e.g. tabu search or simulated annealing)
Greedy search
Specific mathematical constructions for some specific structures
I came across an algorithmic problem to find out the number of inversion pairs in an array in O(nlogn) time. I got the solution to this. But, my question is that what is the real-life application of this problem? Like I want to know some applications where we need to know the inversion pairs.
One example is the fifteen puzzle. If you want to randomly shuffle a grid of numbers, can you tell at a glance if
1 14 5 _
7 3 2 12
6 9 13 15
4 10 8 11
can be solved by sliding moves or not? The parity of the permutation will tell you that it is not.
Here is the use of inversion count in real life..
suppose you want to know how similar two list are..based on ranking..
on any movie site..two wishlist of movies are compared and few of them who are similar , are shown to users who have same choice.
Same logic applies to shopping list on any shopping website.. for recommending shopping items based on his activity..
I once wrote a Tetris AI that played Tetris quite well. The algorithm I used (described in this paper) is a two-step process.
In the first step, the programmer decides to track inputs that are "interesting" to the problem. In Tetris we might be interested in tracking how many gaps there are in a row because minimizing gaps could help place future pieces more easily. Another might be the average column height because it may be a bad idea to take risks if you're about to lose.
The second step is determining weights associated with each input. This is the part where I used a genetic algorithm. Any learning algorithm will do here, as long as the weights are adjusted over time based on the results. The idea is to let the computer decide how the input relates to the solution.
Using these inputs and their weights we can determine the value of taking any action. For example, if putting the straight line shape all the way in the right column will eliminate the gaps of 4 different rows, then this action could get a very high score if its weight is high. Likewise, laying it flat on top might actually cause gaps and so that action gets a low score.
I've always wondered if there's a way to apply a learning algorithm to the first step, where we find "interesting" potential inputs. It seems possible to write an algorithm where the computer first learns what inputs might be useful, then applies learning to weigh those inputs. Has anything been done like this before? Is it already being used in any AI applications?
In neural networks, you can select 'interesting' potential inputs by finding the ones that have the strongest correlation, positive or negative, with the classifications you're training for. I imagine you can do similarly in other contexts.
I think I might approach the problem you're describing by feeding more primitive data to a learning algorithm. For instance, a tetris game state may be described by the list of occupied cells. A string of bits describing this information would be a suitable input to that stage of the learning algorithm. actually training on that is still challenging; how do you know whether those are useful results. I suppose you could roll the whole algorithm into a single blob, where the algorithm is fed with the successive states of play and the output would just be the block placements, with higher scoring algorithms selected for future generations.
Another choice might be to use a large corpus of plays from other sources; such as recorded plays from human players or a hand-crafted ai, and select the algorithms who's outputs bear a strong correlation to some interesting fact or another from the future play, such as the score earned over the next 10 moves.
Yes, there is a way.
If you choose M selected features there are 2^M subsets, so there is a lot to look at.
I would to the following:
For each subset S
run your code to optimize the weights W
save S and the corresponding W
Then for each pair S-W, you can run G games for each pair and save the score L for each one. Now you have a table like this:
feature1 feature2 feature3 featureM subset_code game_number scoreL
1 0 1 1 S1 1 10500
1 0 1 1 S1 2 6230
...
0 1 1 0 S2 G + 1 30120
0 1 1 0 S2 G + 2 25900
Now you can run some component selection algorithm (PCA for example) and decide which features are worth to explain scoreL.
A tip: When running the code to optimize W, seed the random number generator, so that each different 'evolving brain' is tested against the same piece sequence.
I hope it helps in something!