The question is as such: A 2d grid with obstacles has been given with 'n' robots. The robots can move in the adjacent cell within one time-tick. There are multiple constraints on robots like they cannot be in the same cell at the same time. However, they can move simultaneously. An action plan needs to be found out for each robot that moves the robot from a starting point to destination such that constraints are satisfied and the overall total time taken by the robots is the least.
I cannot seem to grasp whether this problem is an NP-hard problem or not
A quick search turns up a recent paper on this topic:
On the Computational Complexity ofMulti-Agent Pathfinding on Directed Graphs
The problem is NP hard in this special case of directed graphs. There are polynomial time algorithms for finding solutions in many cases, but finding optimal solutions is NP hard even on undirected graphs. (The Sliding Tile Puzzle is NP-Hard, and multi-agent pathfinding is a special case.)
So basically, I have to write a Battleships AI, which follows all of the conventional rules, apart from the fact that you are not notified whether or not your shot has sunk a ship - just whether or not it was a hit or miss
At the moment I have a working AI, but it suffers from the problem that it has to fire shots all the way around each enemy ship, as it doesn't know if the ship is sunk - this video may explain it better (for those who are wondering, there are some custom ship shapes, and the board is an L-shape): https://www.youtube.com/watch?v=SzLspp4JzNE
Any suggestions on how to go about reducing the number of shots required?
Many thanks :)
What information the computer can have? Number of ships, number of blocks for each ship, the shape of each ship. The algorithm has to check for the shapes of all remaining ships and pick next field such that the chance of hit is largest. Maybe there is a better long term strategy than "greatest hit chance", but solve the shape recognition part before delving into strategies.
If shapes of the ships are not known in advance then there's not much the algorithm can do. All it can is to check if the block being hit is the part of the largest remaining ship. The advanced strategy here would be to determine if largest ship cannot be anywhere else. If there are other ships remaining then use turns to search for them, but in such a search pattern that you also eliminate the possibility of the largest ship being somewhere else.
However, if there is a rule of number of attacks per turn being in proportion with number of your remaining ships, then you have to eliminate every ship as soon as possible, and so the "advanced" strategy above should not be used.
The most basic idea is to keep information regarding sunked ships as well as remaining ones (I assume that you know, that there are a_i ships with i pieces), and ones you sunk all the 4-pieces, and hit some connected 3-piece ship you will know, that there is no need to shoot around it anymore. So in general you shoot around hitted ship as long, as amount of hitted pieces is smaller that the number of pieces in the biggest non-empty class of ships. And once you sunk some - you decrease the respective ship class counter.
Some interesting alternative would be to learn some statistics of ships positions and shapes based on played games - if you play against human player it is most probable, that he won't place any ship anywhere - there are some psychological aspects that "bias" your decisions towards placements that you consider as "hard to find" - your program could learn such conditional probabilities and use it to assume its confidency in sunking some ship without shooting around it. Unfortunately this would not work with well designed - uniformly placing ships - algorithms.
First find the minimum frequent patterns from the database.
Then divide them into various data types like interval based , binary ,ordinal variables etc and define various distance measures for all the variables.
Finally apply cluster analysis method.
Is this sequence right or am i missing something?
whether you're right or not depends on what you want to do. The general approach that you describe seems to go into the right direction, but you'll never know if your on target until you answer the following questions:
What is your data?
What are trying to find/Which cluster method do you want to use?
From what you describe it seems to me that you want to do 'preprocessing' steps like feature selection and vectorization. Unfortunately, this by itself can be quite challenging. For example, one of the biggest partical problems is the design of a distance function (there's a tremendous amount of research available).
So, please give us more information on your specific target application.
I remember when I was in college we went over some problem where there was a smart agent that was on a grid of squares and it had to clean the squares. It was awarded points for cleaning. It also was deducted points for moving. It had to refuel every now and then and at the end it got a final score based on how many squares on the grid were dirty or clean.
I'm trying to study that problem since it was very interesting when I saw it in college, however I cannot find anything on wikipedia or anywhere online. Is there a specific name for that problem that you know about? Or maybe it was just something my teacher came up with for the class.
I'm searching for AI cleaning agent and similar things, but I don't find anything. I don't know, I'm thinking maybe it has some other name.
If you know where I can find more information about this problem I would appreciate it. Thanks.
Perhaps a "stigmergy" approach is closely related to your problem. There is a starting point here, and you can find something by searching for "dead ants" and "robots" on google scholar.
Basically: instead of modelling a precise strategy you work toward a probabilistic approach. Ants (probably) collect their deads by piling up according to a simple rule such as "if there is a pile of dead ants there, I bring this corpse hither; otherwise, I'll make a new pile". You can start by simplifying your 'cleaning' situation with that, and see where you go.
Also, I think (another?) suitable approach could be modelled with a Genetic Algorithm using a carefully chosen combination of fitness functions such as:
the end number of 'clean' tiles
the number of steps made by the robot
of course if the robots 'dies' out of starvation it automatically removes itself from the gene pool, a-la darwin awards :)
You could start by modelling a very, very simple genotype that will be 'computed' into a behaviour. Consider using a simple GA such as this one by Inman Harvey, then to each gene assign either a part of the strategy, or a complete behaviour. E.g.: if gene A is turned to 1 then the robot will try to wander randomly; if gene B is also turned to 1, then it will give priority to self-charging unless there are dirty tiles at distance X. Or use floats and model probability. Your mileage may vary but I can assure it will be fun :)
The problem is reminiscent of Shakey, although there's cleaning involved (which is like the Roomba -- a device that can also be programmed to perform these very tasks).
If the "problem space" (or room) is small enough, you can solve for an optimal solution using a simple A*-based search, but likely it won't be, since that won't leave for very interesting problems.
The machine learning approach suggested here using genetic algorithms is an interesting approach. Given the problem domain you would only have one "rule" (a move-to action, since clean could be eliminated by implicitly cleaning any square you move to that is dirty) so your learner would essentially be learning how to move around an environment. The problem there would be to build a learner that would be adaptable to any given floor plan, instead of just becoming proficient at cleaning a very specific space.
Whatever approach you have, I'd also consider doing a further meta-reasoning step if the problem sets are big enough, and use a partition approach to divide the floor up into separate areas and then conquering them one at a time.
Can you use techniques to create data to use "offline"? In that case, I'd even consider creating a "database" of optimal routes to take to clean certain floor spaces (1x1 up to, say, 5x5) that include all possible start and end squares. This is similar to "endgame databases" that game AIs use to effectively "solve" games once they reach a certain depth (c.f. Chinook).
This problem reminds me of this. A similar problem is briefly mentioned in the book Complexity as an example of a genetic algorithm. These versions are simplified though, they don't take into account fuel consumption.
I am running a physics simulation and applying a set of movement instructions to a simulated skeleton. I have a multiple sets of instructions for the skeleton consisting of force application to legs, arms, torso etc. and duration of force applied to their respective bone. Each set of instructions (behavior) is developed by testing its effectiveness performing the desired behavior, and then modifying the behavior with a genetic algorithm with other similar behaviors, and testing it again. The skeleton will have an array behaviors in its set list.
I have fitness functions which test for stability, speed, minimization of entropy and force on joints. The problem is that any given behavior will work for a specific context. One behavior works on flat ground, another works if there is a bump in front of the right foot, another if it's in front of the left, and so on. So the fitness of each behavior varies based on the context. Picking a behavior simply on its previous fitness level won't work because that fitness score doesn't apply to this context.
My question is, how do I program to have the skeleton pick the best behavior for the context? Such as picking the best walking behavior for a randomized bumpy terrain.
In a different answer I've given to this question, I assumed that the "terrain" information you have for your model was very approximate and large-grained, e.g., "smooth and flat", "rough", "rocky", etc. and perhaps only at a grid level. However, if the world model is in fact very detailed, such as from a simulated version of a 3-D laser range scanner, then algorithmic and computational path/motion planning approaches from robotics are likely to be more useful than a machine-learning classifier system.
PATH/MOTION PLANNING METHODS
There are a fairly large number of path and motion planning methods, including some perhaps more suited to walking/locomotion, but a few of the more general ones worth mentioning are:
Visibility graphs
Potential Fields
Sampling-based methods
The general solution approach would be use a path planning method to determine the walking trajectory that your skeleton should follow to avoid obstacles, and then use your GA-based controller to achieve the appropriate motion. This is very much at the core of robotics: sense the world and determine actions and motor control required to achieve some goal(s).
Also, a quick literature search turned up the following papers and a book as a source of ideas and starting points for further investigation. The paper on legged robot motion planning may be especially useful as it discusses several motion planning strategies.
Reading Suggestions
Steven Michael LaValle (2006). Planning Algorithms, Cambridge University Press.
Kris Hauser, Timothy Bretl, Jean-Claude Latombe, Kensuke Harada, Brian Wilcox (2008). "Motion Planning for Legged Robots on Varied Terrain", The International Journal of Robotics Research, Vol. 27, No. 11-12, 1325-1349,
DOI: 10.1177/0278364908098447
Guilherme N. DeSouza and Avinash C. Kak (2002). "Vision for Mobile Robot Navigation: A Survey", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 2, February, pp 237-267.
Why not test the behaviors against a randomized bumpy terrain? Just set the parameters of the GA so that it's a little forgiving, and won't condemn a behavior for one or two failures.
You have two problems:
Bipedal locomotion without senses is very difficult. I've seen good robotic locomotion over rough terrain without senses, but never with only two legs. So the best solution you can possibly find this way might not be very good.
Running a GA is as much art as science. There are a lot of knobs you can turn, and it's hard to find parameters that will allow novelty to grow without drowning it in noise.
Starting simple (e.g. crawling) will help with both of these.
EDIT:
Wait... you're training it over and over on the same randomized terrain? Well no wonder you're having trouble! It's optimizing for that particular layout of rocks and bumps, which is much easier than generalizing. Depending on how your GA works, you might get some benefit from making the course really long, but a better solution is to randomize the terrain for every pass. When it can no longer exploit specific features of the terrain, it will have an evolutionary incentive to generalize. Since this is a more difficult problem it will not learn as quickly as it did before, and it might not be able to get very good at all with its current parameters; be prepared to tinker.
There are three aspects to my answer: (1) control theory, (2) sensing, and (3) merging sensing and action.
CONTROL THEORY
The answer to your problem depends partially on what kind of control scheme you are using: is it feed-forward or feedback control? If the latter, what simulated real-time sensors do you have other than terrain information?
Simply having terrain information and incorporating it into your control strategy would not mean you are using feedback control. It is possible to use such information to select a feed-forward strategy, which seems closest to the problem that you have described.
SENSING
Whether you are using feed-forward or feedback control, you need to represent the terrain information and any other sensory data as an input space for your control system. Part of training your GA-based motion controller should be moving your skeleton through a broad range of random terrain in order to learn feature detectors. The feature detectors classify the terrain scenarios by segmenting the input space into regions critical to deciding what is the best action policy, i.e., what control behavior to employ.
How to best represent the input space depends on the level of granularity of the terrain information you have for your simulation. If it's just a discrete space of terrain type and/or obstacles in some grid space, you may be able to present it directly to your GA without transformation. If, however, the data is in a continuous space such as terrain type and obstacles at arbitrary range/direction, you may need to transform it into a space from which it may be easier to infer spatial relationships, such as coarse-coded range and direction, e.g., near, mid, far and forward, left-forward, left, etc. Gaussian and fuzzy classifiers can be useful for the latter approach, but discrete-valued coding can also work.
MERGING SENSING AND ACTION
Using one of the input-space-encoding approaches above, you have a few options for how to connect behavior selection search space and motion control search space:
Separate the two spaces into two learning problems and use a separate GA to evolve the parameters of a standard multi-layer perceptron neural network. The latter would have your sensor data (perhaps transformed) as inputs and your set of skeleton behaviors as outputs. Instead of using back-propagation or some other ANN-learning method to learn the network weights, your GA could use some fitness function to evolve the parameters over a series of simulated trials, e.g., fitness = distance traveled in a fixed time period toward point B starting from point A. This should evolve over successive generations from completely random selection of behaviors to something more coordinated and useful.
Merge the two search spaces (behavior selection and skeleton motor control) by linking a multi-layer perceptron network as described in (1) above into the existing GA-based controller framework that you have, using the skeleton behavior set as the linkage. The parameter space that will be evolved will be both the neural network weights and whatever your existing controller parameter space is. Assuming that you are using a multi-objective genetic algorithm, such as the NSGA-II algorithm, (since you have multiple fitness functions), the fitness functions would be stability, speed, minimization of entropy, force on joints, etc, plus some fitness function(s) targeted at learning the behavior-selection policy, e.g., distance moved toward point B starting from point A in a fixed time period.
The difference between this approach and (1) above is that you may be able to learn both better coordination of behaviors and finer-grain motor control since the parameter space is likely to be better explored when the two problems are merged as opposed to being separate. The downside is that it may take much longer to converge on reasonable parameter solutions(s), and not all aspects of motor control may be learned as well as they would if the two learning problems were kept separate.
Given that you already have working evolved solutions for the motor control problem, you are probably better off using approach (1) to learn the behavior-selection model with a separate GA. Also, there are many alternatives to the hybrid GA-ANN scheme I described above for learning the latter model, including not learning a model at all and instead using a path planning algorithm as described in a separate answer from me. I simply offered this approach since you are already familiar with GA-based machine learning.
The action selection problem is a robust area of research in both machine learning and autonomous robotics. It's probably well-worth reading up on this topic in itself to gain better perspective and insight into your current problem, and you may be able to devise a simpler strategy than anything I've suggested so far by viewing your problem through the lens of this paradigm.
You're using a genetic algorithm to modify the behavior, so that must mean you have devised a fitness function for each combination of factors. Is that your question?
If yes, the answer depends on what metrics you use to define best walking behavior:
Maximize stability
Maximize speed
Minimize forces on joints
Minimize energy or entropy production
Or do you just try a bunch of parameters, record the values, and then let the genetic algorithm drive you to the best solution?
If each behavior works well in one context and not another, I'd try quantifying how to sense and interpolate between contexts and blend the strategies to see if that would help.
It sounds like at this point you have just a classification problem. You want to map some knowledge about what you are currently walking on to one of a set of classes. Knowing the class of the terrain allows you to then invoke the proper subroutine. Is this correct?
If so, then there are a wide array of classification engines that you can use including neural networks, Bayesian networks, decision trees, nearest neighbor, etc. In order to pick the best fit, we will need more information about your problem.
First, what kind of input or sensory data do you have available to help you identify the behavior class you should invoke? Second, can you describe the circumstances in which you will be training this classifier and what the circumstances are during runtime when you deploy it, such as any limits on computational resources or requirements of robustness to noise?
EDIT: Since you have a fixed number of classes, and you have some parameterized model for generating all possible terrains, I would consider using k-means clustering. The principle is as follows. You cluster a whole bunch of terrains into k different classes, where each cluster is associated with one of your specialized subroutines that performs best for that cluster of terrains. Then when a new terrain comes in, it will probably fall near one of these clusters. You then invoke the corresponding specialized subroutine to navigate that terrain.
Do this offline: Generate enough random terrains to sufficiently sample the parameter space, map these terrains to your sensory space (but remember which points in sensory space correspond to which terrains), and then run k-means clustering on this sensory space corpus where k is the number of classes you want to learn. Your distance function between a class representative C and a point P in sensory space would be simply the fitness function of letting algorithm C navigate the terrain that generated P. You would then get a partitioning of your sensory space into k clusters, each cluster mapping to the best subroutine that you've got. Each cluster will have a representative point in sensory space.
Now during runtime: You will get some unlabeled point in sensory space. Use a different distance function to find the closest representative point to this new incoming point. That tells you what class the terrain is.
Note that the success of this method depends on the quality of the mapping from the parameter space of terrain generation to sensory space, from sensory space to your fitness functions, and the eventual distance function you use to compare points in sensory space.
Note also that if you had enough memory, instead of only using the k representative sensory points to tell you which class an unlabeled sensory point belongs to, you might go through your training set and label all points with the learned class. Then during runtime you pick the nearest neighbor, and conclude that your unlabeled point in sensory space is in the same class as that neighbor.