Determining which inputs to weigh in an evolutionary algorithm - artificial-intelligence

I once wrote a Tetris AI that played Tetris quite well. The algorithm I used (described in this paper) is a two-step process.
In the first step, the programmer decides to track inputs that are "interesting" to the problem. In Tetris we might be interested in tracking how many gaps there are in a row because minimizing gaps could help place future pieces more easily. Another might be the average column height because it may be a bad idea to take risks if you're about to lose.
The second step is determining weights associated with each input. This is the part where I used a genetic algorithm. Any learning algorithm will do here, as long as the weights are adjusted over time based on the results. The idea is to let the computer decide how the input relates to the solution.
Using these inputs and their weights we can determine the value of taking any action. For example, if putting the straight line shape all the way in the right column will eliminate the gaps of 4 different rows, then this action could get a very high score if its weight is high. Likewise, laying it flat on top might actually cause gaps and so that action gets a low score.
I've always wondered if there's a way to apply a learning algorithm to the first step, where we find "interesting" potential inputs. It seems possible to write an algorithm where the computer first learns what inputs might be useful, then applies learning to weigh those inputs. Has anything been done like this before? Is it already being used in any AI applications?

In neural networks, you can select 'interesting' potential inputs by finding the ones that have the strongest correlation, positive or negative, with the classifications you're training for. I imagine you can do similarly in other contexts.

I think I might approach the problem you're describing by feeding more primitive data to a learning algorithm. For instance, a tetris game state may be described by the list of occupied cells. A string of bits describing this information would be a suitable input to that stage of the learning algorithm. actually training on that is still challenging; how do you know whether those are useful results. I suppose you could roll the whole algorithm into a single blob, where the algorithm is fed with the successive states of play and the output would just be the block placements, with higher scoring algorithms selected for future generations.
Another choice might be to use a large corpus of plays from other sources; such as recorded plays from human players or a hand-crafted ai, and select the algorithms who's outputs bear a strong correlation to some interesting fact or another from the future play, such as the score earned over the next 10 moves.

Yes, there is a way.
If you choose M selected features there are 2^M subsets, so there is a lot to look at.
I would to the following:
For each subset S
run your code to optimize the weights W
save S and the corresponding W
Then for each pair S-W, you can run G games for each pair and save the score L for each one. Now you have a table like this:
feature1 feature2 feature3 featureM subset_code game_number scoreL
1 0 1 1 S1 1 10500
1 0 1 1 S1 2 6230
...
0 1 1 0 S2 G + 1 30120
0 1 1 0 S2 G + 2 25900
Now you can run some component selection algorithm (PCA for example) and decide which features are worth to explain scoreL.
A tip: When running the code to optimize W, seed the random number generator, so that each different 'evolving brain' is tested against the same piece sequence.
I hope it helps in something!

Related

How can I predict a user generated distribution by learning from previous distributions

I am trying to program a prediction algorithmus which predicts the distribution of marbles among 4 cups based on the prevision user input. But I have no idea where to start or which techniques can be used to solve this.
Example:
There are 4 cups numbered from 0 to 3 and the user will receive x marbles which he distributes among those cups. Each round the user receives another amount of marbles (or the same amount) and before
the user distributes them, the algorithm tries to predict the distribution based on the users previous inputs. After that the user “corrects” it. The goal is that the user does not have to correct anything hence the
algorithm predicts the correct distribution. However the pattern which the user distributes the marbles can change, and the algorithm has to adapt.
This is the simplest design of the problem which already is not trivial to solve. However it get exponential more complex when the marbles have additional properties which can be used for distribution.
For example, they could have a color and a weight.
So, for example, how does the algorithm learn that the user (most of the time) put the marbles with the same color in one cup, but cup 2 is (most of the time empty) and the rest is equally distributed?
So in my head the algorithm has to do something like this:
search for a pattern after the users distribution is done. Those patterns can be the amount of marbles per cup / weight per cup or anything else.
if a pattern is found a predefined value (weight) is added to the pattern
if a previous pattern was not found a predefined value has to be subtracted from the pattern
when the algorithm has to predict, all patterns with a predefined weight have to be applied.
I am not sure if I am missing something and how I would implement something like this or in which area I have to look for answers.
First of all, bear in mind that human behavior does not always follow a pattern. If the user distributes these marbles randomly, it will be hard to predict the next move!
But if there IS a pattern in the distributions, you might create a prediction using an algorithm such as a neural network or a decision tree.
For example:
// dataset1
// the weights and colors of 10 marbles
let dataset1 = [4,3,1,1,2,3,4,5,3,4,7,6,4,4,2,4,1,6,1,2]
// cups1
// the cups distribution of the above marbles
let labels1 = [2,1,0,2,1,1,3,1,0,1]
Now you can train an algorithm, for example a neural network or a decision tree.
This isn't real code, just an example of how it could work.
let net = new NeuralNet()
net.train(dataset1, labels1)
After training with lots of data (at least hundreds of these datasets), you can give the network a new dataset and it will give you a prediction of the cups distribution
let newMarbleSet = [...]
let prediction = net.predict(newMarbleSet)
It's up to you what you want to do with this prediction.

AI of spaceship's propulsion: land a 3D ship at position=0 and angle=0

This is a very difficult problem about how to maneuver a spaceship that can both translate and rotate in 3D, for a space game.
The spaceship has n jets placing in various positions and directions.
Transformation of i-th jet relative to the CM of spaceship is constant = Ti.
Transformation is a tuple of position and orientation (quaternion or matrix 3x3 or, less preferable, Euler angles).
A transformation can also be denoted by a single matrix 4x4.
In other words, all jet are glued to the ship and cannot rotate.
A jet can exert force to the spaceship only in direction of its axis (green).
As a result of glue, the axis rotated along with the spaceship.
All jets can exert force (vector,Fi) at a certain magnitude (scalar,fi) :
i-th jet can exert force (Fi= axis x fi) only within range min_i<= fi <=max_i.
Both min_i and max_i are constant with known value.
To be clear, unit of min_i,fi,max_i is Newton.
Ex. If the range doesn't cover 0, it means that the jet can't be turned off.
The spaceship's mass = m and inertia tensor = I.
The spaceship's current transformation = Tran0, velocity = V0, angularVelocity = W0.
The spaceship physic body follows well-known physic rules :-
Torque=r x F
F=ma
angularAcceleration = I^-1 x Torque
linearAcceleration = m^-1 x F
I is different for each direction, but for the sake of simplicity, it has the same value for every direction (sphere-like). Thus, I can be thought as a scalar instead of matrix 3x3.
Question
How to control all jets (all fi) to land the ship with position=0 and angle=0?
Math-like specification: Find function of fi(time) that take minimum time to reach position=(0,0,0), orient=identity with final angularVelocity and velocity = zero.
More specifically, what are names of technique or related algorithms to solve this problem?
My research (1 dimension)
If the universe is 1D (thus, no rotation), the problem will be easy to solve.
( Thank Gavin Lock, https://stackoverflow.com/a/40359322/3577745 )
First, find the value MIN_BURN=sum{min_i}/m and MAX_BURN=sum{max_i}/m.
Second, think in opposite way, assume that x=0 (position) and v=0 at t=0,
then create two parabolas with x''=MIN_BURN and x''=MAX_BURN.
(The 2nd derivative is assumed to be constant for a period of time, so it is parabola.)
The only remaining work is to join two parabolas together.
The red dash line is where them join.
In the period of time that x''=MAX_BURN, all fi=max_i.
In the period of time that x''=MIN_BURN, all fi=min_i.
It works really well for 1D, but in 3D, the problem is far more harder.
Note:
Just a rough guide pointing me to a correct direction is really appreciated.
I don't need a perfect AI, e.g. it can take a little more time than optimum.
I think about it for more than 1 week, still find no clue.
Other attempts / opinions
I don't think machine learning like neural network is appropriate for this case.
Boundary-constrained-least-square-optimisation may be useful but I don't know how to fit my two hyper-parabola to that form of problem.
This may be solved by using many iterations, but how?
I have searched NASA's website, but not find anything useful.
The feature may exist in "Space Engineer" game.
Commented by Logman: Knowledge in mechanical engineering may help.
Commented by AndyG: It is a motion planning problem with nonholonomic constraints. It could be solved by Rapidly exploring random tree (RRTs), theory around Lyapunov equation, and Linear quadratic regulator.
Commented by John Coleman: This seems more like optimal control than AI.
Edit: "Near-0 assumption" (optional)
In most case, AI (to be designed) run continuously (i.e. called every time-step).
Thus, with the AI's tuning, Tran0 is usually near-identity, V0 and W0 are usually not so different from 0, e.g. |Seta0|<30 degree,|W0|<5 degree per time-step .
I think that AI based on this assumption would work OK in most case. Although not perfect, it can be considered as a correct solution (I started to think that without this assumption, this question might be too hard).
I faintly feel that this assumption may enable some tricks that use some "linear"-approximation.
The 2nd Alternative Question - "Tune 12 Variables" (easier)
The above question might also be viewed as followed :-
I want to tune all six values and six values' (1st-derivative) to be 0, using lowest amount of time-steps.
Here is a table show a possible situation that AI can face:-
The Multiplier table stores inertia^-1 * r and mass^-1 from the original question.
The Multiplier and Range are constant.
Each timestep, the AI will be asked to pick a tuple of values fi that must be in the range [min_i,max_i] for every i+1-th jet.
Ex. From the table, AI can pick (f0=1,f1=0.1,f2=-1).
Then, the caller will use fi to multiply with the Multiplier table to get values''.
Px'' = f0*0.2+f1*0.0+f2*0.7
Py'' = f0*0.3-f1*0.9-f2*0.6
Pz'' = ....................
SetaX''= ....................
SetaY''= ....................
SetaZ''= f0*0.0+f1*0.0+f2*5.0
After that, the caller will update all values' with formula values' += values''.
Px' += Px''
.................
SetaZ' += SetaZ''
Finally, the caller will update all values with formula values += values'.
Px += Px'
.................
SetaZ += SetaZ'
AI will be asked only once for each time-step.
The objective of AI is to return tuples of fi (can be different for different time-step), to make Px,Py,Pz,SetaX,SetaY,SetaZ,Px',Py',Pz',SetaX',SetaY',SetaZ' = 0 (or very near),
by using least amount of time-steps as possible.
I hope providing another view of the problem will make it easier.
It is not the exact same problem, but I feel that a solution that can solve this version can bring me very close to the answer of the original question.
An answer for this alternate question can be very useful.
The 3rd Alternative Question - "Tune 6 Variables" (easiest)
This is a lossy simplified version of the previous alternative.
The only difference is that the world is now 2D, Fi is also 2D (x,y).
Thus I have to tune only Px,Py,SetaZ,Px',Py',SetaZ'=0, by using least amount of time-steps as possible.
An answer to this easiest alternative question can be considered useful.
I'll try to keep this short and sweet.
One approach that is often used to solve these problems in simulation is a Rapidly-Exploring Random Tree. To give at least a little credibility to my post, I'll admit I studied these, and motion planning was my research lab's area of expertise (probabilistic motion planning).
The canonical paper to read on these is Steven LaValle's Rapidly-exploring random trees: A new tool for path planning, and there have been a million papers published since that all improve on it in some way.
First I'll cover the most basic description of an RRT, and then I'll describe how it changes when you have dynamical constraints. I'll leave fiddling with it afterwards up to you:
Terminology
"Spaces"
The state of your spaceship can be described by its 3-dimension position (x, y, z) and its 3-dimensional rotation (alpha, beta, gamma) (I use those greek names because those are the Euler angles).
state space is all possible positions and rotations your spaceship can inhabit. Of course this is infinite.
collision space are all of the "invalid" states. i.e. realistically impossible positions. These are states where your spaceship is in collision with some obstacle (With other bodies this would also include collision with itself, for example planning for a length of chain). Abbreviated as C-Space.
free space is anything that is not collision space.
General Approach (no dynamics constraints)
For a body without dynamical constraints the approach is fairly straightforward:
Sample a state
Find nearest neighbors to that state
Attempt to plan a route between the neighbors and the state
I'll briefly discuss each step
Sampling a state
Sampling a state in the most basic case means choosing at random values for each entry in your state space. If we did this with your space ship, we'd randomly sample for x, y, z, alpha, beta, gamma across all of their possible values (uniform random sampling).
Of course way more of your space is obstacle space than free space typically (because you usually confine your object in question to some "environment" you want to move about inside of). So what is very common to do is to take the bounding cube of your environment and sample positions within it (x, y, z), and now we have a lot higher chance to sample in the free space.
In an RRT, you'll sample randomly most of the time. But with some probability you will actually choose your next sample to be your goal state (play with it, start with 0.05). This is because you need to periodically test to see if a path from start to goal is available.
Finding nearest neighbors to a sampled state
You chose some fixed integer > 0. Let's call that integer k. Your k nearest neighbors are nearby in state space. That means you have some distance metric that can tell you how far away states are from each other. The most basic distance metric is Euclidean distance, which only accounts for physical distance and doesn't care about rotational angles (because in the simplest case you can rotate 360 degrees in a single timestep).
Initially you'll only have your starting position, so it will be the only candidate in the nearest neighbor list.
Planning a route between states
This is called local planning. In a real-world scenario you know where you're going, and along the way you need to dodge other people and moving objects. We won't worry about those things here. In our planning world we assume the universe is static but for us.
What's most common is to assume some linear interpolation between the sampled state and its nearest neighbor. The neighbor (i.e. a node already in the tree) is moved along this linear interpolation bit by bit until it either reaches the sampled configuration, or it travels some maximum distance (recall your distance metric).
What's going on here is that your tree is growing towards the sample. When I say that you step "bit by bit" I mean you define some "delta" (a really small value) and move along the linear interpolation that much each timestep. At each point you check to see if you the new state is in collision with some obstacle. If you hit an obstacle, you keep the last valid configuration as part of the tree (don't forget to store the edge somehow!) So what you'll need for a local planner is:
Collision checking
how to "interpolate" between two states (for your problem you don't need to worry about this because we'll do something different).
A physics simulation for timestepping (Euler integration is quite common, but less stable than something like Runge-Kutta. Fortunately you already have a physics model!
Modification for dynamical constraints
Of course if we assume you can linearly interpolate between states, we'll violate the physics you've defined for your spaceship. So we modify the RRT as follows:
Instead of sampling random states, we sample random controls and apply said controls for a fixed time period (or until collision).
Before, when we sampled random states, what we were really doing was choosing a direction (in state space) to move. Now that we have constraints, we randomly sample our controls, which is effectively the same thing, except we're guaranteed not to violate our constraints.
After you apply your control for a fixed time interval (or until collision), you add a node to the tree, with the control stored on the edge. Your tree will grow very fast to explore the space. This control application replaces linear interpolation between tree states and sampled states.
Sampling the controls
You have n jets that individually have some min and max force they can apply. Sample within that min and max force for each jet.
Which node(s) do I apply my controls to?
Well you can choose at random, or your can bias the selection to choose nodes that are nearest to your goal state (need the distance metric). This biasing will try to grow nodes closer to the goal over time.
Now, with this approach, you're unlikely to exactly reach your goal, so you need to define some definition of "close enough". That is, you will use your distance metric to find nearest neighbors to your goal state, and then test them for "close enough". This "close enough" metric can be different than your distance metric, or not. If you're using Euclidean distance, but it's very important that you goal configuration is also rotated properly, then you may want to modify the "close enough" metric to look at angle differences.
What is "close enough" is entirely up to you. Also something for you to tune, and there are a million papers that try to get you a lot closer in the first place.
Conclusion
This random sampling may sound ridiculous, but your tree will grow to explore your free space very quickly. See some youtube videos on RRT for path planning. We can't guarantee something called "probabilistic completeness" with dynamical constraints, but it's usually "good enough". Sometimes it'll be possible that a solution does not exist, so you'll need to put some logic in there to stop growing the tree after a while (20,000 samples for example)
More Resources:
Start with these, and then start looking into their citations, and then start looking into who is citing them.
Kinodynamic RRT*
RRT-Connect
This is not an answer, but it's too long to place as a comment.
First of all, a real solution will involve both linear programming (for multivariate optimization with constraints that will be used in many of the substeps) as well as techniques used in trajectory optimization and/or control theory. This is a very complex problem and if you can solve it, you could have a job at any company of your choosing. The only thing that could make this problem worse would be friction (drag) effects or external body gravitation effects. A real solution would also ideally use Verlet integration or 4th order Runge Kutta, which offer improvements over the Euler integration you've implemented here.
Secondly, I believe your "2nd Alternative Version" of your question above has omitted the rotational influence on the positional displacement vector you add into the position at each timestep. While the jet axes all remain fixed relative to the frame of reference of the ship, they do not remain fixed relative to the global coordinate system you are using to land the ship (at global coordinate [0, 0, 0]). Therefore the [Px', Py', Pz'] vector (calculated from the ship's frame of reference) must undergo appropriate rotation in all 3 dimensions prior to being applied to the global position coordinates.
Thirdly, there are some implicit assumptions you failed to specify. For example, one dimension should be defined as the "landing depth" dimension and negative coordinate values should be prohibited (unless you accept a fiery crash). I developed a mockup model for this in which I assumed z dimension to be the landing dimension. This problem is very sensitive to initial state and the constraints placed on the jets. All of my attempts using your example initial conditions above failed to land. For example, in my mockup (without the 3d displacement vector rotation noted above), the jet constraints only allow for rotation in one direction on the z-axis. So if aZ becomes negative at any time (which is often the case) the ship is actually forced to complete another full rotation on that axis before it can even try to approach zero degrees again. Also, without the 3d displacement vector rotation, you will find that Px will only go negative using your example initial conditions and constraints, and the ship is forced to either crash or diverge farther and farther onto the negative x-axis as it attempts to maneuver. The only way to solve this is to truly incorporate rotation or allow for sufficient positive and negative jet forces.
However, even when I relaxed your min/max force constraints, I was unable to get my mockup to land successfully, demonstrating how complex planning will probably be required here. Unless it is possible to completely formulate this problem in linear programming space, I believe you will need to incorporate advanced planning or stochastic decision trees that are "smart" enough to continually use rotational methods to reorient the most flexible jets onto the currently most necessary axes.
Lastly, as I noted in the comments section, "On May 14, 2015, the source code for Space Engineers was made freely available on GitHub to the public." If you believe that game already contains this logic, that should be your starting place. However, I suspect you are bound to be disappointed. Most space game landing sequences simply take control of the ship and do not simulate "real" force vectors. Once you take control of a 3-d model, it is very easy to predetermine a 3d spline with rotation that will allow the ship to land softly and with perfect bearing at the predetermined time. Why would any game programmer go through this level of work for a landing sequence? This sort of logic could control ICBM missiles or planetary rover re-entry vehicles and it is simply overkill IMHO for a game (unless the very purpose of the game is to see if you can land a damaged spaceship with arbitrary jets and constraints without crashing).
I can introduce another technique into the mix of (awesome) answers proposed.
It lies more in AI, and provides close-to-optimal solutions. It's called Machine Learning, more specifically Q-Learning. It's surprisingly easy to implement but hard to get right.
The advantage is that the learning can be done offline, so the algorithm can then be super fast when used.
You could do the learning when the ship is built or when something happens to it (thruster destruction, large chunks torn away...).
Optimality
I observed you're looking for near-optimal solutions. Your method with parabolas is good for optimal control. What you did is this:
Observe the state of the system.
For every state (coming in too fast, too slow, heading away, closing in etc.) you devised an action (apply a strategy) that will bring the system into a state closer to the goal.
Repeat
This is pretty much intractable for a human in 3D (too many cases, will drive you nuts) however a machine may learn where to split the parabolas in every dimensions, and devise an optimal strategy by itself.
THe Q-learning works very similarly to us:
Observe the (secretized) state of the system
Select an action based on a strategy
If this action brought the system into a desirable state (closer to the goal), mark the action/initial state as more desirable
Repeat
Discretize your system's state.
For each state, have a map intialized quasi-randomly, which maps every state to an Action (this is the strategy). Also assign a desirability to each state (initially, zero everywhere and 1000000 to the target state (X=0, V=0).
Your state would be your 3 positions, 3 angles, 3translation speed, and three rotation speed.
Your actions can be any combination of thrusters
Training
Train the AI (offline phase):
Generate many diverse situations
Apply the strategy
Evaluate the new state
Let the algo (see links above) reinforce the selected strategies' desirability value.
Live usage in the game
After some time, a global strategy for navigation emerges. You then store it, and during your game loop you simply sample your strategy and apply it to each situation as they come up.
The strategy may still learn during this phase, but probably more slowly (because it happens real-time). (Btw, I dream of a game where the AI would learn from every user's feedback so we could collectively train it ^^)
Try this in a simple 1D problem, it devises a strategy remarkably quickly (a few seconds).
In 2D I believe excellent results could be obtained in an hour.
For 3D... You're looking at overnight computations. There's a few thing to try and accelerate the process:
Try to never 'forget' previous computations, and feed them as an initial 'best guess' strategy. Save it to a file!
You might drop some states (like ship roll maybe?) without losing much navigation optimality but increasing computation speed greatly. Maybe change referentials so the ship is always on the X-axis, this way you'll drop x&y dimensions!
States more frequently encountered will have a reliable and very optimal strategy. Maybe normalize the state to make your ship state always close to a 'standard' state?
Typically rotation speeds intervals may be bounded safely (you don't want a ship tumbling wildely, so the strategy will always be to "un-wind" that speed). Of course rotation angles are additionally bounded.
You can also probably discretize non-linearly the positions because farther away from the objective, precision won't affect the strategy much.
For these kind of problems there are two techniques available: bruteforce search and heuristics. Bruteforce means to recognize the problem as a blackbox with input and output parameters and the aim is to get the right input parameters for winning the game. To program such a bruteforce search, the gamephysics runs in a simulation loop (physics simulation) and via stochastic search (minimax, alpha-beta-prunning) every possibility is tried out. The disadvantage of bruteforce search is the high cpu consumption.
The other techniques utilizes knowledge about the game. Knowledge about motion primitives and about evaluation. This knowledge is programmed with normal computerlanguages like C++ or Java. The disadvantage of this idea is, that it is often difficult to grasp the knowledge.
The best practice for solving spaceship navigation is to combine both ideas into a hybrid system. For programming sourcecode for this concrete problem I estimate that nearly 2000 lines of code are necessary. These kind of problems are normaly done within huge projects with many programmers and takes about 6 months.

How do I handle uncertainty/missing data in an Artifical Neural Network?

The context:
I'm experimenting with using a feed-forward artificial neural network to create AI for a video game, and I've run into the problem that some of my input features are dependent upon the existence or value of other input features.
The most basic, simplified example I can think of is this:
feature 1 is the number of players (range 2...5)
feature 2 to ? is the score of each player (range >=0)
The number of features needed to inform the ANN of the scores is dependent on the number of players.
The question: How can I represent this dynamic knowledge input to an ANN?
Things I've already considered:
Simply not using such features, or consolidating them into static input.
I.E using the sum of the players scores instead. I seriously doubt this is applicable to my problem, it would result in the loss of too much information and the ANN would fail to perform well.
Passing in an error value (eg -1) or default value (eg 0) for non-existant input
I'm not sure how well this would work, in theory the ANN could easily learn from this input and model the function appropriately. In practise I'm worried about the sheer number of non-existant input causing problems for the ANN. For example if the range of players was 2-10, if there were only 2 players, 80% of the input data would be non-existant and would introduce weird bias into the ANN resulting in a poor performance.
Passing in the mean value over the training set in place on non-existant input
Again, the amount of non-existant input would be a problem, and I'm worried this would introduce weird problems for discrete-valued inputs.
So, I'm asking this, does anybody have any other solutions I could think about? And is there a standard or commonly used method for handling this problem?
I know it's a rather niche and complicated question for SO, but I was getting bored of the "how do I fix this code?" and "how do I do this in PHP/Javascript?" questions :P, thanks guys.
It sounds like you have multiple data sets (for each number of players) that aren't really compatible with each other. Would lessons learned from a 5-player game really apply to a 2-player game? Try simplifying the problem, such as #1, and see how the program performs. In AI, absurd simplifications can sometimes give you a lot of traction, like bag of words in spam filters.
Try thinking about some model like the following:
Say xi (e.g. x1) is one of the inputs that a variable number of can exist. You can have n of these (x1 to xn). Let y be the rest of the inputs.
On your first hidden layer, pass x1 and y to the first c nodes, x1,x2 and y to the next c nodes, x1,x2,x3 and y to the next c nodes, and so on. This assumes x1 and x3 can't both be active without x2. The model will have to change appropriately if this needs to be possible.
The rest of the network is a standard feed-forward network with all nodes connected to all nodes of the next layer, or however you choose.
Whenever you have w active inputs, disable all but the wth set of c nodes (completely exclude them from training for that input set, don't include them when calculating the value for the nodes they output to, don't update the weights for their inputs or outputs). This will allow most of the network to train, but for the first hidden layer, only parts applicable to that number of inputs.
I suggest c is chosen such that c*n (the number of nodes in the first hidden layer) is greater than (or equal to) the number of nodes in the 2nd hidden layer (and have c be at the very least 10 for a moderately sized network (into the 100s is also fine)) and I also suggest the network have at least 2 other hidden layers (so 3 in total excluding input and output). This is not from experience, but just what my intuition tells me.
This working is dependent on a certain (possibly undefinable) similarity between the different numbers of inputs, and might not work well, if at all, if this similarity doesn't exist. This also probably requires quite a bit of training data for each number of inputs.
If you try it, let me / us know if it works.
If you're interested in Artificial Intelligence discussions, I suggest joining some Linked-In group dedicated to it, there are some that are quite active and have interesting discussions. There doesn't seem to be much happening on stackoverflow when it comes to Artificial Intelligence, or maybe we should just work to change that, or both.
UPDATE:
Here is a list of the names of a few decent Artificial Intelligence LinkedIn groups (unless they changed their policies recently, it should be easy enough to join):
'Artificial Intelligence Researchers, Faculty + Professionals'
'Artificial Intelligence Applications'
'Artificial Neural Networks'
'AGI — Artificial General Intelligence'
'Applied Artificial Intelligence' (not too much going on at the moment, and still dealing with some spam, but it is getting better)
'Text Analytics' (if you're interested in that)

How to program a neural network for chess?

I want to program a chess engine which learns to make good moves and win against other players. I've already coded a representation of the chess board and a function which outputs all possible moves. So I only need an evaluation function which says how good a given situation of the board is. Therefore, I would like to use an artificial neural network which should then evaluate a given position. The output should be a numerical value. The higher the value is, the better is the position for the white player.
My approach is to build a network of 385 neurons: There are six unique chess pieces and 64 fields on the board. So for every field we take 6 neurons (1 for every piece). If there is a white piece, the input value is 1. If there is a black piece, the value is -1. And if there is no piece of that sort on that field, the value is 0. In addition to that there should be 1 neuron for the player to move. If it is White's turn, the input value is 1 and if it's Black's turn, the value is -1.
I think that configuration of the neural network is quite good. But the main part is missing: How can I implement this neural network into a coding language (e.g. Delphi)? I think the weights for each neuron should be the same in the beginning. Depending on the result of a match, the weights should then be adjusted. But how? I think I should let 2 computer players (both using my engine) play against each other. If White wins, Black gets the feedback that its weights aren't good.
So it would be great if you could help me implementing the neural network into a coding language (best would be Delphi, otherwise pseudo-code). Thanks in advance!
In case somebody randomly finds this page. Given what we know now, what the OP proposes is almost certainly possible. In fact we managed to do it for a game with much larger state space - Go ( https://deepmind.com/research/case-studies/alphago-the-story-so-far ).
I don't see why you can't have a neural net for a static evaluator if you also do some classic mini-max lookahead with alpha-beta pruning. Lots of Chess engines use minimax with a braindead static evaluator that just adds up the pieces or something; it doesn't matter so much if you have enough levels of minimax. I don't know how much of an improvement the net would make but there's little to lose. Training it would be tricky though. I'd suggest using an engine that looks ahead many moves (and takes loads of CPU etc) to train the evaluator for an engine that looks ahead fewer moves. That way you end up with an engine that doesn't take as much CPU (hopefully).
Edit: I wrote the above in 2010, and now in 2020 Stockfish NNUE has done it. "The network is optimized and trained on the [classical Stockfish] evaluations of millions of positions at moderate search depth" and then used as a static evaluator, and in their initial tests they got an 80-elo improvement when using this static evaluator instead of their previous one (or, equivalently, the same elo with a little less CPU time). So yes it does work, and you don't even have to train the network at high search depth as I originally suggested: moderate search depth is enough, but the key is to use many millions of positions.
Been there, done that. Since there is no continuity in your problem (the value of a position is not closely related to an other position with only 1 change in the value of one input), there is very little chance a NN would work. And it never did in my experiments.
I would rather see a simulated annealing system with an ad-hoc heuristic (of which there are plenty out there) to evaluate the value of the position...
However, if you are set on using a NN, is is relatively easy to represent. A general NN is simply a graph, with each node being a neuron. Each neuron has a current activation value, and a transition formula to compute the next activation value, based on input values, i.e. activation values of all the nodes that have a link to it.
A more classical NN, that is with an input layer, an output layer, identical neurons for each layer, and no time-dependency, can thus be represented by an array of input nodes, an array of output nodes, and a linked graph of nodes connecting those. Each node possesses a current activation value, and a list of nodes it forwards to. Computing the output value is simply setting the activations of the input neurons to the input values, and iterating through each subsequent layer in turn, computing the activation values from the previous layer using the transition formula. When you have reached the last (output) layer, you have your result.
It is possible, but not trivial by any means.
https://erikbern.com/2014/11/29/deep-learning-for-chess/
To train his evaluation function, he utilized a lot of computing power to do so.
To summarize generally, you could go about it as follows. Your evaluation function is a feedforward NN. Let the matrix computations lead to a scalar output valuing how good the move is. The input vector for the network is the board state represented by all the pieces on the board so say white pawn is 1, white knight is 2... and empty space is 0. An example board state input vector is simply a sequence of 0-12's. This evaluation can be trained using grandmaster games (available at a fics database for example) for many games, minimizing loss between what the current parameters say is the highest valuation and what move the grandmasters made (which should have the highest valuation). This of course assumes that the grandmaster moves are correct and optimal.
What you need to train a ANN is either something like backpropagation learning or some form of a genetic algorithm. But chess is such an complex game that it is unlikly that a simple ANN will learn to play it - even more if the learning process is unsupervised.
Further, your question does not say anything about the number of layers. You want to use 385 input neurons to encode the current situation. But how do you want to decide what to do? On neuron per field? Highest excitation wins? But there is often more than one possible move.
Further you will need several hidden layers - the functions that can be represented with an input and an output layer without hidden layer are really limited.
So I do not want to prevent you from trying it, but chances for a successful implemenation and training within say one year or so a practically zero.
I tried to build and train an ANN to play Tic-tac-toe when I was 16 years or so ... and I failed. I would suggest to try such an simple game first.
The main problem I see here is one of training. You say you want your ANN to take the current board position and evaluate how good it is for a player. (I assume you will take every possible move for a player, apply it to the current board state, evaluate via the ANN and then take the one with the highest output - ie: hill climbing)
Your options as I see them are:
Develop some heuristic function to evaluate the board state and train the network off that. But that begs the question of why use an ANN at all, when you could just use your heuristic.
Use some statistical measure such as "How many games were won by white or black from this board configuration?", which would give you a fitness value between white or black. The difficulty with that is the amount of training data required for the size of your problem space.
With the second option you could always feed it board sequences from grandmaster games and hope there is enough coverage for the ANN to develop a solution.
Due to the complexity of the problem I'd want to throw the largest network (ie: lots of internal nodes) at it as I could without slowing down the training too much.
Your input algorithm is sound - all positions, all pieces, and both players are accounted for. You may need an input layer for every past state of the gameboard, so that past events are used as input again.
The output layer should (in some form) give the piece to move, and the location to move to.
Write a genetic algorithm using a connectome which contains all neuron weights and synapse strengths, and begin multiple separated gene pools with a large number of connectomes in each.
Make them play one another, keep the best handful, crossover and mutate the best connectomes to repopulate the pool.
Read blondie24 : http://www.amazon.co.uk/Blondie24-Playing-Kaufmann-Artificial-Intelligence/dp/1558607838.
It deals with checkers instead of chess but the principles are the same.
Came here to say what Silas said. Using a minimax algorithm, you can expect to be able to look ahead N moves. Using Alpha-beta pruning, you can expand that to theoretically 2*N moves, but more realistically 3*N/4 moves. Neural networks are really appropriate here.
Perhaps though a genetic algorithm could be used.

What is fuzzy logic?

I'm working with a couple of AI algorithms at school and I find people use the words Fuzzy Logic to explain any situation that they can solve with a couple of cases. When I go back to the books I just read about how instead of a state going from On to Off it's a diagonal line and something can be in both states but in different "levels".
I've read the wikipedia entry and a couple of tutorials and even programmed stuff that "uses fuzzy logic" (an edge detector and a 1-wheel self-controlled robot) and still I find it very confusing going from Theory to Code... for you, in the less complicated definition, what is fuzzy logic?
Fuzzy logic is logic where state membership is, essentially, a float with range 0..1 instead of an int 0 or 1. The mileage you get out of it is that things like, for example, the changes you make in a control system are somewhat naturally more fine-tuned than what you'd get with naive binary logic.
An example might be logic that throttles back system activity based on active TCP connections. Say you define "a little bit too many" TCP connections on your machine as 1000 and "a lot too many" as 2000. At any given time, your system has a "too many TCP connections" state from 0 (<= 1000) to 1 (>= 2000), which you can use as a coefficient in applying whatever throttling mechanisms you have available. This is much more forgiving and responsive to system behavior than naive binary logic that only knows how to determine "too many", and throttle completely, or "not too many", and not throttle at all.
I'd like to add to the answers (that have been modded up) that, a good way to visualize fuzzy logic is follows:
Traditionally, with binary logic you would have a graph whose membership function is true or false whereas in a fuzzy logic system, the membership function is not.
1|
| /\
| / \
| / \
0|/ \
------------
a b c d
Assume for a second that the function is "likes peanuts"
a. kinda likes peanuts
b. really likes peanuts
c. kinda likes peanuts
d. doesn't like peanuts
The function itself doesn't have to be triangular and often isn't (it's just easier with ascii art).
A fuzzy system will likely have many of these, some even overlapping (even opposites) like so:
1| A B
| /\ /\ A = Likes Peanuts
| / \/ \ B = Doesn't Like Peanuts
| / /\ \
0|/ / \ \
------------
a b c d
so now c is "kind likes peanuts, kinda doesn't like peanuts" and d is "really doesn't like peanuts"
And you can program accordingly based on that info.
Hope this helps for the visual learners out there.
The best definition of fuzzy logic is given by its inventor Lotfi Zadeh:
“Fuzzy logic means of representing problems to computers in a way akin to the way human solve them and the essence of fuzzy logic is that everything is a matter of degree.”
The meaning of solving problems with computers akin to the way human solve can easily be explained with a simple example from a basketball game; if a player wants to guard another player firstly he should consider how tall he is and how his playing skills are. Simply if the player that he wants to guard is tall and plays very slow relative to him then he will use his instinct to determine to consider if he should guard that player as there is an uncertainty for him. In this example the important point is the properties are relative to the player and there is a degree for the height and playing skill for the rival player. Fuzzy logic provides a deterministic way for this uncertain situation.
There are some steps to process the fuzzy logic (Figure-1). These steps are; firstly fuzzification where crisp inputs get converted to fuzzy inputs secondly these inputs get processed with fuzzy rules to create fuzzy output and lastly defuzzification which results with degree of result as in fuzzy logic there can be more than one result with different degrees.
Figure 1 – Fuzzy Process Steps (David M. Bourg P.192)
To exemplify the fuzzy process steps, the previous basketball game situation could be used. As mentioned in the example the rival player is tall with 1.87 meters which is quite tall relative to our player and can dribble with 3 m/s which is slow relative to our player. Addition to these data some rules are needed to consider which are called fuzzy rules such as;
if player is short but not fast then guard,
if player is fast but not short then don’t guard
If player is tall then don’t guard
If player is average tall and average fast guard
Figure 2 – how tall
Figure 3- how fast
According to the rules and the input data an output will be created by fuzzy system such as; the degree for guard is 0.7, degree for sometimes guard is 0.4 and never guard is 0.2.
Figure 4-output fuzzy sets
On the last step, defuzzication, is using for creating a crisp output which is a number which may determine the energy that we should use to guard the player during game. The centre of mass is a common method to create the output. On this phase the weights to calculate the mean point is totally depends on the implementation. On this application it is considered to give high weight to guard or not guard but low weight given to sometimes guard. (David M. Bourg, 2004)
Figure 5- fuzzy output (David M. Bourg P.204)
Output = [0.7 * (-10) + 0.4 * 1 + 0.2 * 10] / (0.7 + 0.4 + 0.2) ≈ -3.5
As a result fuzzy logic is using under uncertainty to make a decision and to find out the degree of decision. The problem of fuzzy logic is as the number of inputs increase the number of rules increase exponential.
For more information and its possible application in a game I wrote a little article check this out
To build off of chaos' answer, a formal logic is nothing but an inductively defined set that maps sentences to a valuation. At least, that's how a model theorist thinks of logic. In the case of a sentential boolean logic:
(basis clause) For all A, v(A) in {0,1}
(iterative) For the following connectives,
v(!A) = 1 - v(A)
v(A & B) = min{v(A), v(B)}
v(A | B) = max{v(A), v(B)}
(closure) All sentences in a boolean sentential logic are evaluated per above.
A fuzzy logic changes would be inductively defined:
(basis clause) For all A, v(A) between [0,1]
(iterative) For the following connectives,
v(!A) = 1 - v(A)
v(A & B) = min{v(A), v(B)}
v(A | B) = max{v(A), v(B)}
(closure) All sentences in a fuzzy sentential logic are evaluated per above.
Notice the only difference in the underlying logic is the permission to evaluate a sentence as having the "truth value" of 0.5. An important question for a fuzzy logic model is the threshold that counts for truth satisfaction. This is to ask: for a valuation v(A), for what value D it is the case the v(A) > D means that A is satisfied.
If you really want to found out more about non-classical logics like fuzzy logic, I would recommend either An Introduction to Non-Classical Logic: From If to Is or Possibilities and Paradox
Putting my coder hat back on, I would be careful with the use of fuzzy logic in real world programming, because of the tendency for a fuzzy logic to be undecidable. Maybe it's too much complexity for little gain. For instance a supervaluational logic may do just fine to help a program model vagueness. Or maybe probability would be good enough. In short, I need to be convinced that the domain model dovetails with a fuzzy logic.
Maybe an example clears up what the benefits can be:
Let's say you want to make a thermostat and you want it to be 24 degrees.
This is how you'd implement it using boolean logic:
Rule1: heat up at full power when
it's colder than 21 degrees.
Rule2:
cool down at full power when it's
warmer than 27 degrees.
Such a system will only once and a while be 24 degrees, and it will be very inefficient.
Now, using fuzzy logic, it would be like something like this:
Rule1: For each degree that it's colder than 24 degrees, turn up the heater one notch (0 at 24).
Rule2: For each degree that it's warmer than 24 degress, turn up the cooler one notch (0 at 24).
This system will always be somewhere around 24 degrees, and it only once and will only once and a while make a tiny adjustment. It will also be more energy-efficient.
Well, you could read the works of Bart Kosko, one of the 'founding fathers'. 'Fuzzy Thinking: The New Science of Fuzzy Logic' from 1994 is readable (and available quite cheaply secondhand via Amazon). Apparently, he has a newer book 'Noise' from 2006 which is also quite approachable.
Basically though (in my paraphrase - not having read the first of those books for several years now), fuzzy logic is about how to deal with the world where something is perhaps 10% cool, 50% warm, and 10% hot, where different decisions may be made on the degree to which the different states are true (and no, it wasn't entirely an accident that those percentages don't add up to 100% - though I'd accept correction if needed).
A very good explanation, with a help of Fuzzy Logic Washing Machines.
I know what you mean about it being difficult to go from concept to code. I'm writing a scoring system that looks at the values of sysinfo and /proc on Linux systems and comes up with a number between 0 and 10, 10 being the absolute worst. A simple example:
You have 3 load averages (1, 5, 15 minute) with (at least) three possible states, good, getting bad, bad. Expanding that, you could have six possible states per average, adding 'about to' to the three that I just noted. Yet, the result of all 18 possibilities can only deduct 1 from the score. Repeat that with swap consumed, actual VM allocated (committed) memory and other stuff .. and you have one big bowl of conditional spaghetti :)
Its as much a definition as it is an art, how you implement the decision making process is always more interesting than the paradigm itself .. whereas in a boolean world, its rather cut and dry.
It would be very easy for me to say if load1 < 2 deduct 1, but not very accurate at all.
If you can teach a program to do what you would do when evaluating some set of circumstances and keep the code readable, you have implemented a good example of fuzzy logic.
Fuzzy Logic is a problem-solving methodology that lends itself to implementation in systems ranging from simple, small, embedded micro-controllers to large, networked, multi-channel PC or workstation-based data acquisition and control systems. It can be implemented in hardware, software, or a combination of both. Fuzzy Logic provides a simple way to arrive at a definite conclusion based upon vague, ambiguous, imprecise, noisy, or missing input information. Fuzzy Logic approach to control problems mimics how a person would make decisions, only much faster.
Fuzzy logic has proved to be particularly useful in expert system and other artificial intelligence applications. It is also used in some spell checkers to suggest a list of probable words to replace a misspelled one.
To learn more, just check out: http://en.wikipedia.org/wiki/Fuzzy_logic.
The following is sort of an empirical answer.
A simple (possibly simplistic answer) is that "fuzzy logic" is any logic that returns values other than straight true / false, or 1 / 0. There are a lot of variations on this and they tend to be highly domain specific.
For example, in my previous life I did search engines that used "content similarity searching" as opposed to then common "boolean search". Our similarity system used the Cosine Coefficient of weighted-attribute vectors representing the query and the documents and produced values in the range 0..1. Users would supply "relevance feedback" which was used to shift the query vector in the direction of desirable documents. This is somewhat related to the training done in certain AI systems where the logic gets "rewarded" or "punished" for results of trial runs.
Right now Netflix is running a competition to find a better suggestion algorithm for their company. See http://www.netflixprize.com/. Effectively all of the algorithms could be characterized as "fuzzy logic"
Fuzzy logic is calculating algorithm based on human like way of thinking. It is particularly useful when there is a large number of input variables. One online fuzzy logic calculator for two variables input is given:
http://www.cirvirlab.com/simulation/fuzzy_logic_calculator.php

Resources