For each of the following regex, draw a DFA recognizing the corresponding language - dfa

I want to draw a DFA for each of the following regex.
The first one is that
(0|1)*110*
The second one is that
(1|110)*0
I wrote lots of diagrams, but I can't draw deterministic finite automata.
how can I get these?

Related

How to design a DFA that accepts basic arithmetic expressions

For my university task I must design a Deterministic Finite Automata which recognises basic arithmetic. We're basically building a very basic lexical analyzer.
The DFA uses the operators "+,-,*,/".
The DFA has only positive numbers so expressions like "-1+1","+1+1" aren't accepted.
It can accept decimals but only when they start with 0. so "0.3415" is accepted while "1.3415" is not.
Finally it can accept just a "0" by itself.
I'm confused about the best way to approach this. I have a basic foundation of DFAs and NFAs so can someone please just give me some hints as to how I should start?
My current approach is to draw some small DFAs. One for decimal numbers, one for whole numbers, one for operators, and one that's just a 0. Then I want to concatenate them and do the union of the smaller DFAs to create one big NFA and end it by converting back to a DFA.

NFA to DFA conversion = deterministic?

I am struggling a bit with the meaning of determinism and nondeterminism. I get the difference when it comes to automata, but I can't seem to find an answer for the following: Is a NFA to DFA transformation deterministic?
If multiple DFAs can be constructed for the same regular language, does that mean that the result of a NFA to DFA transformation is not unique? And thus a nondeterministic algorithm?
I'm happy with any information you guys might be able to provide.
Thanks in advance!
There are two different concepts at play here. First, you are correct that there can be many different DFAs equivalent to the same NFA, just as there can be many NFAs that are all equivalent to one another.
Independently, there are several algorithms for converting an NFA into a DFA. The standard algorithm taught in most introductory classes on formal languages is the subset construction (also called the powerset construction). That algorithm is deterministic - there's a specific sequence of steps to follow to convert an NFA to a DFA, and accordingly you'll always get back the same DFA whenever you feed in the same NFA. You could conceivably have a nondeterministic algorithm for converting an NFA to a DFA, where the algorithm might produce one of many different DFAs as output, but to the best of my knowledge there aren't any famous algorithms of this sort.
Hope this helps!
DFA- means deterministic finite automata
Where as NFA- means non deterministic finite automata..
In dfa for every state there is a transition for both the inputs... I we have...{a, b} are the inputs for the given question.. For.. Every state there is a transition for both a and b... That automata is known as deterministic finite automata..
Where as in NDA we need not to have both input transitions for every state... At least one transition... is sufficient...
In NFA Epsilon transition is also accepted.. And dead state is also accepted...
In nfa... No of states required is less.. When compare to dfa.. Every dfa is equivalent to nfa... But every dfa is not equivalent to nfa...

Algorithm for voice comparison

Given two recorded voices in digital format, is there an algorithm to compare the two and return a coefficient of similarity?
I recommend to take a look into the HTK toolkit for speech recognition http://htk.eng.cam.ac.uk/, especially the part on feature extraction.
Features that I would assume to be good indicators:
Mel-Cepstrum coefficients (general timbre)
LPC (for the harmonics)
Given your clarification I think what you are looking for falls under speech recognition algorithms.
Even though you are only looking for the measure of similarity and not trying to turn speech into text, still the concepts are the same and I would not be surprised if a large part of the algorithms would be quite useful.
However, you will have to define this coefficient of similarity more formally and precisely to get anywhere.
EDIT:
I believe speech recognition algorithms would be useful because they do abstraction of the sound and comparison to some known forms. Conceptually this might not be that different from taking two recordings, abstracting them and comparing them.
From wikipedia article on HMM
"In speech recognition, the hidden
Markov model would output a sequence
of n-dimensional real-valued vectors
(with n being a small integer, such as
10), outputting one of these every 10
milliseconds. The vectors would
consist of cepstral coefficients,
which are obtained by taking a Fourier
transform of a short time window of
speech and decorrelating the spectrum
using a cosine transform, then taking
the first (most significant)
coefficients."
So if you run such an algorithm on both recordings you would end up with coefficients that represent the recordings and it might be far easier to measure and establish similarities between the two.
But again now you come to the question of defining the 'similarity coefficient' and introducing dogs and horses did not really help.
(Well it does a bit, but in terms of evaluating algorithms and choosing one over another, you will have to do better).
There are many different algorithms - the general name for this task is Speaker Identification - start with this Wikipedia page and work from there: http://en.wikipedia.org/wiki/Speaker_recognition
I'm not sure this will work for soundfiles, but it gives you an idea how to proceed i hope. That is a basic way how to find a pattern (image) in another image.
You first have to calculate the fft of both the soundfiles and then do a correlation. In formular it would look like (pseudocode):
fftSoundFile1 = fft(soundFile1);
fftConjSoundFile2 = conj(fft(soundFile2));
result_corr = real(ifft(soundFile1.*soundFile2));
Where fft= fast Fourier transform, ifft = inverse, conj = conjugate complex.
The fft is performed on the sample values of the soundfiles.
The peaks in the result_corr vector will then give you the positions of high correlation.
Note that both soundfiles must in this case be of the same size-otherwise you have to place the shorter one into a file of max(soundFileLength) vector.
Regards
Edit: .* means (in matlab style) a component wise mult, you must not do a vector mult!
Next Edit: Note that you have to operate with complex numbers - but there are several Complex classes out there so I think you don't have to bother about this.

How to program a neural network for chess?

I want to program a chess engine which learns to make good moves and win against other players. I've already coded a representation of the chess board and a function which outputs all possible moves. So I only need an evaluation function which says how good a given situation of the board is. Therefore, I would like to use an artificial neural network which should then evaluate a given position. The output should be a numerical value. The higher the value is, the better is the position for the white player.
My approach is to build a network of 385 neurons: There are six unique chess pieces and 64 fields on the board. So for every field we take 6 neurons (1 for every piece). If there is a white piece, the input value is 1. If there is a black piece, the value is -1. And if there is no piece of that sort on that field, the value is 0. In addition to that there should be 1 neuron for the player to move. If it is White's turn, the input value is 1 and if it's Black's turn, the value is -1.
I think that configuration of the neural network is quite good. But the main part is missing: How can I implement this neural network into a coding language (e.g. Delphi)? I think the weights for each neuron should be the same in the beginning. Depending on the result of a match, the weights should then be adjusted. But how? I think I should let 2 computer players (both using my engine) play against each other. If White wins, Black gets the feedback that its weights aren't good.
So it would be great if you could help me implementing the neural network into a coding language (best would be Delphi, otherwise pseudo-code). Thanks in advance!
In case somebody randomly finds this page. Given what we know now, what the OP proposes is almost certainly possible. In fact we managed to do it for a game with much larger state space - Go ( https://deepmind.com/research/case-studies/alphago-the-story-so-far ).
I don't see why you can't have a neural net for a static evaluator if you also do some classic mini-max lookahead with alpha-beta pruning. Lots of Chess engines use minimax with a braindead static evaluator that just adds up the pieces or something; it doesn't matter so much if you have enough levels of minimax. I don't know how much of an improvement the net would make but there's little to lose. Training it would be tricky though. I'd suggest using an engine that looks ahead many moves (and takes loads of CPU etc) to train the evaluator for an engine that looks ahead fewer moves. That way you end up with an engine that doesn't take as much CPU (hopefully).
Edit: I wrote the above in 2010, and now in 2020 Stockfish NNUE has done it. "The network is optimized and trained on the [classical Stockfish] evaluations of millions of positions at moderate search depth" and then used as a static evaluator, and in their initial tests they got an 80-elo improvement when using this static evaluator instead of their previous one (or, equivalently, the same elo with a little less CPU time). So yes it does work, and you don't even have to train the network at high search depth as I originally suggested: moderate search depth is enough, but the key is to use many millions of positions.
Been there, done that. Since there is no continuity in your problem (the value of a position is not closely related to an other position with only 1 change in the value of one input), there is very little chance a NN would work. And it never did in my experiments.
I would rather see a simulated annealing system with an ad-hoc heuristic (of which there are plenty out there) to evaluate the value of the position...
However, if you are set on using a NN, is is relatively easy to represent. A general NN is simply a graph, with each node being a neuron. Each neuron has a current activation value, and a transition formula to compute the next activation value, based on input values, i.e. activation values of all the nodes that have a link to it.
A more classical NN, that is with an input layer, an output layer, identical neurons for each layer, and no time-dependency, can thus be represented by an array of input nodes, an array of output nodes, and a linked graph of nodes connecting those. Each node possesses a current activation value, and a list of nodes it forwards to. Computing the output value is simply setting the activations of the input neurons to the input values, and iterating through each subsequent layer in turn, computing the activation values from the previous layer using the transition formula. When you have reached the last (output) layer, you have your result.
It is possible, but not trivial by any means.
https://erikbern.com/2014/11/29/deep-learning-for-chess/
To train his evaluation function, he utilized a lot of computing power to do so.
To summarize generally, you could go about it as follows. Your evaluation function is a feedforward NN. Let the matrix computations lead to a scalar output valuing how good the move is. The input vector for the network is the board state represented by all the pieces on the board so say white pawn is 1, white knight is 2... and empty space is 0. An example board state input vector is simply a sequence of 0-12's. This evaluation can be trained using grandmaster games (available at a fics database for example) for many games, minimizing loss between what the current parameters say is the highest valuation and what move the grandmasters made (which should have the highest valuation). This of course assumes that the grandmaster moves are correct and optimal.
What you need to train a ANN is either something like backpropagation learning or some form of a genetic algorithm. But chess is such an complex game that it is unlikly that a simple ANN will learn to play it - even more if the learning process is unsupervised.
Further, your question does not say anything about the number of layers. You want to use 385 input neurons to encode the current situation. But how do you want to decide what to do? On neuron per field? Highest excitation wins? But there is often more than one possible move.
Further you will need several hidden layers - the functions that can be represented with an input and an output layer without hidden layer are really limited.
So I do not want to prevent you from trying it, but chances for a successful implemenation and training within say one year or so a practically zero.
I tried to build and train an ANN to play Tic-tac-toe when I was 16 years or so ... and I failed. I would suggest to try such an simple game first.
The main problem I see here is one of training. You say you want your ANN to take the current board position and evaluate how good it is for a player. (I assume you will take every possible move for a player, apply it to the current board state, evaluate via the ANN and then take the one with the highest output - ie: hill climbing)
Your options as I see them are:
Develop some heuristic function to evaluate the board state and train the network off that. But that begs the question of why use an ANN at all, when you could just use your heuristic.
Use some statistical measure such as "How many games were won by white or black from this board configuration?", which would give you a fitness value between white or black. The difficulty with that is the amount of training data required for the size of your problem space.
With the second option you could always feed it board sequences from grandmaster games and hope there is enough coverage for the ANN to develop a solution.
Due to the complexity of the problem I'd want to throw the largest network (ie: lots of internal nodes) at it as I could without slowing down the training too much.
Your input algorithm is sound - all positions, all pieces, and both players are accounted for. You may need an input layer for every past state of the gameboard, so that past events are used as input again.
The output layer should (in some form) give the piece to move, and the location to move to.
Write a genetic algorithm using a connectome which contains all neuron weights and synapse strengths, and begin multiple separated gene pools with a large number of connectomes in each.
Make them play one another, keep the best handful, crossover and mutate the best connectomes to repopulate the pool.
Read blondie24 : http://www.amazon.co.uk/Blondie24-Playing-Kaufmann-Artificial-Intelligence/dp/1558607838.
It deals with checkers instead of chess but the principles are the same.
Came here to say what Silas said. Using a minimax algorithm, you can expect to be able to look ahead N moves. Using Alpha-beta pruning, you can expand that to theoretically 2*N moves, but more realistically 3*N/4 moves. Neural networks are really appropriate here.
Perhaps though a genetic algorithm could be used.

Similarity between line strings

I have a number of tracks recorded by a GPS, which more formally can be described as a number of line strings.
Now, some of the recorded tracks might be recordings of the same route, but because of inaccurasies in the GPS system, the fact that the recordings were made on separate occasions and that they might have been recorded travelling at different speeds, they won't match up perfectly, but still look close enough when viewed on a map by a human to determine that it's actually the same route that has been recorded.
I want to find an algorithm that calculates the similarity between two line strings. I have come up with some home grown methods to do this, but would like to know if this is a problem that's already has good algorithms to solve it.
How would you calculate the similarity, given that similar means represents the same path on a map?
Edit: For those unsure of what I'm talking about, please look at this link for a definition of what a line string is: http://msdn.microsoft.com/en-us/library/bb895372.aspx - I'm not asking about character strings.
Compute the Fréchet distance on each pair of tracks. The distance can be used to gauge the similarity of your tracks.
Math alert: Fréchet was a pioneer in the field of metric space which is relevant to your problem.
I would add a buffer around the first line based on the estimated probable error, and then determine if the second line fits entirely within the buffer.
To determine "same route," create the minimal set of normalized path vectors, calculate the total power differences and compare the total to a quality measure.
Normalize the GPS waypoints on total path length,
walk the vectors of the paths together, creating a new set of path vectors for each path based upon the shortest vector at each waypoint,
calculate the total power differences between endpoints of each vector in the normalized paths weighting for vector length, and
compare against a quality measure.
Tune the power of the differences (start with, say, squared differences) and the quality measure (say as a percent of the total power differences) visually. This algorithm produces a continuous quality measure of the path match as well as a binary result (Are the paths the same?)
Paul Tomblin said: I would add a buffer
around the first line based on the
estimated probable error, and then
determine if the second line fits
entirely within the buffer.
You could modify the algorithm as the normalized vector endpoints are compared. You could determine if any endpoint difference was above a certain size (implementing Paul's buffer idea) or perhaps, if the endpoints were outside the "buffer," use that fact to ignore that endpoint difference, allowing a comparison ignoring side trips.
You could walk along each point (Pa) of LineString A and measure the distance from Pa to the nearest line-segment of LineString B, averaging each of these distances.
This is not a quick or perfect method, but should be able to give use a useful number and is pretty quick to implement.
Do the line strings start and finish at similar points, or are they of very different extents?
If you consider a single line string to be a sequence of [x,y] points (or [x,y,z] points), then you could compute the similarity between each pair of line strings using the Needleman-Wunsch algorithm. As described in the referenced Wikipedia article, the Needleman-Wunsch algorithm requires a "similarity matrix" which defines the distance between a pair of points. However, it would be easy to use a function instead of a matrix. In your case you could simply use the 2D Euclidean distance function (or a 3D Euclidean function if your points have elevation) to provide the distance between each pair of points.
I actually side with the person (Aaron F) who said that you might be interested in the Levenshtein distance problem (and cited this). His answer seems to me to be the best so far.
More specifically, Levenshtein distance (also called edit distance), does not measure strictly the character-by-character distance, but also allows you to perform insertions and deletions. The best algorithm for this distance measure can be computed in quadratic time (pretty slow if your strings are long), but the computational biologists have pretty good heuristics for this, that might be of interest to you on their own. Check out BLAST and FASTA.
In your problem, it seems that you are dealing with differences between strings of numbers, and you care about the numbers. If you give more information, I might be able to direct you to the right variant of BLAST/FASTA/etc for your purposes. In any case, you might consider adapting BLAST and FASTA for your needs. They're quite simple.
1: http://en.wikipedia.org/wiki/Levenshtein_distance, http://www.nist.gov/dads/HTML/Levenshtein.html

Resources