How do I calculate no of states in "Vacuum Cleaner World" - artificial-intelligence

How do I calculate no of states in "Vacuum Cleaner World" ,given the number of positions of the vacuum cleaner -- for example, in this picture, it's 2.

There are three independent factors (none of them affects the existence or available choices of the others)
Vacuum is [left | right]
There [is | is not] dirt on left
There [is | is not] dirt on right
To get the count of combinations for independent events, you multiply:
2 choices x 2 choices x 2 choices => 8 combinations.

Related

Draw an FSA that recognizes: (A∗ | AB+). (The bar outscopes the other operators, so its equal to: (A∗) | (AB+).) Use as few states possible

I've attached what I have. My problem is that I don't know if its correct and if I've even used the fewest states possible to answer this question. Really appreciate any help on what I currently did wrong this is what i have currently
I would start by creating two FSAs, one for each of the branches.
For A* you only need one state.
For AB+ you need three states.
Then you merge the two. Assuming it does not have to be deterministic, the total FSA ends up with three states as well, two of which are final states.
As you tagged your question dfa — a deterministic FSA would need 4 states in total:
Start state: 1; Final States: 1,2,3,4
Transitions:
1 - a -> 2
2 - a -> 4
2 - b -> 3
3 - b -> 3
4 - a -> 4
That is a DFA that recognises (a*|ab+):

Representing a matrix as unique scalar numbers in certain range

I am trying to find a way to represent a 36*36 matrix as 4 different numbers in a certain range (-2.0 to 2.0), but I'm struggling to find the best way to achieve this.
The goal is to "generate" 4 unique floating-point "coordinates" based on one input matrix. In other words, some sort of a hashing algorithm.
The numbers in the matrix won't necessarily be unique numbers and the order (position) of a number matters, i.e.
| 1, 2, 3 | | 1, 3, 2 | | 5, 2, 3 |
| 5, 5, 6 | | 5, 5, 6 | | 5, 1, 6 |
should yield distinctly different results.
The matrix itself is stored as an array of unsigned integers (no more than 2 digits, 0 included). It would be preferable if the method will be able to keep the outcomes from the range limits, and the outcomes from different matrices would yield a noticeable amount of variation (ie, if one matrix yields 1.0000001 and another 1.0000003 — not great).
Lastly, the method doesn't have to be scientifically/mathematically valid, but it HAS to be consistent and repeatable, meaning, one particular matrix would always yield the same result.
The language I'm working with is C and I would appreciate immensely any help and advice you guys could offer.
Thanks!
Edit:
This has nothing to do with security/cryptography/rocket science, I don't expect to have every possible permutation to be mapped to a unique outcome, don't require 0 collisions, exceptional speed, or anything of that sort, nor do I worry if the method is somewhat magic. Simply a way to boil a matrix down to 4 numbers, scattered around enough, so that if I'll do the same for 10 matrices, the results would be somewhat different
For example, I can represent a matrix as a scalar using a Frobenius norm, Adler-32, or any other similar method. What might be a decent approach to generate 4 numbers in the desired range, based on that norm\check\hash?

Algorithm for evenly spacing list items (playlist songs) along several categories (id3 tags)

I am having trouble designing an algorithm to assist in the creation of an mp3 playlist, although an algorithm for the more general case of evenly spacing items in a list could be adapted for my use.
The general case is that I would like to reorder items in a list to maximize their diversity along several axes.
My specific use-case is that I want to dump a bunch of songs into a collection, then run my algorithm over the collection to generate an ordered playlist. I would like the order to follow this set of criteria:
maximize the distance between instances of the same artist
maximize the distance between instances of the same genre
maximize the distance between instances of category X
etc for N categories
Obviously we could not guarantee to optimize ALL categories equally, so the first category would be weighted most important, second weighted less, etc. I definitely want to satisfy the first two criteria, but making the algorithm extensible to satisfy N would be fantastic. Maximizing randomness (shuffle) is not a priority. I just want to diversify the listening experience no matter where I come in on the playlist.
This seems close to the problem described and solved here, but I can't wrap my head around how to apply this when all of the items are in the same list with multiple dimensions, rather than separate lists of differing sizes.
This seems like a problem that would have been solved many times by now but I am not able to find any examples of it.
This should be much faster than brute-force:
Order all the songs randomly.
Compute the weights for each song slot (i.e. how close is it to the same artist/genre/etc.). It will be a number from 1-N indicating how may songs away it is from a match. Lower is worse.
Take the song with the lowest weight, and swap that song with a random other song.
Re-compute the weights of the swapped songs. If either got worse, reverse the swap and go back to 3.
For debugging, print the "lowest weight" and overall average weight. (debugging)
Go to 2
You won't find the optimal this way, but it should give mediocre results pretty fast, and eventually improve.
Step 2 can be made fast this way: (pseudo code in Ruby)
# Find the closest match to a song in slot_number
def closest_match(slot_number)
# Note: MAX can be less than N. Maybe nobody cares about songs more than 20 steps away.
(1..MAX).each |step|
return step if matches?(slot_number+step, slot_number) or matches?(slot_number-step, slot_number)
end
return MAX
end
# Given 2 slots, do the songs there match?
# Handles out-of-bounds
def matches?(x,y)
return false if y > N or y < 1
return false if x > N or x < 1
s1 = song_at(x)
s2 = song_at(y)
return true if s1.artist == s2.artist or s1.genre == s2.genere
return false
end
You also don't have to re-compute the whole array: If you cache the weights, you only need to recompute songs that have weight >=X if they are X steps away from a swapped song. Example:
| Song1 | Song2 | Song3 | Song4 | Song5 |
| Weight=3 | Weight=1 | Weight=5 | Weight=3 | Weight=2|
If you are swapping Song 2, you don't have to re-compute song 5: It's 3 steps away from Song 3, but it's weight was 2, so it won't "see" Song 3.
Your problem is probably NP-hard. To get a sense of it, here's a reduction to CLIQUE (an NP-hard problem). That doesn't prove that your problem is NP-hard, but at least gives an idea that there is a connection between the two problems. (To show definitively that your problem is NP-hard, you need a reduction in the other direction: show that CLIQUE can be reduced to your problem. I feel that it is possible, but getting the details right is fussy.)
Suppose you have n=6 songs, A, B, C, D, E, and F. Lay them out in a chart like this:
1 2 3 4 5 6
A A A A A A
B B B B B B
C C C C C C
D D D D D D
E E E E E E
F F F F F F
Connect each item in column 1 with an edge to every other item in every other column, except for items in the same row. So A in column 1 is connected to B, C, D, E, F in column 2, B, C, D, E, F in column 3, and so on. There are n^2 = 36 nodes in the graph and n*(n-1)^2 + n*(n-1)*(n-2) + n*(n-1)*(n-3) + ... = n*(n-1)*n*(n-1)/2 = O(n^4) edges in the graph.
A playlist is a maximum clique in this graph, in other words a selection which is mutually consistent (no song is played twice). So far, not so hard: it's possible to find many maximum cliques very quickly (just permutations of the songs).
Now we add information about the similarity of the songs as edge weights. Two songs that are similar and close get a low edge weight. Two songs that are similar and far apart get a higher edge weight. Now the problem is to find a maximum clique with maximum total edge weight, in other words the NP-hard problem CLIQUE.
There are some algorithms for attacking CLIQUE, but of course they're exponential in time. The best you're going to be able to do in a reasonable amount of time is either to run one of those algorithms and take the best result it can generate in that time, or to randomly generate permutations for a given amount of time and pick the one with the highest score. You might be able to get better results for natural data using something like simulated annealing to solve the optimization problem, but CLIQUE is "hard to approximate" so I have the feeling you won't get much better results that way than by randomly generating proposals and picking the highest scoring.
Here is my idea: you create a graph, where songs are vertices, and paths represent their diversity.
For example we have five songs:
"A", country, authored by John Doe
"B", country, authored by Jane Dean
"C", techno, authored by Stan Chang
"D", techno, authored by John Doe
"E", country, authored by John Doe
We assign weight 2 to artist and 1 to genre, and use multiplicative inverse as path's value. Some of the path will look like this:
A-B: 2*1 + 1*0 = 2 => value of the path is 1/2 = 0.5
A-C: 2*1 + 1*1 = 3 => value of the path is 1/3 = 0.33
A-D: 2*0 + 1*1 = 1 => value of the path is 1/1 = 1
A-E: 2*0 + 1*0 = 0 => value of the path is 1/0 = MAX_DOUBLE
You can have as many categories as you want, weighted as you wish.
Once you have calculated all paths between all songs, all you have to do is use some heuristic algorithm for Travelling Salesman Problem.
EDIT:
I'd like to throw another constraint on the problem: the "maximal distance" should take into account the fact that the playlist may be on repeat. This means that simply putting two songs by the same artist at opposite ends of the playlist will fail since they will be "next to" each other when the list repeats.
Part of Travelling Salesman Problem is that in the end you return to your origin point, so you will have the same song at both ends of your playlist, and both paths (from song and to song) will be calculated with the best efficiency allowed by used heuristics. So all you have to do is remove last entry from your result (because it's the same as the first one) and you can safely repeat without breaking your requirements.
A brute force algorithm for that is easy.
maxDistance = 0
foreach ordering
distance = 0
foreach category
for i=1 to numsongs
for j=i+1 to numsongs
if song i and song j in this ordering have same value for this category
distance = distance + (j-i)*weight_for_this_category
endif
endfor
endfor
endfor
if ( distance > maxDistance )
maxDistance = distance
mark ordering as best one so far
endif
endfor
But that algorithm has worse than exponential complexity with the number of songs so it will take unmanageable ammounts of time pretty fast. The hard part comes in doing it in a reasonable time.
I was thinking about a "spring" approach. If new items are added to the end of the list, they squish the similar items forward.
If I add pink Floyd to the list, then all other Floyd songs get squished to make space.
I would implement the least common dimensions before the most common dimensions, to ensure the more common dimensions are better managed.
For tags in song ordered by count tags in list asc
Evenly space earlier songs with knowledge new song being added
Add song

Taguchi Method Programming Example

I've been asked to research some programming related to the "Taguchi Method", especially as it relates to Multi-variant testing. This is one of the first subjects I've tried to research that I've found zero, nada, zilch, code examples for, especially considering its mathematical basis.
I've found some books describing the math involved but it looks like I'm going to be doing some math brush up unless I can find some code examples I can relate to.
Is this one of those rare things that once you work out the programming, it's so valuable that no one shares? Or do I just fail at Taguchi + google?
Taguchi designs are the same thing as covering arrays. The basic idea is that if you have F data "fields" and every one can have N different values, it is possible to construct NF different test cases. A covering array is basically a set of test cases that together cover all possible pairwise combinations of two field values, and the idea is to generate as small one as possible. E.g. if F=3 and N=3, you have 27 possible test cases, but it is enough to have nine test cases if you aim for pairwise coverage:
Field A | Field B | Field C
---------------------------
1 1 1
1 2 2
1 3 3
2 1 2
2 2 3
2 3 1
3 1 3
3 2 1
3 3 2
In this table, you can choose any two fields and any two values and you can always find a row that contains the chosen values for the chosen fields.
Generating Taguchi designs in general is a difficult combinatorial problem.
You can generate Taguchi designs by various methods:
Branch and bound
Stochastic search (e.g. tabu search or simulated annealing)
Greedy search
Specific mathematical constructions for some specific structures

Pre RTree step: Divide a set of points into rectangular regions each containing one point

given my current position (lat,long) I want to quickly find the nearest neighbor in a points of interest problem. Thus I intend to use an R-Tree database, which allows for quick lookup. However, first the database must be populated - of course. Therefore, I need to determine the rectangular regions that covers the area, where each region contains one point of interest.
My question is how do I preprocess the data, i.e. how do I subdivide the area into these rectangular sub-regions? (I want rectangular regions because they are easily added to the RTree - in contrast to more general Voronoi regions).
/John
Edit: The below approach works, but ignores the critical feature of R-trees -- that The splitting behavior of R-tree nodes is well defined, and maintains a balanced tree (through B-tree-like properties). So in fact, all you have to do is:
Pick the maximum number of rectangles per page
Create seed rectangles (use points furthest away from each other, centroids, whatever).
For each point, choose a rectangle to put it into
If it already falls into a single rectangle, put it in there
If it does not, extend the rectangle that needs to be extended least (different ways to measure "least" exits -- area works)
If multiple rectangles apply -- choose one based on how full it is, or some other heuristic
If overflow occurs -- use the quadratic split to move things around...
And so on, using R-tree algorithms straight out of a text book.
I think the method below is ok for finding your initial seed rectangles; but you don't want to populate your whole R-tree that way. Doing the splits and rebalancing all the time can be a bit expensive, so you will probably want to do a decent chunk of the work with the KD approach below; just not all of the work.
The KD approach: enclose everything in a rectangle.
If the number of points in the rectangle is > threshold, sweep in direction D until you cover half the points.
Divide into rectangles left and right (or above and below) the splitting point).
Call the same procedure recursively on the new rectangles, with the next direction (if you were going left to right, you will now go top to bottom, and vice versa).
The advantage this has over the divide-into-squares approach offered by another poster is that it accommodates skewed point distributions better.
Oracle Spatial Cartridge documentation describes tessellation algorithm that can do what you want. In short:
enclose all your points in square
if square contains 1 point - index square
if square does not contain points - ignore it
if square contains more then 1 point
split square into 4 equal squares and repeat analysis for each new square
Result should be something like this:
alt text http://download-uk.oracle.com/docs/cd/A64702_01/doc/cartridg.805/a53264/sdo_ina5.gif
I think you a missing something in the problem statement. Assume you have N points (x, y) such that every point has a unique x- and y-coordinate. You can divide your area into N rectangles then by just dividing it into N vertical columns. But that does not help you to solve the nearest POI problem easily, does it? So I think you are thinking about something about the rectangle structure which you haven't articulated yet.
Illustration:
| | | | |5| | |
|1| | | | |6| |
| | |3| | | | |
| | | | | | | |
| |2| | | | | |
| | | | | | |7|
| | | |4| | | |
The numbers are POIs and the vertical lines show a subdivision into 7 rectangular areas. But clearly there isn't much "interesting" information in the subdivision. Is there some additional criterion on the subdivision which you haven't mentioned?

Resources