Can Watson Visual Recognition determine density? - ibm-watson

Bouquets of flowers are a fairly accurate analogy for our problem domain.
For an example, let's assume a test image of thirty flowers:
- Roses: 10
- Poppies: 9
- Daisies: 5
- Lillies: 5
- Sunflowers: 1
Is there a training approach that might get Watson to look at pictures of bouquets and be able to reply with a density of a given flower type, or even a ratio or something?
If there are any ideas, should we train with images of single/isolated or multiple/grouped of each type of flower?
...or a combination of both?
ANY ideas/suggestions would be welcome!!!
EDIT:
Alternatively, rather than making classes by flower-type, we could class by action-needed ??
But, maybe that's a different enough idead to be it's own question.

In part, it depends on how much control you have over the images that you need to classify, and the granularity of the classification that you need to make. If, for example you're guaranteed to always have a top down view of the bouquet that shows all the different flowers clearly and other extraneous objects are generally not in the scene, then you probably could train a classifier for something like five density levels for each flower type. For example, the Daisy classifier would have five classes: 0 to 20% daisies, 20 to 40% daisies, 40 to 60% daisies, 60 to 80% and over 80% daisies.

Related

Algorithm sorting details, but without excluding

I have come across a problem.
I’m not asking for help how to construct what I’m searching for, but only to guide me to what I’m looking for! 😊
The thing I want to create is some sort of ‘Sorting Algorithm/Mechanism’.
Example:
Imagine I have a database with over 1000 pictures of different vehicles.
A person sees a vehicle, he now tries to get as much information and details about that vehicle, such as:
Shape
number of wheels
number and shape of windows
number and shape of light(s)
number and shape of exhaust(s)
Etc…
He then gives me all information about that vehicle he saw. BUT! Without telling me anything about:
Make and model.
…
I will now take that information and tell my database to sort out every vehicle so that it arranges all 1000 vehicle by best match, based by the description it have been given.
But it should NOT exclude any vehicle!
So…
If the person tells me that the vehicle only has 4 wheels, but in reality it has 5 (he might not have seen the fifth wheel) it should just get a bad score in the # of wheels.
But if every other aspect matches that vehicle perfect it will still get a high score.
That way we don’t exclude the vehicle that he has seen, and we still have a change to find the correct vehicle.
The whole aspect of this mechanism is to, as said, sort out the most, so instead of looking through 1000 vehicles we only need to sort through the best matches which is 10 to maybe 50 vehicles out of a 1000 (hopefully).
I tried to describe it the best I could in a language that isn’t ‘my father’s tongue’. So bear with me.
Again, I’m not looking for anybody telling me how to make this algorithm, I’m pretty sure nobody even wants of have the time to do that for me, without getting paid somehow...
But I just need to know where to look regarding learning and understanding how to create this mess of a mechanism.
Kind regards
Gent!
Assuming that all your pictures have been indexed with the relevant fields (number of wheels, window shapes...), and given that they are not too numerous (a thousand is peanuts for a computer), you can proceed as follows:
for every criterion, weight the possible discrepancies (e.g. one wheel too much costs 5, one wheel too few costs 10, bad window shape costs 8...). Make this in a coherent way so that the costs of the criteria are well balanced.
to perform a search, evaluate the total discrepancy cost of every car, and sort the values increasingly. Report the first ten.
Technically, what you are after is called a "nearest neighbor search" in a high dimensional space. This problem has been well studied. There are fast solutions but they are extremely complex, and in your case are absolutely not worth using.
The default way of doing this for example in artificial intelligence is to encode all properties as a vector and applying certain weights to each property. The distance can then be calculated using any metric you like. In your case manhatten-distance should be fine. So in pseudocode:
distance(first_car, second_car):
return abs(first_car.n_wheels - second_car.n_wheels) * wheels_weight+ ... +
abs(first_car.n_windows - second_car.n_windows) * windows_weight
This works fine for simple properties like the number of wheels. For more complex properties like the shape of a window you'll probably need to split it up into multiple attributes depending on your requirements on similarity.
Weights are usually picked in such a way as to normalize all values, if their range is known. Optionally an additional factor can be multiplied to increase the impact of a specific attribute on the overall distance.

How to determine optimum hidden layers and neurons based on inputs and outputs in a NN?

I'm refering mostly to this paper here: http://clgiles.ist.psu.edu/papers/UMD-CS-TR-3617.what.size.neural.net.to.use.pdf
Current Setup:
I'm currently trying to port the neural-genetic AI solution that I have laying around to get into a multi-purpose multi-agent tool. So, for example, it should work as an AI in a game engine for moving around entities and let 'em shoot and destroy the enemy (so e.g. 4 inputs like distance x,y and angle x,y and 2 outputs like accelerate left,right).
The state so far is that I'm using the same amount of genomes as there are agents to determine the fittest agents. 20% of the fittest agents are combined with each other (zz, zw genomes selected) and create 2 babies for the new population each. The rest of the new population per-new-generation is selected randomly across the old population, including the fittest with-an-unfit-genome.
That works pretty well to prime the AI, after generation 50-100 it is pretty much human-unbeatable in a Breakout clone and a little Tank game where you can shoot and move around.
As I had the idea to use on evolution population for each "type of Agent" the question is now if it is possible to determine the amount of hidden layers and the amount of neurons in the hidden layers generically.
My setup for the tank game is 4 inputs, 3 outputs and 1 hidden layer with 12 neurons that worked the best (around 50 generations to be really strong).
My setup for a breakout game is 6 inputs, 2 outputs and 2 hidden layers with 12 neurons that seems to work best.
Done Research:
So, back to the paper: On page 32 you can see that it seems that more neurons per hidden layer need of course more time for priming, but the more neurons are in between, the more are the chances to get into the function without noise.
I currently prime my AI only using the fitness increase on successfully being better than the last try.
So in a tank game it means he successfully shot the other tank (wounded him 4 times is better, then enemy is dead) and won the round.
In the breakout game it's similar as I have a paddle that the AI can move around and it can collect points. "Getting shot" or negative treatment here is that it forgot to catch the ball. So potential noise input would be 2 output values (move-left, move-right) that depend on 4 input values (ball x, y, degx, degy).
Questions:
So, what kind of calculation for the amount of hidden layers and amount of neurons do you think can be a good tradeoff to have no noise that kills the genome evolution?
What is the minimum amount of agents until you can say that "it evolves further"? My current training setup is always around having 50 agents that learn in parallel (so they basically simulate 50 games in parallel "behind the scenes").
In sum, for most problems, one could probably get decent performance (even without a second optimization step) by setting the hidden layer configuration using just two rules: (i) number of hidden layers equals one; and (ii) the number of neurons in that layer is the mean of the neurons in the input and output layers.
-doug
In short. It's an ongoing area of research. Most (All that I know of) ANN using numerous neurons and H-Layers don't set a static number of either, instead they use algorithms to continuously modify these values. Usually constructing and destroying when outputs converge/diverge.
Since it sounds like you're already using some evolutionary computing, consider looking into Andrew Turner's work on CGPANN, I remember it getting pretty decent improvements on benchmarks similar to your work.

Getting win percentages for Texas hold'em poker without monte carlo/exhaustive enumeration

Sorry I'm just starting this project and don't have any ideas or code, I'm asking more of a theoretical question than a programming one.
It seems that every google search provides the same responses and it's very hard to find an answer to this question:
Is there a way to calculate win percentages for texas holdem poker (the same way they do on poker after dark or other televised poker events) without using the monte carlo/exhaustive enumeration methods. Assuming all cards are face up and we know every card in the deck.
Every response on other forums just seems to be "use pokerstove" or something similar, I'm looking for the theory to write the code.
Thanks.
Is there a way to calculate win percentages for texas holdem poker
(the same way they do on poker after dark or other televised poker
events) without using the monte carlo/exhaustive enumeration methods.
In specific instances it is possible...
You can use a perfect lookup table preflop for two players heads-up preflop matchups: note that the "typical" 169 vs 169 approximation ain't good enough (say Jh Th vs 9h 8h ain't really "JTs vs 98s": I mean, that would quite a gross approximation).
Besides that if you have a lot of memory and if you can live with gigantic cache misses, you technically could precompute gigantic lookup tables (say on the server side) and do lookups for other cases (e.g. for every possible three players all-in matchups preflop), but you'd really need a lot of memory : )
Note that "full enumeration" at flop and turn ain't an issue: at flop there's only 2 more cards to come, so there are typically only C(45,2) [two players all-in at flop, we know 2*2 holecards + 3 community cards -- hence leaving 990 possibilities] or C(43,2) [three players all-in at at flop, we know 3*2 holecards + 3 community cards].
So an actual evaluator would not use one but several methods. For example:
lookup table for two players all-in preflop (the fastest)
full enum for any number of players all-in at flop or turn because it's tiny (max 990 possibilities) -- very fast
monte-carlo or full enum for three players or more all-in preflop -- incredibly slower
It is interesting to see here that in the most typical cases you'll get the result very, very fast: most actual all-ins involve two players, not three or more.
So you're either looking up in a "1 vs 1 preflop" lookup table or doing full C(45,2) or C(46,1) full enum (which are, in both case, amazingly fast).
It's really only the "three players or more all-in preflop" case which do take time.
The answer is no.
There is no closed form computation that you can do to compute poker equities. Using combinatorics, you can identify and solve many subproblems, which speeds up computation.
For example, if you are considering all five card hands, there are 52 choose 5 = 2,598,960 different hands. But knowing that suits are equivalent and using combinatorial methods (either analytic or computational), you can reduce the space of all hands to 134,459 classes each weighted according to the number of different hands in each equivalent class.
There are also various ways of using exhaustive evaluations tailored to your application. If you need to perform some subset of evaluations repeatedly, you can use caches or precomputed lookup tables targeted to your specific needs.

What is fuzzy logic?

I'm working with a couple of AI algorithms at school and I find people use the words Fuzzy Logic to explain any situation that they can solve with a couple of cases. When I go back to the books I just read about how instead of a state going from On to Off it's a diagonal line and something can be in both states but in different "levels".
I've read the wikipedia entry and a couple of tutorials and even programmed stuff that "uses fuzzy logic" (an edge detector and a 1-wheel self-controlled robot) and still I find it very confusing going from Theory to Code... for you, in the less complicated definition, what is fuzzy logic?
Fuzzy logic is logic where state membership is, essentially, a float with range 0..1 instead of an int 0 or 1. The mileage you get out of it is that things like, for example, the changes you make in a control system are somewhat naturally more fine-tuned than what you'd get with naive binary logic.
An example might be logic that throttles back system activity based on active TCP connections. Say you define "a little bit too many" TCP connections on your machine as 1000 and "a lot too many" as 2000. At any given time, your system has a "too many TCP connections" state from 0 (<= 1000) to 1 (>= 2000), which you can use as a coefficient in applying whatever throttling mechanisms you have available. This is much more forgiving and responsive to system behavior than naive binary logic that only knows how to determine "too many", and throttle completely, or "not too many", and not throttle at all.
I'd like to add to the answers (that have been modded up) that, a good way to visualize fuzzy logic is follows:
Traditionally, with binary logic you would have a graph whose membership function is true or false whereas in a fuzzy logic system, the membership function is not.
1|
| /\
| / \
| / \
0|/ \
------------
a b c d
Assume for a second that the function is "likes peanuts"
a. kinda likes peanuts
b. really likes peanuts
c. kinda likes peanuts
d. doesn't like peanuts
The function itself doesn't have to be triangular and often isn't (it's just easier with ascii art).
A fuzzy system will likely have many of these, some even overlapping (even opposites) like so:
1| A B
| /\ /\ A = Likes Peanuts
| / \/ \ B = Doesn't Like Peanuts
| / /\ \
0|/ / \ \
------------
a b c d
so now c is "kind likes peanuts, kinda doesn't like peanuts" and d is "really doesn't like peanuts"
And you can program accordingly based on that info.
Hope this helps for the visual learners out there.
The best definition of fuzzy logic is given by its inventor Lotfi Zadeh:
“Fuzzy logic means of representing problems to computers in a way akin to the way human solve them and the essence of fuzzy logic is that everything is a matter of degree.”
The meaning of solving problems with computers akin to the way human solve can easily be explained with a simple example from a basketball game; if a player wants to guard another player firstly he should consider how tall he is and how his playing skills are. Simply if the player that he wants to guard is tall and plays very slow relative to him then he will use his instinct to determine to consider if he should guard that player as there is an uncertainty for him. In this example the important point is the properties are relative to the player and there is a degree for the height and playing skill for the rival player. Fuzzy logic provides a deterministic way for this uncertain situation.
There are some steps to process the fuzzy logic (Figure-1). These steps are; firstly fuzzification where crisp inputs get converted to fuzzy inputs secondly these inputs get processed with fuzzy rules to create fuzzy output and lastly defuzzification which results with degree of result as in fuzzy logic there can be more than one result with different degrees.
Figure 1 – Fuzzy Process Steps (David M. Bourg P.192)
To exemplify the fuzzy process steps, the previous basketball game situation could be used. As mentioned in the example the rival player is tall with 1.87 meters which is quite tall relative to our player and can dribble with 3 m/s which is slow relative to our player. Addition to these data some rules are needed to consider which are called fuzzy rules such as;
if player is short but not fast then guard,
if player is fast but not short then don’t guard
If player is tall then don’t guard
If player is average tall and average fast guard
Figure 2 – how tall
Figure 3- how fast
According to the rules and the input data an output will be created by fuzzy system such as; the degree for guard is 0.7, degree for sometimes guard is 0.4 and never guard is 0.2.
Figure 4-output fuzzy sets
On the last step, defuzzication, is using for creating a crisp output which is a number which may determine the energy that we should use to guard the player during game. The centre of mass is a common method to create the output. On this phase the weights to calculate the mean point is totally depends on the implementation. On this application it is considered to give high weight to guard or not guard but low weight given to sometimes guard. (David M. Bourg, 2004)
Figure 5- fuzzy output (David M. Bourg P.204)
Output = [0.7 * (-10) + 0.4 * 1 + 0.2 * 10] / (0.7 + 0.4 + 0.2) ≈ -3.5
As a result fuzzy logic is using under uncertainty to make a decision and to find out the degree of decision. The problem of fuzzy logic is as the number of inputs increase the number of rules increase exponential.
For more information and its possible application in a game I wrote a little article check this out
To build off of chaos' answer, a formal logic is nothing but an inductively defined set that maps sentences to a valuation. At least, that's how a model theorist thinks of logic. In the case of a sentential boolean logic:
(basis clause) For all A, v(A) in {0,1}
(iterative) For the following connectives,
v(!A) = 1 - v(A)
v(A & B) = min{v(A), v(B)}
v(A | B) = max{v(A), v(B)}
(closure) All sentences in a boolean sentential logic are evaluated per above.
A fuzzy logic changes would be inductively defined:
(basis clause) For all A, v(A) between [0,1]
(iterative) For the following connectives,
v(!A) = 1 - v(A)
v(A & B) = min{v(A), v(B)}
v(A | B) = max{v(A), v(B)}
(closure) All sentences in a fuzzy sentential logic are evaluated per above.
Notice the only difference in the underlying logic is the permission to evaluate a sentence as having the "truth value" of 0.5. An important question for a fuzzy logic model is the threshold that counts for truth satisfaction. This is to ask: for a valuation v(A), for what value D it is the case the v(A) > D means that A is satisfied.
If you really want to found out more about non-classical logics like fuzzy logic, I would recommend either An Introduction to Non-Classical Logic: From If to Is or Possibilities and Paradox
Putting my coder hat back on, I would be careful with the use of fuzzy logic in real world programming, because of the tendency for a fuzzy logic to be undecidable. Maybe it's too much complexity for little gain. For instance a supervaluational logic may do just fine to help a program model vagueness. Or maybe probability would be good enough. In short, I need to be convinced that the domain model dovetails with a fuzzy logic.
Maybe an example clears up what the benefits can be:
Let's say you want to make a thermostat and you want it to be 24 degrees.
This is how you'd implement it using boolean logic:
Rule1: heat up at full power when
it's colder than 21 degrees.
Rule2:
cool down at full power when it's
warmer than 27 degrees.
Such a system will only once and a while be 24 degrees, and it will be very inefficient.
Now, using fuzzy logic, it would be like something like this:
Rule1: For each degree that it's colder than 24 degrees, turn up the heater one notch (0 at 24).
Rule2: For each degree that it's warmer than 24 degress, turn up the cooler one notch (0 at 24).
This system will always be somewhere around 24 degrees, and it only once and will only once and a while make a tiny adjustment. It will also be more energy-efficient.
Well, you could read the works of Bart Kosko, one of the 'founding fathers'. 'Fuzzy Thinking: The New Science of Fuzzy Logic' from 1994 is readable (and available quite cheaply secondhand via Amazon). Apparently, he has a newer book 'Noise' from 2006 which is also quite approachable.
Basically though (in my paraphrase - not having read the first of those books for several years now), fuzzy logic is about how to deal with the world where something is perhaps 10% cool, 50% warm, and 10% hot, where different decisions may be made on the degree to which the different states are true (and no, it wasn't entirely an accident that those percentages don't add up to 100% - though I'd accept correction if needed).
A very good explanation, with a help of Fuzzy Logic Washing Machines.
I know what you mean about it being difficult to go from concept to code. I'm writing a scoring system that looks at the values of sysinfo and /proc on Linux systems and comes up with a number between 0 and 10, 10 being the absolute worst. A simple example:
You have 3 load averages (1, 5, 15 minute) with (at least) three possible states, good, getting bad, bad. Expanding that, you could have six possible states per average, adding 'about to' to the three that I just noted. Yet, the result of all 18 possibilities can only deduct 1 from the score. Repeat that with swap consumed, actual VM allocated (committed) memory and other stuff .. and you have one big bowl of conditional spaghetti :)
Its as much a definition as it is an art, how you implement the decision making process is always more interesting than the paradigm itself .. whereas in a boolean world, its rather cut and dry.
It would be very easy for me to say if load1 < 2 deduct 1, but not very accurate at all.
If you can teach a program to do what you would do when evaluating some set of circumstances and keep the code readable, you have implemented a good example of fuzzy logic.
Fuzzy Logic is a problem-solving methodology that lends itself to implementation in systems ranging from simple, small, embedded micro-controllers to large, networked, multi-channel PC or workstation-based data acquisition and control systems. It can be implemented in hardware, software, or a combination of both. Fuzzy Logic provides a simple way to arrive at a definite conclusion based upon vague, ambiguous, imprecise, noisy, or missing input information. Fuzzy Logic approach to control problems mimics how a person would make decisions, only much faster.
Fuzzy logic has proved to be particularly useful in expert system and other artificial intelligence applications. It is also used in some spell checkers to suggest a list of probable words to replace a misspelled one.
To learn more, just check out: http://en.wikipedia.org/wiki/Fuzzy_logic.
The following is sort of an empirical answer.
A simple (possibly simplistic answer) is that "fuzzy logic" is any logic that returns values other than straight true / false, or 1 / 0. There are a lot of variations on this and they tend to be highly domain specific.
For example, in my previous life I did search engines that used "content similarity searching" as opposed to then common "boolean search". Our similarity system used the Cosine Coefficient of weighted-attribute vectors representing the query and the documents and produced values in the range 0..1. Users would supply "relevance feedback" which was used to shift the query vector in the direction of desirable documents. This is somewhat related to the training done in certain AI systems where the logic gets "rewarded" or "punished" for results of trial runs.
Right now Netflix is running a competition to find a better suggestion algorithm for their company. See http://www.netflixprize.com/. Effectively all of the algorithms could be characterized as "fuzzy logic"
Fuzzy logic is calculating algorithm based on human like way of thinking. It is particularly useful when there is a large number of input variables. One online fuzzy logic calculator for two variables input is given:
http://www.cirvirlab.com/simulation/fuzzy_logic_calculator.php

What optimization problems do you want to have solved?

I love to work on AI optimization software (Genetic Algorithms, Particle Swarm, Ant Colony, ...). Unfortunately I have run out of interesting problems to solve. What problem would you like to have solved?
This list of NP complete problems should keep you busy for a while...
How about the Hutter Prize?
From the entry on Wikipedia:
The Hutter Prize is a cash prize
funded by Marcus Hutter which rewards
data compression improvements on a
specific 100 MB English text file.
[...]
The goal of the Hutter Prize is to
encourage research in artificial
intelligence (AI). The organizers
believe that text compression and AI
are equivalent problems.
Basically the idea is that in order to make a compressor which is able to compress data most efficiently, the compressor must be, in Marcus Hutter's words, "smarter". For more information on the relation between artificial intelligence and compression, see the Motivation and FAQ sections of the Hutter Prize website.
Does the Netflix Prize count?
I would like my bank balance optimised so that there is as much money as possible left at the end of the month, instead of the other way round.
What about the Go Game ?
Here's an interesting practical problem I came up while tinkering with color quantization and image compression.
The basic idea is that I would like a program to which I give a picture and it reduces the amount of colors is it as much as possible without me noticing it. Since every person has a different sensitivity of the eye (and eyes have different sensitivity of red/green/blue intensities), it should be possible to specify this sensitivity threshold in some way.
In other words, in a truecolor picture, replace every pixel's color with another color so that:
The total count of different colors in a picture would be the smallest possible; and
Every new pixel would have it's color no further from the original color than some user-specified value D.
The D can be defined in different ways, pick your favorite. For example:
Separate red, green and blue components for specifying the maximum possible deviation for each of them (for every pixel you get a rectangular cuboid of valid replacement values);
A real number which would represent the maximum allowable distance in the RGB cube (for every pixel you get a sphere of valid replacement values);
Something inbetween or completely different.
Most efficient solution to a given set of Sudoku puzzles. (excluding brute-force methods)

Resources