Increasing States in Tsetlin Automata - artificial-intelligence

What is the effect of increasing number of states in Tsetlin Automatas? Will both learning speed and accuracy increase?

Learning speed will decrease, while accuracy will increase.

Related

Sample size calculation for experimental design

I have three treatments (Wild type, Mutant1 and Mutant2); I request inputs on how to decide the sample size that would be statistically significant (alpha <0.05) with high statistical power (1-beta=0.8).
Questions
I understand that we need the information of effect size. We approach this problem if we don't know the expected effect size prior; a trial experiment to estimate the effect size. In such case if we want to estimate the effect size with trial experiment; what could be the sample size to start with; a high (n=10) or as low as n=3? Can n=3 among treatments provide a good estimate of effect size or n=10 is better to get this estimate. Let's be specific; if we have resource for n=10 max. and we are given option to choose between n=3 or n=10 for this trial
This question is better asked in https://stats.stackexchange.com.
I would discourage you from trying to estimate effects sizes from pilot experiments with low n. Your estimates will be quite noisy and this is rarely done (at least in my field of neuroscience). Instead, I would suggest you estimate your effect size from the literature. Have other people measured something similar to what you are planning to do? What are the sample sizes they use? What kind of effect sizes do they report.
If you were going to go ahead with the plan to run a pilot study, I would recommend pre-registering your experimental design (https://www.cos.io/initiatives/prereg). Something like:
We will test the effects of mutation 1 and mutation 2 on XXXX (compared to wild type) in a cohort of 30 mice (10 in each group). Based on the results of this study, we will then conduct a power analysis and reproduce the experiments in a sample size required to have a power of 0.8 at p=0.05.
Our criteria for excluding animals from the power analysis will be .....
The statistical test for estimating effect size will be......"
etc.

Is MonteCarloTreeSearch an appropriate method for this problem size (large action/state space)?

I'm doing a research on a finite horizon decision problem with t=1,...,40 periods. In every time step t, the (only) agent has to chose an action a(t) ∈ A(t), while the agent is in state s(t) ∈ S(t). The chosen action a(t) in state s(t) affects the transition to the following state s(t+1). So there is a finite horizon markov decision problem.
In my case the following holds true: A(t)=A and S(t)=S, while the size of A is 6 000 000 and the size of S is 10^8. Further the transition function is stochastic.
Since I'm relatively new to the theory of Monte Carlo Tree Search (MCTS), i ask myself: is MCTS an appropriate method for my problem (in particular due to the large size of A and S and the stochastic transition function?)
I have already read a lot of papers about MCTS (e.g. progressiv widening and double progressiv widening, which sound quite promising), but maybe someone can tell me about his experiences applying MCTS to similar problems or about appropriate methods for this problem (with large state/action space and a stochastic transition function).
With 6 million stochastic actions per state, I don't think any kind of simulation is realistically going to differentiate between those moves without running essentially forever.
100 MM states isn't a lot however, you can store the value for all of them in less than a gigabyte of memory and something like value iteration or policy iteration would solve this optimally much faster.

Why does LevelDB make its lower level 10 times bigger than upper one?

According to the official document, there is no doubt that lower level is 10 times bigger than upper one in LevelDB.
The question is why 10? not 2? not 20? It is due to some rigorous math calculations or it just works?
I have read the original LSMT paper. I can understand the multi-components part, because it will be too hard to merge c0 tree with a super large c1 tree. But the paper shows nothing about what the best parameter is.
Am I right? It is actually an interview question. How can I answer it properly if there is no best parameter?
10x is a reasonable value, may not be rigorous.
The value of this coefficient can't be too small, because will make too many level, which is not friend to read, and will result in more space amplification.
It can't be too large as you mentioned, the cost of compact will increase, with higher average number of participant ssts.

Microcontroller Peak Detection in C using slope

I am making a finger plethysmograph(FP) using an LED and a receiver. The sensor produces an analog pulse waveform that is filtered, amplified and fed into a microcontroller input with a range of 3.3-0V. This signal is converted into its digital form.
Smapling rate is 8MHz, Processor frequency is 26MHz, Precision is 10 or 8 bit.
I am having problems coming up with a robust method for peak detection. I want to be able to detect heart pulses from the finger plethysmograph. I have managed to produce an accurate measurement of heart rate using a threshold method. However, the FP is extremely sensitive to movement and the offset of the signal can change based on movement. However, the peaks of the signal will still show up but with varying voltage offset.
Therefore, I am proposing a peak detection method that uses the slope to detect peaks. In example, if a peak is produced, the slope before and after the maximum point will be positive and negative respectively.
How feasible do you think this method is? Is there an easier way to perform peak detection using a microcontroller?
You can still introduce detection of false peaks when the device is moved. This will be present whether you are timing average peak duration or applying an FFT (fast Fourier Transform).
With an FFT you should be able to ignore peaks outside the range of frequencies you are considering (ie those < 30 bpm and > 300 bpm, say).
As Kenny suggests, 8MHz might overwhelm a 26MHz chip. Any particular reason for such a high sampling rate?
Like some of the comments, I would also recommend lowering your sample rate since you only care about pulse (i.e. heart rate) for now. So, assuming you're going to be looking at resting heart rate, you'll be in the sub-1Hz to 2Hz range (60 BPM = 1Hz), depending on subject health, age, etc.
In order to isolate the frequency range of interest, I would also recommend a simple, low-order digital filter. If you have access to Matlab, you can play around with Digital Filter Design using its Filter Design and Analysis Tool (Introduction to the FDATool). As you'll find out, Digital Filtering (wiki) is not computationally expensive since it is a matter of multiplication and addition.
To answer the detection part of your question, YES, it is certainly feasible to implement peak detection on the plethysmograph waveform within a microcontroller. Taking your example, a slope-based peak detection algorithm would operate on your waveform data, searching for changes in slope, essentially where the slope waveform crosses zero.
Here are a few other things to consider about your application:
Calculating slope can have a "spread" (i.e. do you find the slope between adjacent samples, or samples which are a few samples apart?)
What if your peak detection algorithm locates peaks that are too close together, or too far apart, in a physiological sense?
A Pulse Oximeter (wiki) often utilizes LEDs which emit Red and Infrared light. How does the frequency of the LED affect the plethysmograph? (HINT: It may not be significant, but I believe you'll find one wavelength to yield greater amplitudes in your frequency range of interest.)
Of course you'll find a variety of potential algorithms if you do a literature search but I think slope-based detection is great for its simplicity. Hope it helps.
If you can detect the period using zero crossing, even at 10x oversampling of 10 Hz, you can use a line fit of the quick-n-dirty-edge to find the exact period, and then subtract the new wave's samples in that period with the previous, and get a DC offset. The period measurement will have the precision of your sample rate. Doing operations on the time and amplitude-normalized data will be much easier.
This idea is computationally light compared to FFT, which still needs additional data processing.

Genetic Algorithm Sudoku - optimizing mutation

I am in the process of writing a genetic algorithm to solve Sudoku puzzles and was hoping for some input. The algorithm solves puzzles occasionally (about 1 out of 10 times on the same puzzle with max 1,000,000 iterations) and I am trying to get a little input about mutation rates, repopulation, and splicing. Any input is greatly appreciated as this is brand new to me and I feel like I am not doing things 100% correct.
A quick overview of the algorithm
Fitness Function
Counts the number of unique values of numbers 1 through 9 in each column, row, and 3*3 sub box. Each of these unique values in the subsets are summed and divided by 9 resulting in a floating value between 0 and 1. The sum of these values is divided by 27 providing a total fitness value ranging between 0 and 1. 1 indicates a solved puzzle.
Population Size:
100
Selection:
Roulette Method. Each node is randomly selected where nodes containing higher fitness values have a slightly better chance of selection
Reproduction:
Two randomly selected chromosomes/boards swap a randomly selected subset (row, column, or 3*3 subsets) The selection of subset(which row, column, or box) is random. The resulting boards are introduced into population.
Reproduction Rate: 12% of population per cycle
There are six reproductions per iteration resulting in 12 new chromosomes per cycle of the algorithm.
Mutation: mutation occurs at a rate of 2 percent of population after 10 iterations of no improvement of highest fitness.
Listed below are the three mutation methods which have varying weights of selection probability.
1: Swap randomly selected numbers. The method selects two random numbers and swaps them throughout the board. This method seems to have the greatest impact on growth early in the algorithms growth pattern. 25% chance of selection
2: Introduce random changes: Randomly select two cells and change their values. This method seems to help keep the algorithm from converging. %65 chance of selection
3: count the number of each value in the board. A solved board contains a count of 9 of each number between 1 and 9. This method takes any number that occurs less than 9 times and randomly swaps it with a number that occurs more than 9 times. This seems to have a positive impact on the algorithm but only used sparingly. %10 chance of selection
My main question is at what rate should I apply the mutation method. It seems that as I increase mutation I have faster initial results. However as the result approaches a correct result, I think the higher rate of change is introducing too many bad chromosomes and genes into the population. However, with the lower rate of change the algorithm seems to converge too early.
One last question is whether there is a better approach to mutation.
You can anneal the mutation rate over time to get the sort of convergence behavior you're describing. But I actually think there are probably bigger gains to be had by modifying other parts of your algorithm.
Roulette wheel selection applies a very high degree of selection pressure in general. It tends to cause a pretty rapid loss of diversity fairly early in the process. Binary tournament selection is usually a better place to start experimenting. It's a more gradual form of pressure, and just as importantly, it's much better controlled.
With a less aggressive selection mechanism, you can afford to produce more offspring, since you don't have to worry about producing so many near-copies of the best one or two individuals. Rather than 12% of the population producing offspring (possible less because of repetition of parents in the mating pool), I'd go with 100%. You don't necessarily need to literally make sure every parent participates, but just generate the same number of offspring as you have parents.
Some form of mild elitism will probably then be helpful so that you don't lose good parents. Maybe keep the best 2-5 individuals from the parent population if they're better than the worst 2-5 offspring.
With elitism, you can use a bit higher mutation rate. All three of your operators seem useful. (Note that #3 is actually a form of local search embedded in your genetic algorithm. That's often a huge win in terms of performance. You could in fact extend #3 into a much more sophisticated method that looped until it couldn't figure out how to make any further improvements.)
I don't see an obvious better/worse set of weights for your three mutation operators. I think at that point, you're firmly within the realm of experimental parameter tuning. Another idea is to inject a bit of knowledge into the process and, for example, say that early on in the process, you choose between them randomly. Later, as the algorithm is converging, favor the mutation operators that you think are more likely to help finish "almost-solved" boards.
I once made a fairly competent Sudoku solver, using GA. Blogged about the details (including different representations and mutation) here:
http://fakeguido.blogspot.com/2010/05/solving-sudoku-with-genetic-algorithms.html

Resources