Mathematica Findroot Exploring the parameter space - loops

I am solving three non-linear equations in three variables (H0D,H0S and H1S) using FindRoot. In addition to the three variables of interest, there are four parameters in these equations that I would like to be able to vary. My parameters and the range in which I want to vary them are as follows:
CF∈{0,15} , CR∈{0,8} , T∈{0,0.35} , H1R∈{40,79}
The problem is that my non-linear system may not have any solutions for part of this parameter range. What I basically want to ask is if there is a smart way to find out exactly what part of my parameter range admits real solutions.
I could run a FindRoot inside a loop but because of non-linearity, FindRoot is very sensitive to initial conditions so frequently error messages could be because of bad initial conditions rather than absence of a solution.
Is there a way for me to find out what parameter space works, short of plugging 10^4 combinations of parameter values by hand and playing around with the initial conditions and hoping that FindRoot gives me a solution?
Thanks a lot,


Multiple IF QUARTILEs returning wrong values

I am using a nested IF statement within a Quartile wrapper, and it only kind of works, for the most part because it's returning values that are slightly off from what I would have expected if I calculate the range of values manually.
I've looked around but most of the posts and research is about designing the fomrula, I haven't come across anything compelling in terms of this odd behaviour I'm observing.
My formula (ctrl+shift enter as it's an array): =QUARTILE(IF(((F2:$F$10=$W$4)($Q$2:$Q$10=$W$3))($E$2:$E$10=W$2),IF($O$2:$O$10<>"",$O$2:$O$10)),1)
The full dataset:
*The formula has 3 AND conditions that need to be met and should return range:
At which 25% is calculated based on the range.
If I take the output from the formula, 25%-ile (QUARTILE,1) is 0.8803, but if I calculate it manually based on the data points right above, it comes out to 0.8685 and I can't see why.
I feel it's because the IF statements identifies slight off range but the values that meet the IF statements are different rows or something.
If you look at the table here you can see that there is more than one way of estimating quartile (or other percentile) from a sample and Excel has two. The one you are doing by hand must be like Quartile.exc and the one you are using in the formula is like
Basically both formulas work out the rank of the quartile value. If it isn't an integer it interpolates (e.g. if it was 1.5, that means the quartile lies half way between the first and second numbers in ascending order). You might think that there wouldn't be much difference, but for small samples there is a massive difference:
Quartile.exc Rank=(N+1)/4 Rank=(N+3)/4
Here's how it would look with your data

The order of features affects the results of a neural network

I am confused really.
I have a simple order of features, i.e. all the letters and a few symbols, counting how many times are contained in a string.
My selection as a result is as follows
I have a test sample of 65 values, and the MLP can get 46 correct.
Now If I chance the order of features in random order, train with the same data, evaluate the same values, I get a different number of correct predictions, e.g. 49.
Results are consistent (the same order will yield the same accuracy) but the accuracy changes between random orders.
The question is, is this supposed to happen? I cannot see how this is backed up by the theory. I am missing something large here?
PS. I am using WEKA's implementation of the MLP
I'm not familiar with the WEKA implementation of the MLP but that doesn't seem like something that should be happening with a neural network algorithm.
It almost seems like it's getting stuck in some sort of local minimum. The algorithm may be initializing the weights of the individual neurons the same way every time. Changing the parameter order might then cause the algorithm to arrive at the same answer for a certain parameter order each time, dependent on the initial parameter order. The "local minimum" might be determined by the algorithm only going through a certain number of iterations each time.

AI - what is included in the state

I'm taking my first course in AI, and I have to define some problems in my homework (not yet solve them, just supply a definition).
So I have to define about boolean satisfiability problem
What is a state?
What is the initial state?
What is a final state?
What are the operators?
My question is: Should the formula be a part of the state?
Considerations so far:
The operator doesn't change it, and it's constant through the computation, so it's not.
If I do include it, in theory, the search space gets much bigger, since more states are possible, but in reality the formula can't be changed, so I get a big state, and a branching factor that is not corresponding.
It's varying from one execution to the next, so it should be a part of the state.
You need only really consider the varying parts of the problem to be a state when conducting a search such as this, although I'd say in this case it really comes down to how you define the problem.
The search space for a given run of the algorithm depends upon the input formula, but after that is fixed, ie you are searching the space of n length bit vectors where n is the number of variables in the formula. So the formula is not part of the state because it does not vary.
The counter claim is that you are searching in a larger space of formula-vector pairs, but as you cannot change the formula as part of the problem, this has not really increased the size of the search space. So I would not make the claim that "If I do include it, in theory the search space gets much bigger". It does not, the reachable states are the same, the branching is the same, the space that requires exploring to solve the problem is the same.
Given this, my answer would that the formula is not part of the state, but defines the nature of the state space. So the answers to your four questions will each be functionally dependant on the formula in some way, but the state depends only on the length of the formula.
Hope that makes sense!
This is just a note for future readers - Not an answer
Vic Smith is right, another way to look at the fact that in theory there are more states but in practice not (my second dot), is just to think about it as separate bondage spaces. For example for the formula X or Y there is one bondage space, and for not X and Y there is another one and they have no common nodes in the representation.
So it can vary from one execution to another, but still has the same "reachable" states, and same branching factor. And each execution has a different starting state.

Help--100% accuracy with LibSVM?

Nominally a good problem to have, but I'm pretty sure it is because something funny is going on...
As context, I'm working on a problem in the facial expression/recognition space, so getting 100% accuracy seems incredibly implausible (not that it would be plausible in most applications...). I'm guessing there is either some consistent bias in the data set that it making it overly easy for an SVM to pull out the answer, =or=, more likely, I've done something wrong on the SVM side.
I'm looking for suggestions to help understand what is going on--is it me (=my usage of LibSVM)? Or is it the data?
The details:
About ~2500 labeled data vectors/instances (transformed video frames of individuals--<20 individual persons total), binary classification problem. ~900 features/instance. Unbalanced data set at about a 1:4 ratio.
Ran to separate the data into test (500 instances) and train (remaining).
Ran "svm-train -t 0 ". (Note: apparently no need for '-w1 1 -w-1 4'...)
Ran svm-predict on the test file. Accuracy=100%!
Things tried:
Checked about 10 times over that I'm not training & testing on the same data files, through some inadvertent command-line argument error
re-ran (even with -s 1) multiple times and did train/test only multiple different data sets (in case I randomly upon the most magical train/test pa
ran a simple diff-like check to confirm that the test file is not a subset of the training data
svm-scale on the data has no effect on accuracy (accuracy=100%). (Although the number of support vectors does drop from nSV=127, bSV=64 to nBSV=72, bSV=0.)
((weird)) using the default RBF kernel (vice linear -- i.e., removing '-t 0') results in accuracy going to garbage(?!)
(sanity check) running svm-predict using a model trained on a scaled data set against an unscaled data set results in accuracy = 80% (i.e., it always guesses the dominant class). This is strictly a sanity check to make sure that somehow svm-predict is nominally acting right on my machine.
Tentative conclusion?:
Something with the data is wacked--somehow, within the data set, there is a subtle, experimenter-driven effect that the SVM is picking up on.
(This doesn't, on first pass, explain why the RBF kernel gives garbage results, however.)
Would greatly appreciate any suggestions on a) how to fix my usage of LibSVM (if that is actually the problem) or b) determine what subtle experimenter-bias in the data LibSVM is picking up on.
Two other ideas:
Make sure you're not training and testing on the same data. This sounds kind of dumb, but in computer vision applications you should take care that: make sure you're not repeating data (say two frames of the same video fall on different folds), you're not training and testing on the same individual, etc. It is more subtle than it sounds.
Make sure you search for gamma and C parameters for the RBF kernel. There are good theoretical (asymptotic) results that justify that a linear classifier is just a degenerate RBF classifier. So you should just look for a good (C, gamma) pair.
Notwithstanding that the devil is in the details, here are three simple tests you could try:
Quickie (~2 minutes): Run the data through a decision tree algorithm. This is available in Matlab via classregtree, or you can load into R and use rpart. This could tell you if one or just a few features happen to give a perfect separation.
Not-so-quickie (~10-60 minutes, depending on your infrastructure): Iteratively split the features (i.e. from 900 to 2 sets of 450), train, and test. If one of the subsets gives you perfect classification, split it again. It would take fewer than 10 such splits to find out where the problem variables are. If it happens to "break" with many variables remaining (or even in the first split), select a different random subset of features, shave off fewer variables at a time, etc. It can't possibly need all 900 to split the data.
Deeper analysis (minutes to several hours): try permutations of labels. If you can permute all of them and still get perfect separation, you have some problem in your train/test setup. If you select increasingly larger subsets to permute (or, if going in the other direction, to leave static), you can see where you begin to lose separability. Alternatively, consider decreasing your training set size and if you get separability even with a very small training set, then something is weird.
Method #1 is fast & should be insightful. There are some other methods I could recommend, but #1 and #2 are easy and it would be odd if they don't give any insights.

Solving the problem of finding parts which work well with each other

I have a database of items. They are for cars and similar parts (eg cam/pistons) work better than others in different combinations (eg one product will work well with another, while another combination of 2 parts may not).
There are so many possible permutations, what solutions apply to this problem?
So far, I feel that these are possible approaches (Where I have question marks, something tells me these are solutions but I am not 100% confident they are).
Neural networks (?)
Collection-based approach (selection of parts in a collection for cam, and likewise for pistons in another collection, all work well with each other)
Business rules engine (?)
What are good ways to tackle this sort of problem?
The answer largely depends on how do you calculate 'works better'?
1) Independent values
Assuming that 'works better' function f of x combination of items x=(a,b,c,d,...) and(!) that there are no regularities that can be used to decide if f(x') is bigger or smaller then f(x) knowing only x, f(x) and x' (which could allow to find the xmax faster) you will have to calculate f for all combinations at least once.
Once you calculate it for all combinations you can sort. If you will need to look up data in a partitioned way, using SQL/RDBMS might be a good approach (for example, finding top 5 best solutions but without such and such part).
For extra points after calculating all of the results and storing them you could analyze them statistically and try to establish patterns
2) Dependent values
If you can establish some regularities (and maybe you can) regarding the values the search for the max value can be simplified and speeded up.
For example if you know that function that you try to maximize is linear combination of all the parameters then you could look into linear programming
If it is not...
