I am currently trying to write a code to solve a non linear system of equations. I am using the functions of the gsl library, more specifically the multiroot_fdf_solver.
My problem is that it currently doesn't want to converge. More specifically, I have the following behavior:
-if my initial conditions are close to the result, the gsl_multiroot_fdf_solver_iterate does not update the parameters at all. I tried to display the results on the different steps, and I have for all the parameters dx = NaN (I think this quite srange), the status of gsl_multiroot_fdf_solver_iterate is "success" and the status of gsl_multiroot_test_residual is "the iteration has not converged yet"
-the parameters are only updated if my initial conditions are really far from the expected result. Obvisously in this case it does not converge to the right values.
I have already checked multiple times the expression of my function and my Jacobian, and they seem good.
I have to precise that my Jacobian (and my system as well) are quite complicated expression with many trigonometric function.
Would you have any idea of what it could be? Is it possible that if the expression of the Jacobian is too complicated, it has troubles to compute it?
Thank you in advance for your answers, I am really stucked at this point.
Related
The N2 diagram for my full problem is below.
The N2 diagram for the coupled portion of the problem is below.
I have a DirectSolver handling the coupling between LLTForces and ImplicitLiftingLine, and an LNBGS solver handling the coupling between LiftingLineGroup and TestCL.
The gist for the problem is here: https://gist.github.com/eufren/31c0e569ed703b2aea3e2ef5360610f7
I have implemented guess_nonlinear() on ImplicitLiftingLine, which should use various outputs from LLTGeometry to give a good initial guess for the vortex strengths based on a linearised form of the governing equations.
def guess_nonlinear(self, inputs, outputs, resids):
freestream_unit_vector = inputs['freestream_unit_vector']
freestream_velocity = inputs['freestream_velocity']
n = inputs['normal_vectors']
A = inputs['surface_areas']
l = inputs['bound_vortices']
ic_tot = inputs['influence_coefficients_total']
v_inf = freestream_velocity
v_inf_vec = v_inf*freestream_unit_vector
lin_numerator = np.pi * v_inf * A * np.sum(n * v_inf_vec, axis=1)
lin_denominator = (np.linalg.norm(np.cross(v_inf_vec, l), axis=1) - np.pi * v_inf * A * np.sum(np.sum(n * ic_tot, axis=2), axis=1))
lin_vtx_str = lin_numerator / lin_denominator
outputs['vortex_strengths'] = lin_vtx_str
However, when the problem is run for the first time, any inputs not explicitly set with p.set_val() are all 1s. This causes guess_nonlinear() to give a bad output and so the system fails to converge:
As far as I can tell, the execution order for the LLT group is correct, and the geometry components should be being executed before the implicit component. I'm confused as to why this doesn't seem to actually be happening when the code is run, and instead these inputs are taking their default values.
What do I need to change to get this to work properly? Additionally, I've found difficulty in getting LNBGS to converge (hence adding guess_nonlinear()) during optimisation - only DirectSolver gets all the way through the optimisation without issues, but it's very slow for large numbers of LLT nodes). How can I improve the linear and nonlinear solver selection, and improve the reliability of the iterative solver?
Note: Thanks for providing a testable example. It made figuring out the answer to your question a lot simpler. Your problem was a bit subtle and I would not have been able to give a good answer without runnable code
Your first question: "Why are all the inputs 1"
"Short" Answer
You have put the nonlinear solver to high in the model hierarchy, which then included a key precurser component that computed your input values. By moving the solver down to a lower level of the model, I was able to ensure that the precurser component (LTTGeometry) ran and had valid outputs before you got to the guess_nonlinear of implicit component.
Here is what you had (Notice the implicit solver included LTTGeometry even though the data cycle does not require that component:
I moved both the nonlinear solver and the linear solver down into the LTTCycle group, which then allows the LTTGeometry component to execute before getting to the nonlinear solver and guess_nonlinear step:
My fix is only partially correct, since there is a secondary cycle from the TestCL component that also needs a solver and does not have one. However, that cycle still does not involve the LTTGeometry group. So the fully correct fix is to restructure you model top run geometry first, and then put the LTTCycle and TestCL groups together so you can run a solver over just them. That was a bit more hacking than I wanted to do on your test problem, but you can see the general idea from the adjusted N2 above.
Long Answer
The guess_nonlinear sequence in OpenMDAO does NOT run the compute method of explicit components or of groups. It follows the execution hierarchy, and calls any guess_nonlinear that it finds. So that means that any explicit components you have in your model will NOT get executed, their outputs will not get updated with computed values, and those computed values will not get passed to the inputs of downstream components.
Things get a little tricky when you have deep model hierarchies. The guess_nonlinear method is called as the first step in the nonlinear solver process. If you have a NonLinearRunOnce solver at the top level, it will follow the compute chain down the line calling compute or solve_nonlinear on each child and doing a data transfer after each one. If one of those children happens to be a group with a nonlinear solver, then that solver will call guess_nonlinear on its children (grandchildren of the top group with the NonLinearRunOnce solver) as the first step. So any outputs that were computed by the siblings of this group will be valid, but none of the outputs from the grandchild level will have been computed yet.
You may be wondering why not just have the guess_nonlinear method call the compute for any explicit components? There is a difficult to balance trade off here. If you assume that all explicit components are very cheap to run, then it might make sense to run the compute methods --- or it might not. A lot depends on the cyclic data structure. If some early component in the group needs guesses from the later one, then running its compute isn't going to help you much at all. Perhaps more importantly though, not all explicit components are cheap to run. You might have a very expensive computation, and calling compute as part of the guess process would be way too costly.
The compromise here, if you need some kind of top level guess process, is that you can implement guess_nonlinear at the group level. It's less common to do, but it gives you total control over what happens. You can call whatever you need to call in whatever sequence.
So the absolute key thing to remember is that the only data you have available to you when a guess_nonlinear is called is any data that was computed before your containing solver was executed. That means any thing that was computed before you got to the model scope of the containing solver (not the scope of the component with the guess_method itself).
Your second question: "How can I speed this up when the number of nodes gets large?"
This one not possible to give a generic answer to at all. I noticed that you have already specified sparse partial derivatives. That is a great start, but if its still not fast enough for you then it means you're reaching the limits of what you can do with a DirectSolver. You note that this solver is the only one that gets you through the optimization without issues, which I will take to mean that ScipyKryloventer link description here and PetscKrylov are not converging the linear system well for you --- at least not by themselves. Thats not surprising, as krylov solvers almost always require some kind of preconditioner... and this is why I can't offer a generic answer. Setting up efficient linear solvers for larger-scale compute is a tricky subject. If you look into the literature, you'll find some good suggestions. You can also study open source implementations like VSPAero for some tips.
effectively, you've reached the limit of what simple linear solvers can offer you. From this point forward, OpenMDAO can help a bit by making it easier to implement some preconditioning, but you'll have to suffer the math side yourself.
I don't understand this issue:
Issue: HIS Metriken - Cyclomatic (CR-MET4): [function_name] 13>10
It appears in Klocwork analysis while checking the issues of Code: METRICS.E.HIS_Metriken___Cyclomatic__CR_MET4_
Can anyone support?
Thanks
Do you see all those ifs, elses, loops in that function?
Those are the problem, you need to either design this function's logic more elegantly or split it into more functions with well-defined purposes.
By the way, I can only see that problematic function of yours because I am especially clairvoyant. For this kind of question you should normally show your code, just to be fair towards all those other users which cannot read your mind like I did.
Naaa, not really. The cyclomatic complexity is a measure for number of potential paths through your function. And that you have crossed the treshold of 10 by 3 means your function must be full of control structs, which create many paths.
EDIT
SOLVED
Solution was to use the long double versions of sin & cos: sinl & cosl.
It is my first post here, so bear with me :).
I come today here to ask for your input on a small problem that I am having with a C application at work. Basically, I am computing an Extended Kalman Filter and one of my formulas (that I store in a variable) has multiple computations of sin and cos, at least 16 in total in the same line. I want to decrease the time it takes for the computation to be done, so the idea is to compute each cos and sin separately, store them in a variable, and then replace the variables back in the formula.
So I did this:
const ComputationType sin_Roll = compute_sin((ComputationType)(Roll));
const ComputationType sin_Pitch = compute_sin((ComputationType)(Pitch));
const ComputationType cos_Pitch = compute_cos((ComputationType)(Pitch));
const ComputationType cos_Roll = compute_cos((ComputationType)(Roll));
Where ComputationType is a macro definition (renaming) of the type Double. I know it looks ugly, a lot of maybe unnecessary castings, but this code is generated in Python and it was specifically designed so....
Also, compute_cos and compute_sin are defined as such:
#define compute_sin(a) sinf(a)
#define compute_cos(a) cosf(a)
My problem is the value I get from the "optimized" formula is different from the value of the original one.
I will post the code of both and I apologise in advance because it is very ugly and hard to follow but the main points where cos and sin have been replaced can be seen. This is my task, to clean it up and optimize it, but I am doing it step by step to make sure I don't introduce a bug.
So, the new value is:
ComputationType newValue = (ComputationType)(((((((ComputationType)-1.0))*(sin_Pitch))+((DT)*((((Dg_y)+((((ComputationType)-1.0))*(Gy)))*(cos_Pitch)*(cos_Roll))+(((Gz)+((((ComputationType)-1.0))*(Dg_z)))*(cos_Pitch)*(sin_Roll)))))*(cos_Pitch)*(cos_Roll))+((((DT)*((((Dg_y)+((((ComputationType)-1.0))*(Gy)))*(cos_Roll)*(sin_Pitch))+(((Gz)+((((ComputationType)-1.0))*(Dg_z)))*(sin_Pitch)*(sin_Roll))))+(cos_Pitch))*(cos_Roll)*(sin_Pitch))+((((ComputationType)-1.0))*(DT)*((((Gz)+((((ComputationType)-1.0))*(Dg_z)))*(cos_Roll))+((((ComputationType)-1.0))*((Dg_y)+((((ComputationType)-1.0))*(Gy)))*(sin_Roll)))*(sin_Roll)));
And the original is:
ComputationType originalValue = (ComputationType)(((((((ComputationType)-1.0))*(compute_sin((ComputationType)(Pitch))))+((DT)*((((Dg_y)+((((ComputationType)-1.0))*(Gy)))*(compute_cos((ComputationType)(Pitch)))*(compute_cos((ComputationType)(Roll))))+(((Gz)+((((ComputationType)-1.0))*(Dg_z)))*(compute_cos((ComputationType)(Pitch)))*(compute_sin((ComputationType)(Roll)))))))*(compute_cos((ComputationType)(Pitch)))*(compute_cos((ComputationType)(Roll))))+((((DT)*((((Dg_y)+((((ComputationType)-1.0))*(Gy)))*(compute_cos((ComputationType)(Roll)))*(compute_sin((ComputationType)(Pitch))))+(((Gz)+((((ComputationType)-1.0))*(Dg_z)))*(compute_sin((ComputationType)(Pitch)))*(compute_sin((ComputationType)(Roll))))))+(compute_cos((ComputationType)(Pitch))))*(compute_cos((ComputationType)(Roll)))*(compute_sin((ComputationType)(Pitch))))+((((ComputationType)-1.0))*(DT)*((((Gz)+((((ComputationType)-1.0))*(Dg_z)))*(compute_cos((ComputationType)(Roll))))+((((ComputationType)-1.0))*((Dg_y)+((((ComputationType)-1.0))*(Gy)))*(compute_sin((ComputationType)(Roll)))))*(compute_sin((ComputationType)(Roll)))));
What I want is to get the same value as in the original formula. To compare them I use memcmp.
Any help is welcome. I kindly thank you in advance :).
EDIT
I will post also the values that I get.
New value : -1.2214615708217025e-005
Original value : -1.2214615708215651e-005
They are similar up to a point, but the application is safety critical and it is necessary to validate the results.
You can not meet your expectation for a couple of reasons.
By altering the code you adjust the machine instructions being used in subtle ways that will impact the final value.
For instance if originally it was using fused multiplies and adds and this is no longer happening it will change the result.
You don't mention the target architecture. Some architectures retain more than 64bits in the floating point pipeline. These extra bits get rounded when forced into 64bit memory. Again altering how this works will have minor effects on the final output.
I am using MATLAB to build a code that does automatic tuning of the three PID controller gains. The way I am thinking of it, is to minimize the error (the difference between the desired state and the obtained one) of my system, for that, I coded a function that accepts the PID gains as input parameters and returns the calculated error, namely:
errors_vector = closedLoopSimulation(pidGains)
Since I have three set points (input commands), then the dimension of the output errors_vector is 3*N, where N is the number of time samples I have (1000 in my case). So that is the function I want to minimize, and for doing so, I tried using fminunc command, namely:
pidGains_ini = [2.4 0.1 0.4];
func = #closedLoopSimulation;
[pid, fval] = fminunc(func, pidGains_ini)
However, when I run the last piece of code, I get this error:
User supplied objective function must return a scalar value.
which is clearly due to the fact that that errors_vector is a 3*1000 array and not a scalar.
My questions would be, from the programming point of view, is there a way that I can make fminunc minimize functions that return arrays?
On the other hand, and from the Control Theory point of view, is there another way which I can optimize the PID gains automatically?
I hope I made myself clear enough.
Thanks
Minimizing a vector is not very well defined (there is something called multi-objective or multi-criteria optimization but that is somewhat specialized). "Normal" optimization methods can only minimize (or maximize) scalar objectives. I suspect in your case you could form such an objective by taking the sum of the squared errors and minimize that. To be complete: this is standard operating procedure and is often called "least squares".
I'm performing gaussian mixture model classification, and based on that, used "mvnpdf" function in MATLAB.
As far as I know the function returns a multi variate probability density for the data points or elements passed to it.
However I'm trying to recreate it on C and I assumed that mvnpdf is the regular Gaussian distribution (clearly it is not) because the results don't match.
Does anyone know how "mvnpdf" works ? Because I haven't been able to find documentation on it .
The documentation for mvnpdf is here
if you are looking for the exact code just put a break point at the point where you call it and see how it works
Okay I actually found a decent link that explains in detail what's happening inside .
This might be a better link to look at - http://octave.sourceforge.net/statistics/function/mvnpdf.html