nonlinear curve fitting - initial values

wschrabi-disabl · ‎Feb 18, 2009

Dear experts,

I found in the forum the sheet about levenberg-marquardt. My question. Which method is the best when you do not know good initial values for estimantation. I have adapted the sheet as I saw that the GenfitX makes it possible to make a large range of initial values. So I do first a genfitx and then check it with the other methods. Can I say, that GenfitX does the best work due to bad initial estimatation?

Thanks for your comment

ptc-1368288 · ‎Feb 18, 2009

>Dear experts,

I found in the forum the sheet about levenberg-marquardt. My question. Which method is the best when you do not know good initial values for estimantation....<<br> __________________

Will look later on.

There is no best method. LM fails as you wish depending upon the underlying function. QN, CG are occasionally the only route. LM fits more types of models, that's it.

jmG

ValeryOchkov · ‎Feb 18, 2009

I use sometime the two step method:
1. Find guess values as solution of Given-MinErr for 3 points.
2. Find of genfit for all points.
Val
http://twt.mpei.ac.ru/ochkov/v_ochkov.htm

wschrabi-disabl · ‎Feb 18, 2009

Dear jmG,
thanks a lot, but (sorry for my stupid Q) what are the abbr. for QN and CG for?

ptc-1368288 · ‎Feb 18, 2009

On 2/18/2009 9:14:18 AM, wschrabi wrote:
>Dear jmG,
>thanks a lot, but (sorry for
>my stupid Q) what are the
>abbr. for QN and CG for?
_________________________

1. LM = as you know
2. QN = Quasi Newton
3. CG = Conjugate gradient

jmG

ptc-1368288 · ‎Feb 18, 2009

If you objectively intend to acquire fitting skill, this forum is a solid conglomerate of expertise, an hay pile of projects done (done well !). Doing curve fitting, Mathcad has no competition. Attached the 49 pages work sheet mentioned previously (11.2a).

I will review that work sheet in case some tools have been left to feed the cat, but it might take a day (+).

Please: enjoy and visit again.

jmG

wschrabi-disabl · ‎Feb 18, 2009

Thank you very much, jmG. I will study the Worksheet.

jmG, one Question to your Worksheet. MathCad 11.2a has a predefiend function F(x) and the F for Farad.
I have problems with F(x), when I substitute F(X) with U(x) then it works fine, but I do not want to change the whole sheet. Could you give me an advice how to handle this? Thanks

ptc-1368288 · ‎Feb 18, 2009

On 2/18/2009 12:33:31 PM, wschrabi wrote:
>Thank you very much, jmG. I
>will study the Worksheet.
___________________________

The rules for using Genfit are that simple:

1. Use the Mathcad built-in Genfit.

2. Guess wisely ... there are some tricks for many models. If no prepared tricks exist or are otherwise known, then guess manually up until your guessed function crosses at least once the data set ... crossing twice is better. Crossing twice fails with some models and you are left with purely manual trial/error and no fitting method is known.

3. If the model is a "normalized rational approximation" P(x)/[1+ Q(x)] then null all coefficients of order higher than 1 and get the guessed function to cross the data set by playing with the simple linear fit to cross the data set.

4. Once the initials are set, then generate the vector of the derivatives ... manually using the symbolic or as automated as you can if you are at 13, 14 versions. If you are at version 11 or lower, you can use Valery module which dresses the vector of the derivatives wrt... the Maple symbolic parser does it all and fine. But at version 13 this superb facility was broken.

But remember: it is not mathematically possible to recover the exact coefficients of a function that has been perturbed by noise. This is where the rule of thumb to declare exact coefficients comes from and doing so you have included the full traceability of the fit.

The attached worksheet is worth reading as it is the core algorithm of the Mathcad Genfit. To what extent it is actually refined and optimized: don't know. My only contribution in that work sheet is the collection and more tutoring. As you can see, it trails back to Mathcad 6.0. It might not work above 11 because of the undocumented "Iteration" in the for loop ?

Fitting a family of curve is quite a challenge too ! Up until now none were missed in this collab. One thing CAS don't like much are the astronomically huge or small numbers. Sometimes scaling the data set helps.

jmG

ptc-1368288 · ‎Feb 18, 2009

... "initialise wisely"

That particular example does not need that much wise guesses. For the demo, the guessed function crossing the data set should look like this.

jmG

ptc-1368288 · ‎Feb 18, 2009

...

Below Marlett, your example in the manual N mode. No need for TOL, CTOL. Look at it carefully and check the previous assertion about rounding coefficients, check the correlation that 0.999 is statistically = 1, thus supporting the rounding as "true". This manual N needs only to be slaved on the correlation and that Genfit is fully automated in � page of exposed material. Again about the initials: some are relatively easy, some are not so easy, your example is not easy. In few trial/error ... bingo the fit.

If you are at version 13 up, you need to design the vector of Derivatives wrt. The module you are proposing from source vs another approach need be considered, but only 13 and up users can [I'm at 11.2a].

"simply deceptive or deceptively simple"
One of the best quote in this collab.
I would bet (but not certify) that the Mathcad built-in Genfit is the basic algo automated on the correlation, thus traceable to the source code.

jmG

wschrabi-disabl · ‎Feb 19, 2009

jmG, one Question to your Worksheet. MathCad 11.2a has a predefiend function F(x) and the F for Farad.

I have problems with F(x), when I substitute F(x) with U(x) then it works fine, but I do not want to change the whole sheet. Could you give me an advice how to handle this? Thanks

wschrabi-disabl · ‎Feb 19, 2009

Dear jmG,

i have MCad11.2a and MCad14, and I would like to let it run in MCAD 14. But I got this error with the diff() command. I substituted the diff with d/d but now I got the error here.
see picture.

And one other question: When I open a MCv11 sheet then all the variables are substitutes with the name of units. F.e. j = joule, V = volts and so one. Has anyone here an idea how to switch this off?

ptc-1368288 · ‎Feb 19, 2009

On 2/19/2009 5:17:37 AM, wschrabi wrote:
>Dear jmG,
i have MCad11.2a and
>MCad14, and I would like to
>let it run in MCAD 14. But I
>got this error with the diff()
>command. I substituted the
>diff with d/d but now I got
>the error here.
see
>picture.

And one other
>question: When I open a MCv11
>sheet then all the variables
>are substitutes with the name
>of units. F.e. j = joule, V
>= volts and so one. Has anyone
>here an idea how to switch
>this off?
_______________________________

You have several questions in one. For that *.PNG, This the Valery module nov 2007. It is based on Maple and does not work in 13, 14 MuPad based symbolic. I have no idea how to make it work, you should try the module in your original blue sheet or manually create the vector of derivatives wrt.

That when you open a MCv11 sheet it turns variable names into unit names, can't answer except that your 14 corrupts your 11 as you don't have a "virgin 11 anymore". You could try switching the unit system OFF ? The best you can expect is the working version of some "mixed user versions" who made it work for them. Or if you have two separate PC's or otherwise two separate independent installations of 11 and 14, to run in 11 only.
If at least you can recognize modules of interest, you could extract and make them work in 14. What I can't understand is turning function names, i.e: a declaration of an algo, turning into unit symbol. That is above the imagination as it breaks the fundamental meaning that maths are scalar. It roots probably too deep for any user to reach the scalar coding. Could be your 11 is not 11.2a and/or your 14 is not an updated version. Only some volunteer can help.

jmG

wschrabi-disabl · ‎Feb 19, 2009

thanks a lot for your comment, I will use v11 in future.

But now a (for you maybe stupid Q) but I would be thankful for your comment. I have done the chemistry example from your sheet in Mathematica v 7.0 also. I have used your data from your sheet and imported it into Mathematica. But I have problems to interpret the error msg on FindFit. Normally it works fine, and I used your described model from your sheet. The basic Question is however, the A0 A1 x0 and p are the system parameters, which should be obtained from the mathematical process. But - as I think - these values are not calculated from the data they are defined in your sheet. Do I understand it wrong? Could you explain this to me, please?

ptc-1368288 · ‎Feb 19, 2009

On 2/19/2009 10:07:49 AM, wschrabi wrote:
...
>The basic Question is
>however, the A0 A1 x0 and p
>are the system parameters,

==> Yes they are the "model" parameters

>which should be obtained from
>the mathematical process.

==> YES, obtained from the fitting technique.

>But - as I think - these values
>are not calculated from the
>data they are defined in your sheet.

==> YES, obtained from the corresponding data set
==> But you toke a data set out of the blue

>Do I understand it
>wrong? Could you explain this
>to me, please?

==> Use the right data set attached.
==> Try again in Mathematica.

jmG

wschrabi-disabl · ‎Feb 19, 2009

Now, I got the solution. These steps where missing in your documentation. But you wrote anyway, that there are some steps missing.

ptc-1368288 · ‎Feb 19, 2009

>No, you can see the dots of the data. They are not out of the blue <.
______________________

YES, you have taken an unknown data set.
Convince yourself, click on the *.gif attached .
If your data set would be the same as mine ,
the plot would be the same, but yours is not.

The model in my work sheet with the coefficients
is specific to the data set, not a regress type.

jmG

wschrabi-disabl · ‎Feb 19, 2009

Yes the data is diff as I modified it. But when I use your data, the same result occure.

ptc-1368288 · ‎Feb 19, 2009

On 2/19/2009 3:21:57 PM, wschrabi wrote:
>Yes the data is diff as I
>modified it. But when I use
>your data, the same result
>occure.
___________________________

A model is specific to a data set or a family of data represented by the same model. You have several examples in the big work sheet. When you use my data, "same result occurs", you mean the Mathematica LM fails, conclude their LM is either incorrect or the model is not fittable by LM.

Genfit is Genfit and generally of little use in curve fitting, Minerr does same much faster. The use of Genfit is more just to match others traceability, as simple as that.

jmG

wschrabi-disabl · ‎Feb 20, 2009

Thanks jmG, yes I tried also the Method->Automatic but was not fittable by Mathematica. Yes you are right, the MC11 with minerr do a great job. Thanks for spending your time for my noviced-questions.

TomGutman · ‎Feb 19, 2009

You can't directly take a derivative with respect to an expression (including a subscripted variable). But it can be done. I use the functions from the Jacobian etc. work sheet to calculate gradients and Jacobians (and other vector analysis related derivatives). This sheet is known to work in MC14 (at least the latest posted version). The easy genfit setup includes a more tailored routine for the genfit required derivatives. I think it works in MC14, but am not sure. It should be easy to make it work, as it uses the same techniques as in the Jacobian etc. sheet (and one could easily use the Gradient function from there for the derivatives). I haven't paid much attention to genfit since finding out that minerr does as good a job (just as fast and accurate) with much less work. Oh, the MC14 version of genfit does not need the derivatives. It is capable of estimating them numerically if necessary.
__________________
� � � � Tom Gutman

RichardJ · ‎Feb 18, 2009

genfit, and genfitX, which is based on genfit, use Levenberg-Marquardt. As does minerr, which works fine when used in the correct way.

All numeric non-linear least squares algorithms are iterative, and run the risk of getting trapped in local minima (or even completely failing to converge) if the guess values are too far off. The best approach is to figure out a way to get reasonable initial parameter estimates, but how to do that depends entirely on the data. Can you tell us more about the data and the model you are trying to fit?

Richard

wschrabi-disabl · ‎Feb 18, 2009

Thanks Richard, well I am still in the testing phase. But if you look at the attached sheet, you can see it`s a sum of exp functions. And I make the discovery that LM does not succeed (when beta is to far away from 3.9) but the genfitx makes good results. THe sheet is just adapted and not from me, so sorry for my naive questions. Thanks anyway for your advice.

RichardJ · ‎Feb 18, 2009

Actually, I misremembered what GenfitX is. It's not based on genfit, but if I recall correctly it does use the LM algorithm (Tom will have to confirm that - he wrote it). So, mineer, genfit, genfitX, and Dave Powley's LM algorithm all use the same basic algorithm: Levenberg Marquardt. The reason for problems with minerr in that worksheet is that it's not set up correctly. It is set up to minimize the error function, but it should be set up to minimize the vector of residuals. If set up correctly it works. Any difference in behavior between the different functions in the sheet is due to the way they are implemented internally, not anything intrinsic to the algorithm used.

You are better off sticking to the built in Mathcad functions that implement LM: genfit and minerr. There are plenty of example worksheets in this forum showing how to set these up (in my opinion, minerr is easier to set up). If you have trouble setting up your real fit in Mathcad, or if these functions do not converge correctly with your real data and real function, then we would need to see the data and the function.

Richard

wschrabi-disabl · ‎Feb 18, 2009

THanks a lot Richard, you also gave me to think about it.

ptc-1368288 · ‎Feb 18, 2009

>Can I say, that GenfitX does the best work due to bad initial estimation ? <<br> _______________________________

That would be a catastrophic premise to curve fitting that is an art and even more sometimes because the other art is about how to manipulate and interpret all what a CAS can or can't do for you.

Fitting a pure data set as you propose is not interesting: data set are never so pure. No matter which method you will debate, you must initialise, not always obvious but in your example, there is no need for a fit as the coefficients are retrieved exact from the manual fit and the PWMinerr just confirmed.

There are other points. Noisy data set can't come out as pure fit because the fit is floating. That said, a fit can be declared good and coefficients rounded as soon as you get them with 3 or 4 "exact decimals". That is a rule of thumb but just consider "the rule".

"PWMinerr" has been tested and tested so exhaustively that the answer is not to be demonstrated again (like a conjecture):

1. the fastest fitting method then no need to time
2. the less failing method, globally.

If your company intends to standardize on a single method: error ! There is no single universal fitting method, you can trust me on that and in that collaboratory. Your work sheet is interesting, but interesting only. Did you save Guy Beadie great work sheet about initializing Genfit ?

Cnclusion:
Curve fitting needs a tool box because it is an art. Genfit is only LM in 11.2a ( I understood 13, 14 have more options ?). Some data sets can only be fitted manually to a model. Visit PTC library, last october Mona has put a work sheet [47 pages] about curve fitting. It's not my all tool box but very substantial with the most pertaining examples.
You can try ORIGINLAB and their fitting wizard, not bad at all as it contains an extensive library of models. Their fitting is LM only, it fails as well as succeeds and it comes with all the applicable (or recognzed applicable) stats.

Read the work sheet posted last night "ODE MINERR". DE's are also models and not the least, thinking in terms of "modelling".

Thanks for your collaboration.

jmG

wschrabi-disabl · ‎Feb 18, 2009

Thanks a lot jmG, you gave me a lot to think about. I will do it later on.
Wschrabi

wschrabi-disabl · ‎Feb 19, 2009

On 2/18/2009 10:28:07 AM, jmG wrote:
Genfit
>is only LM in 11.2a ( I
>understood 13, 14 have more
>options

It sheems to me that MC v 14 is scrap, as I read a lot of bug reports here in the forum, moreover I was not able to adapt the sheet from 11.2a to 14. I gave it up. I am contine working with 11.2a.

thanks

ptc-1368288 · ‎Feb 19, 2009

On 2/19/2009 7:35:13 AM, wschrabi wrote:
>On 2/18/2009 10:28:07 AM, jmG wrote:
> Genfit
>>is only LM in 11.2a ( I
>>understood 13, 14 have more
>>options
>
>It sheems to me that MC v 14 is scrap,
>as I read a lot of bug reports here in
>the forum, moreover I was not able to
>adapt the sheet from 11.2a to 14. I gave
>it up. I am contine working with 11.2a.
>
>thanks
_________________________

If you have a box called PC, running XP S2 and if you have the original 11 + the two packs, you should have no problem running that 49 pages[excluding collapsed areas] work sheet.

jmG

wschrabi-disabl · ‎Feb 19, 2009

Thanks jmG, I could run it in my 11.2a/XP(SP3) PC. But had to replace the F(x) with U(x). The F is interpreted false. I do not know why. By the way congratulating to your extense great worksheet about the fitting stuff. Thanks a lot