[WIP] Switch to unconstrained optimizers - step 1 #840

dmbates · 2025-07-19T22:38:26Z

An attempt to make it easier to switch to unconstrained optimizers by first changing all lower bounds to -Inf
There is a separate rectify! step after the optimization to flip the signs on columns of \lambda where the diagonal element converges to a negative value.
One of the tests for GLMMs is still giving problems with negative diagonal elements on \lambda. Not sure why
~~test/prima.jl is giving errors so I disabled it. I'm too tired to work out what is happening there.~~

dmbates · 2025-07-19T23:08:10Z

I'll look into the failures on evaluation of the logdet in the model for the Dyestuff2 data tomorrow. It should be a straightforward evaluation because the model (should) converge to \theta = 0.

dmbates · 2025-07-20T14:15:29Z

I had the wrong location for the first failure in the tests - it was in checking the fit of only(models(:penicillin)). This sort of a good news/bad news situation. It is a simple model, two balanced and crossed grouping factors with 24 and 6 levels. The optimizer takes different paths on Apple M-series (with AppleAccelerate) and x86_64 (with MKL) processors. In fact the initial value at [1.0, 1.0] is different on the different processors. The good news is that this is a relatively simple model on which we can do intensive examination. The bad news is that there are differences on this relatively simple model. The evaluation of the logdet(fm) just involves some scalar operations, a rank-k downdate, and the Cholesky factorization of the downdated square matrix of size 6.

dmbates · 2025-07-20T15:09:59Z

I am trying to determine where the iterations in LN_BOBYQA start to differ in the penicillin fit according to the processor and BLAS implementation. It is difficult to imagine accelerated BLAS giving different Cholesky or rankUpdate values on a 6 by 6 matrix but that seems to be the case. It may be better to switch to LN_NEWUOA as the default, because we are setting the lower bounds to -Inf, so we can avoid all the complexities of the bounded code.

I propose:

switching to LN_NEWUOA as the default but keeping, for now, the lowerbd vector, even though it is not passed to the optimizer
checking the initial value of the objective for the penicillin model on different processor/BLAS types.
checking the number of iterations and final values when using LN_NEWUOA on different processor/BLAS types.

Any other ideas of how to proceed are welcome.

andreasnoack · 2025-07-20T16:55:32Z

I would expect item two to be the more fruitful one, i.e. to focus on differences related to evaluating just the objective function across microarchitectures and leave out the effect of the optimization algorithm. I think it would be interesting to see differences in the initial values and in a value where the two start to differ. It doesn't matter which of the microarchitectures created the value, but the same should be used when evaluating both. Again, to leave out the effect of the optimization. I'm speculating if the objective function is ill-conditioned for some input parameters. (Here, I mean ill-conditioned in the general sense that measures an arbitrary function and not just the linear solve conditioning.)

I think it would be beneficial to stick with BOBYQA for now. It means one moving part less, and I don't suspect that disabling the bounds in BOBYQA by setting the lower bound to -Inf should be the issue.

dmbates · 2025-07-21T13:47:42Z

In addition to checking for consistency in the evaluations at the initial values, which tend to have exact integer parameter values for the first few evaluations of the objective in the optimization, I was thinking of using extended precision to evaluate a "reference" value for the result. For a simple model, such as that for the penicillin data, the calculations are all scalar evaluations except for a rankUpdate operation and the Cholesky factorization of the resultant 6 by 6 positive definite matrix.

@andreasnoack or @palday Would you have recommendations for what extended precision representation to use? There is BigFloat but I think there is another representation that essentially extends the mantisa of a Float64. I can't remember the name, unfortunately.

dmbates · 2025-07-21T13:53:43Z

@andreasnoack I am okay with using LN_BOBYQA as the default for a while longer but definitely feel we should eliminate the lowerbd stuff if we are not going to use it. It may be a result of getting older and less mentally flexible but I am finding the current code to be overly complex in places. Like any large project there is a tendency to add code to patch special cases and justify this by adding more tests. Eventually a person looking at the code for the first time is left to wonder "Why is there all this code about a lower bound when the lower bound is always set to -Inf?". If you don't know the history it is nonsense.

dmbates · 2025-07-21T15:01:35Z

It seems that some early differences in the evaluation of the objective for even a simple model like that for the :penicillin data change with accelerated BLAS
This model is a simple computation because it is a balanced, fully-crossed design resulting in a very simple structure for the A matrix. In the early iterations of the optimizer the parameter values are integers and the most complicated parts of the evaluation up to the Cholesky factorization are square roots of integers and inverses of these.

On a Macbook with multiple threads there are differences in the evaluations of the objective according to whether AppleAccelerate or OpenBLAS are used

dmbates@MacBookPro MixedModels % julia -tauto --sta=no --quiet --proj  # use OpenBLAS
julia> using MixedModels

julia> include("./test/modelcache.jl")
models (generic function with 1 method)

julia> m = refit!(first(models(:penicillin)); fitlog=true);

julia> first(m.optsum.fitlog, 8)
8-element Vector{Tuple{Vector{Float64}, Float64}}:
 ([1.0, 1.0], 364.6267798165887)
 ([2.0, 1.0], 369.17598910041096)
 ([1.0, 2.0], 341.1011743244961)
 ([0.0, 1.0], 441.9062007516885)
 ([1.0, 0.0], 634.3549438457203)
 ([1.444405652744627, 1.5955538627700872], 340.7336990803933)
 ([0.49423753019996997, 1.907291795816359], 376.6434207973661)
 ([1.8130833383192728, 1.5302193144504488], 343.9702954340243)

julia> last(m.optsum.fitlog, 4)
4-element Vector{Tuple{Vector{Float64}, Float64}}:
 ([1.5375663750437891, 3.2196669828561943], 332.18834868575027)
 ([1.5374512614983462, 3.219888903724198], 332.1883489935133)
 ([1.5375635758177966, 3.219607909552041], 332.18834870145923)
 ([1.5375908459147303, 3.2196603414283036], 332.1883486779673)

dmbates@MacBookPro MixedModels % julia -tauto --quiet --proj  # use AppleAccelerate
[ Info: using OhMyREPL.jl
[ Info: using AppleAccelerate.jl
[ Info: using Revise.jl
julia> using MixedModels

julia> include("./test/modelcache.jl")
models (generic function with 1 method)

julia> m = refit!(first(models(:penicillin)); fitlog=true);

julia> first(m.optsum.fitlog, 8)
8-element Vector{Tuple{Vector{Float64}, Float64}}:
 ([1.0, 1.0], 364.62677981658675)
 ([2.0, 1.0], 369.17598910040977)
 ([1.0, 2.0], 341.1011743245018)
 ([0.0, 1.0], 441.9062007516885)
 ([1.0, 0.0], 634.3549438457203)
 ([1.4444056527446194, 1.5955538627700523], 340.73369908052746)
 ([0.49423753020029537, 1.9072917958173385], 376.6434207973354)
 ([1.8130833383206328, 1.5302193144497196], 343.9702954341842)

julia> last(m.optsum.fitlog, 4)
4-element Vector{Tuple{Vector{Float64}, Float64}}:
 ([1.5375832064702692, 3.219592495951509], 332.1883486973754)
 ([1.5375061859551695, 3.2196562755740685], 332.1883487819558)
 ([1.5375898376921937, 3.219692275843751], 332.188348672713)
 ([1.5375939045981573, 3.219792193110907], 332.1883486700085)

So right from the beginning the evaluation of the objective is a bit different for AppleAccelerate vs OpenBLAS and from there on the trajectory will be slightly different

dmbates · 2025-07-21T15:09:20Z

On an x86_64 machine I get, with OpenBLAS

dmbates@plinth:~/.julia/dev/MixedModels$ julia -t8 --sta=no --quiet --proj
julia> using MixedModels

julia> include("./test/modelcache.jl")
models (generic function with 1 method)

julia> m = refit!(only(models(:penicillin)); fitlog=true);

julia> first(m.optsum.fitlog, 8)
8-element Vector{Tuple{Vector{Float64}, Float64}}:
 ([1.0, 1.0], 364.6267798165887)
 ([2.0, 1.0], 369.17598910041096)
 ([1.0, 2.0], 341.1011743244961)
 ([0.0, 1.0], 441.9062007516885)
 ([1.0, 0.0], 634.3549438457203)
 ([1.444405652744627, 1.5955538627700872], 340.7336990803923)
 ([0.4942375301999674, 1.9072917958163513], 376.64342079733973)
 ([1.8130833383189084, 1.5302193144504623], 343.97029543404193)

julia> last(m.optsum.fitlog, 4)
4-element Vector{Tuple{Vector{Float64}, Float64}}:
 ([1.5368792590556977, 3.2176454905841227], 332.1883597436415)
 ([1.5375182176811606, 3.219536746361558], 332.18834878933217)
 ([1.537643747886617, 3.2205288361596702], 332.18834933611157)
 ([1.5375133789110618, 3.219886141303852], 332.18834878807843)

and with MKL

dmbates@plinth:~/.julia/dev/MixedModels$ julia -t8 --quiet --proj
julia> using MixedModels

julia> include("./test/modelcache.jl")
models (generic function with 1 method)

julia> m = refit!(only(models(:penicillin)); fitlog=true);

julia> first(m.optsum.fitlog, 8)
8-element Vector{Tuple{Vector{Float64}, Float64}}:
 ([1.0, 1.0], 364.62677981659255)
 ([2.0, 1.0], 369.1759891004115)
 ([1.0, 2.0], 341.10117432449795)
 ([0.0, 1.0], 441.90620075169636)
 ([1.0, 0.0], 634.3549438457203)
 ([1.4444056527446676, 1.5955538627700978], 340.73369908034823)
 ([0.4942375301998896, 1.9072917958160005], 376.64342079740277)
 ([1.8130833383190654, 1.530219314450679], 343.97029543397673)

julia> last(m.optsum.fitlog, 4)
4-element Vector{Tuple{Vector{Float64}, Float64}}:
 ([1.5375663667436432, 3.2196670002814987], 332.1883486857041)
 ([1.537451252957679, 3.21988892102474], 332.18834899335576)
 ([1.5375634872760895, 3.2196078285002994], 332.18834870172884)
 ([1.5375910290916945, 3.2196608178035246], 332.18834867791)

dmbates · 2025-07-21T18:59:00Z

Some of the most interesting results are for the parameter vectors [0.0, 1.0] and [1.0, 0.0], because the model simplifies when one parameter is 0.0. In most cases m.L has a total of 6 non-zero blocks but these models can be reduced to 4 non-zero blocks of which one is an identity matrix (the [1,1] block for [0.0, 1.0] and the [2,2] block for [1.0, 0.0]). Notice that the values of the objective are the same for "M-series, OpenBLAS", "M-series, AppleAccelerate", and "x86_64, OpenBLAS". The objective only changes, relative to the others, with "x86_64, MKL". The Lapack operations for this evaluation are dsyrk and dpotrf.

dmbates · 2025-07-21T19:39:03Z

It is even more simplified in the case of the [0.0, 1.0] parameter vector because the second diagonal block of m.A is 24.0 * I(6) and the Cholesky factor, after adding I is exactly 5.0 * I(6)

julia> m.L[3]
6×6 Matrix{Float64}:
 5.0  0.0  0.0  0.0  0.0  0.0
 0.0  5.0  0.0  0.0  0.0  0.0
 0.0  0.0  5.0  0.0  0.0  0.0
 0.0  0.0  0.0  5.0  0.0  0.0
 0.0  0.0  0.0  0.0  5.0  0.0
 0.0  0.0  0.0  0.0  0.0  5.0

julia> m.L[1]
24×24 Diagonal{Float64, Vector{Float64}}:
 1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0

julia> m.L[6]      # the last diagonal block (conceptually this is LowerTriangular, the [1,2] position is used as scratch storage)
2×2 Matrix{Float64}:
  2.4     3308.0
 55.1333    12.5951

julia> pwrss(m) # penalized sum of squared residuals
158.6355555555629

julia> abs2(last(last(m.L)))  # the square of the [2,2] element of m.L[6]
158.6355555555629

julia> 2 * 6 * log(5.)     # logdet of the [2,2] block of L (recall the logdet is the logarithm of the determinant of L*L', not just L
19.313254949209202

julia> logdet(m)
19.313254949209202

dmbates · 2025-07-21T20:20:34Z

When using MKL on x86_64 the last element of last(m.L) changes, hence the penalized sum of squared residuals changes

julia> abs2(last(last(m.L)))
158.6355555555715

We can reproduce the calculation using BigFloat. In this case the results are simplified because the cholesky factor in m.L[3] is exactly 5.0 * I(6).

julia> m.L[5] == m.A[5] ./ 5.0
true

So the downdate of copyto!(last(m.L), last(m.A)) followed by the Cholesky factorization in BigFloat can be written

julia> BigFloat.(last(m.A))
2×2 Matrix{BigFloat}:
  144.0   3308.0
 3308.0  76582.0

julia> m.L[5]
2×6 Matrix{Float64}:
   4.8    4.8    4.8    4.8    4.8   4.8
 120.8  105.4  119.6  109.8  110.2  95.8

julia> L5 = BigFloat.(m.A[5]) ./ 5
2×6 Matrix{BigFloat}:
   4.8    4.8    4.8    4.8    4.8   4.8
 120.8  105.4  119.6  109.8  110.2  95.8

julia> first(L5)
4.800000000000000000000000000000000000000000000000000000000000000000000000000028

julia> L6 = cholesky(Symmetric(BigFloat.(last(m.A)) - L5 * L5', :L))
Cholesky{BigFloat, Matrix{BigFloat}}
L factor:
2×2 LowerTriangular{BigFloat, Matrix{BigFloat}}:
  2.4       ⋅ 
 55.1333  12.5951

julia> last(L6.L)
12.59506076029629781458345910991177392468429196879571875004312834742566832337298

julia> last(last(m.L))
12.595060760296931

When using OpenBLAS on x86_64 the results are

julia> m = refit!(only(models(:penicillin)); fitlog=true);

julia> objective!(m, [0.0, 1.0])
441.9062007516885

julia> last(last(m.L))
12.595060760296588

so the OpenBLAS result is slighly closer to the BigFloat value.

Frankly I am a bit surprised that there is a difference in the Cholesky factor of a 2 by 2 symmetric matrix.

dmbates · 2025-07-21T20:22:44Z

Does anyone have access to a Windows computer to check there computations with OpenBLAS and with MKL? I don't have anything machines Windows. I suppose we could put something like this into the CI runs.

andreasnoack · 2025-07-21T21:38:23Z

I don't think the difference in the Cholesky matters too much here. For the same input, I get very similar results for OpenBLAS and MKL. With

julia> repr(m.A[6] - m.L[5]*m.L[5]')
"[5.7599999999999625 132.31999999999925; 132.31999999999925 3198.3199999999924]"

I'm getting

OpenBLAS

julia> cholesky([5.7599999999999625 132.31999999999925; 132.31999999999925 3198.3199999999924]).L[2,2]
12.595060760296588

MKL

julia> cholesky([5.7599999999999625 132.31999999999925; 132.31999999999925 3198.3199999999924]).L[2,2]
12.59506076029659

which are only one ULP apart. So I think the difference between the two Choleskys is because of the inputs. With OpenBLAS, there is one ULP error in L[5]

julia> print(m.L[5])
[4.800000000000001 4.800000000000001 4.800000000000001 4.800000000000001 4.800000000000001 4.800000000000001; 120.80000000000001 105.4 119.60000000000001 109.80000000000001 110.2 95.80000000000001]

so I have

julia> m.A[5]/5 == m.L[5]
false

but true with MKL. I think this difference could explain the difference in the Cholesky result.

Just speculating, but maybe the downdate m.A[6] - m.L[5]*m.L[5]' could introduce a large cancellation error for some values of θ?

Update:

Could be that OpenBLAS is multiplying by the reciprocal instead of dividing

julia> (m.A[5]*(1/5))[2,6] == m.L[5][2,6]
true

julia> (m.A[5]/5)[2,6] == m.L[5][2,6]
false

dmbates · 2025-07-22T18:04:48Z

That's interesting. The calculation of m.L[5] is performed as an rdiv! with m.L[3] as LowerTriangular so it may be evaluated differently. It is just a coincidence for this particular model and parameter vector that m.L[3] is 5 * I(6).

dmbates · 2025-07-22T18:05:58Z

I would like to follow up on this thread but I am having difficulty thinking of what to do next. Suggestions welcome.

codecov · 2025-08-01T22:18:08Z

Codecov Report

❌ Patch coverage is 98.03922% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 95.62%. Comparing base (42950ae) to head (2619450).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
src/pca.jl	83.33%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #840      +/-   ##
==========================================
- Coverage   96.27%   95.62%   -0.66%     
==========================================
  Files          38       38              
  Lines        3654     3682      +28     
==========================================
+ Hits         3518     3521       +3     
- Misses        136      161      +25

Flag	Coverage Δ
current	`95.30% <98.03%> (-0.65%)`	⬇️
minimum	`95.62% <98.03%> (-0.66%)`	⬇️
nightly	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dmbates · 2025-08-04T19:24:00Z

I am beginning to see the problem with the style enforcer. For some reason JuliaFormatter on my computer changes successive ':' arguments by removing spaces around them (i.e. foo(k, :, :) to foo(k,:,:)) whereas the version on github wants them back the other way. I'll just forgo running format on the files before committing.

palday · 2025-08-04T19:39:51Z

I am beginning to see the problem with the style enforcer. For some reason JuliaFormatter on my computer changes successive ':' arguments by removing spaces around them (i.e. foo(k, :, :) to foo(k,:,:)) whereas the version on github wants them back the other way. I'll just forgo running format on the files before committing.

I think part of the problem is the difference between v1 and v2 of JuliaFormatter -- I had similar problems at some point when I was running a local version different from the version in the action.

dmbates · 2025-08-05T16:37:06Z

I just noticed that the forwarddiff tests take 15-20 seconds on my laptop whereas the finitediff tests take about half a second. There is one more test in forwarddiff but it is evaluating a 3 by 3 Hessian.

dmbates · 2025-08-05T16:40:09Z

In addition to removing the lowerbd field of optsum I was thinking of changing the structure of fitlog to be a matrix table (in the sense of Tables.table). The first change would not need to be considered breaking after this but the second would. Should I try to incorporate all of these in "one big beautiful PR" or go piecemeal?

palday · 2025-08-06T17:27:52Z

In addition to removing the lowerbd field of optsum I was thinking of changing the structure of fitlog to be a matrix table (in the sense of Tables.table). The first change would not need to be considered breaking after this but the second would. Should I try to incorporate all of these in "one big beautiful PR" or go piecemeal?

I'm onboard with these changes, but please go piecemeal. 😄

palday · 2025-08-06T17:33:51Z

src/profile/thetapr.jl

    osj = optsum
    opt = Opt(osj)
+    pmj = m.parmap[j]
+    lbj = pmj[2] == pmj[3] ? zero(T) : T(-Inf)


this pattern has come up enough, I wonder if it makes sense to define a method lowerbd(parmapj)

[noblock]

andreasnoack · 2025-08-06T20:50:00Z

Sorry for the silence here. I was busy with vacation. I just tried this out with the test cases that we have and this branch seems to avoid any of the issues that we'd hit in the past, i.e. both #705 and #833. No regressions appeared either so I'm in favor of moving forward.

Regarding #840 (comment) then I don't think it has any immediate implications for the transition to unconstrained optimization. However, it might be worth analyzing if the rank downgrade introduces numerical issues under some conditions and if they can be avoided. Such issues would be unrelated to the optimization approach, but they might explain the required changes to the test values here.

…dModels.jl into db/unconstrained

dmbates · 2025-08-12T16:31:36Z

I added a timingtable function definition to test/modelcache.jl but without adding any calls to it. If you call it separately for the :insteval models it produces

julia> timingtable()
Table with 6 columns and 16 rows:
      modnum  bkend  optmzr     feval  objtiv     time
    ┌─────────────────────────────────────────────────────
 1  │ 1       nlopt  LN_NEWUOA  81     2.37722e5  0.424785
 2  │ 1       nlopt  LN_BOBYQA  107    2.37722e5  0.560039
 3  │ 1       prima  newuoa     92     2.37722e5  0.483105
 4  │ 1       prima  bobyqa     104    2.37722e5  0.546217
 5  │ 2       nlopt  LN_NEWUOA  50     2.37586e5  0.249036
 6  │ 2       nlopt  LN_BOBYQA  46     2.37586e5  0.230194
 7  │ 2       prima  newuoa     47     2.37586e5  0.235359
 8  │ 2       prima  bobyqa     46     2.37586e5  0.230809
 9  │ 3       nlopt  LN_NEWUOA  196    2.37649e5  1.13438
 10 │ 3       nlopt  LN_BOBYQA  170    2.37649e5  0.985796
 11 │ 3       prima  newuoa     203    2.37649e5  1.1752
 12 │ 3       prima  bobyqa     173    2.37649e5  1.00095
 13 │ 4       nlopt  LN_NEWUOA  181    2.37647e5  1.05708
 14 │ 4       nlopt  LN_BOBYQA  490    2.37647e5  2.85961
 15 │ 4       prima  newuoa     211    2.37647e5  1.23718
 16 │ 4       prima  bobyqa     293    2.37647e5  1.71672

allowing us to see which are the faster and slower (mostly determined by the number of function evaluations but not entirely - :prima seems to have more overhead than does :nlopt) optimizers. I have been thinking of splitting the objective into two columns, the minimum observed objective for each model, which will be the same for all optimizers, and the deviation from the minimum. That way we can see which optimizer gives the best value.

Shall I add a column for the number of parameters in the model?

palday · 2025-08-13T03:06:42Z

I have been thinking of splitting the objective into two columns, the minimum observed objective for each model, which will be the same for all optimizers, and the deviation from the minimum. That way we can see which optimizer gives the best value.

How about each optimizer's final objective value and that value's deviation from the best value across optimizers?

Shall I add a column for the number of parameters in the model?

Yes.

dmbates · 2025-08-15T17:16:12Z

Finally this version is green on the tests except for the nightly version of Julia and that failure is due to compilation failures in other packages. I will hold off on declaring this to be v5.0.0 until further checking and enhancements take place. I think we should remove the lowerbd property in Optsum, but I am willing to be persuaded otherwise (it's easier to keep a redundant property in the struct than to remove it and find we need to replace it later).

palday · 2025-08-16T17:28:18Z

Finally this version is green on the tests except for the nightly version of Julia and that failure is due to compilation failures in other packages. I will hold off on declaring this to be v5.0.0 until further checking and enhancements take place. I think we should remove the lowerbd property in Optsum, but I am willing to be persuaded otherwise (it's easier to keep a redundant property in the struct than to remove it and find we need to replace it later).

Yep, you can do the lowerbd purging in a follow-up PR, but let's merge this one! 🚀

…nconstrained

dmbates · 2025-08-20T15:03:33Z

@palday Woud you be willing to do the honors and set the new version number, adjust the NEWS.md file, etc., and make a new release? You are much more familiar with the details of doing that than am I.

palday · 2025-08-21T02:39:33Z

@dmbates There are a few small things that I want to do in a follow-up PR before making the breaking release, but I've added an appropriate NEWS entry and marked the version as 5.0.0-DEV. I'm going to go ahead and merge this PR. 😎

Change lower bounds to -Inf, repair tests

4428868

dmbates marked this pull request as draft July 19, 2025 22:38

run JuliaFormatter on src dir

a043bfd

Merge branch 'main' into db/unconstrained

821028a

dmbates added 9 commits July 23, 2025 11:40

Merge branch 'main' into db/unconstrained

d9c7c4f

Comparison values from AppleAccelerate

fecd528

Merge branch 'main' into db/unconstrained

39bc629

[ci skip] allow PRIMA::newuoa optimizer

06cb130

[ci skip] don't pass lower bounds to optimizer

6e4df6c

Initial comparison of optimizers

c9d7f6d

[ci skip] Adjust test targets and tolerances

4449996

[ci skip] Adjust tolerance on test for x86_64, Linux, OpenBLAS

c220b4d

Force ci

17f2de0

Loosen tolerances (again)

0f02487

palday reviewed Aug 6, 2025

View reviewed changes

palday approved these changes Aug 6, 2025

View reviewed changes

palday and others added 4 commits August 6, 2025 16:41

format

7920b0a

Add missing import

74730a3

Merge branch 'db/unconstrained' of https://github.com/JuliaStats/Mixe…

7ee7e68

…dModels.jl into db/unconstrained

Default optimizer to :LN_NEWUOA. Add timingtable function.

1fc5c68

Account for re-ordering of models.

8febdf7

dmbates added 5 commits August 13, 2025 11:16

Default optimizer to :LN_NEWUOA, adjust tests, add timingtable

e4bf8cd

Remove test of nonsensical model.

c3fb5bb

Add atol to first(std(fmnc)) test.

569b299

Add tolerance in prima test.

93eb118

More tolerance additions

f72da72

Merge branch 'main' of github.com:JuliaStats/MixedModels.jl into db/u…

e596f70

…nconstrained

palday added 2 commits August 20, 2025 21:30

bump version to 5.0.0-DEV

269b555

NEWS update

2619450

palday merged commit e9a139b into main Aug 21, 2025
8 of 9 checks passed

palday deleted the db/unconstrained branch August 21, 2025 02:39

palday mentioned this pull request Aug 21, 2025

[WIP] Switch to unconstrained optimization followed by "rectifying" θ #838

Closed

[WIP] Switch to unconstrained optimizers - step 1 #840

[WIP] Switch to unconstrained optimizers - step 1 #840

Uh oh!

Conversation

dmbates commented Jul 19, 2025 • edited by palday Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dmbates commented Jul 19, 2025

Uh oh!

dmbates commented Jul 20, 2025

Uh oh!

dmbates commented Jul 20, 2025

Uh oh!

andreasnoack commented Jul 20, 2025

Uh oh!

dmbates commented Jul 21, 2025

Uh oh!

dmbates commented Jul 21, 2025

Uh oh!

dmbates commented Jul 21, 2025

Uh oh!

dmbates commented Jul 21, 2025

Uh oh!

dmbates commented Jul 21, 2025

Uh oh!

dmbates commented Jul 21, 2025

Uh oh!

dmbates commented Jul 21, 2025

Uh oh!

dmbates commented Jul 21, 2025

Uh oh!

andreasnoack commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

OpenBLAS

MKL

Update:

Uh oh!

dmbates commented Jul 22, 2025

Uh oh!

dmbates commented Jul 22, 2025

Uh oh!

codecov bot commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dmbates commented Aug 4, 2025

Uh oh!

palday commented Aug 4, 2025

Uh oh!

dmbates commented Aug 5, 2025

Uh oh!

dmbates commented Aug 5, 2025

Uh oh!

palday commented Aug 6, 2025

Uh oh!

palday Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andreasnoack commented Aug 6, 2025

Uh oh!

dmbates commented Aug 12, 2025

Uh oh!

palday commented Aug 13, 2025

Uh oh!

dmbates commented Aug 15, 2025

Uh oh!

palday commented Aug 16, 2025

Uh oh!

dmbates commented Aug 20, 2025

Uh oh!

palday commented Aug 21, 2025

Uh oh!

Uh oh!

Uh oh!

dmbates commented Jul 19, 2025 •

edited by palday

Loading

andreasnoack commented Jul 21, 2025 •

edited

Loading

codecov bot commented Aug 1, 2025 •

edited

Loading

palday Aug 6, 2025 •

edited

Loading