Skip to content

Commit 0df3739

Browse files
incorporate Dominique comments
1 parent afd7753 commit 0df3739

File tree

2 files changed

+31
-55
lines changed

2 files changed

+31
-55
lines changed

.cirrus.yml

Lines changed: 0 additions & 26 deletions
This file was deleted.

paper/paper.md

Lines changed: 31 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -51,29 +51,31 @@ Moreover, they can handle cases where Hessian approximations are unbounded[@diou
5151

5252
# Statement of need
5353

54-
## Unified framework for nonsmooth methods
54+
## Model-based framework for nonsmooth methods
5555

56-
There exists a way to solve \eqref{eq:nlp} in Julia via [ProximalAlgorithms.jl](https://github.com/JuliaFirstOrder/ProximalAlgorithms.jl).
57-
It implements several proximal algorithms for nonsmooth optimization.
58-
However, the available examples only consider convex instances of $h$, namely the $\ell_1$ norm and there are no tests for memory allocations.
59-
Moreover, it implements only one quasi-Newton method (L-BFGS) and does not support other Hessian approximations.
56+
There exists a way to solve \eqref{eq:nlp} in Julia using [ProximalAlgorithms.jl](https://github.com/JuliaFirstOrder/ProximalAlgorithms.jl), which implements first-order line search–based methods for composite optimization.
57+
These methods are generally splitting schemes that alternate between gradient steps on the smooth part $f$, or directions derived from it, and proximal steps on the nonsmooth part $h$.
58+
By contrast, **RegularizedOptimization.jl** focuses on model-based approaches such as trust-region and regularization algorithms.
59+
As shown in [@aravkin-baraldi-orban-2022], model-based methods typically require fewer evaluations of the objective and its gradient than first-order line search methods, at the expense of solving more involved subproblems.
60+
Although these subproblems may require many proximal iterations, each proximal computation is inexpensive, making the overall approach efficient for large-scale problems.
6061

61-
**RegularizedOptimization.jl**, in contrast, implements a broad class of regularization-based algorithms for solving problems of the form $f(x) + h(x)$, where $f$ is smooth and $h$ is nonsmooth.
62-
The package offers a consistent API to formulate optimization problems and apply different regularization methods.
62+
Building on this perspective, **RegularizedOptimization.jl** implements a broad class of regularization-based algorithms for solving problems of the form $f(x) + h(x)$, where $f$ is smooth and $h$ is nonsmooth.
63+
The package provides a consistent API to formulate optimization problems and apply different regularization methods.
6364
It integrates seamlessly with the [JuliaSmoothOptimizers](https://github.com/JuliaSmoothOptimizers) ecosystem, an academic organization for nonlinear optimization software development, testing, and benchmarking.
64-
Specifically, **RegularizedOptimization.jl** interoperates with:
6565

66-
- **Definition of smooth problems $f$** via [NLPModels.jl](https://github.com/JuliaSmoothOptimizers/NLPModels.jl) [@orban-siqueira-nlpmodels-2020] which provides a standardized Julia API for representing nonlinear programming (NLP) problems.
66+
On the one hand, smooth problems $f$ can be defined via [NLPModels.jl](https://github.com/JuliaSmoothOptimizers/NLPModels.jl) [@orban-siqueira-nlpmodels-2020], which provides a standardized Julia API for representing nonlinear programming (NLP) problems.
6767
Large collections of such problems are available in [Cutest.jl](https://github.com/JuliaSmoothOptimizers/CUTEst.jl) [@orban-siqueira-cutest-2020] and [OptimizationProblems.jl](https://github.com/JuliaSmoothOptimizers/OptimizationProblems.jl) [@migot-orban-siqueira-optimizationproblems-2023].
68-
Another option is to use [RegularizedProblems.jl](https://github.com/JuliaSmoothOptimizers/RegularizedProblems.jl), which provides instances commonly used in the nonsmooth optimization literature.
69-
- **Hessian approximations (quasi-Newton, diagonal approximations)** via [LinearOperators.jl](https://github.com/JuliaSmoothOptimizers/LinearOperators.jl), which represents Hessians as linear operators and implements efficient Hessian–vector products.
70-
- **Definition of nonsmooth terms $h$** via [ProximalOperators.jl](https://github.com/JuliaSmoothOptimizers/ProximalOperators.jl), which offers a large collection of nonsmooth functions, and [ShiftedProximalOperators.jl](https://github.com/JuliaSmoothOptimizers/ShiftedProximalOperators.jl), which provides shifted proximal mappings for nonsmooth functions.
68+
Another option is to use [RegularizedProblems.jl](https://github.com/JuliaSmoothOptimizers/RegularizedProblems.jl), which provides problem instances commonly used in the nonsmooth optimization literature.
69+
70+
On the other hand, Hessian approximations of these functions, including quasi-Newton and diagonal schemes, can be specified through [LinearOperators.jl](https://github.com/JuliaSmoothOptimizers/LinearOperators.jl), which represents Hessians as linear operators and implements efficient Hessian–vector products.
71+
72+
Finally, nonsmooth terms $h$ can be modeled using [ProximalOperators.jl](https://github.com/JuliaSmoothOptimizers/ProximalOperators.jl), which provides a broad collection of nonsmooth functions, together with [ShiftedProximalOperators.jl](https://github.com/JuliaSmoothOptimizers/ShiftedProximalOperators.jl), which provides shifted proximal mappings for nonsmooth functions.
7173

7274
This modularity makes it easy to benchmark existing solvers available in the repository [@diouane-habiboullah-orban-2024], [@aravkin-baraldi-orban-2022], [@aravkin-baraldi-orban-2024], and [@leconte-orban-2023-2].
7375

7476
## Support for inexact subproblem solves
7577

76-
Solvers in **RegularizedOptimization.jl** allow inexact resolution of trust-region and cubic-regularized subproblems using first-order that are implemented in the package itself such as the quadratic regularization method R2[@aravkin-baraldi-orban-2022] and R2DH[@diouane-habiboullah-orban-2024] with trust-region variants TRDH[@leconte-orban-2023-2]
78+
Solvers in **RegularizedOptimization.jl** allow inexact resolution of trust-region and quadratic-regularized subproblems using first-order that are implemented in the package itself such as the quadratic regularization method R2[@aravkin-baraldi-orban-2022] and R2DH[@diouane-habiboullah-orban-2024] with trust-region variants TRDH[@leconte-orban-2023-2].
7779

7880
This is crucial for large-scale problems where exact subproblem solutions are prohibitive.
7981

@@ -86,35 +88,35 @@ In contrast, many problems admit efficient implementations of Hessian–vector o
8688
## In-place methods
8789

8890
All solvers in **RegularizedOptimization.jl** are implemented in an in-place fashion, minimizing memory allocations and improving performance.
89-
This is particularly important for large-scale problems, where memory usage can become a bottleneck.
90-
Even in low-dimensional settings, Julia may exhibit significantly slower performance due to extra allocations, making the in-place design a key feature of the package.
9191

9292
# Examples
9393

94-
A simple example is the solution of a regularized quadratic problem with an $\ell_1$ penalty, as described in [@aravkin-baraldi-orban-2022].
95-
Such problems are common in statistical learning and compressed sensing applications.The formulation is
96-
$$
97-
\min_{x \in \mathbb{R}^n} \ \tfrac{1}{2}\|Ax-b\|_2^2+\lambda\|x\|_0,
94+
We consider two examples that are where both the smooth part $f$ and the nonsmooth part $h$ are nonconvex.
95+
96+
A first example addresses an image recognition task using a support vector machine (SVM) similar to those in [@aravkin-baraldi-orban-2022] and [@diouane-habiboullah-orban-2024].
97+
The formulation is
9898
$$
99-
where $A \in \mathbb{R}^{m \times n}$, $b \in \mathbb{R}^m$, and $\lambda>0$ is a regularization parameter.
99+
\min_{x \in \mathbb{R}^n} \ \tfrac{1}{2} \|\mathbf{1} - \tanh(b \odot \langle A, x \rangle)\|^2 + \lambda \|x\|_0,
100+
$$
101+
where $\lambda = 10^{-1}$ and $A \in \mathbb{R}^{m \times n}$, with $n = 784$ representing the vectorized size of each image and $m = 13{,}007$ is the number of images in the training dataset.
100102

101103
```julia
102104
using LinearAlgebra, Random
103105
using ProximalOperators
104106
using NLPModels, NLPModelsModifiers, RegularizedProblems, RegularizedOptimization, SolverCore
107+
using MLDatasets
105108

106-
# Set random seed for reproducibility
107-
Random.seed!(1234)
109+
random_seed = 1234
110+
Random.seed!(random_seed)
108111

109-
# Define a basis pursuit denoising problem
110-
compound = 10
111-
bpdn_model, _, _ = bpdn_model(compound)
112+
# Load the MNIST dataset
113+
model, _, _ = RegularizedProblems.svm_train_model()
112114

113115
# Define the Hessian approximation
114-
f = SpectralGradientModel(bpdn)
116+
f = LBFGSModel(model)
115117

116-
# Define the nonsmooth regularizer (L1 norm)
117-
λ = norm(grad(bpdn_model, zeros(bpdn_model.meta.nvar)), Inf) / 10
118+
# Define the nonsmooth regularizer (L0 norm)
119+
λ = 1.0e-1
118120
h = NormL0(λ)
119121

120122
# Define the regularized NLP model
@@ -125,7 +127,7 @@ solver_r2dh= R2DHSolver(reg_nlp)
125127
stats = RegularizedExecutionStats(reg_nlp)
126128

127129
# Solve the problem
128-
solve!(solver_r2dh, reg_nlp, stats, x = f.meta.x0, σk = 1.0, atol = 1e-8, rtol = 1e-8, verbose = 1)
130+
solve!(solver_r2dh, reg_nlp, stats, x = f.meta.x0, σk = 1e-6, atol = 2e-5, rtol = 2e-5, verbose = 1)
129131

130132
```
131133

0 commit comments

Comments
 (0)