You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: paper/paper.md
+31-29Lines changed: 31 additions & 29 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -51,29 +51,31 @@ Moreover, they can handle cases where Hessian approximations are unbounded[@diou
51
51
52
52
# Statement of need
53
53
54
-
## Unified framework for nonsmooth methods
54
+
## Model-based framework for nonsmooth methods
55
55
56
-
There exists a way to solve \eqref{eq:nlp} in Julia via [ProximalAlgorithms.jl](https://github.com/JuliaFirstOrder/ProximalAlgorithms.jl).
57
-
It implements several proximal algorithms for nonsmooth optimization.
58
-
However, the available examples only consider convex instances of $h$, namely the $\ell_1$ norm and there are no tests for memory allocations.
59
-
Moreover, it implements only one quasi-Newton method (L-BFGS) and does not support other Hessian approximations.
56
+
There exists a way to solve \eqref{eq:nlp} in Julia using [ProximalAlgorithms.jl](https://github.com/JuliaFirstOrder/ProximalAlgorithms.jl), which implements first-order line search–based methods for composite optimization.
57
+
These methods are generally splitting schemes that alternate between gradient steps on the smooth part $f$, or directions derived from it, and proximal steps on the nonsmooth part $h$.
58
+
By contrast, **RegularizedOptimization.jl** focuses on model-based approaches such as trust-region and regularization algorithms.
59
+
As shown in [@aravkin-baraldi-orban-2022], model-based methods typically require fewer evaluations of the objective and its gradient than first-order line search methods, at the expense of solving more involved subproblems.
60
+
Although these subproblems may require many proximal iterations, each proximal computation is inexpensive, making the overall approach efficient for large-scale problems.
60
61
61
-
**RegularizedOptimization.jl**, in contrast, implements a broad class of regularization-based algorithms for solving problems of the form $f(x) + h(x)$, where $f$ is smooth and $h$ is nonsmooth.
62
-
The package offers a consistent API to formulate optimization problems and apply different regularization methods.
62
+
Building on this perspective, **RegularizedOptimization.jl** implements a broad class of regularization-based algorithms for solving problems of the form $f(x) + h(x)$, where $f$ is smooth and $h$ is nonsmooth.
63
+
The package provides a consistent API to formulate optimization problems and apply different regularization methods.
63
64
It integrates seamlessly with the [JuliaSmoothOptimizers](https://github.com/JuliaSmoothOptimizers) ecosystem, an academic organization for nonlinear optimization software development, testing, and benchmarking.
-**Definition of smooth problems $f$**via [NLPModels.jl](https://github.com/JuliaSmoothOptimizers/NLPModels.jl)[@orban-siqueira-nlpmodels-2020] which provides a standardized Julia API for representing nonlinear programming (NLP) problems.
66
+
On the one hand, smooth problems $f$ can be defined via [NLPModels.jl](https://github.com/JuliaSmoothOptimizers/NLPModels.jl)[@orban-siqueira-nlpmodels-2020], which provides a standardized Julia API for representing nonlinear programming (NLP) problems.
67
67
Large collections of such problems are available in [Cutest.jl](https://github.com/JuliaSmoothOptimizers/CUTEst.jl)[@orban-siqueira-cutest-2020] and [OptimizationProblems.jl](https://github.com/JuliaSmoothOptimizers/OptimizationProblems.jl)[@migot-orban-siqueira-optimizationproblems-2023].
68
-
Another option is to use [RegularizedProblems.jl](https://github.com/JuliaSmoothOptimizers/RegularizedProblems.jl), which provides instances commonly used in the nonsmooth optimization literature.
69
-
-**Hessian approximations (quasi-Newton, diagonal approximations)** via [LinearOperators.jl](https://github.com/JuliaSmoothOptimizers/LinearOperators.jl), which represents Hessians as linear operators and implements efficient Hessian–vector products.
70
-
-**Definition of nonsmooth terms $h$** via [ProximalOperators.jl](https://github.com/JuliaSmoothOptimizers/ProximalOperators.jl), which offers a large collection of nonsmooth functions, and [ShiftedProximalOperators.jl](https://github.com/JuliaSmoothOptimizers/ShiftedProximalOperators.jl), which provides shifted proximal mappings for nonsmooth functions.
68
+
Another option is to use [RegularizedProblems.jl](https://github.com/JuliaSmoothOptimizers/RegularizedProblems.jl), which provides problem instances commonly used in the nonsmooth optimization literature.
69
+
70
+
On the other hand, Hessian approximations of these functions, including quasi-Newton and diagonal schemes, can be specified through [LinearOperators.jl](https://github.com/JuliaSmoothOptimizers/LinearOperators.jl), which represents Hessians as linear operators and implements efficient Hessian–vector products.
71
+
72
+
Finally, nonsmooth terms $h$ can be modeled using [ProximalOperators.jl](https://github.com/JuliaSmoothOptimizers/ProximalOperators.jl), which provides a broad collection of nonsmooth functions, together with [ShiftedProximalOperators.jl](https://github.com/JuliaSmoothOptimizers/ShiftedProximalOperators.jl), which provides shifted proximal mappings for nonsmooth functions.
71
73
72
74
This modularity makes it easy to benchmark existing solvers available in the repository [@diouane-habiboullah-orban-2024], [@aravkin-baraldi-orban-2022], [@aravkin-baraldi-orban-2024], and [@leconte-orban-2023-2].
73
75
74
76
## Support for inexact subproblem solves
75
77
76
-
Solvers in **RegularizedOptimization.jl** allow inexact resolution of trust-region and cubic-regularized subproblems using first-order that are implemented in the package itself such as the quadratic regularization method R2[@aravkin-baraldi-orban-2022] and R2DH[@diouane-habiboullah-orban-2024] with trust-region variants TRDH[@leconte-orban-2023-2]
78
+
Solvers in **RegularizedOptimization.jl** allow inexact resolution of trust-region and quadratic-regularized subproblems using first-order that are implemented in the package itself such as the quadratic regularization method R2[@aravkin-baraldi-orban-2022] and R2DH[@diouane-habiboullah-orban-2024] with trust-region variants TRDH[@leconte-orban-2023-2].
77
79
78
80
This is crucial for large-scale problems where exact subproblem solutions are prohibitive.
79
81
@@ -86,35 +88,35 @@ In contrast, many problems admit efficient implementations of Hessian–vector o
86
88
## In-place methods
87
89
88
90
All solvers in **RegularizedOptimization.jl** are implemented in an in-place fashion, minimizing memory allocations and improving performance.
89
-
This is particularly important for large-scale problems, where memory usage can become a bottleneck.
90
-
Even in low-dimensional settings, Julia may exhibit significantly slower performance due to extra allocations, making the in-place design a key feature of the package.
91
91
92
92
# Examples
93
93
94
-
A simple example is the solution of a regularized quadratic problem with an $\ell_1$ penalty, as described in [@aravkin-baraldi-orban-2022].
95
-
Such problems are common in statistical learning and compressed sensing applications.The formulation is
We consider two examples that are where both the smooth part $f$ and the nonsmooth part $h$ are nonconvex.
95
+
96
+
A first example addresses an image recognition task using a support vector machine (SVM) similar to those in [@aravkin-baraldi-orban-2022] and [@diouane-habiboullah-orban-2024].
97
+
The formulation is
98
98
$$
99
-
where $A \in \mathbb{R}^{m \times n}$, $b \in \mathbb{R}^m$, and $\lambda>0$ is a regularization parameter.
99
+
\min_{x \in \mathbb{R}^n} \ \tfrac{1}{2} \|\mathbf{1} - \tanh(b \odot \langle A, x \rangle)\|^2 + \lambda \|x\|_0,
100
+
$$
101
+
where $\lambda = 10^{-1}$ and $A \in \mathbb{R}^{m \times n}$, with $n = 784$ representing the vectorized size of each image and $m = 13{,}007$ is the number of images in the training dataset.
100
102
101
103
```julia
102
104
using LinearAlgebra, Random
103
105
using ProximalOperators
104
106
using NLPModels, NLPModelsModifiers, RegularizedProblems, RegularizedOptimization, SolverCore
0 commit comments