asd
This commit is contained in:
76
venv/lib/python3.12/site-packages/scipy/optimize/README
Normal file
76
venv/lib/python3.12/site-packages/scipy/optimize/README
Normal file
@ -0,0 +1,76 @@
|
||||
From the website for the L-BFGS-B code (from at
|
||||
http://www.ece.northwestern.edu/~nocedal/lbfgsb.html):
|
||||
|
||||
"""
|
||||
L-BFGS-B is a limited-memory quasi-Newton code for bound-constrained
|
||||
optimization, i.e. for problems where the only constraints are of the
|
||||
form l<= x <= u.
|
||||
"""
|
||||
|
||||
This is a Python wrapper (using F2PY) written by David M. Cooke
|
||||
<cookedm@physics.mcmaster.ca> and released as version 0.9 on April 9, 2004.
|
||||
The wrapper was slightly modified by Joonas Paalasmaa for the 3.0 version
|
||||
in March 2012.
|
||||
|
||||
License of L-BFGS-B (Fortran code)
|
||||
==================================
|
||||
|
||||
The version included here (in lbfgsb.f) is 3.0 (released April 25, 2011). It was
|
||||
written by Ciyou Zhu, Richard Byrd, and Jorge Nocedal <nocedal@ece.nwu.edu>. It
|
||||
carries the following condition for use:
|
||||
|
||||
"""
|
||||
This software is freely available, but we expect that all publications
|
||||
describing work using this software, or all commercial products using it,
|
||||
quote at least one of the references given below. This software is released
|
||||
under the BSD License.
|
||||
|
||||
References
|
||||
* R. H. Byrd, P. Lu and J. Nocedal. A Limited Memory Algorithm for Bound
|
||||
Constrained Optimization, (1995), SIAM Journal on Scientific and
|
||||
Statistical Computing, 16, 5, pp. 1190-1208.
|
||||
* C. Zhu, R. H. Byrd and J. Nocedal. L-BFGS-B: Algorithm 778: L-BFGS-B,
|
||||
FORTRAN routines for large scale bound constrained optimization (1997),
|
||||
ACM Transactions on Mathematical Software, 23, 4, pp. 550 - 560.
|
||||
* J.L. Morales and J. Nocedal. L-BFGS-B: Remark on Algorithm 778: L-BFGS-B,
|
||||
FORTRAN routines for large scale bound constrained optimization (2011),
|
||||
ACM Transactions on Mathematical Software, 38, 1.
|
||||
"""
|
||||
|
||||
The Python wrapper
|
||||
==================
|
||||
|
||||
This code uses F2PY (http://cens.ioc.ee/projects/f2py2e/) to generate
|
||||
the wrapper around the Fortran code.
|
||||
|
||||
The Python code and wrapper are copyrighted 2004 by David M. Cooke
|
||||
<cookedm@physics.mcmaster.ca>.
|
||||
|
||||
Example usage
|
||||
=============
|
||||
|
||||
An example of the usage is given at the bottom of the lbfgsb.py file.
|
||||
Run it with 'python lbfgsb.py'.
|
||||
|
||||
License for the Python wrapper
|
||||
==============================
|
||||
|
||||
Copyright (c) 2004 David M. Cooke <cookedm@physics.mcmaster.ca>
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy of
|
||||
this software and associated documentation files (the "Software"), to deal in
|
||||
the Software without restriction, including without limitation the rights to
|
||||
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
|
||||
of the Software, and to permit persons to whom the Software is furnished to do
|
||||
so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
452
venv/lib/python3.12/site-packages/scipy/optimize/__init__.py
Normal file
452
venv/lib/python3.12/site-packages/scipy/optimize/__init__.py
Normal file
@ -0,0 +1,452 @@
|
||||
"""
|
||||
=====================================================
|
||||
Optimization and root finding (:mod:`scipy.optimize`)
|
||||
=====================================================
|
||||
|
||||
.. currentmodule:: scipy.optimize
|
||||
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
||||
optimize.cython_optimize
|
||||
|
||||
SciPy ``optimize`` provides functions for minimizing (or maximizing)
|
||||
objective functions, possibly subject to constraints. It includes
|
||||
solvers for nonlinear problems (with support for both local and global
|
||||
optimization algorithms), linear programming, constrained
|
||||
and nonlinear least-squares, root finding, and curve fitting.
|
||||
|
||||
Common functions and objects, shared across different solvers, are:
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
show_options - Show specific options optimization solvers.
|
||||
OptimizeResult - The optimization result returned by some optimizers.
|
||||
OptimizeWarning - The optimization encountered problems.
|
||||
|
||||
|
||||
Optimization
|
||||
============
|
||||
|
||||
Scalar functions optimization
|
||||
-----------------------------
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
minimize_scalar - Interface for minimizers of univariate functions
|
||||
|
||||
The `minimize_scalar` function supports the following methods:
|
||||
|
||||
.. toctree::
|
||||
|
||||
optimize.minimize_scalar-brent
|
||||
optimize.minimize_scalar-bounded
|
||||
optimize.minimize_scalar-golden
|
||||
|
||||
Local (multivariate) optimization
|
||||
---------------------------------
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
minimize - Interface for minimizers of multivariate functions.
|
||||
|
||||
The `minimize` function supports the following methods:
|
||||
|
||||
.. toctree::
|
||||
|
||||
optimize.minimize-neldermead
|
||||
optimize.minimize-powell
|
||||
optimize.minimize-cg
|
||||
optimize.minimize-bfgs
|
||||
optimize.minimize-newtoncg
|
||||
optimize.minimize-lbfgsb
|
||||
optimize.minimize-tnc
|
||||
optimize.minimize-cobyla
|
||||
optimize.minimize-cobyqa
|
||||
optimize.minimize-slsqp
|
||||
optimize.minimize-trustconstr
|
||||
optimize.minimize-dogleg
|
||||
optimize.minimize-trustncg
|
||||
optimize.minimize-trustkrylov
|
||||
optimize.minimize-trustexact
|
||||
|
||||
Constraints are passed to `minimize` function as a single object or
|
||||
as a list of objects from the following classes:
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
NonlinearConstraint - Class defining general nonlinear constraints.
|
||||
LinearConstraint - Class defining general linear constraints.
|
||||
|
||||
Simple bound constraints are handled separately and there is a special class
|
||||
for them:
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
Bounds - Bound constraints.
|
||||
|
||||
Quasi-Newton strategies implementing `HessianUpdateStrategy`
|
||||
interface can be used to approximate the Hessian in `minimize`
|
||||
function (available only for the 'trust-constr' method). Available
|
||||
quasi-Newton methods implementing this interface are:
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
BFGS - Broyden-Fletcher-Goldfarb-Shanno (BFGS) Hessian update strategy.
|
||||
SR1 - Symmetric-rank-1 Hessian update strategy.
|
||||
|
||||
.. _global_optimization:
|
||||
|
||||
Global optimization
|
||||
-------------------
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
basinhopping - Basinhopping stochastic optimizer.
|
||||
brute - Brute force searching optimizer.
|
||||
differential_evolution - Stochastic optimizer using differential evolution.
|
||||
|
||||
shgo - Simplicial homology global optimizer.
|
||||
dual_annealing - Dual annealing stochastic optimizer.
|
||||
direct - DIRECT (Dividing Rectangles) optimizer.
|
||||
|
||||
Least-squares and curve fitting
|
||||
===============================
|
||||
|
||||
Nonlinear least-squares
|
||||
-----------------------
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
least_squares - Solve a nonlinear least-squares problem with bounds on the variables.
|
||||
|
||||
Linear least-squares
|
||||
--------------------
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
nnls - Linear least-squares problem with non-negativity constraint.
|
||||
lsq_linear - Linear least-squares problem with bound constraints.
|
||||
isotonic_regression - Least squares problem of isotonic regression via PAVA.
|
||||
|
||||
Curve fitting
|
||||
-------------
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
curve_fit -- Fit curve to a set of points.
|
||||
|
||||
Root finding
|
||||
============
|
||||
|
||||
Scalar functions
|
||||
----------------
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
root_scalar - Unified interface for nonlinear solvers of scalar functions.
|
||||
brentq - quadratic interpolation Brent method.
|
||||
brenth - Brent method, modified by Harris with hyperbolic extrapolation.
|
||||
ridder - Ridder's method.
|
||||
bisect - Bisection method.
|
||||
newton - Newton's method (also Secant and Halley's methods).
|
||||
toms748 - Alefeld, Potra & Shi Algorithm 748.
|
||||
RootResults - The root finding result returned by some root finders.
|
||||
|
||||
The `root_scalar` function supports the following methods:
|
||||
|
||||
.. toctree::
|
||||
|
||||
optimize.root_scalar-brentq
|
||||
optimize.root_scalar-brenth
|
||||
optimize.root_scalar-bisect
|
||||
optimize.root_scalar-ridder
|
||||
optimize.root_scalar-newton
|
||||
optimize.root_scalar-toms748
|
||||
optimize.root_scalar-secant
|
||||
optimize.root_scalar-halley
|
||||
|
||||
|
||||
|
||||
The table below lists situations and appropriate methods, along with
|
||||
*asymptotic* convergence rates per iteration (and per function evaluation)
|
||||
for successful convergence to a simple root(*).
|
||||
Bisection is the slowest of them all, adding one bit of accuracy for each
|
||||
function evaluation, but is guaranteed to converge.
|
||||
The other bracketing methods all (eventually) increase the number of accurate
|
||||
bits by about 50% for every function evaluation.
|
||||
The derivative-based methods, all built on `newton`, can converge quite quickly
|
||||
if the initial value is close to the root. They can also be applied to
|
||||
functions defined on (a subset of) the complex plane.
|
||||
|
||||
+-------------+----------+----------+-----------+-------------+-------------+----------------+
|
||||
| Domain of f | Bracket? | Derivatives? | Solvers | Convergence |
|
||||
+ + +----------+-----------+ +-------------+----------------+
|
||||
| | | `fprime` | `fprime2` | | Guaranteed? | Rate(s)(*) |
|
||||
+=============+==========+==========+===========+=============+=============+================+
|
||||
| `R` | Yes | N/A | N/A | - bisection | - Yes | - 1 "Linear" |
|
||||
| | | | | - brentq | - Yes | - >=1, <= 1.62 |
|
||||
| | | | | - brenth | - Yes | - >=1, <= 1.62 |
|
||||
| | | | | - ridder | - Yes | - 2.0 (1.41) |
|
||||
| | | | | - toms748 | - Yes | - 2.7 (1.65) |
|
||||
+-------------+----------+----------+-----------+-------------+-------------+----------------+
|
||||
| `R` or `C` | No | No | No | secant | No | 1.62 (1.62) |
|
||||
+-------------+----------+----------+-----------+-------------+-------------+----------------+
|
||||
| `R` or `C` | No | Yes | No | newton | No | 2.00 (1.41) |
|
||||
+-------------+----------+----------+-----------+-------------+-------------+----------------+
|
||||
| `R` or `C` | No | Yes | Yes | halley | No | 3.00 (1.44) |
|
||||
+-------------+----------+----------+-----------+-------------+-------------+----------------+
|
||||
|
||||
.. seealso::
|
||||
|
||||
`scipy.optimize.cython_optimize` -- Typed Cython versions of root finding functions
|
||||
|
||||
Fixed point finding:
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
fixed_point - Single-variable fixed-point solver.
|
||||
|
||||
Multidimensional
|
||||
----------------
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
root - Unified interface for nonlinear solvers of multivariate functions.
|
||||
|
||||
The `root` function supports the following methods:
|
||||
|
||||
.. toctree::
|
||||
|
||||
optimize.root-hybr
|
||||
optimize.root-lm
|
||||
optimize.root-broyden1
|
||||
optimize.root-broyden2
|
||||
optimize.root-anderson
|
||||
optimize.root-linearmixing
|
||||
optimize.root-diagbroyden
|
||||
optimize.root-excitingmixing
|
||||
optimize.root-krylov
|
||||
optimize.root-dfsane
|
||||
|
||||
Linear programming / MILP
|
||||
=========================
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
milp -- Mixed integer linear programming.
|
||||
linprog -- Unified interface for minimizers of linear programming problems.
|
||||
|
||||
The `linprog` function supports the following methods:
|
||||
|
||||
.. toctree::
|
||||
|
||||
optimize.linprog-simplex
|
||||
optimize.linprog-interior-point
|
||||
optimize.linprog-revised_simplex
|
||||
optimize.linprog-highs-ipm
|
||||
optimize.linprog-highs-ds
|
||||
optimize.linprog-highs
|
||||
|
||||
The simplex, interior-point, and revised simplex methods support callback
|
||||
functions, such as:
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
linprog_verbose_callback -- Sample callback function for linprog (simplex).
|
||||
|
||||
Assignment problems
|
||||
===================
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
linear_sum_assignment -- Solves the linear-sum assignment problem.
|
||||
quadratic_assignment -- Solves the quadratic assignment problem.
|
||||
|
||||
The `quadratic_assignment` function supports the following methods:
|
||||
|
||||
.. toctree::
|
||||
|
||||
optimize.qap-faq
|
||||
optimize.qap-2opt
|
||||
|
||||
Utilities
|
||||
=========
|
||||
|
||||
Finite-difference approximation
|
||||
-------------------------------
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
approx_fprime - Approximate the gradient of a scalar function.
|
||||
check_grad - Check the supplied derivative using finite differences.
|
||||
|
||||
|
||||
Line search
|
||||
-----------
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
bracket - Bracket a minimum, given two starting points.
|
||||
line_search - Return a step that satisfies the strong Wolfe conditions.
|
||||
|
||||
Hessian approximation
|
||||
---------------------
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
LbfgsInvHessProduct - Linear operator for L-BFGS approximate inverse Hessian.
|
||||
HessianUpdateStrategy - Interface for implementing Hessian update strategies
|
||||
|
||||
Benchmark problems
|
||||
------------------
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
rosen - The Rosenbrock function.
|
||||
rosen_der - The derivative of the Rosenbrock function.
|
||||
rosen_hess - The Hessian matrix of the Rosenbrock function.
|
||||
rosen_hess_prod - Product of the Rosenbrock Hessian with a vector.
|
||||
|
||||
Legacy functions
|
||||
================
|
||||
|
||||
The functions below are not recommended for use in new scripts;
|
||||
all of these methods are accessible via a newer, more consistent
|
||||
interfaces, provided by the interfaces above.
|
||||
|
||||
Optimization
|
||||
------------
|
||||
|
||||
General-purpose multivariate methods:
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
fmin - Nelder-Mead Simplex algorithm.
|
||||
fmin_powell - Powell's (modified) conjugate direction method.
|
||||
fmin_cg - Non-linear (Polak-Ribiere) conjugate gradient algorithm.
|
||||
fmin_bfgs - Quasi-Newton method (Broydon-Fletcher-Goldfarb-Shanno).
|
||||
fmin_ncg - Line-search Newton Conjugate Gradient.
|
||||
|
||||
Constrained multivariate methods:
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
fmin_l_bfgs_b - Zhu, Byrd, and Nocedal's constrained optimizer.
|
||||
fmin_tnc - Truncated Newton code.
|
||||
fmin_cobyla - Constrained optimization by linear approximation.
|
||||
fmin_slsqp - Minimization using sequential least-squares programming.
|
||||
|
||||
Univariate (scalar) minimization methods:
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
fminbound - Bounded minimization of a scalar function.
|
||||
brent - 1-D function minimization using Brent method.
|
||||
golden - 1-D function minimization using Golden Section method.
|
||||
|
||||
Least-squares
|
||||
-------------
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
leastsq - Minimize the sum of squares of M equations in N unknowns.
|
||||
|
||||
Root finding
|
||||
------------
|
||||
|
||||
General nonlinear solvers:
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
fsolve - Non-linear multivariable equation solver.
|
||||
broyden1 - Broyden's first method.
|
||||
broyden2 - Broyden's second method.
|
||||
NoConvergence - Exception raised when nonlinear solver does not converge.
|
||||
|
||||
Large-scale nonlinear solvers:
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
newton_krylov
|
||||
anderson
|
||||
|
||||
BroydenFirst
|
||||
InverseJacobian
|
||||
KrylovJacobian
|
||||
|
||||
Simple iteration solvers:
|
||||
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
excitingmixing
|
||||
linearmixing
|
||||
diagbroyden
|
||||
|
||||
""" # noqa: E501
|
||||
|
||||
from ._optimize import *
|
||||
from ._minimize import *
|
||||
from ._root import *
|
||||
from ._root_scalar import *
|
||||
from ._minpack_py import *
|
||||
from ._zeros_py import *
|
||||
from ._lbfgsb_py import fmin_l_bfgs_b, LbfgsInvHessProduct
|
||||
from ._tnc import fmin_tnc
|
||||
from ._cobyla_py import fmin_cobyla
|
||||
from ._nonlin import *
|
||||
from ._slsqp_py import fmin_slsqp
|
||||
from ._nnls import nnls
|
||||
from ._basinhopping import basinhopping
|
||||
from ._linprog import linprog, linprog_verbose_callback
|
||||
from ._lsap import linear_sum_assignment
|
||||
from ._differentialevolution import differential_evolution
|
||||
from ._lsq import least_squares, lsq_linear
|
||||
from ._isotonic import isotonic_regression
|
||||
from ._constraints import (NonlinearConstraint,
|
||||
LinearConstraint,
|
||||
Bounds)
|
||||
from ._hessian_update_strategy import HessianUpdateStrategy, BFGS, SR1
|
||||
from ._shgo import shgo
|
||||
from ._dual_annealing import dual_annealing
|
||||
from ._qap import quadratic_assignment
|
||||
from ._direct_py import direct
|
||||
from ._milp import milp
|
||||
|
||||
# Deprecated namespaces, to be removed in v2.0.0
|
||||
from . import (
|
||||
cobyla, lbfgsb, linesearch, minpack, minpack2, moduleTNC, nonlin, optimize,
|
||||
slsqp, tnc, zeros
|
||||
)
|
||||
|
||||
__all__ = [s for s in dir() if not s.startswith('_')]
|
||||
|
||||
from scipy._lib._testutils import PytestTester
|
||||
test = PytestTester(__name__)
|
||||
del PytestTester
|
||||
@ -0,0 +1,753 @@
|
||||
"""
|
||||
basinhopping: The basinhopping global optimization algorithm
|
||||
"""
|
||||
import numpy as np
|
||||
import math
|
||||
import inspect
|
||||
import scipy.optimize
|
||||
from scipy._lib._util import check_random_state
|
||||
|
||||
__all__ = ['basinhopping']
|
||||
|
||||
|
||||
_params = (inspect.Parameter('res_new', kind=inspect.Parameter.KEYWORD_ONLY),
|
||||
inspect.Parameter('res_old', kind=inspect.Parameter.KEYWORD_ONLY))
|
||||
_new_accept_test_signature = inspect.Signature(parameters=_params)
|
||||
|
||||
|
||||
class Storage:
|
||||
"""
|
||||
Class used to store the lowest energy structure
|
||||
"""
|
||||
def __init__(self, minres):
|
||||
self._add(minres)
|
||||
|
||||
def _add(self, minres):
|
||||
self.minres = minres
|
||||
self.minres.x = np.copy(minres.x)
|
||||
|
||||
def update(self, minres):
|
||||
if minres.success and (minres.fun < self.minres.fun
|
||||
or not self.minres.success):
|
||||
self._add(minres)
|
||||
return True
|
||||
else:
|
||||
return False
|
||||
|
||||
def get_lowest(self):
|
||||
return self.minres
|
||||
|
||||
|
||||
class BasinHoppingRunner:
|
||||
"""This class implements the core of the basinhopping algorithm.
|
||||
|
||||
x0 : ndarray
|
||||
The starting coordinates.
|
||||
minimizer : callable
|
||||
The local minimizer, with signature ``result = minimizer(x)``.
|
||||
The return value is an `optimize.OptimizeResult` object.
|
||||
step_taking : callable
|
||||
This function displaces the coordinates randomly. Signature should
|
||||
be ``x_new = step_taking(x)``. Note that `x` may be modified in-place.
|
||||
accept_tests : list of callables
|
||||
Each test is passed the kwargs `f_new`, `x_new`, `f_old` and
|
||||
`x_old`. These tests will be used to judge whether or not to accept
|
||||
the step. The acceptable return values are True, False, or ``"force
|
||||
accept"``. If any of the tests return False then the step is rejected.
|
||||
If ``"force accept"``, then this will override any other tests in
|
||||
order to accept the step. This can be used, for example, to forcefully
|
||||
escape from a local minimum that ``basinhopping`` is trapped in.
|
||||
disp : bool, optional
|
||||
Display status messages.
|
||||
|
||||
"""
|
||||
def __init__(self, x0, minimizer, step_taking, accept_tests, disp=False):
|
||||
self.x = np.copy(x0)
|
||||
self.minimizer = minimizer
|
||||
self.step_taking = step_taking
|
||||
self.accept_tests = accept_tests
|
||||
self.disp = disp
|
||||
|
||||
self.nstep = 0
|
||||
|
||||
# initialize return object
|
||||
self.res = scipy.optimize.OptimizeResult()
|
||||
self.res.minimization_failures = 0
|
||||
|
||||
# do initial minimization
|
||||
minres = minimizer(self.x)
|
||||
if not minres.success:
|
||||
self.res.minimization_failures += 1
|
||||
if self.disp:
|
||||
print("warning: basinhopping: local minimization failure")
|
||||
self.x = np.copy(minres.x)
|
||||
self.energy = minres.fun
|
||||
self.incumbent_minres = minres # best minimize result found so far
|
||||
if self.disp:
|
||||
print("basinhopping step %d: f %g" % (self.nstep, self.energy))
|
||||
|
||||
# initialize storage class
|
||||
self.storage = Storage(minres)
|
||||
|
||||
if hasattr(minres, "nfev"):
|
||||
self.res.nfev = minres.nfev
|
||||
if hasattr(minres, "njev"):
|
||||
self.res.njev = minres.njev
|
||||
if hasattr(minres, "nhev"):
|
||||
self.res.nhev = minres.nhev
|
||||
|
||||
def _monte_carlo_step(self):
|
||||
"""Do one Monte Carlo iteration
|
||||
|
||||
Randomly displace the coordinates, minimize, and decide whether
|
||||
or not to accept the new coordinates.
|
||||
"""
|
||||
# Take a random step. Make a copy of x because the step_taking
|
||||
# algorithm might change x in place
|
||||
x_after_step = np.copy(self.x)
|
||||
x_after_step = self.step_taking(x_after_step)
|
||||
|
||||
# do a local minimization
|
||||
minres = self.minimizer(x_after_step)
|
||||
x_after_quench = minres.x
|
||||
energy_after_quench = minres.fun
|
||||
if not minres.success:
|
||||
self.res.minimization_failures += 1
|
||||
if self.disp:
|
||||
print("warning: basinhopping: local minimization failure")
|
||||
if hasattr(minres, "nfev"):
|
||||
self.res.nfev += minres.nfev
|
||||
if hasattr(minres, "njev"):
|
||||
self.res.njev += minres.njev
|
||||
if hasattr(minres, "nhev"):
|
||||
self.res.nhev += minres.nhev
|
||||
|
||||
# accept the move based on self.accept_tests. If any test is False,
|
||||
# then reject the step. If any test returns the special string
|
||||
# 'force accept', then accept the step regardless. This can be used
|
||||
# to forcefully escape from a local minimum if normal basin hopping
|
||||
# steps are not sufficient.
|
||||
accept = True
|
||||
for test in self.accept_tests:
|
||||
if inspect.signature(test) == _new_accept_test_signature:
|
||||
testres = test(res_new=minres, res_old=self.incumbent_minres)
|
||||
else:
|
||||
testres = test(f_new=energy_after_quench, x_new=x_after_quench,
|
||||
f_old=self.energy, x_old=self.x)
|
||||
|
||||
if testres == 'force accept':
|
||||
accept = True
|
||||
break
|
||||
elif testres is None:
|
||||
raise ValueError("accept_tests must return True, False, or "
|
||||
"'force accept'")
|
||||
elif not testres:
|
||||
accept = False
|
||||
|
||||
# Report the result of the acceptance test to the take step class.
|
||||
# This is for adaptive step taking
|
||||
if hasattr(self.step_taking, "report"):
|
||||
self.step_taking.report(accept, f_new=energy_after_quench,
|
||||
x_new=x_after_quench, f_old=self.energy,
|
||||
x_old=self.x)
|
||||
|
||||
return accept, minres
|
||||
|
||||
def one_cycle(self):
|
||||
"""Do one cycle of the basinhopping algorithm
|
||||
"""
|
||||
self.nstep += 1
|
||||
new_global_min = False
|
||||
|
||||
accept, minres = self._monte_carlo_step()
|
||||
|
||||
if accept:
|
||||
self.energy = minres.fun
|
||||
self.x = np.copy(minres.x)
|
||||
self.incumbent_minres = minres # best minimize result found so far
|
||||
new_global_min = self.storage.update(minres)
|
||||
|
||||
# print some information
|
||||
if self.disp:
|
||||
self.print_report(minres.fun, accept)
|
||||
if new_global_min:
|
||||
print("found new global minimum on step %d with function"
|
||||
" value %g" % (self.nstep, self.energy))
|
||||
|
||||
# save some variables as BasinHoppingRunner attributes
|
||||
self.xtrial = minres.x
|
||||
self.energy_trial = minres.fun
|
||||
self.accept = accept
|
||||
|
||||
return new_global_min
|
||||
|
||||
def print_report(self, energy_trial, accept):
|
||||
"""print a status update"""
|
||||
minres = self.storage.get_lowest()
|
||||
print("basinhopping step %d: f %g trial_f %g accepted %d "
|
||||
" lowest_f %g" % (self.nstep, self.energy, energy_trial,
|
||||
accept, minres.fun))
|
||||
|
||||
|
||||
class AdaptiveStepsize:
|
||||
"""
|
||||
Class to implement adaptive stepsize.
|
||||
|
||||
This class wraps the step taking class and modifies the stepsize to
|
||||
ensure the true acceptance rate is as close as possible to the target.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
takestep : callable
|
||||
The step taking routine. Must contain modifiable attribute
|
||||
takestep.stepsize
|
||||
accept_rate : float, optional
|
||||
The target step acceptance rate
|
||||
interval : int, optional
|
||||
Interval for how often to update the stepsize
|
||||
factor : float, optional
|
||||
The step size is multiplied or divided by this factor upon each
|
||||
update.
|
||||
verbose : bool, optional
|
||||
Print information about each update
|
||||
|
||||
"""
|
||||
def __init__(self, takestep, accept_rate=0.5, interval=50, factor=0.9,
|
||||
verbose=True):
|
||||
self.takestep = takestep
|
||||
self.target_accept_rate = accept_rate
|
||||
self.interval = interval
|
||||
self.factor = factor
|
||||
self.verbose = verbose
|
||||
|
||||
self.nstep = 0
|
||||
self.nstep_tot = 0
|
||||
self.naccept = 0
|
||||
|
||||
def __call__(self, x):
|
||||
return self.take_step(x)
|
||||
|
||||
def _adjust_step_size(self):
|
||||
old_stepsize = self.takestep.stepsize
|
||||
accept_rate = float(self.naccept) / self.nstep
|
||||
if accept_rate > self.target_accept_rate:
|
||||
# We're accepting too many steps. This generally means we're
|
||||
# trapped in a basin. Take bigger steps.
|
||||
self.takestep.stepsize /= self.factor
|
||||
else:
|
||||
# We're not accepting enough steps. Take smaller steps.
|
||||
self.takestep.stepsize *= self.factor
|
||||
if self.verbose:
|
||||
print(f"adaptive stepsize: acceptance rate {accept_rate:f} target "
|
||||
f"{self.target_accept_rate:f} new stepsize "
|
||||
f"{self.takestep.stepsize:g} old stepsize {old_stepsize:g}")
|
||||
|
||||
def take_step(self, x):
|
||||
self.nstep += 1
|
||||
self.nstep_tot += 1
|
||||
if self.nstep % self.interval == 0:
|
||||
self._adjust_step_size()
|
||||
return self.takestep(x)
|
||||
|
||||
def report(self, accept, **kwargs):
|
||||
"called by basinhopping to report the result of the step"
|
||||
if accept:
|
||||
self.naccept += 1
|
||||
|
||||
|
||||
class RandomDisplacement:
|
||||
"""Add a random displacement of maximum size `stepsize` to each coordinate.
|
||||
|
||||
Calling this updates `x` in-place.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
stepsize : float, optional
|
||||
Maximum stepsize in any dimension
|
||||
random_gen : {None, int, `numpy.random.Generator`,
|
||||
`numpy.random.RandomState`}, optional
|
||||
|
||||
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
|
||||
singleton is used.
|
||||
If `seed` is an int, a new ``RandomState`` instance is used,
|
||||
seeded with `seed`.
|
||||
If `seed` is already a ``Generator`` or ``RandomState`` instance then
|
||||
that instance is used.
|
||||
|
||||
"""
|
||||
|
||||
def __init__(self, stepsize=0.5, random_gen=None):
|
||||
self.stepsize = stepsize
|
||||
self.random_gen = check_random_state(random_gen)
|
||||
|
||||
def __call__(self, x):
|
||||
x += self.random_gen.uniform(-self.stepsize, self.stepsize,
|
||||
np.shape(x))
|
||||
return x
|
||||
|
||||
|
||||
class MinimizerWrapper:
|
||||
"""
|
||||
wrap a minimizer function as a minimizer class
|
||||
"""
|
||||
def __init__(self, minimizer, func=None, **kwargs):
|
||||
self.minimizer = minimizer
|
||||
self.func = func
|
||||
self.kwargs = kwargs
|
||||
|
||||
def __call__(self, x0):
|
||||
if self.func is None:
|
||||
return self.minimizer(x0, **self.kwargs)
|
||||
else:
|
||||
return self.minimizer(self.func, x0, **self.kwargs)
|
||||
|
||||
|
||||
class Metropolis:
|
||||
"""Metropolis acceptance criterion.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
T : float
|
||||
The "temperature" parameter for the accept or reject criterion.
|
||||
random_gen : {None, int, `numpy.random.Generator`,
|
||||
`numpy.random.RandomState`}, optional
|
||||
|
||||
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
|
||||
singleton is used.
|
||||
If `seed` is an int, a new ``RandomState`` instance is used,
|
||||
seeded with `seed`.
|
||||
If `seed` is already a ``Generator`` or ``RandomState`` instance then
|
||||
that instance is used.
|
||||
Random number generator used for acceptance test.
|
||||
|
||||
"""
|
||||
|
||||
def __init__(self, T, random_gen=None):
|
||||
# Avoid ZeroDivisionError since "MBH can be regarded as a special case
|
||||
# of the BH framework with the Metropolis criterion, where temperature
|
||||
# T = 0." (Reject all steps that increase energy.)
|
||||
self.beta = 1.0 / T if T != 0 else float('inf')
|
||||
self.random_gen = check_random_state(random_gen)
|
||||
|
||||
def accept_reject(self, res_new, res_old):
|
||||
"""
|
||||
Assuming the local search underlying res_new was successful:
|
||||
If new energy is lower than old, it will always be accepted.
|
||||
If new is higher than old, there is a chance it will be accepted,
|
||||
less likely for larger differences.
|
||||
"""
|
||||
with np.errstate(invalid='ignore'):
|
||||
# The energy values being fed to Metropolis are 1-length arrays, and if
|
||||
# they are equal, their difference is 0, which gets multiplied by beta,
|
||||
# which is inf, and array([0]) * float('inf') causes
|
||||
#
|
||||
# RuntimeWarning: invalid value encountered in multiply
|
||||
#
|
||||
# Ignore this warning so when the algorithm is on a flat plane, it always
|
||||
# accepts the step, to try to move off the plane.
|
||||
prod = -(res_new.fun - res_old.fun) * self.beta
|
||||
w = math.exp(min(0, prod))
|
||||
|
||||
rand = self.random_gen.uniform()
|
||||
return w >= rand and (res_new.success or not res_old.success)
|
||||
|
||||
def __call__(self, *, res_new, res_old):
|
||||
"""
|
||||
f_new and f_old are mandatory in kwargs
|
||||
"""
|
||||
return bool(self.accept_reject(res_new, res_old))
|
||||
|
||||
|
||||
def basinhopping(func, x0, niter=100, T=1.0, stepsize=0.5,
|
||||
minimizer_kwargs=None, take_step=None, accept_test=None,
|
||||
callback=None, interval=50, disp=False, niter_success=None,
|
||||
seed=None, *, target_accept_rate=0.5, stepwise_factor=0.9):
|
||||
"""Find the global minimum of a function using the basin-hopping algorithm.
|
||||
|
||||
Basin-hopping is a two-phase method that combines a global stepping
|
||||
algorithm with local minimization at each step. Designed to mimic
|
||||
the natural process of energy minimization of clusters of atoms, it works
|
||||
well for similar problems with "funnel-like, but rugged" energy landscapes
|
||||
[5]_.
|
||||
|
||||
As the step-taking, step acceptance, and minimization methods are all
|
||||
customizable, this function can also be used to implement other two-phase
|
||||
methods.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
func : callable ``f(x, *args)``
|
||||
Function to be optimized. ``args`` can be passed as an optional item
|
||||
in the dict `minimizer_kwargs`
|
||||
x0 : array_like
|
||||
Initial guess.
|
||||
niter : integer, optional
|
||||
The number of basin-hopping iterations. There will be a total of
|
||||
``niter + 1`` runs of the local minimizer.
|
||||
T : float, optional
|
||||
The "temperature" parameter for the acceptance or rejection criterion.
|
||||
Higher "temperatures" mean that larger jumps in function value will be
|
||||
accepted. For best results `T` should be comparable to the
|
||||
separation (in function value) between local minima.
|
||||
stepsize : float, optional
|
||||
Maximum step size for use in the random displacement.
|
||||
minimizer_kwargs : dict, optional
|
||||
Extra keyword arguments to be passed to the local minimizer
|
||||
`scipy.optimize.minimize` Some important options could be:
|
||||
|
||||
method : str
|
||||
The minimization method (e.g. ``"L-BFGS-B"``)
|
||||
args : tuple
|
||||
Extra arguments passed to the objective function (`func`) and
|
||||
its derivatives (Jacobian, Hessian).
|
||||
|
||||
take_step : callable ``take_step(x)``, optional
|
||||
Replace the default step-taking routine with this routine. The default
|
||||
step-taking routine is a random displacement of the coordinates, but
|
||||
other step-taking algorithms may be better for some systems.
|
||||
`take_step` can optionally have the attribute ``take_step.stepsize``.
|
||||
If this attribute exists, then `basinhopping` will adjust
|
||||
``take_step.stepsize`` in order to try to optimize the global minimum
|
||||
search.
|
||||
accept_test : callable, ``accept_test(f_new=f_new, x_new=x_new, f_old=fold, x_old=x_old)``, optional
|
||||
Define a test which will be used to judge whether to accept the
|
||||
step. This will be used in addition to the Metropolis test based on
|
||||
"temperature" `T`. The acceptable return values are True,
|
||||
False, or ``"force accept"``. If any of the tests return False
|
||||
then the step is rejected. If the latter, then this will override any
|
||||
other tests in order to accept the step. This can be used, for example,
|
||||
to forcefully escape from a local minimum that `basinhopping` is
|
||||
trapped in.
|
||||
callback : callable, ``callback(x, f, accept)``, optional
|
||||
A callback function which will be called for all minima found. ``x``
|
||||
and ``f`` are the coordinates and function value of the trial minimum,
|
||||
and ``accept`` is whether that minimum was accepted. This can
|
||||
be used, for example, to save the lowest N minima found. Also,
|
||||
`callback` can be used to specify a user defined stop criterion by
|
||||
optionally returning True to stop the `basinhopping` routine.
|
||||
interval : integer, optional
|
||||
interval for how often to update the `stepsize`
|
||||
disp : bool, optional
|
||||
Set to True to print status messages
|
||||
niter_success : integer, optional
|
||||
Stop the run if the global minimum candidate remains the same for this
|
||||
number of iterations.
|
||||
seed : {None, int, `numpy.random.Generator`, `numpy.random.RandomState`}, optional
|
||||
|
||||
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
|
||||
singleton is used.
|
||||
If `seed` is an int, a new ``RandomState`` instance is used,
|
||||
seeded with `seed`.
|
||||
If `seed` is already a ``Generator`` or ``RandomState`` instance then
|
||||
that instance is used.
|
||||
Specify `seed` for repeatable minimizations. The random numbers
|
||||
generated with this seed only affect the default Metropolis
|
||||
`accept_test` and the default `take_step`. If you supply your own
|
||||
`take_step` and `accept_test`, and these functions use random
|
||||
number generation, then those functions are responsible for the state
|
||||
of their random number generator.
|
||||
target_accept_rate : float, optional
|
||||
The target acceptance rate that is used to adjust the `stepsize`.
|
||||
If the current acceptance rate is greater than the target,
|
||||
then the `stepsize` is increased. Otherwise, it is decreased.
|
||||
Range is (0, 1). Default is 0.5.
|
||||
|
||||
.. versionadded:: 1.8.0
|
||||
|
||||
stepwise_factor : float, optional
|
||||
The `stepsize` is multiplied or divided by this stepwise factor upon
|
||||
each update. Range is (0, 1). Default is 0.9.
|
||||
|
||||
.. versionadded:: 1.8.0
|
||||
|
||||
Returns
|
||||
-------
|
||||
res : OptimizeResult
|
||||
The optimization result represented as a `OptimizeResult` object.
|
||||
Important attributes are: ``x`` the solution array, ``fun`` the value
|
||||
of the function at the solution, and ``message`` which describes the
|
||||
cause of the termination. The ``OptimizeResult`` object returned by the
|
||||
selected minimizer at the lowest minimum is also contained within this
|
||||
object and can be accessed through the ``lowest_optimization_result``
|
||||
attribute. See `OptimizeResult` for a description of other attributes.
|
||||
|
||||
See Also
|
||||
--------
|
||||
minimize :
|
||||
The local minimization function called once for each basinhopping step.
|
||||
`minimizer_kwargs` is passed to this routine.
|
||||
|
||||
Notes
|
||||
-----
|
||||
Basin-hopping is a stochastic algorithm which attempts to find the global
|
||||
minimum of a smooth scalar function of one or more variables [1]_ [2]_ [3]_
|
||||
[4]_. The algorithm in its current form was described by David Wales and
|
||||
Jonathan Doye [2]_ http://www-wales.ch.cam.ac.uk/.
|
||||
|
||||
The algorithm is iterative with each cycle composed of the following
|
||||
features
|
||||
|
||||
1) random perturbation of the coordinates
|
||||
|
||||
2) local minimization
|
||||
|
||||
3) accept or reject the new coordinates based on the minimized function
|
||||
value
|
||||
|
||||
The acceptance test used here is the Metropolis criterion of standard Monte
|
||||
Carlo algorithms, although there are many other possibilities [3]_.
|
||||
|
||||
This global minimization method has been shown to be extremely efficient
|
||||
for a wide variety of problems in physics and chemistry. It is
|
||||
particularly useful when the function has many minima separated by large
|
||||
barriers. See the `Cambridge Cluster Database
|
||||
<https://www-wales.ch.cam.ac.uk/CCD.html>`_ for databases of molecular
|
||||
systems that have been optimized primarily using basin-hopping. This
|
||||
database includes minimization problems exceeding 300 degrees of freedom.
|
||||
|
||||
See the free software program `GMIN <https://www-wales.ch.cam.ac.uk/GMIN>`_
|
||||
for a Fortran implementation of basin-hopping. This implementation has many
|
||||
variations of the procedure described above, including more
|
||||
advanced step taking algorithms and alternate acceptance criterion.
|
||||
|
||||
For stochastic global optimization there is no way to determine if the true
|
||||
global minimum has actually been found. Instead, as a consistency check,
|
||||
the algorithm can be run from a number of different random starting points
|
||||
to ensure the lowest minimum found in each example has converged to the
|
||||
global minimum. For this reason, `basinhopping` will by default simply
|
||||
run for the number of iterations `niter` and return the lowest minimum
|
||||
found. It is left to the user to ensure that this is in fact the global
|
||||
minimum.
|
||||
|
||||
Choosing `stepsize`: This is a crucial parameter in `basinhopping` and
|
||||
depends on the problem being solved. The step is chosen uniformly in the
|
||||
region from x0-stepsize to x0+stepsize, in each dimension. Ideally, it
|
||||
should be comparable to the typical separation (in argument values) between
|
||||
local minima of the function being optimized. `basinhopping` will, by
|
||||
default, adjust `stepsize` to find an optimal value, but this may take
|
||||
many iterations. You will get quicker results if you set a sensible
|
||||
initial value for ``stepsize``.
|
||||
|
||||
Choosing `T`: The parameter `T` is the "temperature" used in the
|
||||
Metropolis criterion. Basinhopping steps are always accepted if
|
||||
``func(xnew) < func(xold)``. Otherwise, they are accepted with
|
||||
probability::
|
||||
|
||||
exp( -(func(xnew) - func(xold)) / T )
|
||||
|
||||
So, for best results, `T` should to be comparable to the typical
|
||||
difference (in function values) between local minima. (The height of
|
||||
"walls" between local minima is irrelevant.)
|
||||
|
||||
If `T` is 0, the algorithm becomes Monotonic Basin-Hopping, in which all
|
||||
steps that increase energy are rejected.
|
||||
|
||||
.. versionadded:: 0.12.0
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Wales, David J. 2003, Energy Landscapes, Cambridge University Press,
|
||||
Cambridge, UK.
|
||||
.. [2] Wales, D J, and Doye J P K, Global Optimization by Basin-Hopping and
|
||||
the Lowest Energy Structures of Lennard-Jones Clusters Containing up to
|
||||
110 Atoms. Journal of Physical Chemistry A, 1997, 101, 5111.
|
||||
.. [3] Li, Z. and Scheraga, H. A., Monte Carlo-minimization approach to the
|
||||
multiple-minima problem in protein folding, Proc. Natl. Acad. Sci. USA,
|
||||
1987, 84, 6611.
|
||||
.. [4] Wales, D. J. and Scheraga, H. A., Global optimization of clusters,
|
||||
crystals, and biomolecules, Science, 1999, 285, 1368.
|
||||
.. [5] Olson, B., Hashmi, I., Molloy, K., and Shehu1, A., Basin Hopping as
|
||||
a General and Versatile Optimization Framework for the Characterization
|
||||
of Biological Macromolecules, Advances in Artificial Intelligence,
|
||||
Volume 2012 (2012), Article ID 674832, :doi:`10.1155/2012/674832`
|
||||
|
||||
Examples
|
||||
--------
|
||||
The following example is a 1-D minimization problem, with many
|
||||
local minima superimposed on a parabola.
|
||||
|
||||
>>> import numpy as np
|
||||
>>> from scipy.optimize import basinhopping
|
||||
>>> func = lambda x: np.cos(14.5 * x - 0.3) + (x + 0.2) * x
|
||||
>>> x0 = [1.]
|
||||
|
||||
Basinhopping, internally, uses a local minimization algorithm. We will use
|
||||
the parameter `minimizer_kwargs` to tell basinhopping which algorithm to
|
||||
use and how to set up that minimizer. This parameter will be passed to
|
||||
`scipy.optimize.minimize`.
|
||||
|
||||
>>> minimizer_kwargs = {"method": "BFGS"}
|
||||
>>> ret = basinhopping(func, x0, minimizer_kwargs=minimizer_kwargs,
|
||||
... niter=200)
|
||||
>>> # the global minimum is:
|
||||
>>> ret.x, ret.fun
|
||||
-0.1951, -1.0009
|
||||
|
||||
Next consider a 2-D minimization problem. Also, this time, we
|
||||
will use gradient information to significantly speed up the search.
|
||||
|
||||
>>> def func2d(x):
|
||||
... f = np.cos(14.5 * x[0] - 0.3) + (x[1] + 0.2) * x[1] + (x[0] +
|
||||
... 0.2) * x[0]
|
||||
... df = np.zeros(2)
|
||||
... df[0] = -14.5 * np.sin(14.5 * x[0] - 0.3) + 2. * x[0] + 0.2
|
||||
... df[1] = 2. * x[1] + 0.2
|
||||
... return f, df
|
||||
|
||||
We'll also use a different local minimization algorithm. Also, we must tell
|
||||
the minimizer that our function returns both energy and gradient (Jacobian).
|
||||
|
||||
>>> minimizer_kwargs = {"method":"L-BFGS-B", "jac":True}
|
||||
>>> x0 = [1.0, 1.0]
|
||||
>>> ret = basinhopping(func2d, x0, minimizer_kwargs=minimizer_kwargs,
|
||||
... niter=200)
|
||||
>>> print("global minimum: x = [%.4f, %.4f], f(x) = %.4f" % (ret.x[0],
|
||||
... ret.x[1],
|
||||
... ret.fun))
|
||||
global minimum: x = [-0.1951, -0.1000], f(x) = -1.0109
|
||||
|
||||
Here is an example using a custom step-taking routine. Imagine you want
|
||||
the first coordinate to take larger steps than the rest of the coordinates.
|
||||
This can be implemented like so:
|
||||
|
||||
>>> class MyTakeStep:
|
||||
... def __init__(self, stepsize=0.5):
|
||||
... self.stepsize = stepsize
|
||||
... self.rng = np.random.default_rng()
|
||||
... def __call__(self, x):
|
||||
... s = self.stepsize
|
||||
... x[0] += self.rng.uniform(-2.*s, 2.*s)
|
||||
... x[1:] += self.rng.uniform(-s, s, x[1:].shape)
|
||||
... return x
|
||||
|
||||
Since ``MyTakeStep.stepsize`` exists basinhopping will adjust the magnitude
|
||||
of `stepsize` to optimize the search. We'll use the same 2-D function as
|
||||
before
|
||||
|
||||
>>> mytakestep = MyTakeStep()
|
||||
>>> ret = basinhopping(func2d, x0, minimizer_kwargs=minimizer_kwargs,
|
||||
... niter=200, take_step=mytakestep)
|
||||
>>> print("global minimum: x = [%.4f, %.4f], f(x) = %.4f" % (ret.x[0],
|
||||
... ret.x[1],
|
||||
... ret.fun))
|
||||
global minimum: x = [-0.1951, -0.1000], f(x) = -1.0109
|
||||
|
||||
Now, let's do an example using a custom callback function which prints the
|
||||
value of every minimum found
|
||||
|
||||
>>> def print_fun(x, f, accepted):
|
||||
... print("at minimum %.4f accepted %d" % (f, int(accepted)))
|
||||
|
||||
We'll run it for only 10 basinhopping steps this time.
|
||||
|
||||
>>> rng = np.random.default_rng()
|
||||
>>> ret = basinhopping(func2d, x0, minimizer_kwargs=minimizer_kwargs,
|
||||
... niter=10, callback=print_fun, seed=rng)
|
||||
at minimum 0.4159 accepted 1
|
||||
at minimum -0.4317 accepted 1
|
||||
at minimum -1.0109 accepted 1
|
||||
at minimum -0.9073 accepted 1
|
||||
at minimum -0.4317 accepted 0
|
||||
at minimum -0.1021 accepted 1
|
||||
at minimum -0.7425 accepted 1
|
||||
at minimum -0.9073 accepted 1
|
||||
at minimum -0.4317 accepted 0
|
||||
at minimum -0.7425 accepted 1
|
||||
at minimum -0.9073 accepted 1
|
||||
|
||||
The minimum at -1.0109 is actually the global minimum, found already on the
|
||||
8th iteration.
|
||||
|
||||
""" # numpy/numpydoc#87 # noqa: E501
|
||||
if target_accept_rate <= 0. or target_accept_rate >= 1.:
|
||||
raise ValueError('target_accept_rate has to be in range (0, 1)')
|
||||
if stepwise_factor <= 0. or stepwise_factor >= 1.:
|
||||
raise ValueError('stepwise_factor has to be in range (0, 1)')
|
||||
|
||||
x0 = np.array(x0)
|
||||
|
||||
# set up the np.random generator
|
||||
rng = check_random_state(seed)
|
||||
|
||||
# set up minimizer
|
||||
if minimizer_kwargs is None:
|
||||
minimizer_kwargs = dict()
|
||||
wrapped_minimizer = MinimizerWrapper(scipy.optimize.minimize, func,
|
||||
**minimizer_kwargs)
|
||||
|
||||
# set up step-taking algorithm
|
||||
if take_step is not None:
|
||||
if not callable(take_step):
|
||||
raise TypeError("take_step must be callable")
|
||||
# if take_step.stepsize exists then use AdaptiveStepsize to control
|
||||
# take_step.stepsize
|
||||
if hasattr(take_step, "stepsize"):
|
||||
take_step_wrapped = AdaptiveStepsize(
|
||||
take_step, interval=interval,
|
||||
accept_rate=target_accept_rate,
|
||||
factor=stepwise_factor,
|
||||
verbose=disp)
|
||||
else:
|
||||
take_step_wrapped = take_step
|
||||
else:
|
||||
# use default
|
||||
displace = RandomDisplacement(stepsize=stepsize, random_gen=rng)
|
||||
take_step_wrapped = AdaptiveStepsize(displace, interval=interval,
|
||||
accept_rate=target_accept_rate,
|
||||
factor=stepwise_factor,
|
||||
verbose=disp)
|
||||
|
||||
# set up accept tests
|
||||
accept_tests = []
|
||||
if accept_test is not None:
|
||||
if not callable(accept_test):
|
||||
raise TypeError("accept_test must be callable")
|
||||
accept_tests = [accept_test]
|
||||
|
||||
# use default
|
||||
metropolis = Metropolis(T, random_gen=rng)
|
||||
accept_tests.append(metropolis)
|
||||
|
||||
if niter_success is None:
|
||||
niter_success = niter + 2
|
||||
|
||||
bh = BasinHoppingRunner(x0, wrapped_minimizer, take_step_wrapped,
|
||||
accept_tests, disp=disp)
|
||||
|
||||
# The wrapped minimizer is called once during construction of
|
||||
# BasinHoppingRunner, so run the callback
|
||||
if callable(callback):
|
||||
callback(bh.storage.minres.x, bh.storage.minres.fun, True)
|
||||
|
||||
# start main iteration loop
|
||||
count, i = 0, 0
|
||||
message = ["requested number of basinhopping iterations completed"
|
||||
" successfully"]
|
||||
for i in range(niter):
|
||||
new_global_min = bh.one_cycle()
|
||||
|
||||
if callable(callback):
|
||||
# should we pass a copy of x?
|
||||
val = callback(bh.xtrial, bh.energy_trial, bh.accept)
|
||||
if val is not None:
|
||||
if val:
|
||||
message = ["callback function requested stop early by"
|
||||
"returning True"]
|
||||
break
|
||||
|
||||
count += 1
|
||||
if new_global_min:
|
||||
count = 0
|
||||
elif count > niter_success:
|
||||
message = ["success condition satisfied"]
|
||||
break
|
||||
|
||||
# prepare return object
|
||||
res = bh.res
|
||||
res.lowest_optimization_result = bh.storage.get_lowest()
|
||||
res.x = np.copy(res.lowest_optimization_result.x)
|
||||
res.fun = res.lowest_optimization_result.fun
|
||||
res.message = message
|
||||
res.nit = i + 1
|
||||
res.success = res.lowest_optimization_result.success
|
||||
return res
|
||||
Binary file not shown.
666
venv/lib/python3.12/site-packages/scipy/optimize/_bracket.py
Normal file
666
venv/lib/python3.12/site-packages/scipy/optimize/_bracket.py
Normal file
@ -0,0 +1,666 @@
|
||||
import numpy as np
|
||||
import scipy._lib._elementwise_iterative_method as eim
|
||||
from scipy._lib._util import _RichResult
|
||||
|
||||
_ELIMITS = -1 # used in _bracket_root
|
||||
_ESTOPONESIDE = 2 # used in _bracket_root
|
||||
|
||||
def _bracket_root_iv(func, xl0, xr0, xmin, xmax, factor, args, maxiter):
|
||||
|
||||
if not callable(func):
|
||||
raise ValueError('`func` must be callable.')
|
||||
|
||||
if not np.iterable(args):
|
||||
args = (args,)
|
||||
|
||||
xl0 = np.asarray(xl0)[()]
|
||||
if not np.issubdtype(xl0.dtype, np.number) or np.iscomplex(xl0).any():
|
||||
raise ValueError('`xl0` must be numeric and real.')
|
||||
|
||||
xr0 = xl0 + 1 if xr0 is None else xr0
|
||||
xmin = -np.inf if xmin is None else xmin
|
||||
xmax = np.inf if xmax is None else xmax
|
||||
factor = 2. if factor is None else factor
|
||||
xl0, xr0, xmin, xmax, factor = np.broadcast_arrays(xl0, xr0, xmin, xmax, factor)
|
||||
|
||||
if not np.issubdtype(xr0.dtype, np.number) or np.iscomplex(xr0).any():
|
||||
raise ValueError('`xr0` must be numeric and real.')
|
||||
|
||||
if not np.issubdtype(xmin.dtype, np.number) or np.iscomplex(xmin).any():
|
||||
raise ValueError('`xmin` must be numeric and real.')
|
||||
|
||||
if not np.issubdtype(xmax.dtype, np.number) or np.iscomplex(xmax).any():
|
||||
raise ValueError('`xmax` must be numeric and real.')
|
||||
|
||||
if not np.issubdtype(factor.dtype, np.number) or np.iscomplex(factor).any():
|
||||
raise ValueError('`factor` must be numeric and real.')
|
||||
if not np.all(factor > 1):
|
||||
raise ValueError('All elements of `factor` must be greater than 1.')
|
||||
|
||||
maxiter = np.asarray(maxiter)
|
||||
message = '`maxiter` must be a non-negative integer.'
|
||||
if (not np.issubdtype(maxiter.dtype, np.number) or maxiter.shape != tuple()
|
||||
or np.iscomplex(maxiter)):
|
||||
raise ValueError(message)
|
||||
maxiter_int = int(maxiter[()])
|
||||
if not maxiter == maxiter_int or maxiter < 0:
|
||||
raise ValueError(message)
|
||||
|
||||
return func, xl0, xr0, xmin, xmax, factor, args, maxiter
|
||||
|
||||
|
||||
def _bracket_root(func, xl0, xr0=None, *, xmin=None, xmax=None, factor=None,
|
||||
args=(), maxiter=1000):
|
||||
"""Bracket the root of a monotonic scalar function of one variable
|
||||
|
||||
This function works elementwise when `xl0`, `xr0`, `xmin`, `xmax`, `factor`, and
|
||||
the elements of `args` are broadcastable arrays.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
func : callable
|
||||
The function for which the root is to be bracketed.
|
||||
The signature must be::
|
||||
|
||||
func(x: ndarray, *args) -> ndarray
|
||||
|
||||
where each element of ``x`` is a finite real and ``args`` is a tuple,
|
||||
which may contain an arbitrary number of arrays that are broadcastable
|
||||
with `x`. ``func`` must be an elementwise function: each element
|
||||
``func(x)[i]`` must equal ``func(x[i])`` for all indices ``i``.
|
||||
xl0, xr0: float array_like
|
||||
Starting guess of bracket, which need not contain a root. If `xr0` is
|
||||
not provided, ``xr0 = xl0 + 1``. Must be broadcastable with one another.
|
||||
xmin, xmax : float array_like, optional
|
||||
Minimum and maximum allowable endpoints of the bracket, inclusive. Must
|
||||
be broadcastable with `xl0` and `xr0`.
|
||||
factor : float array_like, default: 2
|
||||
The factor used to grow the bracket. See notes for details.
|
||||
args : tuple, optional
|
||||
Additional positional arguments to be passed to `func`. Must be arrays
|
||||
broadcastable with `xl0`, `xr0`, `xmin`, and `xmax`. If the callable to be
|
||||
bracketed requires arguments that are not broadcastable with these
|
||||
arrays, wrap that callable with `func` such that `func` accepts
|
||||
only `x` and broadcastable arrays.
|
||||
maxiter : int, optional
|
||||
The maximum number of iterations of the algorithm to perform.
|
||||
|
||||
Returns
|
||||
-------
|
||||
res : _RichResult
|
||||
An instance of `scipy._lib._util._RichResult` with the following
|
||||
attributes. The descriptions are written as though the values will be
|
||||
scalars; however, if `func` returns an array, the outputs will be
|
||||
arrays of the same shape.
|
||||
|
||||
xl, xr : float
|
||||
The lower and upper ends of the bracket, if the algorithm
|
||||
terminated successfully.
|
||||
fl, fr : float
|
||||
The function value at the lower and upper ends of the bracket.
|
||||
nfev : int
|
||||
The number of function evaluations required to find the bracket.
|
||||
This is distinct from the number of times `func` is *called*
|
||||
because the function may evaluated at multiple points in a single
|
||||
call.
|
||||
nit : int
|
||||
The number of iterations of the algorithm that were performed.
|
||||
status : int
|
||||
An integer representing the exit status of the algorithm.
|
||||
|
||||
- ``0`` : The algorithm produced a valid bracket.
|
||||
- ``-1`` : The bracket expanded to the allowable limits without finding a bracket.
|
||||
- ``-2`` : The maximum number of iterations was reached.
|
||||
- ``-3`` : A non-finite value was encountered.
|
||||
- ``-4`` : Iteration was terminated by `callback`.
|
||||
- ``-5``: The initial bracket does not satisfy `xmin <= xl0 < xr0 < xmax`.
|
||||
- ``1`` : The algorithm is proceeding normally (in `callback` only).
|
||||
- ``2`` : A bracket was found in the opposite search direction (in `callback` only).
|
||||
|
||||
success : bool
|
||||
``True`` when the algorithm terminated successfully (status ``0``).
|
||||
|
||||
Notes
|
||||
-----
|
||||
This function generalizes an algorithm found in pieces throughout
|
||||
`scipy.stats`. The strategy is to iteratively grow the bracket `(l, r)`
|
||||
until ``func(l) < 0 < func(r)``. The bracket grows to the left as follows.
|
||||
|
||||
- If `xmin` is not provided, the distance between `xl0` and `l` is iteratively
|
||||
increased by `factor`.
|
||||
- If `xmin` is provided, the distance between `xmin` and `l` is iteratively
|
||||
decreased by `factor`. Note that this also *increases* the bracket size.
|
||||
|
||||
Growth of the bracket to the right is analogous.
|
||||
|
||||
Growth of the bracket in one direction stops when the endpoint is no longer
|
||||
finite, the function value at the endpoint is no longer finite, or the
|
||||
endpoint reaches its limiting value (`xmin` or `xmax`). Iteration terminates
|
||||
when the bracket stops growing in both directions, the bracket surrounds
|
||||
the root, or a root is found (accidentally).
|
||||
|
||||
If two brackets are found - that is, a bracket is found on both sides in
|
||||
the same iteration, the smaller of the two is returned.
|
||||
If roots of the function are found, both `l` and `r` are set to the
|
||||
leftmost root.
|
||||
|
||||
""" # noqa: E501
|
||||
# Todo:
|
||||
# - find bracket with sign change in specified direction
|
||||
# - Add tolerance
|
||||
# - allow factor < 1?
|
||||
|
||||
callback = None # works; I just don't want to test it
|
||||
temp = _bracket_root_iv(func, xl0, xr0, xmin, xmax, factor, args, maxiter)
|
||||
func, xl0, xr0, xmin, xmax, factor, args, maxiter = temp
|
||||
|
||||
xs = (xl0, xr0)
|
||||
temp = eim._initialize(func, xs, args)
|
||||
func, xs, fs, args, shape, dtype, xp = temp # line split for PEP8
|
||||
xl0, xr0 = xs
|
||||
xmin = np.broadcast_to(xmin, shape).astype(dtype, copy=False).ravel()
|
||||
xmax = np.broadcast_to(xmax, shape).astype(dtype, copy=False).ravel()
|
||||
invalid_bracket = ~((xmin <= xl0) & (xl0 < xr0) & (xr0 <= xmax))
|
||||
|
||||
# The approach is to treat the left and right searches as though they were
|
||||
# (almost) totally independent one-sided bracket searches. (The interaction
|
||||
# is considered when checking for termination and preparing the result
|
||||
# object.)
|
||||
# `x` is the "moving" end of the bracket
|
||||
x = np.concatenate(xs)
|
||||
f = np.concatenate(fs)
|
||||
invalid_bracket = np.concatenate((invalid_bracket, invalid_bracket))
|
||||
n = len(x) // 2
|
||||
|
||||
# `x_last` is the previous location of the moving end of the bracket. If
|
||||
# the signs of `f` and `f_last` are different, `x` and `x_last` form a
|
||||
# bracket.
|
||||
x_last = np.concatenate((x[n:], x[:n]))
|
||||
f_last = np.concatenate((f[n:], f[:n]))
|
||||
# `x0` is the "fixed" end of the bracket.
|
||||
x0 = x_last
|
||||
# We don't need to retain the corresponding function value, since the
|
||||
# fixed end of the bracket is only needed to compute the new value of the
|
||||
# moving end; it is never returned.
|
||||
limit = np.concatenate((xmin, xmax))
|
||||
|
||||
factor = np.broadcast_to(factor, shape).astype(dtype, copy=False).ravel()
|
||||
factor = np.concatenate((factor, factor))
|
||||
|
||||
active = np.arange(2*n)
|
||||
args = [np.concatenate((arg, arg)) for arg in args]
|
||||
|
||||
# This is needed due to inner workings of `eim._loop`.
|
||||
# We're abusing it a tiny bit.
|
||||
shape = shape + (2,)
|
||||
|
||||
# `d` is for "distance".
|
||||
# For searches without a limit, the distance between the fixed end of the
|
||||
# bracket `x0` and the moving end `x` will grow by `factor` each iteration.
|
||||
# For searches with a limit, the distance between the `limit` and moving
|
||||
# end of the bracket `x` will shrink by `factor` each iteration.
|
||||
i = np.isinf(limit)
|
||||
ni = ~i
|
||||
d = np.zeros_like(x)
|
||||
d[i] = x[i] - x0[i]
|
||||
d[ni] = limit[ni] - x[ni]
|
||||
|
||||
status = np.full_like(x, eim._EINPROGRESS, dtype=int) # in progress
|
||||
status[invalid_bracket] = eim._EINPUTERR
|
||||
nit, nfev = 0, 1 # one function evaluation per side performed above
|
||||
|
||||
work = _RichResult(x=x, x0=x0, f=f, limit=limit, factor=factor,
|
||||
active=active, d=d, x_last=x_last, f_last=f_last,
|
||||
nit=nit, nfev=nfev, status=status, args=args,
|
||||
xl=None, xr=None, fl=None, fr=None, n=n)
|
||||
res_work_pairs = [('status', 'status'), ('xl', 'xl'), ('xr', 'xr'),
|
||||
('nit', 'nit'), ('nfev', 'nfev'), ('fl', 'fl'),
|
||||
('fr', 'fr'), ('x', 'x'), ('f', 'f'),
|
||||
('x_last', 'x_last'), ('f_last', 'f_last')]
|
||||
|
||||
def pre_func_eval(work):
|
||||
# Initialize moving end of bracket
|
||||
x = np.zeros_like(work.x)
|
||||
|
||||
# Unlimited brackets grow by `factor` by increasing distance from fixed
|
||||
# end to moving end.
|
||||
i = np.isinf(work.limit) # indices of unlimited brackets
|
||||
work.d[i] *= work.factor[i]
|
||||
x[i] = work.x0[i] + work.d[i]
|
||||
|
||||
# Limited brackets grow by decreasing the distance from the limit to
|
||||
# the moving end.
|
||||
ni = ~i # indices of limited brackets
|
||||
work.d[ni] /= work.factor[ni]
|
||||
x[ni] = work.limit[ni] - work.d[ni]
|
||||
|
||||
return x
|
||||
|
||||
def post_func_eval(x, f, work):
|
||||
# Keep track of the previous location of the moving end so that we can
|
||||
# return a narrower bracket. (The alternative is to remember the
|
||||
# original fixed end, but then the bracket would be wider than needed.)
|
||||
work.x_last = work.x
|
||||
work.f_last = work.f
|
||||
work.x = x
|
||||
work.f = f
|
||||
|
||||
def check_termination(work):
|
||||
# Condition 0: initial bracket is invalid
|
||||
stop = (work.status == eim._EINPUTERR)
|
||||
|
||||
# Condition 1: a valid bracket (or the root itself) has been found
|
||||
sf = np.sign(work.f)
|
||||
sf_last = np.sign(work.f_last)
|
||||
i = ((sf_last == -sf) | (sf_last == 0) | (sf == 0)) & ~stop
|
||||
work.status[i] = eim._ECONVERGED
|
||||
stop[i] = True
|
||||
|
||||
# Condition 2: the other side's search found a valid bracket.
|
||||
# (If we just found a bracket with the rightward search, we can stop
|
||||
# the leftward search, and vice-versa.)
|
||||
# To do this, we need to set the status of the other side's search;
|
||||
# this is tricky because `work.status` contains only the *active*
|
||||
# elements, so we don't immediately know the index of the element we
|
||||
# need to set - or even if it's still there. (That search may have
|
||||
# terminated already, e.g. by reaching its `limit`.)
|
||||
# To facilitate this, `work.active` contains a unit integer index of
|
||||
# each search. Index `k` (`k < n)` and `k + n` correspond with a
|
||||
# leftward and rightward search, respectively. Elements are removed
|
||||
# from `work.active` just as they are removed from `work.status`, so
|
||||
# we use `work.active` to help find the right location in
|
||||
# `work.status`.
|
||||
# Get the integer indices of the elements that can also stop
|
||||
also_stop = (work.active[i] + work.n) % (2*work.n)
|
||||
# Check whether they are still active.
|
||||
# To start, we need to find out where in `work.active` they would
|
||||
# appear if they are indeed there.
|
||||
j = np.searchsorted(work.active, also_stop)
|
||||
# If the location exceeds the length of the `work.active`, they are
|
||||
# not there.
|
||||
j = j[j < len(work.active)]
|
||||
# Check whether they are still there.
|
||||
j = j[also_stop == work.active[j]]
|
||||
# Now convert these to boolean indices to use with `work.status`.
|
||||
i = np.zeros_like(stop)
|
||||
i[j] = True # boolean indices of elements that can also stop
|
||||
i = i & ~stop
|
||||
work.status[i] = _ESTOPONESIDE
|
||||
stop[i] = True
|
||||
|
||||
# Condition 3: moving end of bracket reaches limit
|
||||
i = (work.x == work.limit) & ~stop
|
||||
work.status[i] = _ELIMITS
|
||||
stop[i] = True
|
||||
|
||||
# Condition 4: non-finite value encountered
|
||||
i = ~(np.isfinite(work.x) & np.isfinite(work.f)) & ~stop
|
||||
work.status[i] = eim._EVALUEERR
|
||||
stop[i] = True
|
||||
|
||||
return stop
|
||||
|
||||
def post_termination_check(work):
|
||||
pass
|
||||
|
||||
def customize_result(res, shape):
|
||||
n = len(res['x']) // 2
|
||||
|
||||
# To avoid ambiguity, below we refer to `xl0`, the initial left endpoint
|
||||
# as `a` and `xr0`, the initial right endpoint, as `b`.
|
||||
# Because we treat the two one-sided searches as though they were
|
||||
# independent, what we keep track of in `work` and what we want to
|
||||
# return in `res` look quite different. Combine the results from the
|
||||
# two one-sided searches before reporting the results to the user.
|
||||
# - "a" refers to the leftward search (the moving end started at `a`)
|
||||
# - "b" refers to the rightward search (the moving end started at `b`)
|
||||
# - "l" refers to the left end of the bracket (closer to -oo)
|
||||
# - "r" refers to the right end of the bracket (closer to +oo)
|
||||
xal = res['x'][:n]
|
||||
xar = res['x_last'][:n]
|
||||
xbl = res['x_last'][n:]
|
||||
xbr = res['x'][n:]
|
||||
|
||||
fal = res['f'][:n]
|
||||
far = res['f_last'][:n]
|
||||
fbl = res['f_last'][n:]
|
||||
fbr = res['f'][n:]
|
||||
|
||||
# Initialize the brackets and corresponding function values to return
|
||||
# to the user. Brackets may not be valid (e.g. there is no root,
|
||||
# there weren't enough iterations, NaN encountered), but we still need
|
||||
# to return something. One option would be all NaNs, but what I've
|
||||
# chosen here is the left- and right-most points at which the function
|
||||
# has been evaluated. This gives the user some information about what
|
||||
# interval of the real line has been searched and shows that there is
|
||||
# no sign change between the two ends.
|
||||
xl = xal.copy()
|
||||
fl = fal.copy()
|
||||
xr = xbr.copy()
|
||||
fr = fbr.copy()
|
||||
|
||||
# `status` indicates whether the bracket is valid or not. If so,
|
||||
# we want to adjust the bracket we return to be the narrowest possible
|
||||
# given the points at which we evaluated the function.
|
||||
# For example if bracket "a" is valid and smaller than bracket "b" OR
|
||||
# if bracket "a" is valid and bracket "b" is not valid, we want to
|
||||
# return bracket "a" (and vice versa).
|
||||
sa = res['status'][:n]
|
||||
sb = res['status'][n:]
|
||||
|
||||
da = xar - xal
|
||||
db = xbr - xbl
|
||||
|
||||
i1 = ((da <= db) & (sa == 0)) | ((sa == 0) & (sb != 0))
|
||||
i2 = ((db <= da) & (sb == 0)) | ((sb == 0) & (sa != 0))
|
||||
|
||||
xr[i1] = xar[i1]
|
||||
fr[i1] = far[i1]
|
||||
xl[i2] = xbl[i2]
|
||||
fl[i2] = fbl[i2]
|
||||
|
||||
# Finish assembling the result object
|
||||
res['xl'] = xl
|
||||
res['xr'] = xr
|
||||
res['fl'] = fl
|
||||
res['fr'] = fr
|
||||
|
||||
res['nit'] = np.maximum(res['nit'][:n], res['nit'][n:])
|
||||
res['nfev'] = res['nfev'][:n] + res['nfev'][n:]
|
||||
# If the status on one side is zero, the status is zero. In any case,
|
||||
# report the status from one side only.
|
||||
res['status'] = np.choose(sa == 0, (sb, sa))
|
||||
res['success'] = (res['status'] == 0)
|
||||
|
||||
del res['x']
|
||||
del res['f']
|
||||
del res['x_last']
|
||||
del res['f_last']
|
||||
|
||||
return shape[:-1]
|
||||
|
||||
return eim._loop(work, callback, shape, maxiter, func, args, dtype,
|
||||
pre_func_eval, post_func_eval, check_termination,
|
||||
post_termination_check, customize_result, res_work_pairs,
|
||||
xp)
|
||||
|
||||
|
||||
def _bracket_minimum_iv(func, xm0, xl0, xr0, xmin, xmax, factor, args, maxiter):
|
||||
|
||||
if not callable(func):
|
||||
raise ValueError('`func` must be callable.')
|
||||
|
||||
if not np.iterable(args):
|
||||
args = (args,)
|
||||
|
||||
xm0 = np.asarray(xm0)[()]
|
||||
if not np.issubdtype(xm0.dtype, np.number) or np.iscomplex(xm0).any():
|
||||
raise ValueError('`xm0` must be numeric and real.')
|
||||
|
||||
xmin = -np.inf if xmin is None else xmin
|
||||
xmax = np.inf if xmax is None else xmax
|
||||
|
||||
# If xl0 (xr0) is not supplied, fill with a dummy value for the sake
|
||||
# of broadcasting. We need to wait until xmin (xmax) has been validated
|
||||
# to compute the default values.
|
||||
xl0_not_supplied = False
|
||||
if xl0 is None:
|
||||
xl0 = np.nan
|
||||
xl0_not_supplied = True
|
||||
|
||||
xr0_not_supplied = False
|
||||
if xr0 is None:
|
||||
xr0 = np.nan
|
||||
xr0_not_supplied = True
|
||||
|
||||
factor = 2.0 if factor is None else factor
|
||||
xl0, xm0, xr0, xmin, xmax, factor = np.broadcast_arrays(
|
||||
xl0, xm0, xr0, xmin, xmax, factor
|
||||
)
|
||||
|
||||
if not np.issubdtype(xl0.dtype, np.number) or np.iscomplex(xl0).any():
|
||||
raise ValueError('`xl0` must be numeric and real.')
|
||||
|
||||
if not np.issubdtype(xr0.dtype, np.number) or np.iscomplex(xr0).any():
|
||||
raise ValueError('`xr0` must be numeric and real.')
|
||||
|
||||
if not np.issubdtype(xmin.dtype, np.number) or np.iscomplex(xmin).any():
|
||||
raise ValueError('`xmin` must be numeric and real.')
|
||||
|
||||
if not np.issubdtype(xmax.dtype, np.number) or np.iscomplex(xmax).any():
|
||||
raise ValueError('`xmax` must be numeric and real.')
|
||||
|
||||
if not np.issubdtype(factor.dtype, np.number) or np.iscomplex(factor).any():
|
||||
raise ValueError('`factor` must be numeric and real.')
|
||||
if not np.all(factor > 1):
|
||||
raise ValueError('All elements of `factor` must be greater than 1.')
|
||||
|
||||
# Calculate default values of xl0 and/or xr0 if they have not been supplied
|
||||
# by the user. We need to be careful to ensure xl0 and xr0 are not outside
|
||||
# of (xmin, xmax).
|
||||
if xl0_not_supplied:
|
||||
xl0 = xm0 - np.minimum((xm0 - xmin)/16, 0.5)
|
||||
if xr0_not_supplied:
|
||||
xr0 = xm0 + np.minimum((xmax - xm0)/16, 0.5)
|
||||
|
||||
maxiter = np.asarray(maxiter)
|
||||
message = '`maxiter` must be a non-negative integer.'
|
||||
if (not np.issubdtype(maxiter.dtype, np.number) or maxiter.shape != tuple()
|
||||
or np.iscomplex(maxiter)):
|
||||
raise ValueError(message)
|
||||
maxiter_int = int(maxiter[()])
|
||||
if not maxiter == maxiter_int or maxiter < 0:
|
||||
raise ValueError(message)
|
||||
|
||||
return func, xm0, xl0, xr0, xmin, xmax, factor, args, maxiter
|
||||
|
||||
|
||||
def _bracket_minimum(func, xm0, *, xl0=None, xr0=None, xmin=None, xmax=None,
|
||||
factor=None, args=(), maxiter=1000):
|
||||
"""Bracket the minimum of a unimodal scalar function of one variable
|
||||
|
||||
This function works elementwise when `xm0`, `xl0`, `xr0`, `xmin`, `xmax`,
|
||||
and the elements of `args` are broadcastable arrays.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
func : callable
|
||||
The function for which the minimum is to be bracketed.
|
||||
The signature must be::
|
||||
|
||||
func(x: ndarray, *args) -> ndarray
|
||||
|
||||
where each element of ``x`` is a finite real and ``args`` is a tuple,
|
||||
which may contain an arbitrary number of arrays that are broadcastable
|
||||
with ``x``. `func` must be an elementwise function: each element
|
||||
``func(x)[i]`` must equal ``func(x[i])`` for all indices `i`.
|
||||
xm0: float array_like
|
||||
Starting guess for middle point of bracket.
|
||||
xl0, xr0: float array_like, optional
|
||||
Starting guesses for left and right endpoints of the bracket. Must be
|
||||
broadcastable with one another and with `xm0`.
|
||||
xmin, xmax : float array_like, optional
|
||||
Minimum and maximum allowable endpoints of the bracket, inclusive. Must
|
||||
be broadcastable with `xl0`, `xm0`, and `xr0`.
|
||||
factor : float array_like, optional
|
||||
Controls expansion of bracket endpoint in downhill direction. Works
|
||||
differently in the cases where a limit is set in the downhill direction
|
||||
with `xmax` or `xmin`. See Notes.
|
||||
args : tuple, optional
|
||||
Additional positional arguments to be passed to `func`. Must be arrays
|
||||
broadcastable with `xl0`, `xm0`, `xr0`, `xmin`, and `xmax`. If the
|
||||
callable to be bracketed requires arguments that are not broadcastable
|
||||
with these arrays, wrap that callable with `func` such that `func`
|
||||
accepts only ``x`` and broadcastable arrays.
|
||||
maxiter : int, optional
|
||||
The maximum number of iterations of the algorithm to perform. The number
|
||||
of function evaluations is three greater than the number of iterations.
|
||||
|
||||
Returns
|
||||
-------
|
||||
res : _RichResult
|
||||
An instance of `scipy._lib._util._RichResult` with the following
|
||||
attributes. The descriptions are written as though the values will be
|
||||
scalars; however, if `func` returns an array, the outputs will be
|
||||
arrays of the same shape.
|
||||
|
||||
xl, xm, xr : float
|
||||
The left, middle, and right points of the bracket, if the algorithm
|
||||
terminated successfully.
|
||||
fl, fm, fr : float
|
||||
The function value at the left, middle, and right points of the bracket.
|
||||
nfev : int
|
||||
The number of function evaluations required to find the bracket.
|
||||
nit : int
|
||||
The number of iterations of the algorithm that were performed.
|
||||
status : int
|
||||
An integer representing the exit status of the algorithm.
|
||||
|
||||
- ``0`` : The algorithm produced a valid bracket.
|
||||
- ``-1`` : The bracket expanded to the allowable limits. Assuming
|
||||
unimodality, this implies the endpoint at the limit is a
|
||||
minimizer.
|
||||
- ``-2`` : The maximum number of iterations was reached.
|
||||
- ``-3`` : A non-finite value was encountered.
|
||||
- ``-4`` : ``None`` shall pass.
|
||||
- ``-5`` : The initial bracket does not satisfy
|
||||
`xmin <= xl0 < xm0 < xr0 <= xmax`.
|
||||
|
||||
success : bool
|
||||
``True`` when the algorithm terminated successfully (status ``0``).
|
||||
|
||||
Notes
|
||||
-----
|
||||
Similar to `scipy.optimize.bracket`, this function seeks to find real
|
||||
points ``xl < xm < xr`` such that ``f(xl) >= f(xm)`` and ``f(xr) >= f(xm)``,
|
||||
where at least one of the inequalities is strict. Unlike `scipy.optimize.bracket`,
|
||||
this function can operate in a vectorized manner on array input, so long as
|
||||
the input arrays are broadcastable with each other. Also unlike
|
||||
`scipy.optimize.bracket`, users may specify minimum and maximum endpoints
|
||||
for the desired bracket.
|
||||
|
||||
Given an initial trio of points ``xl = xl0``, ``xm = xm0``, ``xr = xr0``,
|
||||
the algorithm checks if these points already give a valid bracket. If not,
|
||||
a new endpoint, ``w`` is chosen in the "downhill" direction, ``xm`` becomes the new
|
||||
opposite endpoint, and either `xl` or `xr` becomes the new middle point,
|
||||
depending on which direction is downhill. The algorithm repeats from here.
|
||||
|
||||
The new endpoint `w` is chosen differently depending on whether or not a
|
||||
boundary `xmin` or `xmax` has been set in the downhill direction. Without
|
||||
loss of generality, suppose the downhill direction is to the right, so that
|
||||
``f(xl) > f(xm) > f(xr)``. If there is no boundary to the right, then `w`
|
||||
is chosen to be ``xr + factor * (xr - xm)`` where `factor` is controlled by
|
||||
the user (defaults to 2.0) so that step sizes increase in geometric proportion.
|
||||
If there is a boundary, `xmax` in this case, then `w` is chosen to be
|
||||
``xmax - (xmax - xr)/factor``, with steps slowing to a stop at
|
||||
`xmax`. This cautious approach ensures that a minimum near but distinct from
|
||||
the boundary isn't missed while also detecting whether or not the `xmax` is
|
||||
a minimizer when `xmax` is reached after a finite number of steps.
|
||||
""" # noqa: E501
|
||||
callback = None # works; I just don't want to test it
|
||||
|
||||
temp = _bracket_minimum_iv(func, xm0, xl0, xr0, xmin, xmax, factor, args, maxiter)
|
||||
func, xm0, xl0, xr0, xmin, xmax, factor, args, maxiter = temp
|
||||
|
||||
xs = (xl0, xm0, xr0)
|
||||
temp = eim._initialize(func, xs, args)
|
||||
func, xs, fs, args, shape, dtype, xp = temp
|
||||
|
||||
xl0, xm0, xr0 = xs
|
||||
fl0, fm0, fr0 = fs
|
||||
xmin = np.broadcast_to(xmin, shape).astype(dtype, copy=False).ravel()
|
||||
xmax = np.broadcast_to(xmax, shape).astype(dtype, copy=False).ravel()
|
||||
invalid_bracket = ~((xmin <= xl0) & (xl0 < xm0) & (xm0 < xr0) & (xr0 <= xmax))
|
||||
# We will modify factor later on so make a copy. np.broadcast_to returns
|
||||
# a read-only view.
|
||||
factor = np.broadcast_to(factor, shape).astype(dtype, copy=True).ravel()
|
||||
|
||||
# To simplify the logic, swap xl and xr if f(xl) < f(xr). We should always be
|
||||
# marching downhill in the direction from xl to xr.
|
||||
comp = fl0 < fr0
|
||||
xl0[comp], xr0[comp] = xr0[comp], xl0[comp]
|
||||
fl0[comp], fr0[comp] = fr0[comp], fl0[comp]
|
||||
# We only need the boundary in the direction we're traveling.
|
||||
limit = np.where(comp, xmin, xmax)
|
||||
|
||||
unlimited = np.isinf(limit)
|
||||
limited = ~unlimited
|
||||
step = np.empty_like(xl0)
|
||||
|
||||
step[unlimited] = (xr0[unlimited] - xm0[unlimited])
|
||||
step[limited] = (limit[limited] - xr0[limited])
|
||||
|
||||
# Step size is divided by factor for case where there is a limit.
|
||||
factor[limited] = 1 / factor[limited]
|
||||
|
||||
status = np.full_like(xl0, eim._EINPROGRESS, dtype=int)
|
||||
status[invalid_bracket] = eim._EINPUTERR
|
||||
nit, nfev = 0, 3
|
||||
|
||||
work = _RichResult(xl=xl0, xm=xm0, xr=xr0, xr0=xr0, fl=fl0, fm=fm0, fr=fr0,
|
||||
step=step, limit=limit, limited=limited, factor=factor, nit=nit,
|
||||
nfev=nfev, status=status, args=args)
|
||||
|
||||
res_work_pairs = [('status', 'status'), ('xl', 'xl'), ('xm', 'xm'), ('xr', 'xr'),
|
||||
('nit', 'nit'), ('nfev', 'nfev'), ('fl', 'fl'), ('fm', 'fm'),
|
||||
('fr', 'fr')]
|
||||
|
||||
def pre_func_eval(work):
|
||||
work.step *= work.factor
|
||||
x = np.empty_like(work.xr)
|
||||
x[~work.limited] = work.xr0[~work.limited] + work.step[~work.limited]
|
||||
x[work.limited] = work.limit[work.limited] - work.step[work.limited]
|
||||
# Since the new bracket endpoint is calculated from an offset with the
|
||||
# limit, it may be the case that the new endpoint equals the old endpoint,
|
||||
# when the old endpoint is sufficiently close to the limit. We use the
|
||||
# limit itself as the new endpoint in these cases.
|
||||
x[work.limited] = np.where(
|
||||
x[work.limited] == work.xr[work.limited],
|
||||
work.limit[work.limited],
|
||||
x[work.limited],
|
||||
)
|
||||
return x
|
||||
|
||||
def post_func_eval(x, f, work):
|
||||
work.xl, work.xm, work.xr = work.xm, work.xr, x
|
||||
work.fl, work.fm, work.fr = work.fm, work.fr, f
|
||||
|
||||
def check_termination(work):
|
||||
# Condition 0: Initial bracket is invalid.
|
||||
stop = (work.status == eim._EINPUTERR)
|
||||
|
||||
# Condition 1: A valid bracket has been found.
|
||||
i = (
|
||||
(work.fl >= work.fm) & (work.fr > work.fm)
|
||||
| (work.fl > work.fm) & (work.fr >= work.fm)
|
||||
) & ~stop
|
||||
work.status[i] = eim._ECONVERGED
|
||||
stop[i] = True
|
||||
|
||||
# Condition 2: Moving end of bracket reaches limit.
|
||||
i = (work.xr == work.limit) & ~stop
|
||||
work.status[i] = _ELIMITS
|
||||
stop[i] = True
|
||||
|
||||
# Condition 3: non-finite value encountered
|
||||
i = ~(np.isfinite(work.xr) & np.isfinite(work.fr)) & ~stop
|
||||
work.status[i] = eim._EVALUEERR
|
||||
stop[i] = True
|
||||
|
||||
return stop
|
||||
|
||||
def post_termination_check(work):
|
||||
pass
|
||||
|
||||
def customize_result(res, shape):
|
||||
# Reorder entries of xl and xr if they were swapped due to f(xl0) < f(xr0).
|
||||
comp = res['xl'] > res['xr']
|
||||
res['xl'][comp], res['xr'][comp] = res['xr'][comp], res['xl'][comp]
|
||||
res['fl'][comp], res['fr'][comp] = res['fr'][comp], res['fl'][comp]
|
||||
return shape
|
||||
|
||||
return eim._loop(work, callback, shape,
|
||||
maxiter, func, args, dtype,
|
||||
pre_func_eval, post_func_eval,
|
||||
check_termination, post_termination_check,
|
||||
customize_result, res_work_pairs, xp)
|
||||
@ -0,0 +1,549 @@
|
||||
import math
|
||||
import numpy as np
|
||||
import scipy._lib._elementwise_iterative_method as eim
|
||||
from scipy._lib._util import _RichResult
|
||||
from scipy._lib._array_api import xp_clip, xp_minimum, xp_sign
|
||||
|
||||
# TODO:
|
||||
# - (maybe?) don't use fancy indexing assignment
|
||||
# - figure out how to replace the new `try`/`except`s
|
||||
|
||||
|
||||
def _chandrupatla(func, a, b, *, args=(), xatol=None, xrtol=None,
|
||||
fatol=None, frtol=0, maxiter=None, callback=None):
|
||||
"""Find the root of an elementwise function using Chandrupatla's algorithm.
|
||||
|
||||
For each element of the output of `func`, `chandrupatla` seeks the scalar
|
||||
root that makes the element 0. This function allows for `a`, `b`, and the
|
||||
output of `func` to be of any broadcastable shapes.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
func : callable
|
||||
The function whose root is desired. The signature must be::
|
||||
|
||||
func(x: ndarray, *args) -> ndarray
|
||||
|
||||
where each element of ``x`` is a finite real and ``args`` is a tuple,
|
||||
which may contain an arbitrary number of components of any type(s).
|
||||
``func`` must be an elementwise function: each element ``func(x)[i]``
|
||||
must equal ``func(x[i])`` for all indices ``i``. `_chandrupatla`
|
||||
seeks an array ``x`` such that ``func(x)`` is an array of zeros.
|
||||
a, b : array_like
|
||||
The lower and upper bounds of the root of the function. Must be
|
||||
broadcastable with one another.
|
||||
args : tuple, optional
|
||||
Additional positional arguments to be passed to `func`.
|
||||
xatol, xrtol, fatol, frtol : float, optional
|
||||
Absolute and relative tolerances on the root and function value.
|
||||
See Notes for details.
|
||||
maxiter : int, optional
|
||||
The maximum number of iterations of the algorithm to perform.
|
||||
The default is the maximum possible number of bisections within
|
||||
the (normal) floating point numbers of the relevant dtype.
|
||||
callback : callable, optional
|
||||
An optional user-supplied function to be called before the first
|
||||
iteration and after each iteration.
|
||||
Called as ``callback(res)``, where ``res`` is a ``_RichResult``
|
||||
similar to that returned by `_chandrupatla` (but containing the current
|
||||
iterate's values of all variables). If `callback` raises a
|
||||
``StopIteration``, the algorithm will terminate immediately and
|
||||
`_chandrupatla` will return a result.
|
||||
|
||||
Returns
|
||||
-------
|
||||
res : _RichResult
|
||||
An instance of `scipy._lib._util._RichResult` with the following
|
||||
attributes. The descriptions are written as though the values will be
|
||||
scalars; however, if `func` returns an array, the outputs will be
|
||||
arrays of the same shape.
|
||||
|
||||
x : float
|
||||
The root of the function, if the algorithm terminated successfully.
|
||||
nfev : int
|
||||
The number of times the function was called to find the root.
|
||||
nit : int
|
||||
The number of iterations of Chandrupatla's algorithm performed.
|
||||
status : int
|
||||
An integer representing the exit status of the algorithm.
|
||||
``0`` : The algorithm converged to the specified tolerances.
|
||||
``-1`` : The algorithm encountered an invalid bracket.
|
||||
``-2`` : The maximum number of iterations was reached.
|
||||
``-3`` : A non-finite value was encountered.
|
||||
``-4`` : Iteration was terminated by `callback`.
|
||||
``1`` : The algorithm is proceeding normally (in `callback` only).
|
||||
success : bool
|
||||
``True`` when the algorithm terminated successfully (status ``0``).
|
||||
fun : float
|
||||
The value of `func` evaluated at `x`.
|
||||
xl, xr : float
|
||||
The lower and upper ends of the bracket.
|
||||
fl, fr : float
|
||||
The function value at the lower and upper ends of the bracket.
|
||||
|
||||
Notes
|
||||
-----
|
||||
Implemented based on Chandrupatla's original paper [1]_.
|
||||
|
||||
If ``xl`` and ``xr`` are the left and right ends of the bracket,
|
||||
``xmin = xl if abs(func(xl)) <= abs(func(xr)) else xr``,
|
||||
and ``fmin0 = min(func(a), func(b))``, then the algorithm is considered to
|
||||
have converged when ``abs(xr - xl) < xatol + abs(xmin) * xrtol`` or
|
||||
``fun(xmin) <= fatol + abs(fmin0) * frtol``. This is equivalent to the
|
||||
termination condition described in [1]_ with ``xrtol = 4e-10``,
|
||||
``xatol = 1e-5``, and ``fatol = frtol = 0``. The default values are
|
||||
``xatol = 4*tiny``, ``xrtol = 4*eps``, ``frtol = 0``, and ``fatol = tiny``,
|
||||
where ``eps`` and ``tiny`` are the precision and smallest normal number
|
||||
of the result ``dtype`` of function inputs and outputs.
|
||||
|
||||
References
|
||||
----------
|
||||
|
||||
.. [1] Chandrupatla, Tirupathi R.
|
||||
"A new hybrid quadratic/bisection algorithm for finding the zero of a
|
||||
nonlinear function without using derivatives".
|
||||
Advances in Engineering Software, 28(3), 145-149.
|
||||
https://doi.org/10.1016/s0965-9978(96)00051-8
|
||||
|
||||
See Also
|
||||
--------
|
||||
brentq, brenth, ridder, bisect, newton
|
||||
|
||||
Examples
|
||||
--------
|
||||
>>> from scipy import optimize
|
||||
>>> def f(x, c):
|
||||
... return x**3 - 2*x - c
|
||||
>>> c = 5
|
||||
>>> res = optimize._chandrupatla._chandrupatla(f, 0, 3, args=(c,))
|
||||
>>> res.x
|
||||
2.0945514818937463
|
||||
|
||||
>>> c = [3, 4, 5]
|
||||
>>> res = optimize._chandrupatla._chandrupatla(f, 0, 3, args=(c,))
|
||||
>>> res.x
|
||||
array([1.8932892 , 2. , 2.09455148])
|
||||
|
||||
"""
|
||||
res = _chandrupatla_iv(func, args, xatol, xrtol,
|
||||
fatol, frtol, maxiter, callback)
|
||||
func, args, xatol, xrtol, fatol, frtol, maxiter, callback = res
|
||||
|
||||
# Initialization
|
||||
temp = eim._initialize(func, (a, b), args)
|
||||
func, xs, fs, args, shape, dtype, xp = temp
|
||||
x1, x2 = xs
|
||||
f1, f2 = fs
|
||||
status = xp.full_like(x1, eim._EINPROGRESS, dtype=xp.int32) # in progress
|
||||
nit, nfev = 0, 2 # two function evaluations performed above
|
||||
finfo = xp.finfo(dtype)
|
||||
xatol = 4*finfo.smallest_normal if xatol is None else xatol
|
||||
xrtol = 4*finfo.eps if xrtol is None else xrtol
|
||||
fatol = finfo.smallest_normal if fatol is None else fatol
|
||||
frtol = frtol * xp_minimum(xp.abs(f1), xp.abs(f2))
|
||||
maxiter = (math.log2(finfo.max) - math.log2(finfo.smallest_normal)
|
||||
if maxiter is None else maxiter)
|
||||
work = _RichResult(x1=x1, f1=f1, x2=x2, f2=f2, x3=None, f3=None, t=0.5,
|
||||
xatol=xatol, xrtol=xrtol, fatol=fatol, frtol=frtol,
|
||||
nit=nit, nfev=nfev, status=status)
|
||||
res_work_pairs = [('status', 'status'), ('x', 'xmin'), ('fun', 'fmin'),
|
||||
('nit', 'nit'), ('nfev', 'nfev'), ('xl', 'x1'),
|
||||
('fl', 'f1'), ('xr', 'x2'), ('fr', 'f2')]
|
||||
|
||||
def pre_func_eval(work):
|
||||
# [1] Figure 1 (first box)
|
||||
x = work.x1 + work.t * (work.x2 - work.x1)
|
||||
return x
|
||||
|
||||
def post_func_eval(x, f, work):
|
||||
# [1] Figure 1 (first diamond and boxes)
|
||||
# Note: y/n are reversed in figure; compare to BASIC in appendix
|
||||
work.x3, work.f3 = (xp.asarray(work.x2, copy=True),
|
||||
xp.asarray(work.f2, copy=True))
|
||||
j = xp.sign(f) == xp.sign(work.f1)
|
||||
nj = ~j
|
||||
work.x3[j], work.f3[j] = work.x1[j], work.f1[j]
|
||||
work.x2[nj], work.f2[nj] = work.x1[nj], work.f1[nj]
|
||||
work.x1, work.f1 = x, f
|
||||
|
||||
def check_termination(work):
|
||||
# [1] Figure 1 (second diamond)
|
||||
# Check for all terminal conditions and record statuses.
|
||||
|
||||
# See [1] Section 4 (first two sentences)
|
||||
i = xp.abs(work.f1) < xp.abs(work.f2)
|
||||
work.xmin = xp.where(i, work.x1, work.x2)
|
||||
work.fmin = xp.where(i, work.f1, work.f2)
|
||||
stop = xp.zeros_like(work.x1, dtype=xp.bool) # termination condition met
|
||||
|
||||
# If function value tolerance is met, report successful convergence,
|
||||
# regardless of other conditions. Note that `frtol` has been redefined
|
||||
# as `frtol = frtol * minimum(f1, f2)`, where `f1` and `f2` are the
|
||||
# function evaluated at the original ends of the bracket.
|
||||
i = xp.abs(work.fmin) <= work.fatol + work.frtol
|
||||
work.status[i] = eim._ECONVERGED
|
||||
stop[i] = True
|
||||
|
||||
# If the bracket is no longer valid, report failure (unless a function
|
||||
# tolerance is met, as detected above).
|
||||
i = (xp_sign(work.f1) == xp_sign(work.f2)) & ~stop
|
||||
NaN = xp.asarray(xp.nan, dtype=work.xmin.dtype)
|
||||
work.xmin[i], work.fmin[i], work.status[i] = NaN, NaN, eim._ESIGNERR
|
||||
stop[i] = True
|
||||
|
||||
# If the abscissae are non-finite or either function value is NaN,
|
||||
# report failure.
|
||||
x_nonfinite = ~(xp.isfinite(work.x1) & xp.isfinite(work.x2))
|
||||
f_nan = xp.isnan(work.f1) & xp.isnan(work.f2)
|
||||
i = (x_nonfinite | f_nan) & ~stop
|
||||
work.xmin[i], work.fmin[i], work.status[i] = NaN, NaN, eim._EVALUEERR
|
||||
stop[i] = True
|
||||
|
||||
# This is the convergence criterion used in bisect. Chandrupatla's
|
||||
# criterion is equivalent to this except with a factor of 4 on `xrtol`.
|
||||
work.dx = xp.abs(work.x2 - work.x1)
|
||||
work.tol = xp.abs(work.xmin) * work.xrtol + work.xatol
|
||||
i = work.dx < work.tol
|
||||
work.status[i] = eim._ECONVERGED
|
||||
stop[i] = True
|
||||
|
||||
return stop
|
||||
|
||||
def post_termination_check(work):
|
||||
# [1] Figure 1 (third diamond and boxes / Equation 1)
|
||||
xi1 = (work.x1 - work.x2) / (work.x3 - work.x2)
|
||||
phi1 = (work.f1 - work.f2) / (work.f3 - work.f2)
|
||||
alpha = (work.x3 - work.x1) / (work.x2 - work.x1)
|
||||
j = ((1 - xp.sqrt(1 - xi1)) < phi1) & (phi1 < xp.sqrt(xi1))
|
||||
|
||||
f1j, f2j, f3j, alphaj = work.f1[j], work.f2[j], work.f3[j], alpha[j]
|
||||
t = xp.full_like(alpha, 0.5)
|
||||
t[j] = (f1j / (f1j - f2j) * f3j / (f3j - f2j)
|
||||
- alphaj * f1j / (f3j - f1j) * f2j / (f2j - f3j))
|
||||
|
||||
# [1] Figure 1 (last box; see also BASIC in appendix with comment
|
||||
# "Adjust T Away from the Interval Boundary")
|
||||
tl = 0.5 * work.tol / work.dx
|
||||
work.t = xp_clip(t, tl, 1 - tl)
|
||||
|
||||
def customize_result(res, shape):
|
||||
xl, xr, fl, fr = res['xl'], res['xr'], res['fl'], res['fr']
|
||||
i = res['xl'] < res['xr']
|
||||
res['xl'] = xp.where(i, xl, xr)
|
||||
res['xr'] = xp.where(i, xr, xl)
|
||||
res['fl'] = xp.where(i, fl, fr)
|
||||
res['fr'] = xp.where(i, fr, fl)
|
||||
return shape
|
||||
|
||||
return eim._loop(work, callback, shape, maxiter, func, args, dtype,
|
||||
pre_func_eval, post_func_eval, check_termination,
|
||||
post_termination_check, customize_result, res_work_pairs,
|
||||
xp=xp)
|
||||
|
||||
|
||||
def _chandrupatla_iv(func, args, xatol, xrtol,
|
||||
fatol, frtol, maxiter, callback):
|
||||
# Input validation for `_chandrupatla`
|
||||
|
||||
if not callable(func):
|
||||
raise ValueError('`func` must be callable.')
|
||||
|
||||
if not np.iterable(args):
|
||||
args = (args,)
|
||||
|
||||
# tolerances are floats, not arrays; OK to use NumPy
|
||||
tols = np.asarray([xatol if xatol is not None else 1,
|
||||
xrtol if xrtol is not None else 1,
|
||||
fatol if fatol is not None else 1,
|
||||
frtol if frtol is not None else 1])
|
||||
if (not np.issubdtype(tols.dtype, np.number) or np.any(tols < 0)
|
||||
or np.any(np.isnan(tols)) or tols.shape != (4,)):
|
||||
raise ValueError('Tolerances must be non-negative scalars.')
|
||||
|
||||
if maxiter is not None:
|
||||
maxiter_int = int(maxiter)
|
||||
if maxiter != maxiter_int or maxiter < 0:
|
||||
raise ValueError('`maxiter` must be a non-negative integer.')
|
||||
|
||||
if callback is not None and not callable(callback):
|
||||
raise ValueError('`callback` must be callable.')
|
||||
|
||||
return func, args, xatol, xrtol, fatol, frtol, maxiter, callback
|
||||
|
||||
|
||||
def _chandrupatla_minimize(func, x1, x2, x3, *, args=(), xatol=None,
|
||||
xrtol=None, fatol=None, frtol=None, maxiter=100,
|
||||
callback=None):
|
||||
"""Find the minimizer of an elementwise function.
|
||||
|
||||
For each element of the output of `func`, `_chandrupatla_minimize` seeks
|
||||
the scalar minimizer that minimizes the element. This function allows for
|
||||
`x1`, `x2`, `x3`, and the elements of `args` to be arrays of any
|
||||
broadcastable shapes.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
func : callable
|
||||
The function whose minimizer is desired. The signature must be::
|
||||
|
||||
func(x: ndarray, *args) -> ndarray
|
||||
|
||||
where each element of ``x`` is a finite real and ``args`` is a tuple,
|
||||
which may contain an arbitrary number of arrays that are broadcastable
|
||||
with `x`. ``func`` must be an elementwise function: each element
|
||||
``func(x)[i]`` must equal ``func(x[i])`` for all indices ``i``.
|
||||
`_chandrupatla` seeks an array ``x`` such that ``func(x)`` is an array
|
||||
of minima.
|
||||
x1, x2, x3 : array_like
|
||||
The abscissae of a standard scalar minimization bracket. A bracket is
|
||||
valid if ``x1 < x2 < x3`` and ``func(x1) > func(x2) <= func(x3)``.
|
||||
Must be broadcastable with one another and `args`.
|
||||
args : tuple, optional
|
||||
Additional positional arguments to be passed to `func`. Must be arrays
|
||||
broadcastable with `x1`, `x2`, and `x3`. If the callable to be
|
||||
differentiated requires arguments that are not broadcastable with `x`,
|
||||
wrap that callable with `func` such that `func` accepts only `x` and
|
||||
broadcastable arrays.
|
||||
xatol, xrtol, fatol, frtol : float, optional
|
||||
Absolute and relative tolerances on the minimizer and function value.
|
||||
See Notes for details.
|
||||
maxiter : int, optional
|
||||
The maximum number of iterations of the algorithm to perform.
|
||||
callback : callable, optional
|
||||
An optional user-supplied function to be called before the first
|
||||
iteration and after each iteration.
|
||||
Called as ``callback(res)``, where ``res`` is a ``_RichResult``
|
||||
similar to that returned by `_chandrupatla_minimize` (but containing
|
||||
the current iterate's values of all variables). If `callback` raises a
|
||||
``StopIteration``, the algorithm will terminate immediately and
|
||||
`_chandrupatla_minimize` will return a result.
|
||||
|
||||
Returns
|
||||
-------
|
||||
res : _RichResult
|
||||
An instance of `scipy._lib._util._RichResult` with the following
|
||||
attributes. (The descriptions are written as though the values will be
|
||||
scalars; however, if `func` returns an array, the outputs will be
|
||||
arrays of the same shape.)
|
||||
|
||||
success : bool
|
||||
``True`` when the algorithm terminated successfully (status ``0``).
|
||||
status : int
|
||||
An integer representing the exit status of the algorithm.
|
||||
``0`` : The algorithm converged to the specified tolerances.
|
||||
``-1`` : The algorithm encountered an invalid bracket.
|
||||
``-2`` : The maximum number of iterations was reached.
|
||||
``-3`` : A non-finite value was encountered.
|
||||
``-4`` : Iteration was terminated by `callback`.
|
||||
``1`` : The algorithm is proceeding normally (in `callback` only).
|
||||
x : float
|
||||
The minimizer of the function, if the algorithm terminated
|
||||
successfully.
|
||||
fun : float
|
||||
The value of `func` evaluated at `x`.
|
||||
nfev : int
|
||||
The number of points at which `func` was evaluated.
|
||||
nit : int
|
||||
The number of iterations of the algorithm that were performed.
|
||||
xl, xm, xr : float
|
||||
The final three-point bracket.
|
||||
fl, fm, fr : float
|
||||
The function value at the bracket points.
|
||||
|
||||
Notes
|
||||
-----
|
||||
Implemented based on Chandrupatla's original paper [1]_.
|
||||
|
||||
If ``x1 < x2 < x3`` are the points of the bracket and ``f1 > f2 <= f3``
|
||||
are the values of ``func`` at those points, then the algorithm is
|
||||
considered to have converged when ``x3 - x1 <= abs(x2)*xrtol + xatol``
|
||||
or ``(f1 - 2*f2 + f3)/2 <= abs(f2)*frtol + fatol``. Note that first of
|
||||
these differs from the termination conditions described in [1]_. The
|
||||
default values of `xrtol` is the square root of the precision of the
|
||||
appropriate dtype, and ``xatol = fatol = frtol`` is the smallest normal
|
||||
number of the appropriate dtype.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Chandrupatla, Tirupathi R. (1998).
|
||||
"An efficient quadratic fit-sectioning algorithm for minimization
|
||||
without derivatives".
|
||||
Computer Methods in Applied Mechanics and Engineering, 152 (1-2),
|
||||
211-217. https://doi.org/10.1016/S0045-7825(97)00190-4
|
||||
|
||||
See Also
|
||||
--------
|
||||
golden, brent, bounded
|
||||
|
||||
Examples
|
||||
--------
|
||||
>>> from scipy.optimize._chandrupatla import _chandrupatla_minimize
|
||||
>>> def f(x, args=1):
|
||||
... return (x - args)**2
|
||||
>>> res = _chandrupatla_minimize(f, -5, 0, 5)
|
||||
>>> res.x
|
||||
1.0
|
||||
>>> c = [1, 1.5, 2]
|
||||
>>> res = _chandrupatla_minimize(f, -5, 0, 5, args=(c,))
|
||||
>>> res.x
|
||||
array([1. , 1.5, 2. ])
|
||||
"""
|
||||
res = _chandrupatla_iv(func, args, xatol, xrtol,
|
||||
fatol, frtol, maxiter, callback)
|
||||
func, args, xatol, xrtol, fatol, frtol, maxiter, callback = res
|
||||
|
||||
# Initialization
|
||||
xs = (x1, x2, x3)
|
||||
temp = eim._initialize(func, xs, args)
|
||||
func, xs, fs, args, shape, dtype, xp = temp # line split for PEP8
|
||||
x1, x2, x3 = xs
|
||||
f1, f2, f3 = fs
|
||||
phi = dtype.type(0.5 + 0.5*5**0.5) # golden ratio
|
||||
status = np.full_like(x1, eim._EINPROGRESS, dtype=int) # in progress
|
||||
nit, nfev = 0, 3 # three function evaluations performed above
|
||||
fatol = np.finfo(dtype).tiny if fatol is None else fatol
|
||||
frtol = np.finfo(dtype).tiny if frtol is None else frtol
|
||||
xatol = np.finfo(dtype).tiny if xatol is None else xatol
|
||||
xrtol = np.sqrt(np.finfo(dtype).eps) if xrtol is None else xrtol
|
||||
|
||||
# Ensure that x1 < x2 < x3 initially.
|
||||
xs, fs = np.vstack((x1, x2, x3)), np.vstack((f1, f2, f3))
|
||||
i = np.argsort(xs, axis=0)
|
||||
x1, x2, x3 = np.take_along_axis(xs, i, axis=0)
|
||||
f1, f2, f3 = np.take_along_axis(fs, i, axis=0)
|
||||
q0 = x3.copy() # "At the start, q0 is set at x3..." ([1] after (7))
|
||||
|
||||
work = _RichResult(x1=x1, f1=f1, x2=x2, f2=f2, x3=x3, f3=f3, phi=phi,
|
||||
xatol=xatol, xrtol=xrtol, fatol=fatol, frtol=frtol,
|
||||
nit=nit, nfev=nfev, status=status, q0=q0, args=args)
|
||||
res_work_pairs = [('status', 'status'),
|
||||
('x', 'x2'), ('fun', 'f2'),
|
||||
('nit', 'nit'), ('nfev', 'nfev'),
|
||||
('xl', 'x1'), ('xm', 'x2'), ('xr', 'x3'),
|
||||
('fl', 'f1'), ('fm', 'f2'), ('fr', 'f3')]
|
||||
|
||||
def pre_func_eval(work):
|
||||
# `_check_termination` is called first -> `x3 - x2 > x2 - x1`
|
||||
# But let's calculate a few terms that we'll reuse
|
||||
x21 = work.x2 - work.x1
|
||||
x32 = work.x3 - work.x2
|
||||
|
||||
# [1] Section 3. "The quadratic minimum point Q1 is calculated using
|
||||
# the relations developed in the previous section." [1] Section 2 (5/6)
|
||||
A = x21 * (work.f3 - work.f2)
|
||||
B = x32 * (work.f1 - work.f2)
|
||||
C = A / (A + B)
|
||||
# q1 = C * (work.x1 + work.x2) / 2 + (1 - C) * (work.x2 + work.x3) / 2
|
||||
q1 = 0.5 * (C*(work.x1 - work.x3) + work.x2 + work.x3) # much faster
|
||||
# this is an array, so multiplying by 0.5 does not change dtype
|
||||
|
||||
# "If Q1 and Q0 are sufficiently close... Q1 is accepted if it is
|
||||
# sufficiently away from the inside point x2"
|
||||
i = abs(q1 - work.q0) < 0.5 * abs(x21) # [1] (7)
|
||||
xi = q1[i]
|
||||
# Later, after (9), "If the point Q1 is in a +/- xtol neighborhood of
|
||||
# x2, the new point is chosen in the larger interval at a distance
|
||||
# tol away from x2."
|
||||
# See also QBASIC code after "Accept Ql adjust if close to X2".
|
||||
j = abs(q1[i] - work.x2[i]) <= work.xtol[i]
|
||||
xi[j] = work.x2[i][j] + np.sign(x32[i][j]) * work.xtol[i][j]
|
||||
|
||||
# "If condition (7) is not satisfied, golden sectioning of the larger
|
||||
# interval is carried out to introduce the new point."
|
||||
# (For simplicity, we go ahead and calculate it for all points, but we
|
||||
# change the elements for which the condition was satisfied.)
|
||||
x = work.x2 + (2 - work.phi) * x32
|
||||
x[i] = xi
|
||||
|
||||
# "We define Q0 as the value of Q1 at the previous iteration."
|
||||
work.q0 = q1
|
||||
return x
|
||||
|
||||
def post_func_eval(x, f, work):
|
||||
# Standard logic for updating a three-point bracket based on a new
|
||||
# point. In QBASIC code, see "IF SGN(X-X2) = SGN(X3-X2) THEN...".
|
||||
# There is an awful lot of data copying going on here; this would
|
||||
# probably benefit from code optimization or implementation in Pythran.
|
||||
i = np.sign(x - work.x2) == np.sign(work.x3 - work.x2)
|
||||
xi, x1i, x2i, x3i = x[i], work.x1[i], work.x2[i], work.x3[i],
|
||||
fi, f1i, f2i, f3i = f[i], work.f1[i], work.f2[i], work.f3[i]
|
||||
j = fi > f2i
|
||||
x3i[j], f3i[j] = xi[j], fi[j]
|
||||
j = ~j
|
||||
x1i[j], f1i[j], x2i[j], f2i[j] = x2i[j], f2i[j], xi[j], fi[j]
|
||||
|
||||
ni = ~i
|
||||
xni, x1ni, x2ni, x3ni = x[ni], work.x1[ni], work.x2[ni], work.x3[ni],
|
||||
fni, f1ni, f2ni, f3ni = f[ni], work.f1[ni], work.f2[ni], work.f3[ni]
|
||||
j = fni > f2ni
|
||||
x1ni[j], f1ni[j] = xni[j], fni[j]
|
||||
j = ~j
|
||||
x3ni[j], f3ni[j], x2ni[j], f2ni[j] = x2ni[j], f2ni[j], xni[j], fni[j]
|
||||
|
||||
work.x1[i], work.x2[i], work.x3[i] = x1i, x2i, x3i
|
||||
work.f1[i], work.f2[i], work.f3[i] = f1i, f2i, f3i
|
||||
work.x1[ni], work.x2[ni], work.x3[ni] = x1ni, x2ni, x3ni,
|
||||
work.f1[ni], work.f2[ni], work.f3[ni] = f1ni, f2ni, f3ni
|
||||
|
||||
def check_termination(work):
|
||||
# Check for all terminal conditions and record statuses.
|
||||
stop = np.zeros_like(work.x1, dtype=bool) # termination condition met
|
||||
|
||||
# Bracket is invalid; stop and don't return minimizer/minimum
|
||||
i = ((work.f2 > work.f1) | (work.f2 > work.f3))
|
||||
work.x2[i], work.f2[i] = np.nan, np.nan
|
||||
stop[i], work.status[i] = True, eim._ESIGNERR
|
||||
|
||||
# Non-finite values; stop and don't return minimizer/minimum
|
||||
finite = np.isfinite(work.x1+work.x2+work.x3+work.f1+work.f2+work.f3)
|
||||
i = ~(finite | stop)
|
||||
work.x2[i], work.f2[i] = np.nan, np.nan
|
||||
stop[i], work.status[i] = True, eim._EVALUEERR
|
||||
|
||||
# [1] Section 3 "Points 1 and 3 are interchanged if necessary to make
|
||||
# the (x2, x3) the larger interval."
|
||||
# Note: I had used np.choose; this is much faster. This would be a good
|
||||
# place to save e.g. `work.x3 - work.x2` for reuse, but I tried and
|
||||
# didn't notice a speed boost, so let's keep it simple.
|
||||
i = abs(work.x3 - work.x2) < abs(work.x2 - work.x1)
|
||||
temp = work.x1[i]
|
||||
work.x1[i] = work.x3[i]
|
||||
work.x3[i] = temp
|
||||
temp = work.f1[i]
|
||||
work.f1[i] = work.f3[i]
|
||||
work.f3[i] = temp
|
||||
|
||||
# [1] Section 3 (bottom of page 212)
|
||||
# "We set a tolerance value xtol..."
|
||||
work.xtol = abs(work.x2) * work.xrtol + work.xatol # [1] (8)
|
||||
# "The convergence based on interval is achieved when..."
|
||||
# Note: Equality allowed in case of `xtol=0`
|
||||
i = abs(work.x3 - work.x2) <= 2 * work.xtol # [1] (9)
|
||||
|
||||
# "We define ftol using..."
|
||||
ftol = abs(work.f2) * work.frtol + work.fatol # [1] (10)
|
||||
# "The convergence based on function values is achieved when..."
|
||||
# Note 1: modify in place to incorporate tolerance on function value.
|
||||
# Note 2: factor of 2 is not in the text; see QBASIC start of DO loop
|
||||
i |= (work.f1 - 2 * work.f2 + work.f3) <= 2*ftol # [1] (11)
|
||||
i &= ~stop
|
||||
stop[i], work.status[i] = True, eim._ECONVERGED
|
||||
|
||||
return stop
|
||||
|
||||
def post_termination_check(work):
|
||||
pass
|
||||
|
||||
def customize_result(res, shape):
|
||||
xl, xr, fl, fr = res['xl'], res['xr'], res['fl'], res['fr']
|
||||
i = res['xl'] < res['xr']
|
||||
res['xl'] = np.choose(i, (xr, xl))
|
||||
res['xr'] = np.choose(i, (xl, xr))
|
||||
res['fl'] = np.choose(i, (fr, fl))
|
||||
res['fr'] = np.choose(i, (fl, fr))
|
||||
return shape
|
||||
|
||||
return eim._loop(work, callback, shape, maxiter, func, args, dtype,
|
||||
pre_func_eval, post_func_eval, check_termination,
|
||||
post_termination_check, customize_result, res_work_pairs,
|
||||
xp=xp)
|
||||
Binary file not shown.
316
venv/lib/python3.12/site-packages/scipy/optimize/_cobyla_py.py
Normal file
316
venv/lib/python3.12/site-packages/scipy/optimize/_cobyla_py.py
Normal file
@ -0,0 +1,316 @@
|
||||
"""
|
||||
Interface to Constrained Optimization By Linear Approximation
|
||||
|
||||
Functions
|
||||
---------
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
fmin_cobyla
|
||||
|
||||
"""
|
||||
|
||||
import functools
|
||||
from threading import RLock
|
||||
|
||||
import numpy as np
|
||||
from scipy.optimize import _cobyla as cobyla
|
||||
from ._optimize import (OptimizeResult, _check_unknown_options,
|
||||
_prepare_scalar_function)
|
||||
try:
|
||||
from itertools import izip
|
||||
except ImportError:
|
||||
izip = zip
|
||||
|
||||
__all__ = ['fmin_cobyla']
|
||||
|
||||
# Workaround as _cobyla.minimize is not threadsafe
|
||||
# due to an unknown f2py bug and can segfault,
|
||||
# see gh-9658.
|
||||
_module_lock = RLock()
|
||||
def synchronized(func):
|
||||
@functools.wraps(func)
|
||||
def wrapper(*args, **kwargs):
|
||||
with _module_lock:
|
||||
return func(*args, **kwargs)
|
||||
return wrapper
|
||||
|
||||
@synchronized
|
||||
def fmin_cobyla(func, x0, cons, args=(), consargs=None, rhobeg=1.0,
|
||||
rhoend=1e-4, maxfun=1000, disp=None, catol=2e-4,
|
||||
*, callback=None):
|
||||
"""
|
||||
Minimize a function using the Constrained Optimization By Linear
|
||||
Approximation (COBYLA) method. This method wraps a FORTRAN
|
||||
implementation of the algorithm.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
func : callable
|
||||
Function to minimize. In the form func(x, \\*args).
|
||||
x0 : ndarray
|
||||
Initial guess.
|
||||
cons : sequence
|
||||
Constraint functions; must all be ``>=0`` (a single function
|
||||
if only 1 constraint). Each function takes the parameters `x`
|
||||
as its first argument, and it can return either a single number or
|
||||
an array or list of numbers.
|
||||
args : tuple, optional
|
||||
Extra arguments to pass to function.
|
||||
consargs : tuple, optional
|
||||
Extra arguments to pass to constraint functions (default of None means
|
||||
use same extra arguments as those passed to func).
|
||||
Use ``()`` for no extra arguments.
|
||||
rhobeg : float, optional
|
||||
Reasonable initial changes to the variables.
|
||||
rhoend : float, optional
|
||||
Final accuracy in the optimization (not precisely guaranteed). This
|
||||
is a lower bound on the size of the trust region.
|
||||
disp : {0, 1, 2, 3}, optional
|
||||
Controls the frequency of output; 0 implies no output.
|
||||
maxfun : int, optional
|
||||
Maximum number of function evaluations.
|
||||
catol : float, optional
|
||||
Absolute tolerance for constraint violations.
|
||||
callback : callable, optional
|
||||
Called after each iteration, as ``callback(x)``, where ``x`` is the
|
||||
current parameter vector.
|
||||
|
||||
Returns
|
||||
-------
|
||||
x : ndarray
|
||||
The argument that minimises `f`.
|
||||
|
||||
See also
|
||||
--------
|
||||
minimize: Interface to minimization algorithms for multivariate
|
||||
functions. See the 'COBYLA' `method` in particular.
|
||||
|
||||
Notes
|
||||
-----
|
||||
This algorithm is based on linear approximations to the objective
|
||||
function and each constraint. We briefly describe the algorithm.
|
||||
|
||||
Suppose the function is being minimized over k variables. At the
|
||||
jth iteration the algorithm has k+1 points v_1, ..., v_(k+1),
|
||||
an approximate solution x_j, and a radius RHO_j.
|
||||
(i.e., linear plus a constant) approximations to the objective
|
||||
function and constraint functions such that their function values
|
||||
agree with the linear approximation on the k+1 points v_1,.., v_(k+1).
|
||||
This gives a linear program to solve (where the linear approximations
|
||||
of the constraint functions are constrained to be non-negative).
|
||||
|
||||
However, the linear approximations are likely only good
|
||||
approximations near the current simplex, so the linear program is
|
||||
given the further requirement that the solution, which
|
||||
will become x_(j+1), must be within RHO_j from x_j. RHO_j only
|
||||
decreases, never increases. The initial RHO_j is rhobeg and the
|
||||
final RHO_j is rhoend. In this way COBYLA's iterations behave
|
||||
like a trust region algorithm.
|
||||
|
||||
Additionally, the linear program may be inconsistent, or the
|
||||
approximation may give poor improvement. For details about
|
||||
how these issues are resolved, as well as how the points v_i are
|
||||
updated, refer to the source code or the references below.
|
||||
|
||||
|
||||
References
|
||||
----------
|
||||
Powell M.J.D. (1994), "A direct search optimization method that models
|
||||
the objective and constraint functions by linear interpolation.", in
|
||||
Advances in Optimization and Numerical Analysis, eds. S. Gomez and
|
||||
J-P Hennart, Kluwer Academic (Dordrecht), pp. 51-67
|
||||
|
||||
Powell M.J.D. (1998), "Direct search algorithms for optimization
|
||||
calculations", Acta Numerica 7, 287-336
|
||||
|
||||
Powell M.J.D. (2007), "A view of algorithms for optimization without
|
||||
derivatives", Cambridge University Technical Report DAMTP 2007/NA03
|
||||
|
||||
|
||||
Examples
|
||||
--------
|
||||
Minimize the objective function f(x,y) = x*y subject
|
||||
to the constraints x**2 + y**2 < 1 and y > 0::
|
||||
|
||||
>>> def objective(x):
|
||||
... return x[0]*x[1]
|
||||
...
|
||||
>>> def constr1(x):
|
||||
... return 1 - (x[0]**2 + x[1]**2)
|
||||
...
|
||||
>>> def constr2(x):
|
||||
... return x[1]
|
||||
...
|
||||
>>> from scipy.optimize import fmin_cobyla
|
||||
>>> fmin_cobyla(objective, [0.0, 0.1], [constr1, constr2], rhoend=1e-7)
|
||||
array([-0.70710685, 0.70710671])
|
||||
|
||||
The exact solution is (-sqrt(2)/2, sqrt(2)/2).
|
||||
|
||||
|
||||
|
||||
"""
|
||||
err = "cons must be a sequence of callable functions or a single"\
|
||||
" callable function."
|
||||
try:
|
||||
len(cons)
|
||||
except TypeError as e:
|
||||
if callable(cons):
|
||||
cons = [cons]
|
||||
else:
|
||||
raise TypeError(err) from e
|
||||
else:
|
||||
for thisfunc in cons:
|
||||
if not callable(thisfunc):
|
||||
raise TypeError(err)
|
||||
|
||||
if consargs is None:
|
||||
consargs = args
|
||||
|
||||
# build constraints
|
||||
con = tuple({'type': 'ineq', 'fun': c, 'args': consargs} for c in cons)
|
||||
|
||||
# options
|
||||
opts = {'rhobeg': rhobeg,
|
||||
'tol': rhoend,
|
||||
'disp': disp,
|
||||
'maxiter': maxfun,
|
||||
'catol': catol,
|
||||
'callback': callback}
|
||||
|
||||
sol = _minimize_cobyla(func, x0, args, constraints=con,
|
||||
**opts)
|
||||
if disp and not sol['success']:
|
||||
print(f"COBYLA failed to find a solution: {sol.message}")
|
||||
return sol['x']
|
||||
|
||||
|
||||
@synchronized
|
||||
def _minimize_cobyla(fun, x0, args=(), constraints=(),
|
||||
rhobeg=1.0, tol=1e-4, maxiter=1000,
|
||||
disp=False, catol=2e-4, callback=None, bounds=None,
|
||||
**unknown_options):
|
||||
"""
|
||||
Minimize a scalar function of one or more variables using the
|
||||
Constrained Optimization BY Linear Approximation (COBYLA) algorithm.
|
||||
|
||||
Options
|
||||
-------
|
||||
rhobeg : float
|
||||
Reasonable initial changes to the variables.
|
||||
tol : float
|
||||
Final accuracy in the optimization (not precisely guaranteed).
|
||||
This is a lower bound on the size of the trust region.
|
||||
disp : bool
|
||||
Set to True to print convergence messages. If False,
|
||||
`verbosity` is ignored as set to 0.
|
||||
maxiter : int
|
||||
Maximum number of function evaluations.
|
||||
catol : float
|
||||
Tolerance (absolute) for constraint violations
|
||||
|
||||
"""
|
||||
_check_unknown_options(unknown_options)
|
||||
maxfun = maxiter
|
||||
rhoend = tol
|
||||
iprint = int(bool(disp))
|
||||
|
||||
# check constraints
|
||||
if isinstance(constraints, dict):
|
||||
constraints = (constraints, )
|
||||
|
||||
if bounds:
|
||||
i_lb = np.isfinite(bounds.lb)
|
||||
if np.any(i_lb):
|
||||
def lb_constraint(x, *args, **kwargs):
|
||||
return x[i_lb] - bounds.lb[i_lb]
|
||||
|
||||
constraints.append({'type': 'ineq', 'fun': lb_constraint})
|
||||
|
||||
i_ub = np.isfinite(bounds.ub)
|
||||
if np.any(i_ub):
|
||||
def ub_constraint(x):
|
||||
return bounds.ub[i_ub] - x[i_ub]
|
||||
|
||||
constraints.append({'type': 'ineq', 'fun': ub_constraint})
|
||||
|
||||
for ic, con in enumerate(constraints):
|
||||
# check type
|
||||
try:
|
||||
ctype = con['type'].lower()
|
||||
except KeyError as e:
|
||||
raise KeyError('Constraint %d has no type defined.' % ic) from e
|
||||
except TypeError as e:
|
||||
raise TypeError('Constraints must be defined using a '
|
||||
'dictionary.') from e
|
||||
except AttributeError as e:
|
||||
raise TypeError("Constraint's type must be a string.") from e
|
||||
else:
|
||||
if ctype != 'ineq':
|
||||
raise ValueError("Constraints of type '%s' not handled by "
|
||||
"COBYLA." % con['type'])
|
||||
|
||||
# check function
|
||||
if 'fun' not in con:
|
||||
raise KeyError('Constraint %d has no function defined.' % ic)
|
||||
|
||||
# check extra arguments
|
||||
if 'args' not in con:
|
||||
con['args'] = ()
|
||||
|
||||
# m is the total number of constraint values
|
||||
# it takes into account that some constraints may be vector-valued
|
||||
cons_lengths = []
|
||||
for c in constraints:
|
||||
f = c['fun'](x0, *c['args'])
|
||||
try:
|
||||
cons_length = len(f)
|
||||
except TypeError:
|
||||
cons_length = 1
|
||||
cons_lengths.append(cons_length)
|
||||
m = sum(cons_lengths)
|
||||
|
||||
# create the ScalarFunction, cobyla doesn't require derivative function
|
||||
def _jac(x, *args):
|
||||
return None
|
||||
|
||||
sf = _prepare_scalar_function(fun, x0, args=args, jac=_jac)
|
||||
|
||||
def calcfc(x, con):
|
||||
f = sf.fun(x)
|
||||
i = 0
|
||||
for size, c in izip(cons_lengths, constraints):
|
||||
con[i: i + size] = c['fun'](x, *c['args'])
|
||||
i += size
|
||||
return f
|
||||
|
||||
def wrapped_callback(x):
|
||||
if callback is not None:
|
||||
callback(np.copy(x))
|
||||
|
||||
info = np.zeros(4, np.float64)
|
||||
xopt, info = cobyla.minimize(calcfc, m=m, x=np.copy(x0), rhobeg=rhobeg,
|
||||
rhoend=rhoend, iprint=iprint, maxfun=maxfun,
|
||||
dinfo=info, callback=wrapped_callback)
|
||||
|
||||
if info[3] > catol:
|
||||
# Check constraint violation
|
||||
info[0] = 4
|
||||
|
||||
return OptimizeResult(x=xopt,
|
||||
status=int(info[0]),
|
||||
success=info[0] == 1,
|
||||
message={1: 'Optimization terminated successfully.',
|
||||
2: 'Maximum number of function evaluations '
|
||||
'has been exceeded.',
|
||||
3: 'Rounding errors are becoming damaging '
|
||||
'in COBYLA subroutine.',
|
||||
4: 'Did not converge to a solution '
|
||||
'satisfying the constraints. See '
|
||||
'`maxcv` for magnitude of violation.',
|
||||
5: 'NaN result encountered.'
|
||||
}.get(info[0], 'Unknown exit status.'),
|
||||
nfev=int(info[1]),
|
||||
fun=info[2],
|
||||
maxcv=info[3])
|
||||
@ -0,0 +1,62 @@
|
||||
import numpy as np
|
||||
|
||||
from ._optimize import _check_unknown_options
|
||||
|
||||
|
||||
def _minimize_cobyqa(fun, x0, args=(), bounds=None, constraints=(),
|
||||
callback=None, disp=False, maxfev=None, maxiter=None,
|
||||
f_target=-np.inf, feasibility_tol=1e-8,
|
||||
initial_tr_radius=1.0, final_tr_radius=1e-6, scale=False,
|
||||
**unknown_options):
|
||||
"""
|
||||
Minimize a scalar function of one or more variables using the
|
||||
Constrained Optimization BY Quadratic Approximations (COBYQA) algorithm [1]_.
|
||||
|
||||
.. versionadded:: 1.14.0
|
||||
|
||||
Options
|
||||
-------
|
||||
disp : bool
|
||||
Set to True to print information about the optimization procedure.
|
||||
maxfev : int
|
||||
Maximum number of function evaluations.
|
||||
maxiter : int
|
||||
Maximum number of iterations.
|
||||
f_target : float
|
||||
Target value for the objective function. The optimization procedure is
|
||||
terminated when the objective function value of a feasible point (see
|
||||
`feasibility_tol` below) is less than or equal to this target.
|
||||
feasibility_tol : float
|
||||
Absolute tolerance for the constraint violation.
|
||||
initial_tr_radius : float
|
||||
Initial trust-region radius. Typically, this value should be in the
|
||||
order of one tenth of the greatest expected change to the variables.
|
||||
final_tr_radius : float
|
||||
Final trust-region radius. It should indicate the accuracy required in
|
||||
the final values of the variables. If provided, this option overrides
|
||||
the value of `tol` in the `minimize` function.
|
||||
scale : bool
|
||||
Set to True to scale the variables according to the bounds. If True and
|
||||
if all the lower and upper bounds are finite, the variables are scaled
|
||||
to be within the range :math:`[-1, 1]`. If any of the lower or upper
|
||||
bounds is infinite, the variables are not scaled.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] COBYQA
|
||||
https://www.cobyqa.com/stable/
|
||||
"""
|
||||
from .._lib.cobyqa import minimize # import here to avoid circular imports
|
||||
|
||||
_check_unknown_options(unknown_options)
|
||||
options = {
|
||||
'disp': bool(disp),
|
||||
'maxfev': int(maxfev) if maxfev is not None else 500 * len(x0),
|
||||
'maxiter': int(maxiter) if maxiter is not None else 1000 * len(x0),
|
||||
'target': float(f_target),
|
||||
'feasibility_tol': float(feasibility_tol),
|
||||
'radius_init': float(initial_tr_radius),
|
||||
'radius_final': float(final_tr_radius),
|
||||
'scale': bool(scale),
|
||||
}
|
||||
return minimize(fun, x0, args, bounds, constraints, callback, options)
|
||||
590
venv/lib/python3.12/site-packages/scipy/optimize/_constraints.py
Normal file
590
venv/lib/python3.12/site-packages/scipy/optimize/_constraints.py
Normal file
@ -0,0 +1,590 @@
|
||||
"""Constraints definition for minimize."""
|
||||
import numpy as np
|
||||
from ._hessian_update_strategy import BFGS
|
||||
from ._differentiable_functions import (
|
||||
VectorFunction, LinearVectorFunction, IdentityVectorFunction)
|
||||
from ._optimize import OptimizeWarning
|
||||
from warnings import warn, catch_warnings, simplefilter, filterwarnings
|
||||
from scipy.sparse import issparse
|
||||
|
||||
|
||||
def _arr_to_scalar(x):
|
||||
# If x is a numpy array, return x.item(). This will
|
||||
# fail if the array has more than one element.
|
||||
return x.item() if isinstance(x, np.ndarray) else x
|
||||
|
||||
|
||||
class NonlinearConstraint:
|
||||
"""Nonlinear constraint on the variables.
|
||||
|
||||
The constraint has the general inequality form::
|
||||
|
||||
lb <= fun(x) <= ub
|
||||
|
||||
Here the vector of independent variables x is passed as ndarray of shape
|
||||
(n,) and ``fun`` returns a vector with m components.
|
||||
|
||||
It is possible to use equal bounds to represent an equality constraint or
|
||||
infinite bounds to represent a one-sided constraint.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
fun : callable
|
||||
The function defining the constraint.
|
||||
The signature is ``fun(x) -> array_like, shape (m,)``.
|
||||
lb, ub : array_like
|
||||
Lower and upper bounds on the constraint. Each array must have the
|
||||
shape (m,) or be a scalar, in the latter case a bound will be the same
|
||||
for all components of the constraint. Use ``np.inf`` with an
|
||||
appropriate sign to specify a one-sided constraint.
|
||||
Set components of `lb` and `ub` equal to represent an equality
|
||||
constraint. Note that you can mix constraints of different types:
|
||||
interval, one-sided or equality, by setting different components of
|
||||
`lb` and `ub` as necessary.
|
||||
jac : {callable, '2-point', '3-point', 'cs'}, optional
|
||||
Method of computing the Jacobian matrix (an m-by-n matrix,
|
||||
where element (i, j) is the partial derivative of f[i] with
|
||||
respect to x[j]). The keywords {'2-point', '3-point',
|
||||
'cs'} select a finite difference scheme for the numerical estimation.
|
||||
A callable must have the following signature:
|
||||
``jac(x) -> {ndarray, sparse matrix}, shape (m, n)``.
|
||||
Default is '2-point'.
|
||||
hess : {callable, '2-point', '3-point', 'cs', HessianUpdateStrategy, None}, optional
|
||||
Method for computing the Hessian matrix. The keywords
|
||||
{'2-point', '3-point', 'cs'} select a finite difference scheme for
|
||||
numerical estimation. Alternatively, objects implementing
|
||||
`HessianUpdateStrategy` interface can be used to approximate the
|
||||
Hessian. Currently available implementations are:
|
||||
|
||||
- `BFGS` (default option)
|
||||
- `SR1`
|
||||
|
||||
A callable must return the Hessian matrix of ``dot(fun, v)`` and
|
||||
must have the following signature:
|
||||
``hess(x, v) -> {LinearOperator, sparse matrix, array_like}, shape (n, n)``.
|
||||
Here ``v`` is ndarray with shape (m,) containing Lagrange multipliers.
|
||||
keep_feasible : array_like of bool, optional
|
||||
Whether to keep the constraint components feasible throughout
|
||||
iterations. A single value set this property for all components.
|
||||
Default is False. Has no effect for equality constraints.
|
||||
finite_diff_rel_step: None or array_like, optional
|
||||
Relative step size for the finite difference approximation. Default is
|
||||
None, which will select a reasonable value automatically depending
|
||||
on a finite difference scheme.
|
||||
finite_diff_jac_sparsity: {None, array_like, sparse matrix}, optional
|
||||
Defines the sparsity structure of the Jacobian matrix for finite
|
||||
difference estimation, its shape must be (m, n). If the Jacobian has
|
||||
only few non-zero elements in *each* row, providing the sparsity
|
||||
structure will greatly speed up the computations. A zero entry means
|
||||
that a corresponding element in the Jacobian is identically zero.
|
||||
If provided, forces the use of 'lsmr' trust-region solver.
|
||||
If None (default) then dense differencing will be used.
|
||||
|
||||
Notes
|
||||
-----
|
||||
Finite difference schemes {'2-point', '3-point', 'cs'} may be used for
|
||||
approximating either the Jacobian or the Hessian. We, however, do not allow
|
||||
its use for approximating both simultaneously. Hence whenever the Jacobian
|
||||
is estimated via finite-differences, we require the Hessian to be estimated
|
||||
using one of the quasi-Newton strategies.
|
||||
|
||||
The scheme 'cs' is potentially the most accurate, but requires the function
|
||||
to correctly handles complex inputs and be analytically continuable to the
|
||||
complex plane. The scheme '3-point' is more accurate than '2-point' but
|
||||
requires twice as many operations.
|
||||
|
||||
Examples
|
||||
--------
|
||||
Constrain ``x[0] < sin(x[1]) + 1.9``
|
||||
|
||||
>>> from scipy.optimize import NonlinearConstraint
|
||||
>>> import numpy as np
|
||||
>>> con = lambda x: x[0] - np.sin(x[1])
|
||||
>>> nlc = NonlinearConstraint(con, -np.inf, 1.9)
|
||||
|
||||
"""
|
||||
def __init__(self, fun, lb, ub, jac='2-point', hess=BFGS(),
|
||||
keep_feasible=False, finite_diff_rel_step=None,
|
||||
finite_diff_jac_sparsity=None):
|
||||
self.fun = fun
|
||||
self.lb = lb
|
||||
self.ub = ub
|
||||
self.finite_diff_rel_step = finite_diff_rel_step
|
||||
self.finite_diff_jac_sparsity = finite_diff_jac_sparsity
|
||||
self.jac = jac
|
||||
self.hess = hess
|
||||
self.keep_feasible = keep_feasible
|
||||
|
||||
|
||||
class LinearConstraint:
|
||||
"""Linear constraint on the variables.
|
||||
|
||||
The constraint has the general inequality form::
|
||||
|
||||
lb <= A.dot(x) <= ub
|
||||
|
||||
Here the vector of independent variables x is passed as ndarray of shape
|
||||
(n,) and the matrix A has shape (m, n).
|
||||
|
||||
It is possible to use equal bounds to represent an equality constraint or
|
||||
infinite bounds to represent a one-sided constraint.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : {array_like, sparse matrix}, shape (m, n)
|
||||
Matrix defining the constraint.
|
||||
lb, ub : dense array_like, optional
|
||||
Lower and upper limits on the constraint. Each array must have the
|
||||
shape (m,) or be a scalar, in the latter case a bound will be the same
|
||||
for all components of the constraint. Use ``np.inf`` with an
|
||||
appropriate sign to specify a one-sided constraint.
|
||||
Set components of `lb` and `ub` equal to represent an equality
|
||||
constraint. Note that you can mix constraints of different types:
|
||||
interval, one-sided or equality, by setting different components of
|
||||
`lb` and `ub` as necessary. Defaults to ``lb = -np.inf``
|
||||
and ``ub = np.inf`` (no limits).
|
||||
keep_feasible : dense array_like of bool, optional
|
||||
Whether to keep the constraint components feasible throughout
|
||||
iterations. A single value set this property for all components.
|
||||
Default is False. Has no effect for equality constraints.
|
||||
"""
|
||||
def _input_validation(self):
|
||||
if self.A.ndim != 2:
|
||||
message = "`A` must have exactly two dimensions."
|
||||
raise ValueError(message)
|
||||
|
||||
try:
|
||||
shape = self.A.shape[0:1]
|
||||
self.lb = np.broadcast_to(self.lb, shape)
|
||||
self.ub = np.broadcast_to(self.ub, shape)
|
||||
self.keep_feasible = np.broadcast_to(self.keep_feasible, shape)
|
||||
except ValueError:
|
||||
message = ("`lb`, `ub`, and `keep_feasible` must be broadcastable "
|
||||
"to shape `A.shape[0:1]`")
|
||||
raise ValueError(message)
|
||||
|
||||
def __init__(self, A, lb=-np.inf, ub=np.inf, keep_feasible=False):
|
||||
if not issparse(A):
|
||||
# In some cases, if the constraint is not valid, this emits a
|
||||
# VisibleDeprecationWarning about ragged nested sequences
|
||||
# before eventually causing an error. `scipy.optimize.milp` would
|
||||
# prefer that this just error out immediately so it can handle it
|
||||
# rather than concerning the user.
|
||||
with catch_warnings():
|
||||
simplefilter("error")
|
||||
self.A = np.atleast_2d(A).astype(np.float64)
|
||||
else:
|
||||
self.A = A
|
||||
if issparse(lb) or issparse(ub):
|
||||
raise ValueError("Constraint limits must be dense arrays.")
|
||||
self.lb = np.atleast_1d(lb).astype(np.float64)
|
||||
self.ub = np.atleast_1d(ub).astype(np.float64)
|
||||
|
||||
if issparse(keep_feasible):
|
||||
raise ValueError("`keep_feasible` must be a dense array.")
|
||||
self.keep_feasible = np.atleast_1d(keep_feasible).astype(bool)
|
||||
self._input_validation()
|
||||
|
||||
def residual(self, x):
|
||||
"""
|
||||
Calculate the residual between the constraint function and the limits
|
||||
|
||||
For a linear constraint of the form::
|
||||
|
||||
lb <= A@x <= ub
|
||||
|
||||
the lower and upper residuals between ``A@x`` and the limits are values
|
||||
``sl`` and ``sb`` such that::
|
||||
|
||||
lb + sl == A@x == ub - sb
|
||||
|
||||
When all elements of ``sl`` and ``sb`` are positive, all elements of
|
||||
the constraint are satisfied; a negative element in ``sl`` or ``sb``
|
||||
indicates that the corresponding element of the constraint is not
|
||||
satisfied.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
x: array_like
|
||||
Vector of independent variables
|
||||
|
||||
Returns
|
||||
-------
|
||||
sl, sb : array-like
|
||||
The lower and upper residuals
|
||||
"""
|
||||
return self.A@x - self.lb, self.ub - self.A@x
|
||||
|
||||
|
||||
class Bounds:
|
||||
"""Bounds constraint on the variables.
|
||||
|
||||
The constraint has the general inequality form::
|
||||
|
||||
lb <= x <= ub
|
||||
|
||||
It is possible to use equal bounds to represent an equality constraint or
|
||||
infinite bounds to represent a one-sided constraint.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
lb, ub : dense array_like, optional
|
||||
Lower and upper bounds on independent variables. `lb`, `ub`, and
|
||||
`keep_feasible` must be the same shape or broadcastable.
|
||||
Set components of `lb` and `ub` equal
|
||||
to fix a variable. Use ``np.inf`` with an appropriate sign to disable
|
||||
bounds on all or some variables. Note that you can mix constraints of
|
||||
different types: interval, one-sided or equality, by setting different
|
||||
components of `lb` and `ub` as necessary. Defaults to ``lb = -np.inf``
|
||||
and ``ub = np.inf`` (no bounds).
|
||||
keep_feasible : dense array_like of bool, optional
|
||||
Whether to keep the constraint components feasible throughout
|
||||
iterations. Must be broadcastable with `lb` and `ub`.
|
||||
Default is False. Has no effect for equality constraints.
|
||||
"""
|
||||
def _input_validation(self):
|
||||
try:
|
||||
res = np.broadcast_arrays(self.lb, self.ub, self.keep_feasible)
|
||||
self.lb, self.ub, self.keep_feasible = res
|
||||
except ValueError:
|
||||
message = "`lb`, `ub`, and `keep_feasible` must be broadcastable."
|
||||
raise ValueError(message)
|
||||
|
||||
def __init__(self, lb=-np.inf, ub=np.inf, keep_feasible=False):
|
||||
if issparse(lb) or issparse(ub):
|
||||
raise ValueError("Lower and upper bounds must be dense arrays.")
|
||||
self.lb = np.atleast_1d(lb)
|
||||
self.ub = np.atleast_1d(ub)
|
||||
|
||||
if issparse(keep_feasible):
|
||||
raise ValueError("`keep_feasible` must be a dense array.")
|
||||
self.keep_feasible = np.atleast_1d(keep_feasible).astype(bool)
|
||||
self._input_validation()
|
||||
|
||||
def __repr__(self):
|
||||
start = f"{type(self).__name__}({self.lb!r}, {self.ub!r}"
|
||||
if np.any(self.keep_feasible):
|
||||
end = f", keep_feasible={self.keep_feasible!r})"
|
||||
else:
|
||||
end = ")"
|
||||
return start + end
|
||||
|
||||
def residual(self, x):
|
||||
"""Calculate the residual (slack) between the input and the bounds
|
||||
|
||||
For a bound constraint of the form::
|
||||
|
||||
lb <= x <= ub
|
||||
|
||||
the lower and upper residuals between `x` and the bounds are values
|
||||
``sl`` and ``sb`` such that::
|
||||
|
||||
lb + sl == x == ub - sb
|
||||
|
||||
When all elements of ``sl`` and ``sb`` are positive, all elements of
|
||||
``x`` lie within the bounds; a negative element in ``sl`` or ``sb``
|
||||
indicates that the corresponding element of ``x`` is out of bounds.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
x: array_like
|
||||
Vector of independent variables
|
||||
|
||||
Returns
|
||||
-------
|
||||
sl, sb : array-like
|
||||
The lower and upper residuals
|
||||
"""
|
||||
return x - self.lb, self.ub - x
|
||||
|
||||
|
||||
class PreparedConstraint:
|
||||
"""Constraint prepared from a user defined constraint.
|
||||
|
||||
On creation it will check whether a constraint definition is valid and
|
||||
the initial point is feasible. If created successfully, it will contain
|
||||
the attributes listed below.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
constraint : {NonlinearConstraint, LinearConstraint`, Bounds}
|
||||
Constraint to check and prepare.
|
||||
x0 : array_like
|
||||
Initial vector of independent variables.
|
||||
sparse_jacobian : bool or None, optional
|
||||
If bool, then the Jacobian of the constraint will be converted
|
||||
to the corresponded format if necessary. If None (default), such
|
||||
conversion is not made.
|
||||
finite_diff_bounds : 2-tuple, optional
|
||||
Lower and upper bounds on the independent variables for the finite
|
||||
difference approximation, if applicable. Defaults to no bounds.
|
||||
|
||||
Attributes
|
||||
----------
|
||||
fun : {VectorFunction, LinearVectorFunction, IdentityVectorFunction}
|
||||
Function defining the constraint wrapped by one of the convenience
|
||||
classes.
|
||||
bounds : 2-tuple
|
||||
Contains lower and upper bounds for the constraints --- lb and ub.
|
||||
These are converted to ndarray and have a size equal to the number of
|
||||
the constraints.
|
||||
keep_feasible : ndarray
|
||||
Array indicating which components must be kept feasible with a size
|
||||
equal to the number of the constraints.
|
||||
"""
|
||||
def __init__(self, constraint, x0, sparse_jacobian=None,
|
||||
finite_diff_bounds=(-np.inf, np.inf)):
|
||||
if isinstance(constraint, NonlinearConstraint):
|
||||
fun = VectorFunction(constraint.fun, x0,
|
||||
constraint.jac, constraint.hess,
|
||||
constraint.finite_diff_rel_step,
|
||||
constraint.finite_diff_jac_sparsity,
|
||||
finite_diff_bounds, sparse_jacobian)
|
||||
elif isinstance(constraint, LinearConstraint):
|
||||
fun = LinearVectorFunction(constraint.A, x0, sparse_jacobian)
|
||||
elif isinstance(constraint, Bounds):
|
||||
fun = IdentityVectorFunction(x0, sparse_jacobian)
|
||||
else:
|
||||
raise ValueError("`constraint` of an unknown type is passed.")
|
||||
|
||||
m = fun.m
|
||||
|
||||
lb = np.asarray(constraint.lb, dtype=float)
|
||||
ub = np.asarray(constraint.ub, dtype=float)
|
||||
keep_feasible = np.asarray(constraint.keep_feasible, dtype=bool)
|
||||
|
||||
lb = np.broadcast_to(lb, m)
|
||||
ub = np.broadcast_to(ub, m)
|
||||
keep_feasible = np.broadcast_to(keep_feasible, m)
|
||||
|
||||
if keep_feasible.shape != (m,):
|
||||
raise ValueError("`keep_feasible` has a wrong shape.")
|
||||
|
||||
mask = keep_feasible & (lb != ub)
|
||||
f0 = fun.f
|
||||
if np.any(f0[mask] < lb[mask]) or np.any(f0[mask] > ub[mask]):
|
||||
raise ValueError("`x0` is infeasible with respect to some "
|
||||
"inequality constraint with `keep_feasible` "
|
||||
"set to True.")
|
||||
|
||||
self.fun = fun
|
||||
self.bounds = (lb, ub)
|
||||
self.keep_feasible = keep_feasible
|
||||
|
||||
def violation(self, x):
|
||||
"""How much the constraint is exceeded by.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
x : array-like
|
||||
Vector of independent variables
|
||||
|
||||
Returns
|
||||
-------
|
||||
excess : array-like
|
||||
How much the constraint is exceeded by, for each of the
|
||||
constraints specified by `PreparedConstraint.fun`.
|
||||
"""
|
||||
with catch_warnings():
|
||||
# Ignore the following warning, it's not important when
|
||||
# figuring out total violation
|
||||
# UserWarning: delta_grad == 0.0. Check if the approximated
|
||||
# function is linear
|
||||
filterwarnings("ignore", "delta_grad", UserWarning)
|
||||
ev = self.fun.fun(np.asarray(x))
|
||||
|
||||
excess_lb = np.maximum(self.bounds[0] - ev, 0)
|
||||
excess_ub = np.maximum(ev - self.bounds[1], 0)
|
||||
|
||||
return excess_lb + excess_ub
|
||||
|
||||
|
||||
def new_bounds_to_old(lb, ub, n):
|
||||
"""Convert the new bounds representation to the old one.
|
||||
|
||||
The new representation is a tuple (lb, ub) and the old one is a list
|
||||
containing n tuples, ith containing lower and upper bound on a ith
|
||||
variable.
|
||||
If any of the entries in lb/ub are -np.inf/np.inf they are replaced by
|
||||
None.
|
||||
"""
|
||||
lb = np.broadcast_to(lb, n)
|
||||
ub = np.broadcast_to(ub, n)
|
||||
|
||||
lb = [float(x) if x > -np.inf else None for x in lb]
|
||||
ub = [float(x) if x < np.inf else None for x in ub]
|
||||
|
||||
return list(zip(lb, ub))
|
||||
|
||||
|
||||
def old_bound_to_new(bounds):
|
||||
"""Convert the old bounds representation to the new one.
|
||||
|
||||
The new representation is a tuple (lb, ub) and the old one is a list
|
||||
containing n tuples, ith containing lower and upper bound on a ith
|
||||
variable.
|
||||
If any of the entries in lb/ub are None they are replaced by
|
||||
-np.inf/np.inf.
|
||||
"""
|
||||
lb, ub = zip(*bounds)
|
||||
|
||||
# Convert occurrences of None to -inf or inf, and replace occurrences of
|
||||
# any numpy array x with x.item(). Then wrap the results in numpy arrays.
|
||||
lb = np.array([float(_arr_to_scalar(x)) if x is not None else -np.inf
|
||||
for x in lb])
|
||||
ub = np.array([float(_arr_to_scalar(x)) if x is not None else np.inf
|
||||
for x in ub])
|
||||
|
||||
return lb, ub
|
||||
|
||||
|
||||
def strict_bounds(lb, ub, keep_feasible, n_vars):
|
||||
"""Remove bounds which are not asked to be kept feasible."""
|
||||
strict_lb = np.resize(lb, n_vars).astype(float)
|
||||
strict_ub = np.resize(ub, n_vars).astype(float)
|
||||
keep_feasible = np.resize(keep_feasible, n_vars)
|
||||
strict_lb[~keep_feasible] = -np.inf
|
||||
strict_ub[~keep_feasible] = np.inf
|
||||
return strict_lb, strict_ub
|
||||
|
||||
|
||||
def new_constraint_to_old(con, x0):
|
||||
"""
|
||||
Converts new-style constraint objects to old-style constraint dictionaries.
|
||||
"""
|
||||
if isinstance(con, NonlinearConstraint):
|
||||
if (con.finite_diff_jac_sparsity is not None or
|
||||
con.finite_diff_rel_step is not None or
|
||||
not isinstance(con.hess, BFGS) or # misses user specified BFGS
|
||||
con.keep_feasible):
|
||||
warn("Constraint options `finite_diff_jac_sparsity`, "
|
||||
"`finite_diff_rel_step`, `keep_feasible`, and `hess`"
|
||||
"are ignored by this method.",
|
||||
OptimizeWarning, stacklevel=3)
|
||||
|
||||
fun = con.fun
|
||||
if callable(con.jac):
|
||||
jac = con.jac
|
||||
else:
|
||||
jac = None
|
||||
|
||||
else: # LinearConstraint
|
||||
if np.any(con.keep_feasible):
|
||||
warn("Constraint option `keep_feasible` is ignored by this method.",
|
||||
OptimizeWarning, stacklevel=3)
|
||||
|
||||
A = con.A
|
||||
if issparse(A):
|
||||
A = A.toarray()
|
||||
def fun(x):
|
||||
return np.dot(A, x)
|
||||
def jac(x):
|
||||
return A
|
||||
|
||||
# FIXME: when bugs in VectorFunction/LinearVectorFunction are worked out,
|
||||
# use pcon.fun.fun and pcon.fun.jac. Until then, get fun/jac above.
|
||||
pcon = PreparedConstraint(con, x0)
|
||||
lb, ub = pcon.bounds
|
||||
|
||||
i_eq = lb == ub
|
||||
i_bound_below = np.logical_xor(lb != -np.inf, i_eq)
|
||||
i_bound_above = np.logical_xor(ub != np.inf, i_eq)
|
||||
i_unbounded = np.logical_and(lb == -np.inf, ub == np.inf)
|
||||
|
||||
if np.any(i_unbounded):
|
||||
warn("At least one constraint is unbounded above and below. Such "
|
||||
"constraints are ignored.",
|
||||
OptimizeWarning, stacklevel=3)
|
||||
|
||||
ceq = []
|
||||
if np.any(i_eq):
|
||||
def f_eq(x):
|
||||
y = np.array(fun(x)).flatten()
|
||||
return y[i_eq] - lb[i_eq]
|
||||
ceq = [{"type": "eq", "fun": f_eq}]
|
||||
|
||||
if jac is not None:
|
||||
def j_eq(x):
|
||||
dy = jac(x)
|
||||
if issparse(dy):
|
||||
dy = dy.toarray()
|
||||
dy = np.atleast_2d(dy)
|
||||
return dy[i_eq, :]
|
||||
ceq[0]["jac"] = j_eq
|
||||
|
||||
cineq = []
|
||||
n_bound_below = np.sum(i_bound_below)
|
||||
n_bound_above = np.sum(i_bound_above)
|
||||
if n_bound_below + n_bound_above:
|
||||
def f_ineq(x):
|
||||
y = np.zeros(n_bound_below + n_bound_above)
|
||||
y_all = np.array(fun(x)).flatten()
|
||||
y[:n_bound_below] = y_all[i_bound_below] - lb[i_bound_below]
|
||||
y[n_bound_below:] = -(y_all[i_bound_above] - ub[i_bound_above])
|
||||
return y
|
||||
cineq = [{"type": "ineq", "fun": f_ineq}]
|
||||
|
||||
if jac is not None:
|
||||
def j_ineq(x):
|
||||
dy = np.zeros((n_bound_below + n_bound_above, len(x0)))
|
||||
dy_all = jac(x)
|
||||
if issparse(dy_all):
|
||||
dy_all = dy_all.toarray()
|
||||
dy_all = np.atleast_2d(dy_all)
|
||||
dy[:n_bound_below, :] = dy_all[i_bound_below]
|
||||
dy[n_bound_below:, :] = -dy_all[i_bound_above]
|
||||
return dy
|
||||
cineq[0]["jac"] = j_ineq
|
||||
|
||||
old_constraints = ceq + cineq
|
||||
|
||||
if len(old_constraints) > 1:
|
||||
warn("Equality and inequality constraints are specified in the same "
|
||||
"element of the constraint list. For efficient use with this "
|
||||
"method, equality and inequality constraints should be specified "
|
||||
"in separate elements of the constraint list. ",
|
||||
OptimizeWarning, stacklevel=3)
|
||||
return old_constraints
|
||||
|
||||
|
||||
def old_constraint_to_new(ic, con):
|
||||
"""
|
||||
Converts old-style constraint dictionaries to new-style constraint objects.
|
||||
"""
|
||||
# check type
|
||||
try:
|
||||
ctype = con['type'].lower()
|
||||
except KeyError as e:
|
||||
raise KeyError('Constraint %d has no type defined.' % ic) from e
|
||||
except TypeError as e:
|
||||
raise TypeError(
|
||||
'Constraints must be a sequence of dictionaries.'
|
||||
) from e
|
||||
except AttributeError as e:
|
||||
raise TypeError("Constraint's type must be a string.") from e
|
||||
else:
|
||||
if ctype not in ['eq', 'ineq']:
|
||||
raise ValueError("Unknown constraint type '%s'." % con['type'])
|
||||
if 'fun' not in con:
|
||||
raise ValueError('Constraint %d has no function defined.' % ic)
|
||||
|
||||
lb = 0
|
||||
if ctype == 'eq':
|
||||
ub = 0
|
||||
else:
|
||||
ub = np.inf
|
||||
|
||||
jac = '2-point'
|
||||
if 'args' in con:
|
||||
args = con['args']
|
||||
def fun(x):
|
||||
return con["fun"](x, *args)
|
||||
if 'jac' in con:
|
||||
def jac(x):
|
||||
return con["jac"](x, *args)
|
||||
else:
|
||||
fun = con['fun']
|
||||
if 'jac' in con:
|
||||
jac = con['jac']
|
||||
|
||||
return NonlinearConstraint(fun, lb, ub, jac)
|
||||
728
venv/lib/python3.12/site-packages/scipy/optimize/_dcsrch.py
Normal file
728
venv/lib/python3.12/site-packages/scipy/optimize/_dcsrch.py
Normal file
@ -0,0 +1,728 @@
|
||||
import numpy as np
|
||||
|
||||
"""
|
||||
# 2023 - ported from minpack2.dcsrch, dcstep (Fortran) to Python
|
||||
c MINPACK-1 Project. June 1983.
|
||||
c Argonne National Laboratory.
|
||||
c Jorge J. More' and David J. Thuente.
|
||||
c
|
||||
c MINPACK-2 Project. November 1993.
|
||||
c Argonne National Laboratory and University of Minnesota.
|
||||
c Brett M. Averick, Richard G. Carter, and Jorge J. More'.
|
||||
"""
|
||||
|
||||
# NOTE this file was linted by black on first commit, and can be kept that way.
|
||||
|
||||
|
||||
class DCSRCH:
|
||||
"""
|
||||
Parameters
|
||||
----------
|
||||
phi : callable phi(alpha)
|
||||
Function at point `alpha`
|
||||
derphi : callable phi'(alpha)
|
||||
Objective function derivative. Returns a scalar.
|
||||
ftol : float
|
||||
A nonnegative tolerance for the sufficient decrease condition.
|
||||
gtol : float
|
||||
A nonnegative tolerance for the curvature condition.
|
||||
xtol : float
|
||||
A nonnegative relative tolerance for an acceptable step. The
|
||||
subroutine exits with a warning if the relative difference between
|
||||
sty and stx is less than xtol.
|
||||
stpmin : float
|
||||
A nonnegative lower bound for the step.
|
||||
stpmax :
|
||||
A nonnegative upper bound for the step.
|
||||
|
||||
Notes
|
||||
-----
|
||||
|
||||
This subroutine finds a step that satisfies a sufficient
|
||||
decrease condition and a curvature condition.
|
||||
|
||||
Each call of the subroutine updates an interval with
|
||||
endpoints stx and sty. The interval is initially chosen
|
||||
so that it contains a minimizer of the modified function
|
||||
|
||||
psi(stp) = f(stp) - f(0) - ftol*stp*f'(0).
|
||||
|
||||
If psi(stp) <= 0 and f'(stp) >= 0 for some step, then the
|
||||
interval is chosen so that it contains a minimizer of f.
|
||||
|
||||
The algorithm is designed to find a step that satisfies
|
||||
the sufficient decrease condition
|
||||
|
||||
f(stp) <= f(0) + ftol*stp*f'(0),
|
||||
|
||||
and the curvature condition
|
||||
|
||||
abs(f'(stp)) <= gtol*abs(f'(0)).
|
||||
|
||||
If ftol is less than gtol and if, for example, the function
|
||||
is bounded below, then there is always a step which satisfies
|
||||
both conditions.
|
||||
|
||||
If no step can be found that satisfies both conditions, then
|
||||
the algorithm stops with a warning. In this case stp only
|
||||
satisfies the sufficient decrease condition.
|
||||
|
||||
A typical invocation of dcsrch has the following outline:
|
||||
|
||||
Evaluate the function at stp = 0.0d0; store in f.
|
||||
Evaluate the gradient at stp = 0.0d0; store in g.
|
||||
Choose a starting step stp.
|
||||
|
||||
task = 'START'
|
||||
10 continue
|
||||
call dcsrch(stp,f,g,ftol,gtol,xtol,task,stpmin,stpmax,
|
||||
isave,dsave)
|
||||
if (task .eq. 'FG') then
|
||||
Evaluate the function and the gradient at stp
|
||||
go to 10
|
||||
end if
|
||||
|
||||
NOTE: The user must not alter work arrays between calls.
|
||||
|
||||
The subroutine statement is
|
||||
|
||||
subroutine dcsrch(f,g,stp,ftol,gtol,xtol,stpmin,stpmax,
|
||||
task,isave,dsave)
|
||||
where
|
||||
|
||||
stp is a double precision variable.
|
||||
On entry stp is the current estimate of a satisfactory
|
||||
step. On initial entry, a positive initial estimate
|
||||
must be provided.
|
||||
On exit stp is the current estimate of a satisfactory step
|
||||
if task = 'FG'. If task = 'CONV' then stp satisfies
|
||||
the sufficient decrease and curvature condition.
|
||||
|
||||
f is a double precision variable.
|
||||
On initial entry f is the value of the function at 0.
|
||||
On subsequent entries f is the value of the
|
||||
function at stp.
|
||||
On exit f is the value of the function at stp.
|
||||
|
||||
g is a double precision variable.
|
||||
On initial entry g is the derivative of the function at 0.
|
||||
On subsequent entries g is the derivative of the
|
||||
function at stp.
|
||||
On exit g is the derivative of the function at stp.
|
||||
|
||||
ftol is a double precision variable.
|
||||
On entry ftol specifies a nonnegative tolerance for the
|
||||
sufficient decrease condition.
|
||||
On exit ftol is unchanged.
|
||||
|
||||
gtol is a double precision variable.
|
||||
On entry gtol specifies a nonnegative tolerance for the
|
||||
curvature condition.
|
||||
On exit gtol is unchanged.
|
||||
|
||||
xtol is a double precision variable.
|
||||
On entry xtol specifies a nonnegative relative tolerance
|
||||
for an acceptable step. The subroutine exits with a
|
||||
warning if the relative difference between sty and stx
|
||||
is less than xtol.
|
||||
|
||||
On exit xtol is unchanged.
|
||||
|
||||
task is a character variable of length at least 60.
|
||||
On initial entry task must be set to 'START'.
|
||||
On exit task indicates the required action:
|
||||
|
||||
If task(1:2) = 'FG' then evaluate the function and
|
||||
derivative at stp and call dcsrch again.
|
||||
|
||||
If task(1:4) = 'CONV' then the search is successful.
|
||||
|
||||
If task(1:4) = 'WARN' then the subroutine is not able
|
||||
to satisfy the convergence conditions. The exit value of
|
||||
stp contains the best point found during the search.
|
||||
|
||||
If task(1:5) = 'ERROR' then there is an error in the
|
||||
input arguments.
|
||||
|
||||
On exit with convergence, a warning or an error, the
|
||||
variable task contains additional information.
|
||||
|
||||
stpmin is a double precision variable.
|
||||
On entry stpmin is a nonnegative lower bound for the step.
|
||||
On exit stpmin is unchanged.
|
||||
|
||||
stpmax is a double precision variable.
|
||||
On entry stpmax is a nonnegative upper bound for the step.
|
||||
On exit stpmax is unchanged.
|
||||
|
||||
isave is an integer work array of dimension 2.
|
||||
|
||||
dsave is a double precision work array of dimension 13.
|
||||
|
||||
Subprograms called
|
||||
|
||||
MINPACK-2 ... dcstep
|
||||
MINPACK-1 Project. June 1983.
|
||||
Argonne National Laboratory.
|
||||
Jorge J. More' and David J. Thuente.
|
||||
|
||||
MINPACK-2 Project. November 1993.
|
||||
Argonne National Laboratory and University of Minnesota.
|
||||
Brett M. Averick, Richard G. Carter, and Jorge J. More'.
|
||||
"""
|
||||
|
||||
def __init__(self, phi, derphi, ftol, gtol, xtol, stpmin, stpmax):
|
||||
self.stage = None
|
||||
self.ginit = None
|
||||
self.gtest = None
|
||||
self.gx = None
|
||||
self.gy = None
|
||||
self.finit = None
|
||||
self.fx = None
|
||||
self.fy = None
|
||||
self.stx = None
|
||||
self.sty = None
|
||||
self.stmin = None
|
||||
self.stmax = None
|
||||
self.width = None
|
||||
self.width1 = None
|
||||
|
||||
# leave all assessment of tolerances/limits to the first call of
|
||||
# this object
|
||||
self.ftol = ftol
|
||||
self.gtol = gtol
|
||||
self.xtol = xtol
|
||||
self.stpmin = stpmin
|
||||
self.stpmax = stpmax
|
||||
|
||||
self.phi = phi
|
||||
self.derphi = derphi
|
||||
|
||||
def __call__(self, alpha1, phi0=None, derphi0=None, maxiter=100):
|
||||
"""
|
||||
Parameters
|
||||
----------
|
||||
alpha1 : float
|
||||
alpha1 is the current estimate of a satisfactory
|
||||
step. A positive initial estimate must be provided.
|
||||
phi0 : float
|
||||
the value of `phi` at 0 (if known).
|
||||
derphi0 : float
|
||||
the derivative of `derphi` at 0 (if known).
|
||||
maxiter : int
|
||||
|
||||
Returns
|
||||
-------
|
||||
alpha : float
|
||||
Step size, or None if no suitable step was found.
|
||||
phi : float
|
||||
Value of `phi` at the new point `alpha`.
|
||||
phi0 : float
|
||||
Value of `phi` at `alpha=0`.
|
||||
task : bytes
|
||||
On exit task indicates status information.
|
||||
|
||||
If task[:4] == b'CONV' then the search is successful.
|
||||
|
||||
If task[:4] == b'WARN' then the subroutine is not able
|
||||
to satisfy the convergence conditions. The exit value of
|
||||
stp contains the best point found during the search.
|
||||
|
||||
If task[:5] == b'ERROR' then there is an error in the
|
||||
input arguments.
|
||||
"""
|
||||
if phi0 is None:
|
||||
phi0 = self.phi(0.0)
|
||||
if derphi0 is None:
|
||||
derphi0 = self.derphi(0.0)
|
||||
|
||||
phi1 = phi0
|
||||
derphi1 = derphi0
|
||||
|
||||
task = b"START"
|
||||
for i in range(maxiter):
|
||||
stp, phi1, derphi1, task = self._iterate(
|
||||
alpha1, phi1, derphi1, task
|
||||
)
|
||||
|
||||
if not np.isfinite(stp):
|
||||
task = b"WARN"
|
||||
stp = None
|
||||
break
|
||||
|
||||
if task[:2] == b"FG":
|
||||
alpha1 = stp
|
||||
phi1 = self.phi(stp)
|
||||
derphi1 = self.derphi(stp)
|
||||
else:
|
||||
break
|
||||
else:
|
||||
# maxiter reached, the line search did not converge
|
||||
stp = None
|
||||
task = b"WARNING: dcsrch did not converge within max iterations"
|
||||
|
||||
if task[:5] == b"ERROR" or task[:4] == b"WARN":
|
||||
stp = None # failed
|
||||
|
||||
return stp, phi1, phi0, task
|
||||
|
||||
def _iterate(self, stp, f, g, task):
|
||||
"""
|
||||
Parameters
|
||||
----------
|
||||
stp : float
|
||||
The current estimate of a satisfactory step. On initial entry, a
|
||||
positive initial estimate must be provided.
|
||||
f : float
|
||||
On first call f is the value of the function at 0. On subsequent
|
||||
entries f should be the value of the function at stp.
|
||||
g : float
|
||||
On initial entry g is the derivative of the function at 0. On
|
||||
subsequent entries g is the derivative of the function at stp.
|
||||
task : bytes
|
||||
On initial entry task must be set to 'START'.
|
||||
|
||||
On exit with convergence, a warning or an error, the
|
||||
variable task contains additional information.
|
||||
|
||||
|
||||
Returns
|
||||
-------
|
||||
stp, f, g, task: tuple
|
||||
|
||||
stp : float
|
||||
the current estimate of a satisfactory step if task = 'FG'. If
|
||||
task = 'CONV' then stp satisfies the sufficient decrease and
|
||||
curvature condition.
|
||||
f : float
|
||||
the value of the function at stp.
|
||||
g : float
|
||||
the derivative of the function at stp.
|
||||
task : bytes
|
||||
On exit task indicates the required action:
|
||||
|
||||
If task(1:2) == b'FG' then evaluate the function and
|
||||
derivative at stp and call dcsrch again.
|
||||
|
||||
If task(1:4) == b'CONV' then the search is successful.
|
||||
|
||||
If task(1:4) == b'WARN' then the subroutine is not able
|
||||
to satisfy the convergence conditions. The exit value of
|
||||
stp contains the best point found during the search.
|
||||
|
||||
If task(1:5) == b'ERROR' then there is an error in the
|
||||
input arguments.
|
||||
"""
|
||||
p5 = 0.5
|
||||
p66 = 0.66
|
||||
xtrapl = 1.1
|
||||
xtrapu = 4.0
|
||||
|
||||
if task[:5] == b"START":
|
||||
if stp < self.stpmin:
|
||||
task = b"ERROR: STP .LT. STPMIN"
|
||||
if stp > self.stpmax:
|
||||
task = b"ERROR: STP .GT. STPMAX"
|
||||
if g >= 0:
|
||||
task = b"ERROR: INITIAL G .GE. ZERO"
|
||||
if self.ftol < 0:
|
||||
task = b"ERROR: FTOL .LT. ZERO"
|
||||
if self.gtol < 0:
|
||||
task = b"ERROR: GTOL .LT. ZERO"
|
||||
if self.xtol < 0:
|
||||
task = b"ERROR: XTOL .LT. ZERO"
|
||||
if self.stpmin < 0:
|
||||
task = b"ERROR: STPMIN .LT. ZERO"
|
||||
if self.stpmax < self.stpmin:
|
||||
task = b"ERROR: STPMAX .LT. STPMIN"
|
||||
|
||||
if task[:5] == b"ERROR":
|
||||
return stp, f, g, task
|
||||
|
||||
# Initialize local variables.
|
||||
|
||||
self.brackt = False
|
||||
self.stage = 1
|
||||
self.finit = f
|
||||
self.ginit = g
|
||||
self.gtest = self.ftol * self.ginit
|
||||
self.width = self.stpmax - self.stpmin
|
||||
self.width1 = self.width / p5
|
||||
|
||||
# The variables stx, fx, gx contain the values of the step,
|
||||
# function, and derivative at the best step.
|
||||
# The variables sty, fy, gy contain the value of the step,
|
||||
# function, and derivative at sty.
|
||||
# The variables stp, f, g contain the values of the step,
|
||||
# function, and derivative at stp.
|
||||
|
||||
self.stx = 0.0
|
||||
self.fx = self.finit
|
||||
self.gx = self.ginit
|
||||
self.sty = 0.0
|
||||
self.fy = self.finit
|
||||
self.gy = self.ginit
|
||||
self.stmin = 0
|
||||
self.stmax = stp + xtrapu * stp
|
||||
task = b"FG"
|
||||
return stp, f, g, task
|
||||
|
||||
# in the original Fortran this was a location to restore variables
|
||||
# we don't need to do that because they're attributes.
|
||||
|
||||
# If psi(stp) <= 0 and f'(stp) >= 0 for some step, then the
|
||||
# algorithm enters the second stage.
|
||||
ftest = self.finit + stp * self.gtest
|
||||
|
||||
if self.stage == 1 and f <= ftest and g >= 0:
|
||||
self.stage = 2
|
||||
|
||||
# test for warnings
|
||||
if self.brackt and (stp <= self.stmin or stp >= self.stmax):
|
||||
task = b"WARNING: ROUNDING ERRORS PREVENT PROGRESS"
|
||||
if self.brackt and self.stmax - self.stmin <= self.xtol * self.stmax:
|
||||
task = b"WARNING: XTOL TEST SATISFIED"
|
||||
if stp == self.stpmax and f <= ftest and g <= self.gtest:
|
||||
task = b"WARNING: STP = STPMAX"
|
||||
if stp == self.stpmin and (f > ftest or g >= self.gtest):
|
||||
task = b"WARNING: STP = STPMIN"
|
||||
|
||||
# test for convergence
|
||||
if f <= ftest and abs(g) <= self.gtol * -self.ginit:
|
||||
task = b"CONVERGENCE"
|
||||
|
||||
# test for termination
|
||||
if task[:4] == b"WARN" or task[:4] == b"CONV":
|
||||
return stp, f, g, task
|
||||
|
||||
# A modified function is used to predict the step during the
|
||||
# first stage if a lower function value has been obtained but
|
||||
# the decrease is not sufficient.
|
||||
if self.stage == 1 and f <= self.fx and f > ftest:
|
||||
# Define the modified function and derivative values.
|
||||
fm = f - stp * self.gtest
|
||||
fxm = self.fx - self.stx * self.gtest
|
||||
fym = self.fy - self.sty * self.gtest
|
||||
gm = g - self.gtest
|
||||
gxm = self.gx - self.gtest
|
||||
gym = self.gy - self.gtest
|
||||
|
||||
# Call dcstep to update stx, sty, and to compute the new step.
|
||||
# dcstep can have several operations which can produce NaN
|
||||
# e.g. inf/inf. Filter these out.
|
||||
with np.errstate(invalid="ignore", over="ignore"):
|
||||
tup = dcstep(
|
||||
self.stx,
|
||||
fxm,
|
||||
gxm,
|
||||
self.sty,
|
||||
fym,
|
||||
gym,
|
||||
stp,
|
||||
fm,
|
||||
gm,
|
||||
self.brackt,
|
||||
self.stmin,
|
||||
self.stmax,
|
||||
)
|
||||
self.stx, fxm, gxm, self.sty, fym, gym, stp, self.brackt = tup
|
||||
|
||||
# Reset the function and derivative values for f
|
||||
self.fx = fxm + self.stx * self.gtest
|
||||
self.fy = fym + self.sty * self.gtest
|
||||
self.gx = gxm + self.gtest
|
||||
self.gy = gym + self.gtest
|
||||
|
||||
else:
|
||||
# Call dcstep to update stx, sty, and to compute the new step.
|
||||
# dcstep can have several operations which can produce NaN
|
||||
# e.g. inf/inf. Filter these out.
|
||||
|
||||
with np.errstate(invalid="ignore", over="ignore"):
|
||||
tup = dcstep(
|
||||
self.stx,
|
||||
self.fx,
|
||||
self.gx,
|
||||
self.sty,
|
||||
self.fy,
|
||||
self.gy,
|
||||
stp,
|
||||
f,
|
||||
g,
|
||||
self.brackt,
|
||||
self.stmin,
|
||||
self.stmax,
|
||||
)
|
||||
(
|
||||
self.stx,
|
||||
self.fx,
|
||||
self.gx,
|
||||
self.sty,
|
||||
self.fy,
|
||||
self.gy,
|
||||
stp,
|
||||
self.brackt,
|
||||
) = tup
|
||||
|
||||
# Decide if a bisection step is needed
|
||||
if self.brackt:
|
||||
if abs(self.sty - self.stx) >= p66 * self.width1:
|
||||
stp = self.stx + p5 * (self.sty - self.stx)
|
||||
self.width1 = self.width
|
||||
self.width = abs(self.sty - self.stx)
|
||||
|
||||
# Set the minimum and maximum steps allowed for stp.
|
||||
if self.brackt:
|
||||
self.stmin = min(self.stx, self.sty)
|
||||
self.stmax = max(self.stx, self.sty)
|
||||
else:
|
||||
self.stmin = stp + xtrapl * (stp - self.stx)
|
||||
self.stmax = stp + xtrapu * (stp - self.stx)
|
||||
|
||||
# Force the step to be within the bounds stpmax and stpmin.
|
||||
stp = np.clip(stp, self.stpmin, self.stpmax)
|
||||
|
||||
# If further progress is not possible, let stp be the best
|
||||
# point obtained during the search.
|
||||
if (
|
||||
self.brackt
|
||||
and (stp <= self.stmin or stp >= self.stmax)
|
||||
or (
|
||||
self.brackt
|
||||
and self.stmax - self.stmin <= self.xtol * self.stmax
|
||||
)
|
||||
):
|
||||
stp = self.stx
|
||||
|
||||
# Obtain another function and derivative
|
||||
task = b"FG"
|
||||
return stp, f, g, task
|
||||
|
||||
|
||||
def dcstep(stx, fx, dx, sty, fy, dy, stp, fp, dp, brackt, stpmin, stpmax):
|
||||
"""
|
||||
Subroutine dcstep
|
||||
|
||||
This subroutine computes a safeguarded step for a search
|
||||
procedure and updates an interval that contains a step that
|
||||
satisfies a sufficient decrease and a curvature condition.
|
||||
|
||||
The parameter stx contains the step with the least function
|
||||
value. If brackt is set to .true. then a minimizer has
|
||||
been bracketed in an interval with endpoints stx and sty.
|
||||
The parameter stp contains the current step.
|
||||
The subroutine assumes that if brackt is set to .true. then
|
||||
|
||||
min(stx,sty) < stp < max(stx,sty),
|
||||
|
||||
and that the derivative at stx is negative in the direction
|
||||
of the step.
|
||||
|
||||
The subroutine statement is
|
||||
|
||||
subroutine dcstep(stx,fx,dx,sty,fy,dy,stp,fp,dp,brackt,
|
||||
stpmin,stpmax)
|
||||
|
||||
where
|
||||
|
||||
stx is a double precision variable.
|
||||
On entry stx is the best step obtained so far and is an
|
||||
endpoint of the interval that contains the minimizer.
|
||||
On exit stx is the updated best step.
|
||||
|
||||
fx is a double precision variable.
|
||||
On entry fx is the function at stx.
|
||||
On exit fx is the function at stx.
|
||||
|
||||
dx is a double precision variable.
|
||||
On entry dx is the derivative of the function at
|
||||
stx. The derivative must be negative in the direction of
|
||||
the step, that is, dx and stp - stx must have opposite
|
||||
signs.
|
||||
On exit dx is the derivative of the function at stx.
|
||||
|
||||
sty is a double precision variable.
|
||||
On entry sty is the second endpoint of the interval that
|
||||
contains the minimizer.
|
||||
On exit sty is the updated endpoint of the interval that
|
||||
contains the minimizer.
|
||||
|
||||
fy is a double precision variable.
|
||||
On entry fy is the function at sty.
|
||||
On exit fy is the function at sty.
|
||||
|
||||
dy is a double precision variable.
|
||||
On entry dy is the derivative of the function at sty.
|
||||
On exit dy is the derivative of the function at the exit sty.
|
||||
|
||||
stp is a double precision variable.
|
||||
On entry stp is the current step. If brackt is set to .true.
|
||||
then on input stp must be between stx and sty.
|
||||
On exit stp is a new trial step.
|
||||
|
||||
fp is a double precision variable.
|
||||
On entry fp is the function at stp
|
||||
On exit fp is unchanged.
|
||||
|
||||
dp is a double precision variable.
|
||||
On entry dp is the derivative of the function at stp.
|
||||
On exit dp is unchanged.
|
||||
|
||||
brackt is an logical variable.
|
||||
On entry brackt specifies if a minimizer has been bracketed.
|
||||
Initially brackt must be set to .false.
|
||||
On exit brackt specifies if a minimizer has been bracketed.
|
||||
When a minimizer is bracketed brackt is set to .true.
|
||||
|
||||
stpmin is a double precision variable.
|
||||
On entry stpmin is a lower bound for the step.
|
||||
On exit stpmin is unchanged.
|
||||
|
||||
stpmax is a double precision variable.
|
||||
On entry stpmax is an upper bound for the step.
|
||||
On exit stpmax is unchanged.
|
||||
|
||||
MINPACK-1 Project. June 1983
|
||||
Argonne National Laboratory.
|
||||
Jorge J. More' and David J. Thuente.
|
||||
|
||||
MINPACK-2 Project. November 1993.
|
||||
Argonne National Laboratory and University of Minnesota.
|
||||
Brett M. Averick and Jorge J. More'.
|
||||
|
||||
"""
|
||||
sgn_dp = np.sign(dp)
|
||||
sgn_dx = np.sign(dx)
|
||||
|
||||
# sgnd = dp * (dx / abs(dx))
|
||||
sgnd = sgn_dp * sgn_dx
|
||||
|
||||
# First case: A higher function value. The minimum is bracketed.
|
||||
# If the cubic step is closer to stx than the quadratic step, the
|
||||
# cubic step is taken, otherwise the average of the cubic and
|
||||
# quadratic steps is taken.
|
||||
if fp > fx:
|
||||
theta = 3.0 * (fx - fp) / (stp - stx) + dx + dp
|
||||
s = max(abs(theta), abs(dx), abs(dp))
|
||||
gamma = s * np.sqrt((theta / s) ** 2 - (dx / s) * (dp / s))
|
||||
if stp < stx:
|
||||
gamma *= -1
|
||||
p = (gamma - dx) + theta
|
||||
q = ((gamma - dx) + gamma) + dp
|
||||
r = p / q
|
||||
stpc = stx + r * (stp - stx)
|
||||
stpq = stx + ((dx / ((fx - fp) / (stp - stx) + dx)) / 2.0) * (stp - stx)
|
||||
if abs(stpc - stx) <= abs(stpq - stx):
|
||||
stpf = stpc
|
||||
else:
|
||||
stpf = stpc + (stpq - stpc) / 2.0
|
||||
brackt = True
|
||||
elif sgnd < 0.0:
|
||||
# Second case: A lower function value and derivatives of opposite
|
||||
# sign. The minimum is bracketed. If the cubic step is farther from
|
||||
# stp than the secant step, the cubic step is taken, otherwise the
|
||||
# secant step is taken.
|
||||
theta = 3 * (fx - fp) / (stp - stx) + dx + dp
|
||||
s = max(abs(theta), abs(dx), abs(dp))
|
||||
gamma = s * np.sqrt((theta / s) ** 2 - (dx / s) * (dp / s))
|
||||
if stp > stx:
|
||||
gamma *= -1
|
||||
p = (gamma - dp) + theta
|
||||
q = ((gamma - dp) + gamma) + dx
|
||||
r = p / q
|
||||
stpc = stp + r * (stx - stp)
|
||||
stpq = stp + (dp / (dp - dx)) * (stx - stp)
|
||||
if abs(stpc - stp) > abs(stpq - stp):
|
||||
stpf = stpc
|
||||
else:
|
||||
stpf = stpq
|
||||
brackt = True
|
||||
elif abs(dp) < abs(dx):
|
||||
# Third case: A lower function value, derivatives of the same sign,
|
||||
# and the magnitude of the derivative decreases.
|
||||
|
||||
# The cubic step is computed only if the cubic tends to infinity
|
||||
# in the direction of the step or if the minimum of the cubic
|
||||
# is beyond stp. Otherwise the cubic step is defined to be the
|
||||
# secant step.
|
||||
theta = 3 * (fx - fp) / (stp - stx) + dx + dp
|
||||
s = max(abs(theta), abs(dx), abs(dp))
|
||||
|
||||
# The case gamma = 0 only arises if the cubic does not tend
|
||||
# to infinity in the direction of the step.
|
||||
gamma = s * np.sqrt(max(0, (theta / s) ** 2 - (dx / s) * (dp / s)))
|
||||
if stp > stx:
|
||||
gamma = -gamma
|
||||
p = (gamma - dp) + theta
|
||||
q = (gamma + (dx - dp)) + gamma
|
||||
r = p / q
|
||||
if r < 0 and gamma != 0:
|
||||
stpc = stp + r * (stx - stp)
|
||||
elif stp > stx:
|
||||
stpc = stpmax
|
||||
else:
|
||||
stpc = stpmin
|
||||
stpq = stp + (dp / (dp - dx)) * (stx - stp)
|
||||
|
||||
if brackt:
|
||||
# A minimizer has been bracketed. If the cubic step is
|
||||
# closer to stp than the secant step, the cubic step is
|
||||
# taken, otherwise the secant step is taken.
|
||||
if abs(stpc - stp) < abs(stpq - stp):
|
||||
stpf = stpc
|
||||
else:
|
||||
stpf = stpq
|
||||
|
||||
if stp > stx:
|
||||
stpf = min(stp + 0.66 * (sty - stp), stpf)
|
||||
else:
|
||||
stpf = max(stp + 0.66 * (sty - stp), stpf)
|
||||
else:
|
||||
# A minimizer has not been bracketed. If the cubic step is
|
||||
# farther from stp than the secant step, the cubic step is
|
||||
# taken, otherwise the secant step is taken.
|
||||
if abs(stpc - stp) > abs(stpq - stp):
|
||||
stpf = stpc
|
||||
else:
|
||||
stpf = stpq
|
||||
stpf = np.clip(stpf, stpmin, stpmax)
|
||||
|
||||
else:
|
||||
# Fourth case: A lower function value, derivatives of the same sign,
|
||||
# and the magnitude of the derivative does not decrease. If the
|
||||
# minimum is not bracketed, the step is either stpmin or stpmax,
|
||||
# otherwise the cubic step is taken.
|
||||
if brackt:
|
||||
theta = 3.0 * (fp - fy) / (sty - stp) + dy + dp
|
||||
s = max(abs(theta), abs(dy), abs(dp))
|
||||
gamma = s * np.sqrt((theta / s) ** 2 - (dy / s) * (dp / s))
|
||||
if stp > sty:
|
||||
gamma = -gamma
|
||||
p = (gamma - dp) + theta
|
||||
q = ((gamma - dp) + gamma) + dy
|
||||
r = p / q
|
||||
stpc = stp + r * (sty - stp)
|
||||
stpf = stpc
|
||||
elif stp > stx:
|
||||
stpf = stpmax
|
||||
else:
|
||||
stpf = stpmin
|
||||
|
||||
# Update the interval which contains a minimizer.
|
||||
if fp > fx:
|
||||
sty = stp
|
||||
fy = fp
|
||||
dy = dp
|
||||
else:
|
||||
if sgnd < 0:
|
||||
sty = stx
|
||||
fy = fx
|
||||
dy = dx
|
||||
stx = stp
|
||||
fx = fp
|
||||
dx = dp
|
||||
|
||||
# Compute the new step.
|
||||
stp = stpf
|
||||
|
||||
return stx, fx, dx, sty, fy, dy, stp, brackt
|
||||
@ -0,0 +1,693 @@
|
||||
import numpy as np
|
||||
import scipy.sparse as sps
|
||||
from ._numdiff import approx_derivative, group_columns
|
||||
from ._hessian_update_strategy import HessianUpdateStrategy
|
||||
from scipy.sparse.linalg import LinearOperator
|
||||
from scipy._lib._array_api import atleast_nd, array_namespace
|
||||
|
||||
|
||||
FD_METHODS = ('2-point', '3-point', 'cs')
|
||||
|
||||
|
||||
def _wrapper_fun(fun, args=()):
|
||||
ncalls = [0]
|
||||
|
||||
def wrapped(x):
|
||||
ncalls[0] += 1
|
||||
# Send a copy because the user may overwrite it.
|
||||
# Overwriting results in undefined behaviour because
|
||||
# fun(self.x) will change self.x, with the two no longer linked.
|
||||
fx = fun(np.copy(x), *args)
|
||||
# Make sure the function returns a true scalar
|
||||
if not np.isscalar(fx):
|
||||
try:
|
||||
fx = np.asarray(fx).item()
|
||||
except (TypeError, ValueError) as e:
|
||||
raise ValueError(
|
||||
"The user-provided objective function "
|
||||
"must return a scalar value."
|
||||
) from e
|
||||
return fx
|
||||
return wrapped, ncalls
|
||||
|
||||
|
||||
def _wrapper_grad(grad, fun=None, args=(), finite_diff_options=None):
|
||||
ncalls = [0]
|
||||
|
||||
if callable(grad):
|
||||
def wrapped(x, **kwds):
|
||||
# kwds present to give function same signature as numdiff variant
|
||||
ncalls[0] += 1
|
||||
return np.atleast_1d(grad(np.copy(x), *args))
|
||||
return wrapped, ncalls
|
||||
|
||||
elif grad in FD_METHODS:
|
||||
def wrapped1(x, f0=None):
|
||||
ncalls[0] += 1
|
||||
return approx_derivative(
|
||||
fun, x, f0=f0, **finite_diff_options
|
||||
)
|
||||
|
||||
return wrapped1, ncalls
|
||||
|
||||
|
||||
def _wrapper_hess(hess, grad=None, x0=None, args=(), finite_diff_options=None):
|
||||
if callable(hess):
|
||||
H = hess(np.copy(x0), *args)
|
||||
ncalls = [1]
|
||||
|
||||
if sps.issparse(H):
|
||||
def wrapped(x, **kwds):
|
||||
ncalls[0] += 1
|
||||
return sps.csr_matrix(hess(np.copy(x), *args))
|
||||
|
||||
H = sps.csr_matrix(H)
|
||||
|
||||
elif isinstance(H, LinearOperator):
|
||||
def wrapped(x, **kwds):
|
||||
ncalls[0] += 1
|
||||
return hess(np.copy(x), *args)
|
||||
|
||||
else: # dense
|
||||
def wrapped(x, **kwds):
|
||||
ncalls[0] += 1
|
||||
return np.atleast_2d(np.asarray(hess(np.copy(x), *args)))
|
||||
|
||||
H = np.atleast_2d(np.asarray(H))
|
||||
|
||||
return wrapped, ncalls, H
|
||||
elif hess in FD_METHODS:
|
||||
ncalls = [0]
|
||||
|
||||
def wrapped1(x, f0=None):
|
||||
return approx_derivative(
|
||||
grad, x, f0=f0, **finite_diff_options
|
||||
)
|
||||
|
||||
return wrapped1, ncalls, None
|
||||
|
||||
|
||||
class ScalarFunction:
|
||||
"""Scalar function and its derivatives.
|
||||
|
||||
This class defines a scalar function F: R^n->R and methods for
|
||||
computing or approximating its first and second derivatives.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
fun : callable
|
||||
evaluates the scalar function. Must be of the form ``fun(x, *args)``,
|
||||
where ``x`` is the argument in the form of a 1-D array and ``args`` is
|
||||
a tuple of any additional fixed parameters needed to completely specify
|
||||
the function. Should return a scalar.
|
||||
x0 : array-like
|
||||
Provides an initial set of variables for evaluating fun. Array of real
|
||||
elements of size (n,), where 'n' is the number of independent
|
||||
variables.
|
||||
args : tuple, optional
|
||||
Any additional fixed parameters needed to completely specify the scalar
|
||||
function.
|
||||
grad : {callable, '2-point', '3-point', 'cs'}
|
||||
Method for computing the gradient vector.
|
||||
If it is a callable, it should be a function that returns the gradient
|
||||
vector:
|
||||
|
||||
``grad(x, *args) -> array_like, shape (n,)``
|
||||
|
||||
where ``x`` is an array with shape (n,) and ``args`` is a tuple with
|
||||
the fixed parameters.
|
||||
Alternatively, the keywords {'2-point', '3-point', 'cs'} can be used
|
||||
to select a finite difference scheme for numerical estimation of the
|
||||
gradient with a relative step size. These finite difference schemes
|
||||
obey any specified `bounds`.
|
||||
hess : {callable, '2-point', '3-point', 'cs', HessianUpdateStrategy}
|
||||
Method for computing the Hessian matrix. If it is callable, it should
|
||||
return the Hessian matrix:
|
||||
|
||||
``hess(x, *args) -> {LinearOperator, spmatrix, array}, (n, n)``
|
||||
|
||||
where x is a (n,) ndarray and `args` is a tuple with the fixed
|
||||
parameters. Alternatively, the keywords {'2-point', '3-point', 'cs'}
|
||||
select a finite difference scheme for numerical estimation. Or, objects
|
||||
implementing `HessianUpdateStrategy` interface can be used to
|
||||
approximate the Hessian.
|
||||
Whenever the gradient is estimated via finite-differences, the Hessian
|
||||
cannot be estimated with options {'2-point', '3-point', 'cs'} and needs
|
||||
to be estimated using one of the quasi-Newton strategies.
|
||||
finite_diff_rel_step : None or array_like
|
||||
Relative step size to use. The absolute step size is computed as
|
||||
``h = finite_diff_rel_step * sign(x0) * max(1, abs(x0))``, possibly
|
||||
adjusted to fit into the bounds. For ``method='3-point'`` the sign
|
||||
of `h` is ignored. If None then finite_diff_rel_step is selected
|
||||
automatically,
|
||||
finite_diff_bounds : tuple of array_like
|
||||
Lower and upper bounds on independent variables. Defaults to no bounds,
|
||||
(-np.inf, np.inf). Each bound must match the size of `x0` or be a
|
||||
scalar, in the latter case the bound will be the same for all
|
||||
variables. Use it to limit the range of function evaluation.
|
||||
epsilon : None or array_like, optional
|
||||
Absolute step size to use, possibly adjusted to fit into the bounds.
|
||||
For ``method='3-point'`` the sign of `epsilon` is ignored. By default
|
||||
relative steps are used, only if ``epsilon is not None`` are absolute
|
||||
steps used.
|
||||
|
||||
Notes
|
||||
-----
|
||||
This class implements a memoization logic. There are methods `fun`,
|
||||
`grad`, hess` and corresponding attributes `f`, `g` and `H`. The following
|
||||
things should be considered:
|
||||
|
||||
1. Use only public methods `fun`, `grad` and `hess`.
|
||||
2. After one of the methods is called, the corresponding attribute
|
||||
will be set. However, a subsequent call with a different argument
|
||||
of *any* of the methods may overwrite the attribute.
|
||||
"""
|
||||
def __init__(self, fun, x0, args, grad, hess, finite_diff_rel_step,
|
||||
finite_diff_bounds, epsilon=None):
|
||||
if not callable(grad) and grad not in FD_METHODS:
|
||||
raise ValueError(
|
||||
f"`grad` must be either callable or one of {FD_METHODS}."
|
||||
)
|
||||
|
||||
if not (callable(hess) or hess in FD_METHODS
|
||||
or isinstance(hess, HessianUpdateStrategy)):
|
||||
raise ValueError(
|
||||
f"`hess` must be either callable, HessianUpdateStrategy"
|
||||
f" or one of {FD_METHODS}."
|
||||
)
|
||||
|
||||
if grad in FD_METHODS and hess in FD_METHODS:
|
||||
raise ValueError("Whenever the gradient is estimated via "
|
||||
"finite-differences, we require the Hessian "
|
||||
"to be estimated using one of the "
|
||||
"quasi-Newton strategies.")
|
||||
|
||||
self.xp = xp = array_namespace(x0)
|
||||
_x = atleast_nd(x0, ndim=1, xp=xp)
|
||||
_dtype = xp.float64
|
||||
if xp.isdtype(_x.dtype, "real floating"):
|
||||
_dtype = _x.dtype
|
||||
|
||||
# original arguments
|
||||
self._wrapped_fun, self._nfev = _wrapper_fun(fun, args=args)
|
||||
self._orig_fun = fun
|
||||
self._orig_grad = grad
|
||||
self._orig_hess = hess
|
||||
self._args = args
|
||||
|
||||
# promotes to floating
|
||||
self.x = xp.astype(_x, _dtype)
|
||||
self.x_dtype = _dtype
|
||||
self.n = self.x.size
|
||||
self.f_updated = False
|
||||
self.g_updated = False
|
||||
self.H_updated = False
|
||||
|
||||
self._lowest_x = None
|
||||
self._lowest_f = np.inf
|
||||
|
||||
finite_diff_options = {}
|
||||
if grad in FD_METHODS:
|
||||
finite_diff_options["method"] = grad
|
||||
finite_diff_options["rel_step"] = finite_diff_rel_step
|
||||
finite_diff_options["abs_step"] = epsilon
|
||||
finite_diff_options["bounds"] = finite_diff_bounds
|
||||
if hess in FD_METHODS:
|
||||
finite_diff_options["method"] = hess
|
||||
finite_diff_options["rel_step"] = finite_diff_rel_step
|
||||
finite_diff_options["abs_step"] = epsilon
|
||||
finite_diff_options["as_linear_operator"] = True
|
||||
|
||||
# Initial function evaluation
|
||||
self._update_fun()
|
||||
|
||||
# Initial gradient evaluation
|
||||
self._wrapped_grad, self._ngev = _wrapper_grad(
|
||||
grad,
|
||||
fun=self._wrapped_fun,
|
||||
args=args,
|
||||
finite_diff_options=finite_diff_options
|
||||
)
|
||||
self._update_grad()
|
||||
|
||||
# Hessian evaluation
|
||||
if callable(hess):
|
||||
self._wrapped_hess, self._nhev, self.H = _wrapper_hess(
|
||||
hess, x0=x0, args=args
|
||||
)
|
||||
self.H_updated = True
|
||||
elif hess in FD_METHODS:
|
||||
self._wrapped_hess, self._nhev, self.H = _wrapper_hess(
|
||||
hess,
|
||||
grad=self._wrapped_grad,
|
||||
x0=x0,
|
||||
finite_diff_options=finite_diff_options
|
||||
)
|
||||
self._update_grad()
|
||||
self.H = self._wrapped_hess(self.x, f0=self.g)
|
||||
self.H_updated = True
|
||||
elif isinstance(hess, HessianUpdateStrategy):
|
||||
self.H = hess
|
||||
self.H.initialize(self.n, 'hess')
|
||||
self.H_updated = True
|
||||
self.x_prev = None
|
||||
self.g_prev = None
|
||||
self._nhev = [0]
|
||||
|
||||
@property
|
||||
def nfev(self):
|
||||
return self._nfev[0]
|
||||
|
||||
@property
|
||||
def ngev(self):
|
||||
return self._ngev[0]
|
||||
|
||||
@property
|
||||
def nhev(self):
|
||||
return self._nhev[0]
|
||||
|
||||
def _update_x(self, x):
|
||||
if isinstance(self._orig_hess, HessianUpdateStrategy):
|
||||
self._update_grad()
|
||||
self.x_prev = self.x
|
||||
self.g_prev = self.g
|
||||
# ensure that self.x is a copy of x. Don't store a reference
|
||||
# otherwise the memoization doesn't work properly.
|
||||
|
||||
_x = atleast_nd(x, ndim=1, xp=self.xp)
|
||||
self.x = self.xp.astype(_x, self.x_dtype)
|
||||
self.f_updated = False
|
||||
self.g_updated = False
|
||||
self.H_updated = False
|
||||
self._update_hess()
|
||||
else:
|
||||
# ensure that self.x is a copy of x. Don't store a reference
|
||||
# otherwise the memoization doesn't work properly.
|
||||
_x = atleast_nd(x, ndim=1, xp=self.xp)
|
||||
self.x = self.xp.astype(_x, self.x_dtype)
|
||||
self.f_updated = False
|
||||
self.g_updated = False
|
||||
self.H_updated = False
|
||||
|
||||
def _update_fun(self):
|
||||
if not self.f_updated:
|
||||
fx = self._wrapped_fun(self.x)
|
||||
if fx < self._lowest_f:
|
||||
self._lowest_x = self.x
|
||||
self._lowest_f = fx
|
||||
|
||||
self.f = fx
|
||||
self.f_updated = True
|
||||
|
||||
def _update_grad(self):
|
||||
if not self.g_updated:
|
||||
if self._orig_grad in FD_METHODS:
|
||||
self._update_fun()
|
||||
self.g = self._wrapped_grad(self.x, f0=self.f)
|
||||
self.g_updated = True
|
||||
|
||||
def _update_hess(self):
|
||||
if not self.H_updated:
|
||||
if self._orig_hess in FD_METHODS:
|
||||
self._update_grad()
|
||||
self.H = self._wrapped_hess(self.x, f0=self.g)
|
||||
elif isinstance(self._orig_hess, HessianUpdateStrategy):
|
||||
self._update_grad()
|
||||
self.H.update(self.x - self.x_prev, self.g - self.g_prev)
|
||||
else: # should be callable(hess)
|
||||
self.H = self._wrapped_hess(self.x)
|
||||
|
||||
self.H_updated = True
|
||||
|
||||
def fun(self, x):
|
||||
if not np.array_equal(x, self.x):
|
||||
self._update_x(x)
|
||||
self._update_fun()
|
||||
return self.f
|
||||
|
||||
def grad(self, x):
|
||||
if not np.array_equal(x, self.x):
|
||||
self._update_x(x)
|
||||
self._update_grad()
|
||||
return self.g
|
||||
|
||||
def hess(self, x):
|
||||
if not np.array_equal(x, self.x):
|
||||
self._update_x(x)
|
||||
self._update_hess()
|
||||
return self.H
|
||||
|
||||
def fun_and_grad(self, x):
|
||||
if not np.array_equal(x, self.x):
|
||||
self._update_x(x)
|
||||
self._update_fun()
|
||||
self._update_grad()
|
||||
return self.f, self.g
|
||||
|
||||
|
||||
class VectorFunction:
|
||||
"""Vector function and its derivatives.
|
||||
|
||||
This class defines a vector function F: R^n->R^m and methods for
|
||||
computing or approximating its first and second derivatives.
|
||||
|
||||
Notes
|
||||
-----
|
||||
This class implements a memoization logic. There are methods `fun`,
|
||||
`jac`, hess` and corresponding attributes `f`, `J` and `H`. The following
|
||||
things should be considered:
|
||||
|
||||
1. Use only public methods `fun`, `jac` and `hess`.
|
||||
2. After one of the methods is called, the corresponding attribute
|
||||
will be set. However, a subsequent call with a different argument
|
||||
of *any* of the methods may overwrite the attribute.
|
||||
"""
|
||||
def __init__(self, fun, x0, jac, hess,
|
||||
finite_diff_rel_step, finite_diff_jac_sparsity,
|
||||
finite_diff_bounds, sparse_jacobian):
|
||||
if not callable(jac) and jac not in FD_METHODS:
|
||||
raise ValueError(f"`jac` must be either callable or one of {FD_METHODS}.")
|
||||
|
||||
if not (callable(hess) or hess in FD_METHODS
|
||||
or isinstance(hess, HessianUpdateStrategy)):
|
||||
raise ValueError("`hess` must be either callable,"
|
||||
f"HessianUpdateStrategy or one of {FD_METHODS}.")
|
||||
|
||||
if jac in FD_METHODS and hess in FD_METHODS:
|
||||
raise ValueError("Whenever the Jacobian is estimated via "
|
||||
"finite-differences, we require the Hessian to "
|
||||
"be estimated using one of the quasi-Newton "
|
||||
"strategies.")
|
||||
|
||||
self.xp = xp = array_namespace(x0)
|
||||
_x = atleast_nd(x0, ndim=1, xp=xp)
|
||||
_dtype = xp.float64
|
||||
if xp.isdtype(_x.dtype, "real floating"):
|
||||
_dtype = _x.dtype
|
||||
|
||||
# promotes to floating
|
||||
self.x = xp.astype(_x, _dtype)
|
||||
self.x_dtype = _dtype
|
||||
|
||||
self.n = self.x.size
|
||||
self.nfev = 0
|
||||
self.njev = 0
|
||||
self.nhev = 0
|
||||
self.f_updated = False
|
||||
self.J_updated = False
|
||||
self.H_updated = False
|
||||
|
||||
finite_diff_options = {}
|
||||
if jac in FD_METHODS:
|
||||
finite_diff_options["method"] = jac
|
||||
finite_diff_options["rel_step"] = finite_diff_rel_step
|
||||
if finite_diff_jac_sparsity is not None:
|
||||
sparsity_groups = group_columns(finite_diff_jac_sparsity)
|
||||
finite_diff_options["sparsity"] = (finite_diff_jac_sparsity,
|
||||
sparsity_groups)
|
||||
finite_diff_options["bounds"] = finite_diff_bounds
|
||||
self.x_diff = np.copy(self.x)
|
||||
if hess in FD_METHODS:
|
||||
finite_diff_options["method"] = hess
|
||||
finite_diff_options["rel_step"] = finite_diff_rel_step
|
||||
finite_diff_options["as_linear_operator"] = True
|
||||
self.x_diff = np.copy(self.x)
|
||||
if jac in FD_METHODS and hess in FD_METHODS:
|
||||
raise ValueError("Whenever the Jacobian is estimated via "
|
||||
"finite-differences, we require the Hessian to "
|
||||
"be estimated using one of the quasi-Newton "
|
||||
"strategies.")
|
||||
|
||||
# Function evaluation
|
||||
def fun_wrapped(x):
|
||||
self.nfev += 1
|
||||
return np.atleast_1d(fun(x))
|
||||
|
||||
def update_fun():
|
||||
self.f = fun_wrapped(self.x)
|
||||
|
||||
self._update_fun_impl = update_fun
|
||||
update_fun()
|
||||
|
||||
self.v = np.zeros_like(self.f)
|
||||
self.m = self.v.size
|
||||
|
||||
# Jacobian Evaluation
|
||||
if callable(jac):
|
||||
self.J = jac(self.x)
|
||||
self.J_updated = True
|
||||
self.njev += 1
|
||||
|
||||
if (sparse_jacobian or
|
||||
sparse_jacobian is None and sps.issparse(self.J)):
|
||||
def jac_wrapped(x):
|
||||
self.njev += 1
|
||||
return sps.csr_matrix(jac(x))
|
||||
self.J = sps.csr_matrix(self.J)
|
||||
self.sparse_jacobian = True
|
||||
|
||||
elif sps.issparse(self.J):
|
||||
def jac_wrapped(x):
|
||||
self.njev += 1
|
||||
return jac(x).toarray()
|
||||
self.J = self.J.toarray()
|
||||
self.sparse_jacobian = False
|
||||
|
||||
else:
|
||||
def jac_wrapped(x):
|
||||
self.njev += 1
|
||||
return np.atleast_2d(jac(x))
|
||||
self.J = np.atleast_2d(self.J)
|
||||
self.sparse_jacobian = False
|
||||
|
||||
def update_jac():
|
||||
self.J = jac_wrapped(self.x)
|
||||
|
||||
elif jac in FD_METHODS:
|
||||
self.J = approx_derivative(fun_wrapped, self.x, f0=self.f,
|
||||
**finite_diff_options)
|
||||
self.J_updated = True
|
||||
|
||||
if (sparse_jacobian or
|
||||
sparse_jacobian is None and sps.issparse(self.J)):
|
||||
def update_jac():
|
||||
self._update_fun()
|
||||
self.J = sps.csr_matrix(
|
||||
approx_derivative(fun_wrapped, self.x, f0=self.f,
|
||||
**finite_diff_options))
|
||||
self.J = sps.csr_matrix(self.J)
|
||||
self.sparse_jacobian = True
|
||||
|
||||
elif sps.issparse(self.J):
|
||||
def update_jac():
|
||||
self._update_fun()
|
||||
self.J = approx_derivative(fun_wrapped, self.x, f0=self.f,
|
||||
**finite_diff_options).toarray()
|
||||
self.J = self.J.toarray()
|
||||
self.sparse_jacobian = False
|
||||
|
||||
else:
|
||||
def update_jac():
|
||||
self._update_fun()
|
||||
self.J = np.atleast_2d(
|
||||
approx_derivative(fun_wrapped, self.x, f0=self.f,
|
||||
**finite_diff_options))
|
||||
self.J = np.atleast_2d(self.J)
|
||||
self.sparse_jacobian = False
|
||||
|
||||
self._update_jac_impl = update_jac
|
||||
|
||||
# Define Hessian
|
||||
if callable(hess):
|
||||
self.H = hess(self.x, self.v)
|
||||
self.H_updated = True
|
||||
self.nhev += 1
|
||||
|
||||
if sps.issparse(self.H):
|
||||
def hess_wrapped(x, v):
|
||||
self.nhev += 1
|
||||
return sps.csr_matrix(hess(x, v))
|
||||
self.H = sps.csr_matrix(self.H)
|
||||
|
||||
elif isinstance(self.H, LinearOperator):
|
||||
def hess_wrapped(x, v):
|
||||
self.nhev += 1
|
||||
return hess(x, v)
|
||||
|
||||
else:
|
||||
def hess_wrapped(x, v):
|
||||
self.nhev += 1
|
||||
return np.atleast_2d(np.asarray(hess(x, v)))
|
||||
self.H = np.atleast_2d(np.asarray(self.H))
|
||||
|
||||
def update_hess():
|
||||
self.H = hess_wrapped(self.x, self.v)
|
||||
elif hess in FD_METHODS:
|
||||
def jac_dot_v(x, v):
|
||||
return jac_wrapped(x).T.dot(v)
|
||||
|
||||
def update_hess():
|
||||
self._update_jac()
|
||||
self.H = approx_derivative(jac_dot_v, self.x,
|
||||
f0=self.J.T.dot(self.v),
|
||||
args=(self.v,),
|
||||
**finite_diff_options)
|
||||
update_hess()
|
||||
self.H_updated = True
|
||||
elif isinstance(hess, HessianUpdateStrategy):
|
||||
self.H = hess
|
||||
self.H.initialize(self.n, 'hess')
|
||||
self.H_updated = True
|
||||
self.x_prev = None
|
||||
self.J_prev = None
|
||||
|
||||
def update_hess():
|
||||
self._update_jac()
|
||||
# When v is updated before x was updated, then x_prev and
|
||||
# J_prev are None and we need this check.
|
||||
if self.x_prev is not None and self.J_prev is not None:
|
||||
delta_x = self.x - self.x_prev
|
||||
delta_g = self.J.T.dot(self.v) - self.J_prev.T.dot(self.v)
|
||||
self.H.update(delta_x, delta_g)
|
||||
|
||||
self._update_hess_impl = update_hess
|
||||
|
||||
if isinstance(hess, HessianUpdateStrategy):
|
||||
def update_x(x):
|
||||
self._update_jac()
|
||||
self.x_prev = self.x
|
||||
self.J_prev = self.J
|
||||
_x = atleast_nd(x, ndim=1, xp=self.xp)
|
||||
self.x = self.xp.astype(_x, self.x_dtype)
|
||||
self.f_updated = False
|
||||
self.J_updated = False
|
||||
self.H_updated = False
|
||||
self._update_hess()
|
||||
else:
|
||||
def update_x(x):
|
||||
_x = atleast_nd(x, ndim=1, xp=self.xp)
|
||||
self.x = self.xp.astype(_x, self.x_dtype)
|
||||
self.f_updated = False
|
||||
self.J_updated = False
|
||||
self.H_updated = False
|
||||
|
||||
self._update_x_impl = update_x
|
||||
|
||||
def _update_v(self, v):
|
||||
if not np.array_equal(v, self.v):
|
||||
self.v = v
|
||||
self.H_updated = False
|
||||
|
||||
def _update_x(self, x):
|
||||
if not np.array_equal(x, self.x):
|
||||
self._update_x_impl(x)
|
||||
|
||||
def _update_fun(self):
|
||||
if not self.f_updated:
|
||||
self._update_fun_impl()
|
||||
self.f_updated = True
|
||||
|
||||
def _update_jac(self):
|
||||
if not self.J_updated:
|
||||
self._update_jac_impl()
|
||||
self.J_updated = True
|
||||
|
||||
def _update_hess(self):
|
||||
if not self.H_updated:
|
||||
self._update_hess_impl()
|
||||
self.H_updated = True
|
||||
|
||||
def fun(self, x):
|
||||
self._update_x(x)
|
||||
self._update_fun()
|
||||
return self.f
|
||||
|
||||
def jac(self, x):
|
||||
self._update_x(x)
|
||||
self._update_jac()
|
||||
return self.J
|
||||
|
||||
def hess(self, x, v):
|
||||
# v should be updated before x.
|
||||
self._update_v(v)
|
||||
self._update_x(x)
|
||||
self._update_hess()
|
||||
return self.H
|
||||
|
||||
|
||||
class LinearVectorFunction:
|
||||
"""Linear vector function and its derivatives.
|
||||
|
||||
Defines a linear function F = A x, where x is N-D vector and
|
||||
A is m-by-n matrix. The Jacobian is constant and equals to A. The Hessian
|
||||
is identically zero and it is returned as a csr matrix.
|
||||
"""
|
||||
def __init__(self, A, x0, sparse_jacobian):
|
||||
if sparse_jacobian or sparse_jacobian is None and sps.issparse(A):
|
||||
self.J = sps.csr_matrix(A)
|
||||
self.sparse_jacobian = True
|
||||
elif sps.issparse(A):
|
||||
self.J = A.toarray()
|
||||
self.sparse_jacobian = False
|
||||
else:
|
||||
# np.asarray makes sure A is ndarray and not matrix
|
||||
self.J = np.atleast_2d(np.asarray(A))
|
||||
self.sparse_jacobian = False
|
||||
|
||||
self.m, self.n = self.J.shape
|
||||
|
||||
self.xp = xp = array_namespace(x0)
|
||||
_x = atleast_nd(x0, ndim=1, xp=xp)
|
||||
_dtype = xp.float64
|
||||
if xp.isdtype(_x.dtype, "real floating"):
|
||||
_dtype = _x.dtype
|
||||
|
||||
# promotes to floating
|
||||
self.x = xp.astype(_x, _dtype)
|
||||
self.x_dtype = _dtype
|
||||
|
||||
self.f = self.J.dot(self.x)
|
||||
self.f_updated = True
|
||||
|
||||
self.v = np.zeros(self.m, dtype=float)
|
||||
self.H = sps.csr_matrix((self.n, self.n))
|
||||
|
||||
def _update_x(self, x):
|
||||
if not np.array_equal(x, self.x):
|
||||
_x = atleast_nd(x, ndim=1, xp=self.xp)
|
||||
self.x = self.xp.astype(_x, self.x_dtype)
|
||||
self.f_updated = False
|
||||
|
||||
def fun(self, x):
|
||||
self._update_x(x)
|
||||
if not self.f_updated:
|
||||
self.f = self.J.dot(x)
|
||||
self.f_updated = True
|
||||
return self.f
|
||||
|
||||
def jac(self, x):
|
||||
self._update_x(x)
|
||||
return self.J
|
||||
|
||||
def hess(self, x, v):
|
||||
self._update_x(x)
|
||||
self.v = v
|
||||
return self.H
|
||||
|
||||
|
||||
class IdentityVectorFunction(LinearVectorFunction):
|
||||
"""Identity vector function and its derivatives.
|
||||
|
||||
The Jacobian is the identity matrix, returned as a dense array when
|
||||
`sparse_jacobian=False` and as a csr matrix otherwise. The Hessian is
|
||||
identically zero and it is returned as a csr matrix.
|
||||
"""
|
||||
def __init__(self, x0, sparse_jacobian):
|
||||
n = len(x0)
|
||||
if sparse_jacobian or sparse_jacobian is None:
|
||||
A = sps.eye(n, format='csr')
|
||||
sparse_jacobian = True
|
||||
else:
|
||||
A = np.eye(n)
|
||||
sparse_jacobian = False
|
||||
super().__init__(A, x0, sparse_jacobian)
|
||||
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,856 @@
|
||||
# mypy: disable-error-code="attr-defined"
|
||||
import numpy as np
|
||||
import scipy._lib._elementwise_iterative_method as eim
|
||||
from scipy._lib._util import _RichResult
|
||||
|
||||
_EERRORINCREASE = -1 # used in _differentiate
|
||||
|
||||
def _differentiate_iv(func, x, args, atol, rtol, maxiter, order, initial_step,
|
||||
step_factor, step_direction, preserve_shape, callback):
|
||||
# Input validation for `_differentiate`
|
||||
|
||||
if not callable(func):
|
||||
raise ValueError('`func` must be callable.')
|
||||
|
||||
# x has more complex IV that is taken care of during initialization
|
||||
x = np.asarray(x)
|
||||
dtype = x.dtype if np.issubdtype(x.dtype, np.inexact) else np.float64
|
||||
|
||||
if not np.iterable(args):
|
||||
args = (args,)
|
||||
|
||||
if atol is None:
|
||||
atol = np.finfo(dtype).tiny
|
||||
|
||||
if rtol is None:
|
||||
rtol = np.sqrt(np.finfo(dtype).eps)
|
||||
|
||||
message = 'Tolerances and step parameters must be non-negative scalars.'
|
||||
tols = np.asarray([atol, rtol, initial_step, step_factor])
|
||||
if (not np.issubdtype(tols.dtype, np.number)
|
||||
or np.any(tols < 0)
|
||||
or tols.shape != (4,)):
|
||||
raise ValueError(message)
|
||||
initial_step, step_factor = tols[2:].astype(dtype)
|
||||
|
||||
maxiter_int = int(maxiter)
|
||||
if maxiter != maxiter_int or maxiter <= 0:
|
||||
raise ValueError('`maxiter` must be a positive integer.')
|
||||
|
||||
order_int = int(order)
|
||||
if order_int != order or order <= 0:
|
||||
raise ValueError('`order` must be a positive integer.')
|
||||
|
||||
step_direction = np.sign(step_direction).astype(dtype)
|
||||
x, step_direction = np.broadcast_arrays(x, step_direction)
|
||||
x, step_direction = x[()], step_direction[()]
|
||||
|
||||
message = '`preserve_shape` must be True or False.'
|
||||
if preserve_shape not in {True, False}:
|
||||
raise ValueError(message)
|
||||
|
||||
if callback is not None and not callable(callback):
|
||||
raise ValueError('`callback` must be callable.')
|
||||
|
||||
return (func, x, args, atol, rtol, maxiter_int, order_int, initial_step,
|
||||
step_factor, step_direction, preserve_shape, callback)
|
||||
|
||||
|
||||
def _differentiate(func, x, *, args=(), atol=None, rtol=None, maxiter=10,
|
||||
order=8, initial_step=0.5, step_factor=2.0,
|
||||
step_direction=0, preserve_shape=False, callback=None):
|
||||
"""Evaluate the derivative of an elementwise scalar function numerically.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
func : callable
|
||||
The function whose derivative is desired. The signature must be::
|
||||
|
||||
func(x: ndarray, *fargs) -> ndarray
|
||||
|
||||
where each element of ``x`` is a finite real number and ``fargs`` is a tuple,
|
||||
which may contain an arbitrary number of arrays that are broadcastable
|
||||
with `x`. ``func`` must be an elementwise function: each element
|
||||
``func(x)[i]`` must equal ``func(x[i])`` for all indices ``i``.
|
||||
x : array_like
|
||||
Abscissae at which to evaluate the derivative.
|
||||
args : tuple, optional
|
||||
Additional positional arguments to be passed to `func`. Must be arrays
|
||||
broadcastable with `x`. If the callable to be differentiated requires
|
||||
arguments that are not broadcastable with `x`, wrap that callable with
|
||||
`func`. See Examples.
|
||||
atol, rtol : float, optional
|
||||
Absolute and relative tolerances for the stopping condition: iteration
|
||||
will stop when ``res.error < atol + rtol * abs(res.df)``. The default
|
||||
`atol` is the smallest normal number of the appropriate dtype, and
|
||||
the default `rtol` is the square root of the precision of the
|
||||
appropriate dtype.
|
||||
order : int, default: 8
|
||||
The (positive integer) order of the finite difference formula to be
|
||||
used. Odd integers will be rounded up to the next even integer.
|
||||
initial_step : float, default: 0.5
|
||||
The (absolute) initial step size for the finite difference derivative
|
||||
approximation.
|
||||
step_factor : float, default: 2.0
|
||||
The factor by which the step size is *reduced* in each iteration; i.e.
|
||||
the step size in iteration 1 is ``initial_step/step_factor``. If
|
||||
``step_factor < 1``, subsequent steps will be greater than the initial
|
||||
step; this may be useful if steps smaller than some threshold are
|
||||
undesirable (e.g. due to subtractive cancellation error).
|
||||
maxiter : int, default: 10
|
||||
The maximum number of iterations of the algorithm to perform. See
|
||||
notes.
|
||||
step_direction : array_like
|
||||
An array representing the direction of the finite difference steps (for
|
||||
use when `x` lies near to the boundary of the domain of the function.)
|
||||
Must be broadcastable with `x` and all `args`.
|
||||
Where 0 (default), central differences are used; where negative (e.g.
|
||||
-1), steps are non-positive; and where positive (e.g. 1), all steps are
|
||||
non-negative.
|
||||
preserve_shape : bool, default: False
|
||||
In the following, "arguments of `func`" refers to the array ``x`` and
|
||||
any arrays within ``fargs``. Let ``shape`` be the broadcasted shape
|
||||
of `x` and all elements of `args` (which is conceptually
|
||||
distinct from ``fargs`` passed into `f`).
|
||||
|
||||
- When ``preserve_shape=False`` (default), `f` must accept arguments
|
||||
of *any* broadcastable shapes.
|
||||
|
||||
- When ``preserve_shape=True``, `f` must accept arguments of shape
|
||||
``shape`` *or* ``shape + (n,)``, where ``(n,)`` is the number of
|
||||
abscissae at which the function is being evaluated.
|
||||
|
||||
In either case, for each scalar element ``xi`` within `x`, the array
|
||||
returned by `f` must include the scalar ``f(xi)`` at the same index.
|
||||
Consequently, the shape of the output is always the shape of the input
|
||||
``x``.
|
||||
|
||||
See Examples.
|
||||
callback : callable, optional
|
||||
An optional user-supplied function to be called before the first
|
||||
iteration and after each iteration.
|
||||
Called as ``callback(res)``, where ``res`` is a ``_RichResult``
|
||||
similar to that returned by `_differentiate` (but containing the
|
||||
current iterate's values of all variables). If `callback` raises a
|
||||
``StopIteration``, the algorithm will terminate immediately and
|
||||
`_differentiate` will return a result.
|
||||
|
||||
Returns
|
||||
-------
|
||||
res : _RichResult
|
||||
An instance of `scipy._lib._util._RichResult` with the following
|
||||
attributes. (The descriptions are written as though the values will be
|
||||
scalars; however, if `func` returns an array, the outputs will be
|
||||
arrays of the same shape.)
|
||||
|
||||
success : bool
|
||||
``True`` when the algorithm terminated successfully (status ``0``).
|
||||
status : int
|
||||
An integer representing the exit status of the algorithm.
|
||||
``0`` : The algorithm converged to the specified tolerances.
|
||||
``-1`` : The error estimate increased, so iteration was terminated.
|
||||
``-2`` : The maximum number of iterations was reached.
|
||||
``-3`` : A non-finite value was encountered.
|
||||
``-4`` : Iteration was terminated by `callback`.
|
||||
``1`` : The algorithm is proceeding normally (in `callback` only).
|
||||
df : float
|
||||
The derivative of `func` at `x`, if the algorithm terminated
|
||||
successfully.
|
||||
error : float
|
||||
An estimate of the error: the magnitude of the difference between
|
||||
the current estimate of the derivative and the estimate in the
|
||||
previous iteration.
|
||||
nit : int
|
||||
The number of iterations performed.
|
||||
nfev : int
|
||||
The number of points at which `func` was evaluated.
|
||||
x : float
|
||||
The value at which the derivative of `func` was evaluated
|
||||
(after broadcasting with `args` and `step_direction`).
|
||||
|
||||
Notes
|
||||
-----
|
||||
The implementation was inspired by jacobi [1]_, numdifftools [2]_, and
|
||||
DERIVEST [3]_, but the implementation follows the theory of Taylor series
|
||||
more straightforwardly (and arguably naively so).
|
||||
In the first iteration, the derivative is estimated using a finite
|
||||
difference formula of order `order` with maximum step size `initial_step`.
|
||||
Each subsequent iteration, the maximum step size is reduced by
|
||||
`step_factor`, and the derivative is estimated again until a termination
|
||||
condition is reached. The error estimate is the magnitude of the difference
|
||||
between the current derivative approximation and that of the previous
|
||||
iteration.
|
||||
|
||||
The stencils of the finite difference formulae are designed such that
|
||||
abscissae are "nested": after `func` is evaluated at ``order + 1``
|
||||
points in the first iteration, `func` is evaluated at only two new points
|
||||
in each subsequent iteration; ``order - 1`` previously evaluated function
|
||||
values required by the finite difference formula are reused, and two
|
||||
function values (evaluations at the points furthest from `x`) are unused.
|
||||
|
||||
Step sizes are absolute. When the step size is small relative to the
|
||||
magnitude of `x`, precision is lost; for example, if `x` is ``1e20``, the
|
||||
default initial step size of ``0.5`` cannot be resolved. Accordingly,
|
||||
consider using larger initial step sizes for large magnitudes of `x`.
|
||||
|
||||
The default tolerances are challenging to satisfy at points where the
|
||||
true derivative is exactly zero. If the derivative may be exactly zero,
|
||||
consider specifying an absolute tolerance (e.g. ``atol=1e-16``) to
|
||||
improve convergence.
|
||||
|
||||
References
|
||||
----------
|
||||
[1]_ Hans Dembinski (@HDembinski). jacobi.
|
||||
https://github.com/HDembinski/jacobi
|
||||
[2]_ Per A. Brodtkorb and John D'Errico. numdifftools.
|
||||
https://numdifftools.readthedocs.io/en/latest/
|
||||
[3]_ John D'Errico. DERIVEST: Adaptive Robust Numerical Differentiation.
|
||||
https://www.mathworks.com/matlabcentral/fileexchange/13490-adaptive-robust-numerical-differentiation
|
||||
[4]_ Numerical Differentition. Wikipedia.
|
||||
https://en.wikipedia.org/wiki/Numerical_differentiation
|
||||
|
||||
Examples
|
||||
--------
|
||||
Evaluate the derivative of ``np.exp`` at several points ``x``.
|
||||
|
||||
>>> import numpy as np
|
||||
>>> from scipy.optimize._differentiate import _differentiate
|
||||
>>> f = np.exp
|
||||
>>> df = np.exp # true derivative
|
||||
>>> x = np.linspace(1, 2, 5)
|
||||
>>> res = _differentiate(f, x)
|
||||
>>> res.df # approximation of the derivative
|
||||
array([2.71828183, 3.49034296, 4.48168907, 5.75460268, 7.3890561 ])
|
||||
>>> res.error # estimate of the error
|
||||
array(
|
||||
[7.12940817e-12, 9.16688947e-12, 1.17594823e-11, 1.50972568e-11, 1.93942640e-11]
|
||||
)
|
||||
>>> abs(res.df - df(x)) # true error
|
||||
array(
|
||||
[3.06421555e-14, 3.01980663e-14, 5.06261699e-14, 6.30606678e-14, 8.34887715e-14]
|
||||
)
|
||||
|
||||
Show the convergence of the approximation as the step size is reduced.
|
||||
Each iteration, the step size is reduced by `step_factor`, so for
|
||||
sufficiently small initial step, each iteration reduces the error by a
|
||||
factor of ``1/step_factor**order`` until finite precision arithmetic
|
||||
inhibits further improvement.
|
||||
|
||||
>>> iter = list(range(1, 12)) # maximum iterations
|
||||
>>> hfac = 2 # step size reduction per iteration
|
||||
>>> hdir = [-1, 0, 1] # compare left-, central-, and right- steps
|
||||
>>> order = 4 # order of differentiation formula
|
||||
>>> x = 1
|
||||
>>> ref = df(x)
|
||||
>>> errors = [] # true error
|
||||
>>> for i in iter:
|
||||
... res = _differentiate(f, x, maxiter=i, step_factor=hfac,
|
||||
... step_direction=hdir, order=order,
|
||||
... atol=0, rtol=0) # prevent early termination
|
||||
... errors.append(abs(res.df - ref))
|
||||
>>> errors = np.array(errors)
|
||||
>>> plt.semilogy(iter, errors[:, 0], label='left differences')
|
||||
>>> plt.semilogy(iter, errors[:, 1], label='central differences')
|
||||
>>> plt.semilogy(iter, errors[:, 2], label='right differences')
|
||||
>>> plt.xlabel('iteration')
|
||||
>>> plt.ylabel('error')
|
||||
>>> plt.legend()
|
||||
>>> plt.show()
|
||||
>>> (errors[1, 1] / errors[0, 1], 1 / hfac**order)
|
||||
(0.06215223140159822, 0.0625)
|
||||
|
||||
The implementation is vectorized over `x`, `step_direction`, and `args`.
|
||||
The function is evaluated once before the first iteration to perform input
|
||||
validation and standardization, and once per iteration thereafter.
|
||||
|
||||
>>> def f(x, p):
|
||||
... print('here')
|
||||
... f.nit += 1
|
||||
... return x**p
|
||||
>>> f.nit = 0
|
||||
>>> def df(x, p):
|
||||
... return p*x**(p-1)
|
||||
>>> x = np.arange(1, 5)
|
||||
>>> p = np.arange(1, 6).reshape((-1, 1))
|
||||
>>> hdir = np.arange(-1, 2).reshape((-1, 1, 1))
|
||||
>>> res = _differentiate(f, x, args=(p,), step_direction=hdir, maxiter=1)
|
||||
>>> np.allclose(res.df, df(x, p))
|
||||
True
|
||||
>>> res.df.shape
|
||||
(3, 5, 4)
|
||||
>>> f.nit
|
||||
2
|
||||
|
||||
By default, `preserve_shape` is False, and therefore the callable
|
||||
`f` may be called with arrays of any broadcastable shapes.
|
||||
For example:
|
||||
|
||||
>>> shapes = []
|
||||
>>> def f(x, c):
|
||||
... shape = np.broadcast_shapes(x.shape, c.shape)
|
||||
... shapes.append(shape)
|
||||
... return np.sin(c*x)
|
||||
>>>
|
||||
>>> c = [1, 5, 10, 20]
|
||||
>>> res = _differentiate(f, 0, args=(c,))
|
||||
>>> shapes
|
||||
[(4,), (4, 8), (4, 2), (3, 2), (2, 2), (1, 2)]
|
||||
|
||||
To understand where these shapes are coming from - and to better
|
||||
understand how `_differentiate` computes accurate results - note that
|
||||
higher values of ``c`` correspond with higher frequency sinusoids.
|
||||
The higher frequency sinusoids make the function's derivative change
|
||||
faster, so more function evaluations are required to achieve the target
|
||||
accuracy:
|
||||
|
||||
>>> res.nfev
|
||||
array([11, 13, 15, 17])
|
||||
|
||||
The initial ``shape``, ``(4,)``, corresponds with evaluating the
|
||||
function at a single abscissa and all four frequencies; this is used
|
||||
for input validation and to determine the size and dtype of the arrays
|
||||
that store results. The next shape corresponds with evaluating the
|
||||
function at an initial grid of abscissae and all four frequencies.
|
||||
Successive calls to the function evaluate the function at two more
|
||||
abscissae, increasing the effective order of the approximation by two.
|
||||
However, in later function evaluations, the function is evaluated at
|
||||
fewer frequencies because the corresponding derivative has already
|
||||
converged to the required tolerance. This saves function evaluations to
|
||||
improve performance, but it requires the function to accept arguments of
|
||||
any shape.
|
||||
|
||||
"Vector-valued" functions are unlikely to satisfy this requirement.
|
||||
For example, consider
|
||||
|
||||
>>> def f(x):
|
||||
... return [x, np.sin(3*x), x+np.sin(10*x), np.sin(20*x)*(x-1)**2]
|
||||
|
||||
This integrand is not compatible with `_differentiate` as written; for instance,
|
||||
the shape of the output will not be the same as the shape of ``x``. Such a
|
||||
function *could* be converted to a compatible form with the introduction of
|
||||
additional parameters, but this would be inconvenient. In such cases,
|
||||
a simpler solution would be to use `preserve_shape`.
|
||||
|
||||
>>> shapes = []
|
||||
>>> def f(x):
|
||||
... shapes.append(x.shape)
|
||||
... x0, x1, x2, x3 = x
|
||||
... return [x0, np.sin(3*x1), x2+np.sin(10*x2), np.sin(20*x3)*(x3-1)**2]
|
||||
>>>
|
||||
>>> x = np.zeros(4)
|
||||
>>> res = _differentiate(f, x, preserve_shape=True)
|
||||
>>> shapes
|
||||
[(4,), (4, 8), (4, 2), (4, 2), (4, 2), (4, 2)]
|
||||
|
||||
Here, the shape of ``x`` is ``(4,)``. With ``preserve_shape=True``, the
|
||||
function may be called with argument ``x`` of shape ``(4,)`` or ``(4, n)``,
|
||||
and this is what we observe.
|
||||
|
||||
"""
|
||||
# TODO (followup):
|
||||
# - investigate behavior at saddle points
|
||||
# - array initial_step / step_factor?
|
||||
# - multivariate functions?
|
||||
|
||||
res = _differentiate_iv(func, x, args, atol, rtol, maxiter, order, initial_step,
|
||||
step_factor, step_direction, preserve_shape, callback)
|
||||
(func, x, args, atol, rtol, maxiter, order,
|
||||
h0, fac, hdir, preserve_shape, callback) = res
|
||||
|
||||
# Initialization
|
||||
# Since f(x) (no step) is not needed for central differences, it may be
|
||||
# possible to eliminate this function evaluation. However, it's useful for
|
||||
# input validation and standardization, and everything else is designed to
|
||||
# reduce function calls, so let's keep it simple.
|
||||
temp = eim._initialize(func, (x,), args, preserve_shape=preserve_shape)
|
||||
func, xs, fs, args, shape, dtype, xp = temp
|
||||
x, f = xs[0], fs[0]
|
||||
df = np.full_like(f, np.nan)
|
||||
# Ideally we'd broadcast the shape of `hdir` in `_elementwise_algo_init`, but
|
||||
# it's simpler to do it here than to generalize `_elementwise_algo_init` further.
|
||||
# `hdir` and `x` are already broadcasted in `_differentiate_iv`, so we know
|
||||
# that `hdir` can be broadcasted to the final shape.
|
||||
hdir = np.broadcast_to(hdir, shape).flatten()
|
||||
|
||||
status = np.full_like(x, eim._EINPROGRESS, dtype=int) # in progress
|
||||
nit, nfev = 0, 1 # one function evaluations performed above
|
||||
# Boolean indices of left, central, right, and (all) one-sided steps
|
||||
il = hdir < 0
|
||||
ic = hdir == 0
|
||||
ir = hdir > 0
|
||||
io = il | ir
|
||||
|
||||
# Most of these attributes are reasonably obvious, but:
|
||||
# - `fs` holds all the function values of all active `x`. The zeroth
|
||||
# axis corresponds with active points `x`, the first axis corresponds
|
||||
# with the different steps (in the order described in
|
||||
# `_differentiate_weights`).
|
||||
# - `terms` (which could probably use a better name) is half the `order`,
|
||||
# which is always even.
|
||||
work = _RichResult(x=x, df=df, fs=f[:, np.newaxis], error=np.nan, h=h0,
|
||||
df_last=np.nan, error_last=np.nan, h0=h0, fac=fac,
|
||||
atol=atol, rtol=rtol, nit=nit, nfev=nfev,
|
||||
status=status, dtype=dtype, terms=(order+1)//2,
|
||||
hdir=hdir, il=il, ic=ic, ir=ir, io=io)
|
||||
# This is the correspondence between terms in the `work` object and the
|
||||
# final result. In this case, the mapping is trivial. Note that `success`
|
||||
# is prepended automatically.
|
||||
res_work_pairs = [('status', 'status'), ('df', 'df'), ('error', 'error'),
|
||||
('nit', 'nit'), ('nfev', 'nfev'), ('x', 'x')]
|
||||
|
||||
def pre_func_eval(work):
|
||||
"""Determine the abscissae at which the function needs to be evaluated.
|
||||
|
||||
See `_differentiate_weights` for a description of the stencil (pattern
|
||||
of the abscissae).
|
||||
|
||||
In the first iteration, there is only one stored function value in
|
||||
`work.fs`, `f(x)`, so we need to evaluate at `order` new points. In
|
||||
subsequent iterations, we evaluate at two new points. Note that
|
||||
`work.x` is always flattened into a 1D array after broadcasting with
|
||||
all `args`, so we add a new axis at the end and evaluate all point
|
||||
in one call to the function.
|
||||
|
||||
For improvement:
|
||||
- Consider measuring the step size actually taken, since `(x + h) - x`
|
||||
is not identically equal to `h` with floating point arithmetic.
|
||||
- Adjust the step size automatically if `x` is too big to resolve the
|
||||
step.
|
||||
- We could probably save some work if there are no central difference
|
||||
steps or no one-sided steps.
|
||||
"""
|
||||
n = work.terms # half the order
|
||||
h = work.h # step size
|
||||
c = work.fac # step reduction factor
|
||||
d = c**0.5 # square root of step reduction factor (one-sided stencil)
|
||||
# Note - no need to be careful about dtypes until we allocate `x_eval`
|
||||
|
||||
if work.nit == 0:
|
||||
hc = h / c**np.arange(n)
|
||||
hc = np.concatenate((-hc[::-1], hc))
|
||||
else:
|
||||
hc = np.asarray([-h, h]) / c**(n-1)
|
||||
|
||||
if work.nit == 0:
|
||||
hr = h / d**np.arange(2*n)
|
||||
else:
|
||||
hr = np.asarray([h, h/d]) / c**(n-1)
|
||||
|
||||
n_new = 2*n if work.nit == 0 else 2 # number of new abscissae
|
||||
x_eval = np.zeros((len(work.hdir), n_new), dtype=work.dtype)
|
||||
il, ic, ir = work.il, work.ic, work.ir
|
||||
x_eval[ir] = work.x[ir, np.newaxis] + hr
|
||||
x_eval[ic] = work.x[ic, np.newaxis] + hc
|
||||
x_eval[il] = work.x[il, np.newaxis] - hr
|
||||
return x_eval
|
||||
|
||||
def post_func_eval(x, f, work):
|
||||
""" Estimate the derivative and error from the function evaluations
|
||||
|
||||
As in `pre_func_eval`: in the first iteration, there is only one stored
|
||||
function value in `work.fs`, `f(x)`, so we need to add the `order` new
|
||||
points. In subsequent iterations, we add two new points. The tricky
|
||||
part is getting the order to match that of the weights, which is
|
||||
described in `_differentiate_weights`.
|
||||
|
||||
For improvement:
|
||||
- Change the order of the weights (and steps in `pre_func_eval`) to
|
||||
simplify `work_fc` concatenation and eliminate `fc` concatenation.
|
||||
- It would be simple to do one-step Richardson extrapolation with `df`
|
||||
and `df_last` to increase the order of the estimate and/or improve
|
||||
the error estimate.
|
||||
- Process the function evaluations in a more numerically favorable
|
||||
way. For instance, combining the pairs of central difference evals
|
||||
into a second-order approximation and using Richardson extrapolation
|
||||
to produce a higher order approximation seemed to retain accuracy up
|
||||
to very high order.
|
||||
- Alternatively, we could use `polyfit` like Jacobi. An advantage of
|
||||
fitting polynomial to more points than necessary is improved noise
|
||||
tolerance.
|
||||
"""
|
||||
n = work.terms
|
||||
n_new = n if work.nit == 0 else 1
|
||||
il, ic, io = work.il, work.ic, work.io
|
||||
|
||||
# Central difference
|
||||
# `work_fc` is *all* the points at which the function has been evaluated
|
||||
# `fc` is the points we're using *this iteration* to produce the estimate
|
||||
work_fc = (f[ic, :n_new], work.fs[ic, :], f[ic, -n_new:])
|
||||
work_fc = np.concatenate(work_fc, axis=-1)
|
||||
if work.nit == 0:
|
||||
fc = work_fc
|
||||
else:
|
||||
fc = (work_fc[:, :n], work_fc[:, n:n+1], work_fc[:, -n:])
|
||||
fc = np.concatenate(fc, axis=-1)
|
||||
|
||||
# One-sided difference
|
||||
work_fo = np.concatenate((work.fs[io, :], f[io, :]), axis=-1)
|
||||
if work.nit == 0:
|
||||
fo = work_fo
|
||||
else:
|
||||
fo = np.concatenate((work_fo[:, 0:1], work_fo[:, -2*n:]), axis=-1)
|
||||
|
||||
work.fs = np.zeros((len(ic), work.fs.shape[-1] + 2*n_new))
|
||||
work.fs[ic] = work_fc
|
||||
work.fs[io] = work_fo
|
||||
|
||||
wc, wo = _differentiate_weights(work, n)
|
||||
work.df_last = work.df.copy()
|
||||
work.df[ic] = fc @ wc / work.h
|
||||
work.df[io] = fo @ wo / work.h
|
||||
work.df[il] *= -1
|
||||
|
||||
work.h /= work.fac
|
||||
work.error_last = work.error
|
||||
# Simple error estimate - the difference in derivative estimates between
|
||||
# this iteration and the last. This is typically conservative because if
|
||||
# convergence has begin, the true error is much closer to the difference
|
||||
# between the current estimate and the *next* error estimate. However,
|
||||
# we could use Richarson extrapolation to produce an error estimate that
|
||||
# is one order higher, and take the difference between that and
|
||||
# `work.df` (which would just be constant factor that depends on `fac`.)
|
||||
work.error = abs(work.df - work.df_last)
|
||||
|
||||
def check_termination(work):
|
||||
"""Terminate due to convergence, non-finite values, or error increase"""
|
||||
stop = np.zeros_like(work.df).astype(bool)
|
||||
|
||||
i = work.error < work.atol + work.rtol*abs(work.df)
|
||||
work.status[i] = eim._ECONVERGED
|
||||
stop[i] = True
|
||||
|
||||
if work.nit > 0:
|
||||
i = ~((np.isfinite(work.x) & np.isfinite(work.df)) | stop)
|
||||
work.df[i], work.status[i] = np.nan, eim._EVALUEERR
|
||||
stop[i] = True
|
||||
|
||||
# With infinite precision, there is a step size below which
|
||||
# all smaller step sizes will reduce the error. But in floating point
|
||||
# arithmetic, catastrophic cancellation will begin to cause the error
|
||||
# to increase again. This heuristic tries to avoid step sizes that are
|
||||
# too small. There may be more theoretically sound approaches for
|
||||
# detecting a step size that minimizes the total error, but this
|
||||
# heuristic seems simple and effective.
|
||||
i = (work.error > work.error_last*10) & ~stop
|
||||
work.status[i] = _EERRORINCREASE
|
||||
stop[i] = True
|
||||
|
||||
return stop
|
||||
|
||||
def post_termination_check(work):
|
||||
return
|
||||
|
||||
def customize_result(res, shape):
|
||||
return shape
|
||||
|
||||
return eim._loop(work, callback, shape, maxiter, func, args, dtype,
|
||||
pre_func_eval, post_func_eval, check_termination,
|
||||
post_termination_check, customize_result, res_work_pairs,
|
||||
xp, preserve_shape)
|
||||
|
||||
|
||||
def _differentiate_weights(work, n):
|
||||
# This produces the weights of the finite difference formula for a given
|
||||
# stencil. In experiments, use of a second-order central difference formula
|
||||
# with Richardson extrapolation was more accurate numerically, but it was
|
||||
# more complicated, and it would have become even more complicated when
|
||||
# adding support for one-sided differences. However, now that all the
|
||||
# function evaluation values are stored, they can be processed in whatever
|
||||
# way is desired to produce the derivative estimate. We leave alternative
|
||||
# approaches to future work. To be more self-contained, here is the theory
|
||||
# for deriving the weights below.
|
||||
#
|
||||
# Recall that the Taylor expansion of a univariate, scalar-values function
|
||||
# about a point `x` may be expressed as:
|
||||
# f(x + h) = f(x) + f'(x)*h + f''(x)/2!*h**2 + O(h**3)
|
||||
# Suppose we evaluate f(x), f(x+h), and f(x-h). We have:
|
||||
# f(x) = f(x)
|
||||
# f(x + h) = f(x) + f'(x)*h + f''(x)/2!*h**2 + O(h**3)
|
||||
# f(x - h) = f(x) - f'(x)*h + f''(x)/2!*h**2 + O(h**3)
|
||||
# We can solve for weights `wi` such that:
|
||||
# w1*f(x) = w1*(f(x))
|
||||
# + w2*f(x + h) = w2*(f(x) + f'(x)*h + f''(x)/2!*h**2) + O(h**3)
|
||||
# + w3*f(x - h) = w3*(f(x) - f'(x)*h + f''(x)/2!*h**2) + O(h**3)
|
||||
# = 0 + f'(x)*h + 0 + O(h**3)
|
||||
# Then
|
||||
# f'(x) ~ (w1*f(x) + w2*f(x+h) + w3*f(x-h))/h
|
||||
# is a finite difference derivative approximation with error O(h**2),
|
||||
# and so it is said to be a "second-order" approximation. Under certain
|
||||
# conditions (e.g. well-behaved function, `h` sufficiently small), the
|
||||
# error in the approximation will decrease with h**2; that is, if `h` is
|
||||
# reduced by a factor of 2, the error is reduced by a factor of 4.
|
||||
#
|
||||
# By default, we use eighth-order formulae. Our central-difference formula
|
||||
# uses abscissae:
|
||||
# x-h/c**3, x-h/c**2, x-h/c, x-h, x, x+h, x+h/c, x+h/c**2, x+h/c**3
|
||||
# where `c` is the step factor. (Typically, the step factor is greater than
|
||||
# one, so the outermost points - as written above - are actually closest to
|
||||
# `x`.) This "stencil" is chosen so that each iteration, the step can be
|
||||
# reduced by the factor `c`, and most of the function evaluations can be
|
||||
# reused with the new step size. For example, in the next iteration, we
|
||||
# will have:
|
||||
# x-h/c**4, x-h/c**3, x-h/c**2, x-h/c, x, x+h/c, x+h/c**2, x+h/c**3, x+h/c**4
|
||||
# We do not reuse `x-h` and `x+h` for the new derivative estimate.
|
||||
# While this would increase the order of the formula and thus the
|
||||
# theoretical convergence rate, it is also less stable numerically.
|
||||
# (As noted above, there are other ways of processing the values that are
|
||||
# more stable. Thus, even now we store `f(x-h)` and `f(x+h)` in `work.fs`
|
||||
# to simplify future development of this sort of improvement.)
|
||||
#
|
||||
# The (right) one-sided formula is produced similarly using abscissae
|
||||
# x, x+h, x+h/d, x+h/d**2, ..., x+h/d**6, x+h/d**7, x+h/d**7
|
||||
# where `d` is the square root of `c`. (The left one-sided formula simply
|
||||
# uses -h.) When the step size is reduced by factor `c = d**2`, we have
|
||||
# abscissae:
|
||||
# x, x+h/d**2, x+h/d**3..., x+h/d**8, x+h/d**9, x+h/d**9
|
||||
# `d` is chosen as the square root of `c` so that the rate of the step-size
|
||||
# reduction is the same per iteration as in the central difference case.
|
||||
# Note that because the central difference formulas are inherently of even
|
||||
# order, for simplicity, we use only even-order formulas for one-sided
|
||||
# differences, too.
|
||||
|
||||
# It's possible for the user to specify `fac` in, say, double precision but
|
||||
# `x` and `args` in single precision. `fac` gets converted to single
|
||||
# precision, but we should always use double precision for the intermediate
|
||||
# calculations here to avoid additional error in the weights.
|
||||
fac = work.fac.astype(np.float64)
|
||||
|
||||
# Note that if the user switches back to floating point precision with
|
||||
# `x` and `args`, then `fac` will not necessarily equal the (lower
|
||||
# precision) cached `_differentiate_weights.fac`, and the weights will
|
||||
# need to be recalculated. This could be fixed, but it's late, and of
|
||||
# low consequence.
|
||||
if fac != _differentiate_weights.fac:
|
||||
_differentiate_weights.central = []
|
||||
_differentiate_weights.right = []
|
||||
_differentiate_weights.fac = fac
|
||||
|
||||
if len(_differentiate_weights.central) != 2*n + 1:
|
||||
# Central difference weights. Consider refactoring this; it could
|
||||
# probably be more compact.
|
||||
i = np.arange(-n, n + 1)
|
||||
p = np.abs(i) - 1. # center point has power `p` -1, but sign `s` is 0
|
||||
s = np.sign(i)
|
||||
|
||||
h = s / fac ** p
|
||||
A = np.vander(h, increasing=True).T
|
||||
b = np.zeros(2*n + 1)
|
||||
b[1] = 1
|
||||
weights = np.linalg.solve(A, b)
|
||||
|
||||
# Enforce identities to improve accuracy
|
||||
weights[n] = 0
|
||||
for i in range(n):
|
||||
weights[-i-1] = -weights[i]
|
||||
|
||||
# Cache the weights. We only need to calculate them once unless
|
||||
# the step factor changes.
|
||||
_differentiate_weights.central = weights
|
||||
|
||||
# One-sided difference weights. The left one-sided weights (with
|
||||
# negative steps) are simply the negative of the right one-sided
|
||||
# weights, so no need to compute them separately.
|
||||
i = np.arange(2*n + 1)
|
||||
p = i - 1.
|
||||
s = np.sign(i)
|
||||
|
||||
h = s / np.sqrt(fac) ** p
|
||||
A = np.vander(h, increasing=True).T
|
||||
b = np.zeros(2 * n + 1)
|
||||
b[1] = 1
|
||||
weights = np.linalg.solve(A, b)
|
||||
|
||||
_differentiate_weights.right = weights
|
||||
|
||||
return (_differentiate_weights.central.astype(work.dtype, copy=False),
|
||||
_differentiate_weights.right.astype(work.dtype, copy=False))
|
||||
_differentiate_weights.central = []
|
||||
_differentiate_weights.right = []
|
||||
_differentiate_weights.fac = None
|
||||
|
||||
|
||||
def _jacobian(func, x, *, atol=None, rtol=None, maxiter=10,
|
||||
order=8, initial_step=0.5, step_factor=2.0):
|
||||
r"""Evaluate the Jacobian of a function numerically.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
func : callable
|
||||
The function whose Jacobian is desired. The signature must be::
|
||||
|
||||
func(x: ndarray) -> ndarray
|
||||
|
||||
where each element of ``x`` is a finite real. If the function to be
|
||||
differentiated accepts additional, arguments wrap it (e.g. using
|
||||
`functools.partial` or ``lambda``) and pass the wrapped callable
|
||||
into `_jacobian`. See Notes regarding vectorization and the dimensionality
|
||||
of the input and output.
|
||||
x : array_like
|
||||
Points at which to evaluate the Jacobian. Must have at least one dimension.
|
||||
See Notes regarding the dimensionality and vectorization.
|
||||
atol, rtol : float, optional
|
||||
Absolute and relative tolerances for the stopping condition: iteration
|
||||
will stop for each element of the Jacobian when
|
||||
``res.error < atol + rtol * abs(res.df)``. The default `atol` is the
|
||||
smallest normal number of the appropriate dtype, and the default `rtol`
|
||||
is the square root of the precision of the appropriate dtype.
|
||||
order : int, default: 8
|
||||
The (positive integer) order of the finite difference formula to be
|
||||
used. Odd integers will be rounded up to the next even integer.
|
||||
initial_step : float, default: 0.5
|
||||
The (absolute) initial step size for the finite difference derivative
|
||||
approximation.
|
||||
step_factor : float, default: 2.0
|
||||
The factor by which the step size is *reduced* in each iteration; i.e.
|
||||
the step size in iteration 1 is ``initial_step/step_factor``. If
|
||||
``step_factor < 1``, subsequent steps will be greater than the initial
|
||||
step; this may be useful if steps smaller than some threshold are
|
||||
undesirable (e.g. due to subtractive cancellation error).
|
||||
maxiter : int, default: 10
|
||||
The maximum number of iterations of the algorithm to perform.
|
||||
|
||||
Returns
|
||||
-------
|
||||
res : _RichResult
|
||||
An instance of `scipy._lib._util._RichResult` with the following
|
||||
attributes.
|
||||
|
||||
success : bool array
|
||||
``True`` when the algorithm terminated successfully (status ``0``).
|
||||
status : int array
|
||||
An integer representing the exit status of the algorithm.
|
||||
``0`` : The algorithm converged to the specified tolerances.
|
||||
``-1`` : The error estimate increased, so iteration was terminated.
|
||||
``-2`` : The maximum number of iterations was reached.
|
||||
``-3`` : A non-finite value was encountered.
|
||||
``-4`` : Iteration was terminated by `callback`.
|
||||
``1`` : The algorithm is proceeding normally (in `callback` only).
|
||||
df : float array
|
||||
The Jacobian of `func` at `x`, if the algorithm terminated
|
||||
successfully.
|
||||
error : float array
|
||||
An estimate of the error: the magnitude of the difference between
|
||||
the current estimate of the derivative and the estimate in the
|
||||
previous iteration.
|
||||
nit : int array
|
||||
The number of iterations performed.
|
||||
nfev : int array
|
||||
The number of points at which `func` was evaluated.
|
||||
x : float array
|
||||
The value at which the derivative of `func` was evaluated.
|
||||
|
||||
See Also
|
||||
--------
|
||||
_differentiate
|
||||
|
||||
Notes
|
||||
-----
|
||||
Suppose we wish to evaluate the Jacobian of a function
|
||||
:math:`f: \mathbf{R^m} \rightarrow \mathbf{R^n}`, and assign to variables
|
||||
``m`` and ``n`` the positive integer values of :math:`m` and :math:`n`,
|
||||
respectively. If we wish to evaluate the Jacobian at a single point,
|
||||
then:
|
||||
|
||||
- argument `x` must be an array of shape ``(m,)``
|
||||
- argument `func` must be vectorized to accept an array of shape ``(m, p)``.
|
||||
The first axis represents the :math:`m` inputs of :math:`f`; the second
|
||||
is for evaluating the function at multiple points in a single call.
|
||||
- argument `func` must return an array of shape ``(n, p)``. The first
|
||||
axis represents the :math:`n` outputs of :math:`f`; the second
|
||||
is for the result of evaluating the function at multiple points.
|
||||
- attribute ``df`` of the result object will be an array of shape ``(n, m)``,
|
||||
the Jacobian.
|
||||
|
||||
This function is also vectorized in the sense that the Jacobian can be
|
||||
evaluated at ``k`` points in a single call. In this case, `x` would be an
|
||||
array of shape ``(m, k)``, `func` would accept an array of shape
|
||||
``(m, k, p)`` and return an array of shape ``(n, k, p)``, and the ``df``
|
||||
attribute of the result would have shape ``(n, m, k)``.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Jacobian matrix and determinant, *Wikipedia*,
|
||||
https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant
|
||||
|
||||
Examples
|
||||
--------
|
||||
The Rosenbrock function maps from :math:`\mathbf{R}^m \righarrow \mathbf{R}`;
|
||||
the SciPy implementation `scipy.optimize.rosen` is vectorized to accept an
|
||||
array of shape ``(m, p)`` and return an array of shape ``m``. Suppose we wish
|
||||
to evaluate the Jacobian (AKA the gradient because the function returns a scalar)
|
||||
at ``[0.5, 0.5, 0.5]``.
|
||||
|
||||
>>> import numpy as np
|
||||
>>> from scipy.optimize._differentiate import _jacobian as jacobian
|
||||
>>> from scipy.optimize import rosen, rosen_der
|
||||
>>> m = 3
|
||||
>>> x = np.full(m, 0.5)
|
||||
>>> res = jacobian(rosen, x)
|
||||
>>> ref = rosen_der(x) # reference value of the gradient
|
||||
>>> res.df, ref
|
||||
(array([-51., -1., 50.]), array([-51., -1., 50.]))
|
||||
|
||||
As an example of a function with multiple outputs, consider Example 4
|
||||
from [1]_.
|
||||
|
||||
>>> def f(x):
|
||||
... x1, x2, x3 = x ...
|
||||
... return [x1, 5*x3, 4*x2**2 - 2*x3, x3*np.sin(x1)]
|
||||
|
||||
The true Jacobian is given by:
|
||||
|
||||
>>> def df(x):
|
||||
... x1, x2, x3 = x
|
||||
... one = np.ones_like(x1)
|
||||
... return [[one, 0*one, 0*one],
|
||||
... [0*one, 0*one, 5*one],
|
||||
... [0*one, 8*x2, -2*one],
|
||||
... [x3*np.cos(x1), 0*one, np.sin(x1)]]
|
||||
|
||||
Evaluate the Jacobian at an arbitrary point.
|
||||
|
||||
>>> rng = np.random.default_rng(389252938452)
|
||||
>>> x = rng.random(size=3)
|
||||
>>> res = jacobian(f, x)
|
||||
>>> ref = df(x)
|
||||
>>> res.df.shape == (4, 3)
|
||||
True
|
||||
>>> np.allclose(res.df, ref)
|
||||
True
|
||||
|
||||
Evaluate the Jacobian at 10 arbitrary points in a single call.
|
||||
|
||||
>>> x = rng.random(size=(3, 10))
|
||||
>>> res = jacobian(f, x)
|
||||
>>> ref = df(x)
|
||||
>>> res.df.shape == (4, 3, 10)
|
||||
True
|
||||
>>> np.allclose(res.df, ref)
|
||||
True
|
||||
|
||||
"""
|
||||
x = np.asarray(x)
|
||||
int_dtype = np.issubdtype(x.dtype, np.integer)
|
||||
x0 = np.asarray(x, dtype=float) if int_dtype else x
|
||||
|
||||
if x0.ndim < 1:
|
||||
message = "Argument `x` must be at least 1-D."
|
||||
raise ValueError(message)
|
||||
|
||||
m = x0.shape[0]
|
||||
i = np.arange(m)
|
||||
|
||||
def wrapped(x):
|
||||
p = () if x.ndim == x0.ndim else (x.shape[-1],) # number of abscissae
|
||||
new_dims = (1,) if x.ndim == x0.ndim else (1, -1)
|
||||
new_shape = (m, m) + x0.shape[1:] + p
|
||||
xph = np.expand_dims(x0, new_dims)
|
||||
xph = np.broadcast_to(xph, new_shape).copy()
|
||||
xph[i, i] = x
|
||||
return func(xph)
|
||||
|
||||
res = _differentiate(wrapped, x, atol=atol, rtol=rtol,
|
||||
maxiter=maxiter, order=order, initial_step=initial_step,
|
||||
step_factor=step_factor, preserve_shape=True)
|
||||
del res.x # the user knows `x`, and the way it gets broadcasted is meaningless here
|
||||
return res
|
||||
Binary file not shown.
278
venv/lib/python3.12/site-packages/scipy/optimize/_direct_py.py
Normal file
278
venv/lib/python3.12/site-packages/scipy/optimize/_direct_py.py
Normal file
@ -0,0 +1,278 @@
|
||||
from __future__ import annotations
|
||||
from typing import ( # noqa: UP035
|
||||
Any, Callable, Iterable, TYPE_CHECKING
|
||||
)
|
||||
|
||||
import numpy as np
|
||||
from scipy.optimize import OptimizeResult
|
||||
from ._constraints import old_bound_to_new, Bounds
|
||||
from ._direct import direct as _direct # type: ignore
|
||||
|
||||
if TYPE_CHECKING:
|
||||
import numpy.typing as npt
|
||||
|
||||
__all__ = ['direct']
|
||||
|
||||
ERROR_MESSAGES = (
|
||||
"Number of function evaluations done is larger than maxfun={}",
|
||||
"Number of iterations is larger than maxiter={}",
|
||||
"u[i] < l[i] for some i",
|
||||
"maxfun is too large",
|
||||
"Initialization failed",
|
||||
"There was an error in the creation of the sample points",
|
||||
"An error occurred while the function was sampled",
|
||||
"Maximum number of levels has been reached.",
|
||||
"Forced stop",
|
||||
"Invalid arguments",
|
||||
"Out of memory",
|
||||
)
|
||||
|
||||
SUCCESS_MESSAGES = (
|
||||
("The best function value found is within a relative error={} "
|
||||
"of the (known) global optimum f_min"),
|
||||
("The volume of the hyperrectangle containing the lowest function value "
|
||||
"found is below vol_tol={}"),
|
||||
("The side length measure of the hyperrectangle containing the lowest "
|
||||
"function value found is below len_tol={}"),
|
||||
)
|
||||
|
||||
|
||||
def direct(
|
||||
func: Callable[[npt.ArrayLike, tuple[Any]], float],
|
||||
bounds: Iterable | Bounds,
|
||||
*,
|
||||
args: tuple = (),
|
||||
eps: float = 1e-4,
|
||||
maxfun: int | None = None,
|
||||
maxiter: int = 1000,
|
||||
locally_biased: bool = True,
|
||||
f_min: float = -np.inf,
|
||||
f_min_rtol: float = 1e-4,
|
||||
vol_tol: float = 1e-16,
|
||||
len_tol: float = 1e-6,
|
||||
callback: Callable[[npt.ArrayLike], None] | None = None
|
||||
) -> OptimizeResult:
|
||||
"""
|
||||
Finds the global minimum of a function using the
|
||||
DIRECT algorithm.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
func : callable
|
||||
The objective function to be minimized.
|
||||
``func(x, *args) -> float``
|
||||
where ``x`` is an 1-D array with shape (n,) and ``args`` is a tuple of
|
||||
the fixed parameters needed to completely specify the function.
|
||||
bounds : sequence or `Bounds`
|
||||
Bounds for variables. There are two ways to specify the bounds:
|
||||
|
||||
1. Instance of `Bounds` class.
|
||||
2. ``(min, max)`` pairs for each element in ``x``.
|
||||
|
||||
args : tuple, optional
|
||||
Any additional fixed parameters needed to
|
||||
completely specify the objective function.
|
||||
eps : float, optional
|
||||
Minimal required difference of the objective function values
|
||||
between the current best hyperrectangle and the next potentially
|
||||
optimal hyperrectangle to be divided. In consequence, `eps` serves as a
|
||||
tradeoff between local and global search: the smaller, the more local
|
||||
the search becomes. Default is 1e-4.
|
||||
maxfun : int or None, optional
|
||||
Approximate upper bound on objective function evaluations.
|
||||
If `None`, will be automatically set to ``1000 * N`` where ``N``
|
||||
represents the number of dimensions. Will be capped if necessary to
|
||||
limit DIRECT's RAM usage to app. 1GiB. This will only occur for very
|
||||
high dimensional problems and excessive `max_fun`. Default is `None`.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations. Default is 1000.
|
||||
locally_biased : bool, optional
|
||||
If `True` (default), use the locally biased variant of the
|
||||
algorithm known as DIRECT_L. If `False`, use the original unbiased
|
||||
DIRECT algorithm. For hard problems with many local minima,
|
||||
`False` is recommended.
|
||||
f_min : float, optional
|
||||
Function value of the global optimum. Set this value only if the
|
||||
global optimum is known. Default is ``-np.inf``, so that this
|
||||
termination criterion is deactivated.
|
||||
f_min_rtol : float, optional
|
||||
Terminate the optimization once the relative error between the
|
||||
current best minimum `f` and the supplied global minimum `f_min`
|
||||
is smaller than `f_min_rtol`. This parameter is only used if
|
||||
`f_min` is also set. Must lie between 0 and 1. Default is 1e-4.
|
||||
vol_tol : float, optional
|
||||
Terminate the optimization once the volume of the hyperrectangle
|
||||
containing the lowest function value is smaller than `vol_tol`
|
||||
of the complete search space. Must lie between 0 and 1.
|
||||
Default is 1e-16.
|
||||
len_tol : float, optional
|
||||
If `locally_biased=True`, terminate the optimization once half of
|
||||
the normalized maximal side length of the hyperrectangle containing
|
||||
the lowest function value is smaller than `len_tol`.
|
||||
If `locally_biased=False`, terminate the optimization once half of
|
||||
the normalized diagonal of the hyperrectangle containing the lowest
|
||||
function value is smaller than `len_tol`. Must lie between 0 and 1.
|
||||
Default is 1e-6.
|
||||
callback : callable, optional
|
||||
A callback function with signature ``callback(xk)`` where ``xk``
|
||||
represents the best function value found so far.
|
||||
|
||||
Returns
|
||||
-------
|
||||
res : OptimizeResult
|
||||
The optimization result represented as a ``OptimizeResult`` object.
|
||||
Important attributes are: ``x`` the solution array, ``success`` a
|
||||
Boolean flag indicating if the optimizer exited successfully and
|
||||
``message`` which describes the cause of the termination. See
|
||||
`OptimizeResult` for a description of other attributes.
|
||||
|
||||
Notes
|
||||
-----
|
||||
DIviding RECTangles (DIRECT) is a deterministic global
|
||||
optimization algorithm capable of minimizing a black box function with
|
||||
its variables subject to lower and upper bound constraints by sampling
|
||||
potential solutions in the search space [1]_. The algorithm starts by
|
||||
normalising the search space to an n-dimensional unit hypercube.
|
||||
It samples the function at the center of this hypercube and at 2n
|
||||
(n is the number of variables) more points, 2 in each coordinate
|
||||
direction. Using these function values, DIRECT then divides the
|
||||
domain into hyperrectangles, each having exactly one of the sampling
|
||||
points as its center. In each iteration, DIRECT chooses, using the `eps`
|
||||
parameter which defaults to 1e-4, some of the existing hyperrectangles
|
||||
to be further divided. This division process continues until either the
|
||||
maximum number of iterations or maximum function evaluations allowed
|
||||
are exceeded, or the hyperrectangle containing the minimal value found
|
||||
so far becomes small enough. If `f_min` is specified, the optimization
|
||||
will stop once this function value is reached within a relative tolerance.
|
||||
The locally biased variant of DIRECT (originally called DIRECT_L) [2]_ is
|
||||
used by default. It makes the search more locally biased and more
|
||||
efficient for cases with only a few local minima.
|
||||
|
||||
A note about termination criteria: `vol_tol` refers to the volume of the
|
||||
hyperrectangle containing the lowest function value found so far. This
|
||||
volume decreases exponentially with increasing dimensionality of the
|
||||
problem. Therefore `vol_tol` should be decreased to avoid premature
|
||||
termination of the algorithm for higher dimensions. This does not hold
|
||||
for `len_tol`: it refers either to half of the maximal side length
|
||||
(for ``locally_biased=True``) or half of the diagonal of the
|
||||
hyperrectangle (for ``locally_biased=False``).
|
||||
|
||||
This code is based on the DIRECT 2.0.4 Fortran code by Gablonsky et al. at
|
||||
https://ctk.math.ncsu.edu/SOFTWARE/DIRECTv204.tar.gz .
|
||||
This original version was initially converted via f2c and then cleaned up
|
||||
and reorganized by Steven G. Johnson, August 2007, for the NLopt project.
|
||||
The `direct` function wraps the C implementation.
|
||||
|
||||
.. versionadded:: 1.9.0
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Jones, D.R., Perttunen, C.D. & Stuckman, B.E. Lipschitzian
|
||||
optimization without the Lipschitz constant. J Optim Theory Appl
|
||||
79, 157-181 (1993).
|
||||
.. [2] Gablonsky, J., Kelley, C. A Locally-Biased form of the DIRECT
|
||||
Algorithm. Journal of Global Optimization 21, 27-37 (2001).
|
||||
|
||||
Examples
|
||||
--------
|
||||
The following example is a 2-D problem with four local minima: minimizing
|
||||
the Styblinski-Tang function
|
||||
(https://en.wikipedia.org/wiki/Test_functions_for_optimization).
|
||||
|
||||
>>> from scipy.optimize import direct, Bounds
|
||||
>>> def styblinski_tang(pos):
|
||||
... x, y = pos
|
||||
... return 0.5 * (x**4 - 16*x**2 + 5*x + y**4 - 16*y**2 + 5*y)
|
||||
>>> bounds = Bounds([-4., -4.], [4., 4.])
|
||||
>>> result = direct(styblinski_tang, bounds)
|
||||
>>> result.x, result.fun, result.nfev
|
||||
array([-2.90321597, -2.90321597]), -78.3323279095383, 2011
|
||||
|
||||
The correct global minimum was found but with a huge number of function
|
||||
evaluations (2011). Loosening the termination tolerances `vol_tol` and
|
||||
`len_tol` can be used to stop DIRECT earlier.
|
||||
|
||||
>>> result = direct(styblinski_tang, bounds, len_tol=1e-3)
|
||||
>>> result.x, result.fun, result.nfev
|
||||
array([-2.9044353, -2.9044353]), -78.33230330754142, 207
|
||||
|
||||
"""
|
||||
# convert bounds to new Bounds class if necessary
|
||||
if not isinstance(bounds, Bounds):
|
||||
if isinstance(bounds, list) or isinstance(bounds, tuple):
|
||||
lb, ub = old_bound_to_new(bounds)
|
||||
bounds = Bounds(lb, ub)
|
||||
else:
|
||||
message = ("bounds must be a sequence or "
|
||||
"instance of Bounds class")
|
||||
raise ValueError(message)
|
||||
|
||||
lb = np.ascontiguousarray(bounds.lb, dtype=np.float64)
|
||||
ub = np.ascontiguousarray(bounds.ub, dtype=np.float64)
|
||||
|
||||
# validate bounds
|
||||
# check that lower bounds are smaller than upper bounds
|
||||
if not np.all(lb < ub):
|
||||
raise ValueError('Bounds are not consistent min < max')
|
||||
# check for infs
|
||||
if (np.any(np.isinf(lb)) or np.any(np.isinf(ub))):
|
||||
raise ValueError("Bounds must not be inf.")
|
||||
|
||||
# validate tolerances
|
||||
if (vol_tol < 0 or vol_tol > 1):
|
||||
raise ValueError("vol_tol must be between 0 and 1.")
|
||||
if (len_tol < 0 or len_tol > 1):
|
||||
raise ValueError("len_tol must be between 0 and 1.")
|
||||
if (f_min_rtol < 0 or f_min_rtol > 1):
|
||||
raise ValueError("f_min_rtol must be between 0 and 1.")
|
||||
|
||||
# validate maxfun and maxiter
|
||||
if maxfun is None:
|
||||
maxfun = 1000 * lb.shape[0]
|
||||
if not isinstance(maxfun, int):
|
||||
raise ValueError("maxfun must be of type int.")
|
||||
if maxfun < 0:
|
||||
raise ValueError("maxfun must be > 0.")
|
||||
if not isinstance(maxiter, int):
|
||||
raise ValueError("maxiter must be of type int.")
|
||||
if maxiter < 0:
|
||||
raise ValueError("maxiter must be > 0.")
|
||||
|
||||
# validate boolean parameters
|
||||
if not isinstance(locally_biased, bool):
|
||||
raise ValueError("locally_biased must be True or False.")
|
||||
|
||||
def _func_wrap(x, args=None):
|
||||
x = np.asarray(x)
|
||||
if args is None:
|
||||
f = func(x)
|
||||
else:
|
||||
f = func(x, *args)
|
||||
# always return a float
|
||||
return np.asarray(f).item()
|
||||
|
||||
# TODO: fix disp argument
|
||||
x, fun, ret_code, nfev, nit = _direct(
|
||||
_func_wrap,
|
||||
np.asarray(lb), np.asarray(ub),
|
||||
args,
|
||||
False, eps, maxfun, maxiter,
|
||||
locally_biased,
|
||||
f_min, f_min_rtol,
|
||||
vol_tol, len_tol, callback
|
||||
)
|
||||
|
||||
format_val = (maxfun, maxiter, f_min_rtol, vol_tol, len_tol)
|
||||
if ret_code > 2:
|
||||
message = SUCCESS_MESSAGES[ret_code - 3].format(
|
||||
format_val[ret_code - 1])
|
||||
elif 0 < ret_code <= 2:
|
||||
message = ERROR_MESSAGES[ret_code - 1].format(format_val[ret_code - 1])
|
||||
elif 0 > ret_code > -100:
|
||||
message = ERROR_MESSAGES[abs(ret_code) + 1]
|
||||
else:
|
||||
message = ERROR_MESSAGES[ret_code + 99]
|
||||
|
||||
return OptimizeResult(x=np.asarray(x), fun=fun, status=ret_code,
|
||||
success=ret_code > 2, message=message,
|
||||
nfev=nfev, nit=nit)
|
||||
@ -0,0 +1,732 @@
|
||||
# Dual Annealing implementation.
|
||||
# Copyright (c) 2018 Sylvain Gubian <sylvain.gubian@pmi.com>,
|
||||
# Yang Xiang <yang.xiang@pmi.com>
|
||||
# Author: Sylvain Gubian, Yang Xiang, PMP S.A.
|
||||
|
||||
"""
|
||||
A Dual Annealing global optimization algorithm
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
from scipy.optimize import OptimizeResult
|
||||
from scipy.optimize import minimize, Bounds
|
||||
from scipy.special import gammaln
|
||||
from scipy._lib._util import check_random_state
|
||||
from scipy.optimize._constraints import new_bounds_to_old
|
||||
|
||||
__all__ = ['dual_annealing']
|
||||
|
||||
|
||||
class VisitingDistribution:
|
||||
"""
|
||||
Class used to generate new coordinates based on the distorted
|
||||
Cauchy-Lorentz distribution. Depending on the steps within the strategy
|
||||
chain, the class implements the strategy for generating new location
|
||||
changes.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
lb : array_like
|
||||
A 1-D NumPy ndarray containing lower bounds of the generated
|
||||
components. Neither NaN or inf are allowed.
|
||||
ub : array_like
|
||||
A 1-D NumPy ndarray containing upper bounds for the generated
|
||||
components. Neither NaN or inf are allowed.
|
||||
visiting_param : float
|
||||
Parameter for visiting distribution. Default value is 2.62.
|
||||
Higher values give the visiting distribution a heavier tail, this
|
||||
makes the algorithm jump to a more distant region.
|
||||
The value range is (1, 3]. Its value is fixed for the life of the
|
||||
object.
|
||||
rand_gen : {`~numpy.random.RandomState`, `~numpy.random.Generator`}
|
||||
A `~numpy.random.RandomState`, `~numpy.random.Generator` object
|
||||
for using the current state of the created random generator container.
|
||||
|
||||
"""
|
||||
TAIL_LIMIT = 1.e8
|
||||
MIN_VISIT_BOUND = 1.e-10
|
||||
|
||||
def __init__(self, lb, ub, visiting_param, rand_gen):
|
||||
# if you wish to make _visiting_param adjustable during the life of
|
||||
# the object then _factor2, _factor3, _factor5, _d1, _factor6 will
|
||||
# have to be dynamically calculated in `visit_fn`. They're factored
|
||||
# out here so they don't need to be recalculated all the time.
|
||||
self._visiting_param = visiting_param
|
||||
self.rand_gen = rand_gen
|
||||
self.lower = lb
|
||||
self.upper = ub
|
||||
self.bound_range = ub - lb
|
||||
|
||||
# these are invariant numbers unless visiting_param changes
|
||||
self._factor2 = np.exp((4.0 - self._visiting_param) * np.log(
|
||||
self._visiting_param - 1.0))
|
||||
self._factor3 = np.exp((2.0 - self._visiting_param) * np.log(2.0)
|
||||
/ (self._visiting_param - 1.0))
|
||||
self._factor4_p = np.sqrt(np.pi) * self._factor2 / (self._factor3 * (
|
||||
3.0 - self._visiting_param))
|
||||
|
||||
self._factor5 = 1.0 / (self._visiting_param - 1.0) - 0.5
|
||||
self._d1 = 2.0 - self._factor5
|
||||
self._factor6 = np.pi * (1.0 - self._factor5) / np.sin(
|
||||
np.pi * (1.0 - self._factor5)) / np.exp(gammaln(self._d1))
|
||||
|
||||
def visiting(self, x, step, temperature):
|
||||
""" Based on the step in the strategy chain, new coordinates are
|
||||
generated by changing all components is the same time or only
|
||||
one of them, the new values are computed with visit_fn method
|
||||
"""
|
||||
dim = x.size
|
||||
if step < dim:
|
||||
# Changing all coordinates with a new visiting value
|
||||
visits = self.visit_fn(temperature, dim)
|
||||
upper_sample, lower_sample = self.rand_gen.uniform(size=2)
|
||||
visits[visits > self.TAIL_LIMIT] = self.TAIL_LIMIT * upper_sample
|
||||
visits[visits < -self.TAIL_LIMIT] = -self.TAIL_LIMIT * lower_sample
|
||||
x_visit = visits + x
|
||||
a = x_visit - self.lower
|
||||
b = np.fmod(a, self.bound_range) + self.bound_range
|
||||
x_visit = np.fmod(b, self.bound_range) + self.lower
|
||||
x_visit[np.fabs(
|
||||
x_visit - self.lower) < self.MIN_VISIT_BOUND] += 1.e-10
|
||||
else:
|
||||
# Changing only one coordinate at a time based on strategy
|
||||
# chain step
|
||||
x_visit = np.copy(x)
|
||||
visit = self.visit_fn(temperature, 1)[0]
|
||||
if visit > self.TAIL_LIMIT:
|
||||
visit = self.TAIL_LIMIT * self.rand_gen.uniform()
|
||||
elif visit < -self.TAIL_LIMIT:
|
||||
visit = -self.TAIL_LIMIT * self.rand_gen.uniform()
|
||||
index = step - dim
|
||||
x_visit[index] = visit + x[index]
|
||||
a = x_visit[index] - self.lower[index]
|
||||
b = np.fmod(a, self.bound_range[index]) + self.bound_range[index]
|
||||
x_visit[index] = np.fmod(b, self.bound_range[
|
||||
index]) + self.lower[index]
|
||||
if np.fabs(x_visit[index] - self.lower[
|
||||
index]) < self.MIN_VISIT_BOUND:
|
||||
x_visit[index] += self.MIN_VISIT_BOUND
|
||||
return x_visit
|
||||
|
||||
def visit_fn(self, temperature, dim):
|
||||
""" Formula Visita from p. 405 of reference [2] """
|
||||
x, y = self.rand_gen.normal(size=(dim, 2)).T
|
||||
|
||||
factor1 = np.exp(np.log(temperature) / (self._visiting_param - 1.0))
|
||||
factor4 = self._factor4_p * factor1
|
||||
|
||||
# sigmax
|
||||
x *= np.exp(-(self._visiting_param - 1.0) * np.log(
|
||||
self._factor6 / factor4) / (3.0 - self._visiting_param))
|
||||
|
||||
den = np.exp((self._visiting_param - 1.0) * np.log(np.fabs(y)) /
|
||||
(3.0 - self._visiting_param))
|
||||
|
||||
return x / den
|
||||
|
||||
|
||||
class EnergyState:
|
||||
"""
|
||||
Class used to record the energy state. At any time, it knows what is the
|
||||
currently used coordinates and the most recent best location.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
lower : array_like
|
||||
A 1-D NumPy ndarray containing lower bounds for generating an initial
|
||||
random components in the `reset` method.
|
||||
upper : array_like
|
||||
A 1-D NumPy ndarray containing upper bounds for generating an initial
|
||||
random components in the `reset` method
|
||||
components. Neither NaN or inf are allowed.
|
||||
callback : callable, ``callback(x, f, context)``, optional
|
||||
A callback function which will be called for all minima found.
|
||||
``x`` and ``f`` are the coordinates and function value of the
|
||||
latest minimum found, and `context` has value in [0, 1, 2]
|
||||
"""
|
||||
# Maximum number of trials for generating a valid starting point
|
||||
MAX_REINIT_COUNT = 1000
|
||||
|
||||
def __init__(self, lower, upper, callback=None):
|
||||
self.ebest = None
|
||||
self.current_energy = None
|
||||
self.current_location = None
|
||||
self.xbest = None
|
||||
self.lower = lower
|
||||
self.upper = upper
|
||||
self.callback = callback
|
||||
|
||||
def reset(self, func_wrapper, rand_gen, x0=None):
|
||||
"""
|
||||
Initialize current location is the search domain. If `x0` is not
|
||||
provided, a random location within the bounds is generated.
|
||||
"""
|
||||
if x0 is None:
|
||||
self.current_location = rand_gen.uniform(self.lower, self.upper,
|
||||
size=len(self.lower))
|
||||
else:
|
||||
self.current_location = np.copy(x0)
|
||||
init_error = True
|
||||
reinit_counter = 0
|
||||
while init_error:
|
||||
self.current_energy = func_wrapper.fun(self.current_location)
|
||||
if self.current_energy is None:
|
||||
raise ValueError('Objective function is returning None')
|
||||
if (not np.isfinite(self.current_energy) or np.isnan(
|
||||
self.current_energy)):
|
||||
if reinit_counter >= EnergyState.MAX_REINIT_COUNT:
|
||||
init_error = False
|
||||
message = (
|
||||
'Stopping algorithm because function '
|
||||
'create NaN or (+/-) infinity values even with '
|
||||
'trying new random parameters'
|
||||
)
|
||||
raise ValueError(message)
|
||||
self.current_location = rand_gen.uniform(self.lower,
|
||||
self.upper,
|
||||
size=self.lower.size)
|
||||
reinit_counter += 1
|
||||
else:
|
||||
init_error = False
|
||||
# If first time reset, initialize ebest and xbest
|
||||
if self.ebest is None and self.xbest is None:
|
||||
self.ebest = self.current_energy
|
||||
self.xbest = np.copy(self.current_location)
|
||||
# Otherwise, we keep them in case of reannealing reset
|
||||
|
||||
def update_best(self, e, x, context):
|
||||
self.ebest = e
|
||||
self.xbest = np.copy(x)
|
||||
if self.callback is not None:
|
||||
val = self.callback(x, e, context)
|
||||
if val is not None:
|
||||
if val:
|
||||
return ('Callback function requested to stop early by '
|
||||
'returning True')
|
||||
|
||||
def update_current(self, e, x):
|
||||
self.current_energy = e
|
||||
self.current_location = np.copy(x)
|
||||
|
||||
|
||||
class StrategyChain:
|
||||
"""
|
||||
Class that implements within a Markov chain the strategy for location
|
||||
acceptance and local search decision making.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
acceptance_param : float
|
||||
Parameter for acceptance distribution. It is used to control the
|
||||
probability of acceptance. The lower the acceptance parameter, the
|
||||
smaller the probability of acceptance. Default value is -5.0 with
|
||||
a range (-1e4, -5].
|
||||
visit_dist : VisitingDistribution
|
||||
Instance of `VisitingDistribution` class.
|
||||
func_wrapper : ObjectiveFunWrapper
|
||||
Instance of `ObjectiveFunWrapper` class.
|
||||
minimizer_wrapper: LocalSearchWrapper
|
||||
Instance of `LocalSearchWrapper` class.
|
||||
rand_gen : {None, int, `numpy.random.Generator`,
|
||||
`numpy.random.RandomState`}, optional
|
||||
|
||||
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
|
||||
singleton is used.
|
||||
If `seed` is an int, a new ``RandomState`` instance is used,
|
||||
seeded with `seed`.
|
||||
If `seed` is already a ``Generator`` or ``RandomState`` instance then
|
||||
that instance is used.
|
||||
energy_state: EnergyState
|
||||
Instance of `EnergyState` class.
|
||||
|
||||
"""
|
||||
|
||||
def __init__(self, acceptance_param, visit_dist, func_wrapper,
|
||||
minimizer_wrapper, rand_gen, energy_state):
|
||||
# Local strategy chain minimum energy and location
|
||||
self.emin = energy_state.current_energy
|
||||
self.xmin = np.array(energy_state.current_location)
|
||||
# Global optimizer state
|
||||
self.energy_state = energy_state
|
||||
# Acceptance parameter
|
||||
self.acceptance_param = acceptance_param
|
||||
# Visiting distribution instance
|
||||
self.visit_dist = visit_dist
|
||||
# Wrapper to objective function
|
||||
self.func_wrapper = func_wrapper
|
||||
# Wrapper to the local minimizer
|
||||
self.minimizer_wrapper = minimizer_wrapper
|
||||
self.not_improved_idx = 0
|
||||
self.not_improved_max_idx = 1000
|
||||
self._rand_gen = rand_gen
|
||||
self.temperature_step = 0
|
||||
self.K = 100 * len(energy_state.current_location)
|
||||
|
||||
def accept_reject(self, j, e, x_visit):
|
||||
r = self._rand_gen.uniform()
|
||||
pqv_temp = 1.0 - ((1.0 - self.acceptance_param) *
|
||||
(e - self.energy_state.current_energy) / self.temperature_step)
|
||||
if pqv_temp <= 0.:
|
||||
pqv = 0.
|
||||
else:
|
||||
pqv = np.exp(np.log(pqv_temp) / (
|
||||
1. - self.acceptance_param))
|
||||
|
||||
if r <= pqv:
|
||||
# We accept the new location and update state
|
||||
self.energy_state.update_current(e, x_visit)
|
||||
self.xmin = np.copy(self.energy_state.current_location)
|
||||
|
||||
# No improvement for a long time
|
||||
if self.not_improved_idx >= self.not_improved_max_idx:
|
||||
if j == 0 or self.energy_state.current_energy < self.emin:
|
||||
self.emin = self.energy_state.current_energy
|
||||
self.xmin = np.copy(self.energy_state.current_location)
|
||||
|
||||
def run(self, step, temperature):
|
||||
self.temperature_step = temperature / float(step + 1)
|
||||
self.not_improved_idx += 1
|
||||
for j in range(self.energy_state.current_location.size * 2):
|
||||
if j == 0:
|
||||
if step == 0:
|
||||
self.energy_state_improved = True
|
||||
else:
|
||||
self.energy_state_improved = False
|
||||
x_visit = self.visit_dist.visiting(
|
||||
self.energy_state.current_location, j, temperature)
|
||||
# Calling the objective function
|
||||
e = self.func_wrapper.fun(x_visit)
|
||||
if e < self.energy_state.current_energy:
|
||||
# We have got a better energy value
|
||||
self.energy_state.update_current(e, x_visit)
|
||||
if e < self.energy_state.ebest:
|
||||
val = self.energy_state.update_best(e, x_visit, 0)
|
||||
if val is not None:
|
||||
if val:
|
||||
return val
|
||||
self.energy_state_improved = True
|
||||
self.not_improved_idx = 0
|
||||
else:
|
||||
# We have not improved but do we accept the new location?
|
||||
self.accept_reject(j, e, x_visit)
|
||||
if self.func_wrapper.nfev >= self.func_wrapper.maxfun:
|
||||
return ('Maximum number of function call reached '
|
||||
'during annealing')
|
||||
# End of StrategyChain loop
|
||||
|
||||
def local_search(self):
|
||||
# Decision making for performing a local search
|
||||
# based on strategy chain results
|
||||
# If energy has been improved or no improvement since too long,
|
||||
# performing a local search with the best strategy chain location
|
||||
if self.energy_state_improved:
|
||||
# Global energy has improved, let's see if LS improves further
|
||||
e, x = self.minimizer_wrapper.local_search(self.energy_state.xbest,
|
||||
self.energy_state.ebest)
|
||||
if e < self.energy_state.ebest:
|
||||
self.not_improved_idx = 0
|
||||
val = self.energy_state.update_best(e, x, 1)
|
||||
if val is not None:
|
||||
if val:
|
||||
return val
|
||||
self.energy_state.update_current(e, x)
|
||||
if self.func_wrapper.nfev >= self.func_wrapper.maxfun:
|
||||
return ('Maximum number of function call reached '
|
||||
'during local search')
|
||||
# Check probability of a need to perform a LS even if no improvement
|
||||
do_ls = False
|
||||
if self.K < 90 * len(self.energy_state.current_location):
|
||||
pls = np.exp(self.K * (
|
||||
self.energy_state.ebest - self.energy_state.current_energy) /
|
||||
self.temperature_step)
|
||||
if pls >= self._rand_gen.uniform():
|
||||
do_ls = True
|
||||
# Global energy not improved, let's see what LS gives
|
||||
# on the best strategy chain location
|
||||
if self.not_improved_idx >= self.not_improved_max_idx:
|
||||
do_ls = True
|
||||
if do_ls:
|
||||
e, x = self.minimizer_wrapper.local_search(self.xmin, self.emin)
|
||||
self.xmin = np.copy(x)
|
||||
self.emin = e
|
||||
self.not_improved_idx = 0
|
||||
self.not_improved_max_idx = self.energy_state.current_location.size
|
||||
if e < self.energy_state.ebest:
|
||||
val = self.energy_state.update_best(
|
||||
self.emin, self.xmin, 2)
|
||||
if val is not None:
|
||||
if val:
|
||||
return val
|
||||
self.energy_state.update_current(e, x)
|
||||
if self.func_wrapper.nfev >= self.func_wrapper.maxfun:
|
||||
return ('Maximum number of function call reached '
|
||||
'during dual annealing')
|
||||
|
||||
|
||||
class ObjectiveFunWrapper:
|
||||
|
||||
def __init__(self, func, maxfun=1e7, *args):
|
||||
self.func = func
|
||||
self.args = args
|
||||
# Number of objective function evaluations
|
||||
self.nfev = 0
|
||||
# Number of gradient function evaluation if used
|
||||
self.ngev = 0
|
||||
# Number of hessian of the objective function if used
|
||||
self.nhev = 0
|
||||
self.maxfun = maxfun
|
||||
|
||||
def fun(self, x):
|
||||
self.nfev += 1
|
||||
return self.func(x, *self.args)
|
||||
|
||||
|
||||
class LocalSearchWrapper:
|
||||
"""
|
||||
Class used to wrap around the minimizer used for local search
|
||||
Default local minimizer is SciPy minimizer L-BFGS-B
|
||||
"""
|
||||
|
||||
LS_MAXITER_RATIO = 6
|
||||
LS_MAXITER_MIN = 100
|
||||
LS_MAXITER_MAX = 1000
|
||||
|
||||
def __init__(self, search_bounds, func_wrapper, *args, **kwargs):
|
||||
self.func_wrapper = func_wrapper
|
||||
self.kwargs = kwargs
|
||||
self.jac = self.kwargs.get('jac', None)
|
||||
self.hess = self.kwargs.get('hess', None)
|
||||
self.hessp = self.kwargs.get('hessp', None)
|
||||
self.kwargs.pop("args", None)
|
||||
self.minimizer = minimize
|
||||
bounds_list = list(zip(*search_bounds))
|
||||
self.lower = np.array(bounds_list[0])
|
||||
self.upper = np.array(bounds_list[1])
|
||||
|
||||
# If no minimizer specified, use SciPy minimize with 'L-BFGS-B' method
|
||||
if not self.kwargs:
|
||||
n = len(self.lower)
|
||||
ls_max_iter = min(max(n * self.LS_MAXITER_RATIO,
|
||||
self.LS_MAXITER_MIN),
|
||||
self.LS_MAXITER_MAX)
|
||||
self.kwargs['method'] = 'L-BFGS-B'
|
||||
self.kwargs['options'] = {
|
||||
'maxiter': ls_max_iter,
|
||||
}
|
||||
self.kwargs['bounds'] = list(zip(self.lower, self.upper))
|
||||
else:
|
||||
if callable(self.jac):
|
||||
def wrapped_jac(x):
|
||||
return self.jac(x, *args)
|
||||
self.kwargs['jac'] = wrapped_jac
|
||||
if callable(self.hess):
|
||||
def wrapped_hess(x):
|
||||
return self.hess(x, *args)
|
||||
self.kwargs['hess'] = wrapped_hess
|
||||
if callable(self.hessp):
|
||||
def wrapped_hessp(x, p):
|
||||
return self.hessp(x, p, *args)
|
||||
self.kwargs['hessp'] = wrapped_hessp
|
||||
|
||||
def local_search(self, x, e):
|
||||
# Run local search from the given x location where energy value is e
|
||||
x_tmp = np.copy(x)
|
||||
mres = self.minimizer(self.func_wrapper.fun, x, **self.kwargs)
|
||||
if 'njev' in mres:
|
||||
self.func_wrapper.ngev += mres.njev
|
||||
if 'nhev' in mres:
|
||||
self.func_wrapper.nhev += mres.nhev
|
||||
# Check if is valid value
|
||||
is_finite = np.all(np.isfinite(mres.x)) and np.isfinite(mres.fun)
|
||||
in_bounds = np.all(mres.x >= self.lower) and np.all(
|
||||
mres.x <= self.upper)
|
||||
is_valid = is_finite and in_bounds
|
||||
|
||||
# Use the new point only if it is valid and return a better results
|
||||
if is_valid and mres.fun < e:
|
||||
return mres.fun, mres.x
|
||||
else:
|
||||
return e, x_tmp
|
||||
|
||||
|
||||
def dual_annealing(func, bounds, args=(), maxiter=1000,
|
||||
minimizer_kwargs=None, initial_temp=5230.,
|
||||
restart_temp_ratio=2.e-5, visit=2.62, accept=-5.0,
|
||||
maxfun=1e7, seed=None, no_local_search=False,
|
||||
callback=None, x0=None):
|
||||
"""
|
||||
Find the global minimum of a function using Dual Annealing.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
func : callable
|
||||
The objective function to be minimized. Must be in the form
|
||||
``f(x, *args)``, where ``x`` is the argument in the form of a 1-D array
|
||||
and ``args`` is a tuple of any additional fixed parameters needed to
|
||||
completely specify the function.
|
||||
bounds : sequence or `Bounds`
|
||||
Bounds for variables. There are two ways to specify the bounds:
|
||||
|
||||
1. Instance of `Bounds` class.
|
||||
2. Sequence of ``(min, max)`` pairs for each element in `x`.
|
||||
|
||||
args : tuple, optional
|
||||
Any additional fixed parameters needed to completely specify the
|
||||
objective function.
|
||||
maxiter : int, optional
|
||||
The maximum number of global search iterations. Default value is 1000.
|
||||
minimizer_kwargs : dict, optional
|
||||
Keyword arguments to be passed to the local minimizer
|
||||
(`minimize`). An important option could be ``method`` for the minimizer
|
||||
method to use.
|
||||
If no keyword arguments are provided, the local minimizer defaults to
|
||||
'L-BFGS-B' and uses the already supplied bounds. If `minimizer_kwargs`
|
||||
is specified, then the dict must contain all parameters required to
|
||||
control the local minimization. `args` is ignored in this dict, as it is
|
||||
passed automatically. `bounds` is not automatically passed on to the
|
||||
local minimizer as the method may not support them.
|
||||
initial_temp : float, optional
|
||||
The initial temperature, use higher values to facilitates a wider
|
||||
search of the energy landscape, allowing dual_annealing to escape
|
||||
local minima that it is trapped in. Default value is 5230. Range is
|
||||
(0.01, 5.e4].
|
||||
restart_temp_ratio : float, optional
|
||||
During the annealing process, temperature is decreasing, when it
|
||||
reaches ``initial_temp * restart_temp_ratio``, the reannealing process
|
||||
is triggered. Default value of the ratio is 2e-5. Range is (0, 1).
|
||||
visit : float, optional
|
||||
Parameter for visiting distribution. Default value is 2.62. Higher
|
||||
values give the visiting distribution a heavier tail, this makes
|
||||
the algorithm jump to a more distant region. The value range is (1, 3].
|
||||
accept : float, optional
|
||||
Parameter for acceptance distribution. It is used to control the
|
||||
probability of acceptance. The lower the acceptance parameter, the
|
||||
smaller the probability of acceptance. Default value is -5.0 with
|
||||
a range (-1e4, -5].
|
||||
maxfun : int, optional
|
||||
Soft limit for the number of objective function calls. If the
|
||||
algorithm is in the middle of a local search, this number will be
|
||||
exceeded, the algorithm will stop just after the local search is
|
||||
done. Default value is 1e7.
|
||||
seed : {None, int, `numpy.random.Generator`, `numpy.random.RandomState`}, optional
|
||||
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
|
||||
singleton is used.
|
||||
If `seed` is an int, a new ``RandomState`` instance is used,
|
||||
seeded with `seed`.
|
||||
If `seed` is already a ``Generator`` or ``RandomState`` instance then
|
||||
that instance is used.
|
||||
Specify `seed` for repeatable minimizations. The random numbers
|
||||
generated with this seed only affect the visiting distribution function
|
||||
and new coordinates generation.
|
||||
no_local_search : bool, optional
|
||||
If `no_local_search` is set to True, a traditional Generalized
|
||||
Simulated Annealing will be performed with no local search
|
||||
strategy applied.
|
||||
callback : callable, optional
|
||||
A callback function with signature ``callback(x, f, context)``,
|
||||
which will be called for all minima found.
|
||||
``x`` and ``f`` are the coordinates and function value of the
|
||||
latest minimum found, and ``context`` has value in [0, 1, 2], with the
|
||||
following meaning:
|
||||
|
||||
- 0: minimum detected in the annealing process.
|
||||
- 1: detection occurred in the local search process.
|
||||
- 2: detection done in the dual annealing process.
|
||||
|
||||
If the callback implementation returns True, the algorithm will stop.
|
||||
x0 : ndarray, shape(n,), optional
|
||||
Coordinates of a single N-D starting point.
|
||||
|
||||
Returns
|
||||
-------
|
||||
res : OptimizeResult
|
||||
The optimization result represented as a `OptimizeResult` object.
|
||||
Important attributes are: ``x`` the solution array, ``fun`` the value
|
||||
of the function at the solution, and ``message`` which describes the
|
||||
cause of the termination.
|
||||
See `OptimizeResult` for a description of other attributes.
|
||||
|
||||
Notes
|
||||
-----
|
||||
This function implements the Dual Annealing optimization. This stochastic
|
||||
approach derived from [3]_ combines the generalization of CSA (Classical
|
||||
Simulated Annealing) and FSA (Fast Simulated Annealing) [1]_ [2]_ coupled
|
||||
to a strategy for applying a local search on accepted locations [4]_.
|
||||
An alternative implementation of this same algorithm is described in [5]_
|
||||
and benchmarks are presented in [6]_. This approach introduces an advanced
|
||||
method to refine the solution found by the generalized annealing
|
||||
process. This algorithm uses a distorted Cauchy-Lorentz visiting
|
||||
distribution, with its shape controlled by the parameter :math:`q_{v}`
|
||||
|
||||
.. math::
|
||||
|
||||
g_{q_{v}}(\\Delta x(t)) \\propto \\frac{ \\
|
||||
\\left[T_{q_{v}}(t) \\right]^{-\\frac{D}{3-q_{v}}}}{ \\
|
||||
\\left[{1+(q_{v}-1)\\frac{(\\Delta x(t))^{2}} { \\
|
||||
\\left[T_{q_{v}}(t)\\right]^{\\frac{2}{3-q_{v}}}}}\\right]^{ \\
|
||||
\\frac{1}{q_{v}-1}+\\frac{D-1}{2}}}
|
||||
|
||||
Where :math:`t` is the artificial time. This visiting distribution is used
|
||||
to generate a trial jump distance :math:`\\Delta x(t)` of variable
|
||||
:math:`x(t)` under artificial temperature :math:`T_{q_{v}}(t)`.
|
||||
|
||||
From the starting point, after calling the visiting distribution
|
||||
function, the acceptance probability is computed as follows:
|
||||
|
||||
.. math::
|
||||
|
||||
p_{q_{a}} = \\min{\\{1,\\left[1-(1-q_{a}) \\beta \\Delta E \\right]^{ \\
|
||||
\\frac{1}{1-q_{a}}}\\}}
|
||||
|
||||
Where :math:`q_{a}` is a acceptance parameter. For :math:`q_{a}<1`, zero
|
||||
acceptance probability is assigned to the cases where
|
||||
|
||||
.. math::
|
||||
|
||||
[1-(1-q_{a}) \\beta \\Delta E] < 0
|
||||
|
||||
The artificial temperature :math:`T_{q_{v}}(t)` is decreased according to
|
||||
|
||||
.. math::
|
||||
|
||||
T_{q_{v}}(t) = T_{q_{v}}(1) \\frac{2^{q_{v}-1}-1}{\\left( \\
|
||||
1 + t\\right)^{q_{v}-1}-1}
|
||||
|
||||
Where :math:`q_{v}` is the visiting parameter.
|
||||
|
||||
.. versionadded:: 1.2.0
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Tsallis C. Possible generalization of Boltzmann-Gibbs
|
||||
statistics. Journal of Statistical Physics, 52, 479-487 (1998).
|
||||
.. [2] Tsallis C, Stariolo DA. Generalized Simulated Annealing.
|
||||
Physica A, 233, 395-406 (1996).
|
||||
.. [3] Xiang Y, Sun DY, Fan W, Gong XG. Generalized Simulated
|
||||
Annealing Algorithm and Its Application to the Thomson Model.
|
||||
Physics Letters A, 233, 216-220 (1997).
|
||||
.. [4] Xiang Y, Gong XG. Efficiency of Generalized Simulated
|
||||
Annealing. Physical Review E, 62, 4473 (2000).
|
||||
.. [5] Xiang Y, Gubian S, Suomela B, Hoeng J. Generalized
|
||||
Simulated Annealing for Efficient Global Optimization: the GenSA
|
||||
Package for R. The R Journal, Volume 5/1 (2013).
|
||||
.. [6] Mullen, K. Continuous Global Optimization in R. Journal of
|
||||
Statistical Software, 60(6), 1 - 45, (2014).
|
||||
:doi:`10.18637/jss.v060.i06`
|
||||
|
||||
Examples
|
||||
--------
|
||||
The following example is a 10-D problem, with many local minima.
|
||||
The function involved is called Rastrigin
|
||||
(https://en.wikipedia.org/wiki/Rastrigin_function)
|
||||
|
||||
>>> import numpy as np
|
||||
>>> from scipy.optimize import dual_annealing
|
||||
>>> func = lambda x: np.sum(x*x - 10*np.cos(2*np.pi*x)) + 10*np.size(x)
|
||||
>>> lw = [-5.12] * 10
|
||||
>>> up = [5.12] * 10
|
||||
>>> ret = dual_annealing(func, bounds=list(zip(lw, up)))
|
||||
>>> ret.x
|
||||
array([-4.26437714e-09, -3.91699361e-09, -1.86149218e-09, -3.97165720e-09,
|
||||
-6.29151648e-09, -6.53145322e-09, -3.93616815e-09, -6.55623025e-09,
|
||||
-6.05775280e-09, -5.00668935e-09]) # random
|
||||
>>> ret.fun
|
||||
0.000000
|
||||
|
||||
"""
|
||||
|
||||
if isinstance(bounds, Bounds):
|
||||
bounds = new_bounds_to_old(bounds.lb, bounds.ub, len(bounds.lb))
|
||||
|
||||
if x0 is not None and not len(x0) == len(bounds):
|
||||
raise ValueError('Bounds size does not match x0')
|
||||
|
||||
lu = list(zip(*bounds))
|
||||
lower = np.array(lu[0])
|
||||
upper = np.array(lu[1])
|
||||
# Check that restart temperature ratio is correct
|
||||
if restart_temp_ratio <= 0. or restart_temp_ratio >= 1.:
|
||||
raise ValueError('Restart temperature ratio has to be in range (0, 1)')
|
||||
# Checking bounds are valid
|
||||
if (np.any(np.isinf(lower)) or np.any(np.isinf(upper)) or np.any(
|
||||
np.isnan(lower)) or np.any(np.isnan(upper))):
|
||||
raise ValueError('Some bounds values are inf values or nan values')
|
||||
# Checking that bounds are consistent
|
||||
if not np.all(lower < upper):
|
||||
raise ValueError('Bounds are not consistent min < max')
|
||||
# Checking that bounds are the same length
|
||||
if not len(lower) == len(upper):
|
||||
raise ValueError('Bounds do not have the same dimensions')
|
||||
|
||||
# Wrapper for the objective function
|
||||
func_wrapper = ObjectiveFunWrapper(func, maxfun, *args)
|
||||
|
||||
# minimizer_kwargs has to be a dict, not None
|
||||
minimizer_kwargs = minimizer_kwargs or {}
|
||||
|
||||
minimizer_wrapper = LocalSearchWrapper(
|
||||
bounds, func_wrapper, *args, **minimizer_kwargs)
|
||||
|
||||
# Initialization of random Generator for reproducible runs if seed provided
|
||||
rand_state = check_random_state(seed)
|
||||
# Initialization of the energy state
|
||||
energy_state = EnergyState(lower, upper, callback)
|
||||
energy_state.reset(func_wrapper, rand_state, x0)
|
||||
# Minimum value of annealing temperature reached to perform
|
||||
# re-annealing
|
||||
temperature_restart = initial_temp * restart_temp_ratio
|
||||
# VisitingDistribution instance
|
||||
visit_dist = VisitingDistribution(lower, upper, visit, rand_state)
|
||||
# Strategy chain instance
|
||||
strategy_chain = StrategyChain(accept, visit_dist, func_wrapper,
|
||||
minimizer_wrapper, rand_state, energy_state)
|
||||
need_to_stop = False
|
||||
iteration = 0
|
||||
message = []
|
||||
# OptimizeResult object to be returned
|
||||
optimize_res = OptimizeResult()
|
||||
optimize_res.success = True
|
||||
optimize_res.status = 0
|
||||
|
||||
t1 = np.exp((visit - 1) * np.log(2.0)) - 1.0
|
||||
# Run the search loop
|
||||
while not need_to_stop:
|
||||
for i in range(maxiter):
|
||||
# Compute temperature for this step
|
||||
s = float(i) + 2.0
|
||||
t2 = np.exp((visit - 1) * np.log(s)) - 1.0
|
||||
temperature = initial_temp * t1 / t2
|
||||
if iteration >= maxiter:
|
||||
message.append("Maximum number of iteration reached")
|
||||
need_to_stop = True
|
||||
break
|
||||
# Need a re-annealing process?
|
||||
if temperature < temperature_restart:
|
||||
energy_state.reset(func_wrapper, rand_state)
|
||||
break
|
||||
# starting strategy chain
|
||||
val = strategy_chain.run(i, temperature)
|
||||
if val is not None:
|
||||
message.append(val)
|
||||
need_to_stop = True
|
||||
optimize_res.success = False
|
||||
break
|
||||
# Possible local search at the end of the strategy chain
|
||||
if not no_local_search:
|
||||
val = strategy_chain.local_search()
|
||||
if val is not None:
|
||||
message.append(val)
|
||||
need_to_stop = True
|
||||
optimize_res.success = False
|
||||
break
|
||||
iteration += 1
|
||||
|
||||
# Setting the OptimizeResult values
|
||||
optimize_res.x = energy_state.xbest
|
||||
optimize_res.fun = energy_state.ebest
|
||||
optimize_res.nit = iteration
|
||||
optimize_res.nfev = func_wrapper.nfev
|
||||
optimize_res.njev = func_wrapper.ngev
|
||||
optimize_res.nhev = func_wrapper.nhev
|
||||
optimize_res.message = message
|
||||
return optimize_res
|
||||
Binary file not shown.
@ -0,0 +1,475 @@
|
||||
"""Hessian update strategies for quasi-Newton optimization methods."""
|
||||
import numpy as np
|
||||
from numpy.linalg import norm
|
||||
from scipy.linalg import get_blas_funcs, issymmetric
|
||||
from warnings import warn
|
||||
|
||||
|
||||
__all__ = ['HessianUpdateStrategy', 'BFGS', 'SR1']
|
||||
|
||||
|
||||
class HessianUpdateStrategy:
|
||||
"""Interface for implementing Hessian update strategies.
|
||||
|
||||
Many optimization methods make use of Hessian (or inverse Hessian)
|
||||
approximations, such as the quasi-Newton methods BFGS, SR1, L-BFGS.
|
||||
Some of these approximations, however, do not actually need to store
|
||||
the entire matrix or can compute the internal matrix product with a
|
||||
given vector in a very efficiently manner. This class serves as an
|
||||
abstract interface between the optimization algorithm and the
|
||||
quasi-Newton update strategies, giving freedom of implementation
|
||||
to store and update the internal matrix as efficiently as possible.
|
||||
Different choices of initialization and update procedure will result
|
||||
in different quasi-Newton strategies.
|
||||
|
||||
Four methods should be implemented in derived classes: ``initialize``,
|
||||
``update``, ``dot`` and ``get_matrix``.
|
||||
|
||||
Notes
|
||||
-----
|
||||
Any instance of a class that implements this interface,
|
||||
can be accepted by the method ``minimize`` and used by
|
||||
the compatible solvers to approximate the Hessian (or
|
||||
inverse Hessian) used by the optimization algorithms.
|
||||
"""
|
||||
|
||||
def initialize(self, n, approx_type):
|
||||
"""Initialize internal matrix.
|
||||
|
||||
Allocate internal memory for storing and updating
|
||||
the Hessian or its inverse.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
n : int
|
||||
Problem dimension.
|
||||
approx_type : {'hess', 'inv_hess'}
|
||||
Selects either the Hessian or the inverse Hessian.
|
||||
When set to 'hess' the Hessian will be stored and updated.
|
||||
When set to 'inv_hess' its inverse will be used instead.
|
||||
"""
|
||||
raise NotImplementedError("The method ``initialize(n, approx_type)``"
|
||||
" is not implemented.")
|
||||
|
||||
def update(self, delta_x, delta_grad):
|
||||
"""Update internal matrix.
|
||||
|
||||
Update Hessian matrix or its inverse (depending on how 'approx_type'
|
||||
is defined) using information about the last evaluated points.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
delta_x : ndarray
|
||||
The difference between two points the gradient
|
||||
function have been evaluated at: ``delta_x = x2 - x1``.
|
||||
delta_grad : ndarray
|
||||
The difference between the gradients:
|
||||
``delta_grad = grad(x2) - grad(x1)``.
|
||||
"""
|
||||
raise NotImplementedError("The method ``update(delta_x, delta_grad)``"
|
||||
" is not implemented.")
|
||||
|
||||
def dot(self, p):
|
||||
"""Compute the product of the internal matrix with the given vector.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
p : array_like
|
||||
1-D array representing a vector.
|
||||
|
||||
Returns
|
||||
-------
|
||||
Hp : array
|
||||
1-D represents the result of multiplying the approximation matrix
|
||||
by vector p.
|
||||
"""
|
||||
raise NotImplementedError("The method ``dot(p)``"
|
||||
" is not implemented.")
|
||||
|
||||
def get_matrix(self):
|
||||
"""Return current internal matrix.
|
||||
|
||||
Returns
|
||||
-------
|
||||
H : ndarray, shape (n, n)
|
||||
Dense matrix containing either the Hessian
|
||||
or its inverse (depending on how 'approx_type'
|
||||
is defined).
|
||||
"""
|
||||
raise NotImplementedError("The method ``get_matrix(p)``"
|
||||
" is not implemented.")
|
||||
|
||||
|
||||
class FullHessianUpdateStrategy(HessianUpdateStrategy):
|
||||
"""Hessian update strategy with full dimensional internal representation.
|
||||
"""
|
||||
_syr = get_blas_funcs('syr', dtype='d') # Symmetric rank 1 update
|
||||
_syr2 = get_blas_funcs('syr2', dtype='d') # Symmetric rank 2 update
|
||||
# Symmetric matrix-vector product
|
||||
_symv = get_blas_funcs('symv', dtype='d')
|
||||
|
||||
def __init__(self, init_scale='auto'):
|
||||
self.init_scale = init_scale
|
||||
# Until initialize is called we can't really use the class,
|
||||
# so it makes sense to set everything to None.
|
||||
self.first_iteration = None
|
||||
self.approx_type = None
|
||||
self.B = None
|
||||
self.H = None
|
||||
|
||||
def initialize(self, n, approx_type):
|
||||
"""Initialize internal matrix.
|
||||
|
||||
Allocate internal memory for storing and updating
|
||||
the Hessian or its inverse.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
n : int
|
||||
Problem dimension.
|
||||
approx_type : {'hess', 'inv_hess'}
|
||||
Selects either the Hessian or the inverse Hessian.
|
||||
When set to 'hess' the Hessian will be stored and updated.
|
||||
When set to 'inv_hess' its inverse will be used instead.
|
||||
"""
|
||||
self.first_iteration = True
|
||||
self.n = n
|
||||
self.approx_type = approx_type
|
||||
if approx_type not in ('hess', 'inv_hess'):
|
||||
raise ValueError("`approx_type` must be 'hess' or 'inv_hess'.")
|
||||
# Create matrix
|
||||
if self.approx_type == 'hess':
|
||||
self.B = np.eye(n, dtype=float)
|
||||
else:
|
||||
self.H = np.eye(n, dtype=float)
|
||||
|
||||
def _auto_scale(self, delta_x, delta_grad):
|
||||
# Heuristic to scale matrix at first iteration.
|
||||
# Described in Nocedal and Wright "Numerical Optimization"
|
||||
# p.143 formula (6.20).
|
||||
s_norm2 = np.dot(delta_x, delta_x)
|
||||
y_norm2 = np.dot(delta_grad, delta_grad)
|
||||
ys = np.abs(np.dot(delta_grad, delta_x))
|
||||
if ys == 0.0 or y_norm2 == 0 or s_norm2 == 0:
|
||||
return 1
|
||||
if self.approx_type == 'hess':
|
||||
return y_norm2 / ys
|
||||
else:
|
||||
return ys / y_norm2
|
||||
|
||||
def _update_implementation(self, delta_x, delta_grad):
|
||||
raise NotImplementedError("The method ``_update_implementation``"
|
||||
" is not implemented.")
|
||||
|
||||
def update(self, delta_x, delta_grad):
|
||||
"""Update internal matrix.
|
||||
|
||||
Update Hessian matrix or its inverse (depending on how 'approx_type'
|
||||
is defined) using information about the last evaluated points.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
delta_x : ndarray
|
||||
The difference between two points the gradient
|
||||
function have been evaluated at: ``delta_x = x2 - x1``.
|
||||
delta_grad : ndarray
|
||||
The difference between the gradients:
|
||||
``delta_grad = grad(x2) - grad(x1)``.
|
||||
"""
|
||||
if np.all(delta_x == 0.0):
|
||||
return
|
||||
if np.all(delta_grad == 0.0):
|
||||
warn('delta_grad == 0.0. Check if the approximated '
|
||||
'function is linear. If the function is linear '
|
||||
'better results can be obtained by defining the '
|
||||
'Hessian as zero instead of using quasi-Newton '
|
||||
'approximations.',
|
||||
UserWarning, stacklevel=2)
|
||||
return
|
||||
if self.first_iteration:
|
||||
# Get user specific scale
|
||||
if isinstance(self.init_scale, str) and self.init_scale == "auto":
|
||||
scale = self._auto_scale(delta_x, delta_grad)
|
||||
else:
|
||||
scale = self.init_scale
|
||||
|
||||
# Check for complex: numpy will silently cast a complex array to
|
||||
# a real one but not so for scalar as it raises a TypeError.
|
||||
# Checking here brings a consistent behavior.
|
||||
replace = False
|
||||
if np.size(scale) == 1:
|
||||
# to account for the legacy behavior having the exact same cast
|
||||
scale = float(scale)
|
||||
elif np.iscomplexobj(scale):
|
||||
raise TypeError("init_scale contains complex elements, "
|
||||
"must be real.")
|
||||
else: # test explicitly for allowed shapes and values
|
||||
replace = True
|
||||
if self.approx_type == 'hess':
|
||||
shape = np.shape(self.B)
|
||||
dtype = self.B.dtype
|
||||
else:
|
||||
shape = np.shape(self.H)
|
||||
dtype = self.H.dtype
|
||||
# copy, will replace the original
|
||||
scale = np.array(scale, dtype=dtype, copy=True)
|
||||
|
||||
# it has to match the shape of the matrix for the multiplication,
|
||||
# no implicit broadcasting is allowed
|
||||
if shape != (init_shape := np.shape(scale)):
|
||||
raise ValueError("If init_scale is an array, it must have the "
|
||||
f"dimensions of the hess/inv_hess: {shape}."
|
||||
f" Got {init_shape}.")
|
||||
if not issymmetric(scale):
|
||||
raise ValueError("If init_scale is an array, it must be"
|
||||
" symmetric (passing scipy.linalg.issymmetric)"
|
||||
" to be an approximation of a hess/inv_hess.")
|
||||
|
||||
# Scale initial matrix with ``scale * np.eye(n)`` or replace
|
||||
# This is not ideal, we could assign the scale directly in
|
||||
# initialize, but we would need to
|
||||
if self.approx_type == 'hess':
|
||||
if replace:
|
||||
self.B = scale
|
||||
else:
|
||||
self.B *= scale
|
||||
else:
|
||||
if replace:
|
||||
self.H = scale
|
||||
else:
|
||||
self.H *= scale
|
||||
self.first_iteration = False
|
||||
self._update_implementation(delta_x, delta_grad)
|
||||
|
||||
def dot(self, p):
|
||||
"""Compute the product of the internal matrix with the given vector.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
p : array_like
|
||||
1-D array representing a vector.
|
||||
|
||||
Returns
|
||||
-------
|
||||
Hp : array
|
||||
1-D represents the result of multiplying the approximation matrix
|
||||
by vector p.
|
||||
"""
|
||||
if self.approx_type == 'hess':
|
||||
return self._symv(1, self.B, p)
|
||||
else:
|
||||
return self._symv(1, self.H, p)
|
||||
|
||||
def get_matrix(self):
|
||||
"""Return the current internal matrix.
|
||||
|
||||
Returns
|
||||
-------
|
||||
M : ndarray, shape (n, n)
|
||||
Dense matrix containing either the Hessian or its inverse
|
||||
(depending on how `approx_type` was defined).
|
||||
"""
|
||||
if self.approx_type == 'hess':
|
||||
M = np.copy(self.B)
|
||||
else:
|
||||
M = np.copy(self.H)
|
||||
li = np.tril_indices_from(M, k=-1)
|
||||
M[li] = M.T[li]
|
||||
return M
|
||||
|
||||
|
||||
class BFGS(FullHessianUpdateStrategy):
|
||||
"""Broyden-Fletcher-Goldfarb-Shanno (BFGS) Hessian update strategy.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
exception_strategy : {'skip_update', 'damp_update'}, optional
|
||||
Define how to proceed when the curvature condition is violated.
|
||||
Set it to 'skip_update' to just skip the update. Or, alternatively,
|
||||
set it to 'damp_update' to interpolate between the actual BFGS
|
||||
result and the unmodified matrix. Both exceptions strategies
|
||||
are explained in [1]_, p.536-537.
|
||||
min_curvature : float
|
||||
This number, scaled by a normalization factor, defines the
|
||||
minimum curvature ``dot(delta_grad, delta_x)`` allowed to go
|
||||
unaffected by the exception strategy. By default is equal to
|
||||
1e-8 when ``exception_strategy = 'skip_update'`` and equal
|
||||
to 0.2 when ``exception_strategy = 'damp_update'``.
|
||||
init_scale : {float, np.array, 'auto'}
|
||||
This parameter can be used to initialize the Hessian or its
|
||||
inverse. When a float is given, the relevant array is initialized
|
||||
to ``np.eye(n) * init_scale``, where ``n`` is the problem dimension.
|
||||
Alternatively, if a precisely ``(n, n)`` shaped, symmetric array is given,
|
||||
this array will be used. Otherwise an error is generated.
|
||||
Set it to 'auto' in order to use an automatic heuristic for choosing
|
||||
the initial scale. The heuristic is described in [1]_, p.143.
|
||||
The default is 'auto'.
|
||||
|
||||
Notes
|
||||
-----
|
||||
The update is based on the description in [1]_, p.140.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Nocedal, Jorge, and Stephen J. Wright. "Numerical optimization"
|
||||
Second Edition (2006).
|
||||
"""
|
||||
|
||||
def __init__(self, exception_strategy='skip_update', min_curvature=None,
|
||||
init_scale='auto'):
|
||||
if exception_strategy == 'skip_update':
|
||||
if min_curvature is not None:
|
||||
self.min_curvature = min_curvature
|
||||
else:
|
||||
self.min_curvature = 1e-8
|
||||
elif exception_strategy == 'damp_update':
|
||||
if min_curvature is not None:
|
||||
self.min_curvature = min_curvature
|
||||
else:
|
||||
self.min_curvature = 0.2
|
||||
else:
|
||||
raise ValueError("`exception_strategy` must be 'skip_update' "
|
||||
"or 'damp_update'.")
|
||||
|
||||
super().__init__(init_scale)
|
||||
self.exception_strategy = exception_strategy
|
||||
|
||||
def _update_inverse_hessian(self, ys, Hy, yHy, s):
|
||||
"""Update the inverse Hessian matrix.
|
||||
|
||||
BFGS update using the formula:
|
||||
|
||||
``H <- H + ((H*y).T*y + s.T*y)/(s.T*y)^2 * (s*s.T)
|
||||
- 1/(s.T*y) * ((H*y)*s.T + s*(H*y).T)``
|
||||
|
||||
where ``s = delta_x`` and ``y = delta_grad``. This formula is
|
||||
equivalent to (6.17) in [1]_ written in a more efficient way
|
||||
for implementation.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Nocedal, Jorge, and Stephen J. Wright. "Numerical optimization"
|
||||
Second Edition (2006).
|
||||
"""
|
||||
self.H = self._syr2(-1.0 / ys, s, Hy, a=self.H)
|
||||
self.H = self._syr((ys + yHy) / ys ** 2, s, a=self.H)
|
||||
|
||||
def _update_hessian(self, ys, Bs, sBs, y):
|
||||
"""Update the Hessian matrix.
|
||||
|
||||
BFGS update using the formula:
|
||||
|
||||
``B <- B - (B*s)*(B*s).T/s.T*(B*s) + y*y^T/s.T*y``
|
||||
|
||||
where ``s`` is short for ``delta_x`` and ``y`` is short
|
||||
for ``delta_grad``. Formula (6.19) in [1]_.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Nocedal, Jorge, and Stephen J. Wright. "Numerical optimization"
|
||||
Second Edition (2006).
|
||||
"""
|
||||
self.B = self._syr(1.0 / ys, y, a=self.B)
|
||||
self.B = self._syr(-1.0 / sBs, Bs, a=self.B)
|
||||
|
||||
def _update_implementation(self, delta_x, delta_grad):
|
||||
# Auxiliary variables w and z
|
||||
if self.approx_type == 'hess':
|
||||
w = delta_x
|
||||
z = delta_grad
|
||||
else:
|
||||
w = delta_grad
|
||||
z = delta_x
|
||||
# Do some common operations
|
||||
wz = np.dot(w, z)
|
||||
Mw = self.dot(w)
|
||||
wMw = Mw.dot(w)
|
||||
# Guarantee that wMw > 0 by reinitializing matrix.
|
||||
# While this is always true in exact arithmetic,
|
||||
# indefinite matrix may appear due to roundoff errors.
|
||||
if wMw <= 0.0:
|
||||
scale = self._auto_scale(delta_x, delta_grad)
|
||||
# Reinitialize matrix
|
||||
if self.approx_type == 'hess':
|
||||
self.B = scale * np.eye(self.n, dtype=float)
|
||||
else:
|
||||
self.H = scale * np.eye(self.n, dtype=float)
|
||||
# Do common operations for new matrix
|
||||
Mw = self.dot(w)
|
||||
wMw = Mw.dot(w)
|
||||
# Check if curvature condition is violated
|
||||
if wz <= self.min_curvature * wMw:
|
||||
# If the option 'skip_update' is set
|
||||
# we just skip the update when the condition
|
||||
# is violated.
|
||||
if self.exception_strategy == 'skip_update':
|
||||
return
|
||||
# If the option 'damp_update' is set we
|
||||
# interpolate between the actual BFGS
|
||||
# result and the unmodified matrix.
|
||||
elif self.exception_strategy == 'damp_update':
|
||||
update_factor = (1-self.min_curvature) / (1 - wz/wMw)
|
||||
z = update_factor*z + (1-update_factor)*Mw
|
||||
wz = np.dot(w, z)
|
||||
# Update matrix
|
||||
if self.approx_type == 'hess':
|
||||
self._update_hessian(wz, Mw, wMw, z)
|
||||
else:
|
||||
self._update_inverse_hessian(wz, Mw, wMw, z)
|
||||
|
||||
|
||||
class SR1(FullHessianUpdateStrategy):
|
||||
"""Symmetric-rank-1 Hessian update strategy.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
min_denominator : float
|
||||
This number, scaled by a normalization factor,
|
||||
defines the minimum denominator magnitude allowed
|
||||
in the update. When the condition is violated we skip
|
||||
the update. By default uses ``1e-8``.
|
||||
init_scale : {float, np.array, 'auto'}, optional
|
||||
This parameter can be used to initialize the Hessian or its
|
||||
inverse. When a float is given, the relevant array is initialized
|
||||
to ``np.eye(n) * init_scale``, where ``n`` is the problem dimension.
|
||||
Alternatively, if a precisely ``(n, n)`` shaped, symmetric array is given,
|
||||
this array will be used. Otherwise an error is generated.
|
||||
Set it to 'auto' in order to use an automatic heuristic for choosing
|
||||
the initial scale. The heuristic is described in [1]_, p.143.
|
||||
The default is 'auto'.
|
||||
|
||||
Notes
|
||||
-----
|
||||
The update is based on the description in [1]_, p.144-146.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Nocedal, Jorge, and Stephen J. Wright. "Numerical optimization"
|
||||
Second Edition (2006).
|
||||
"""
|
||||
|
||||
def __init__(self, min_denominator=1e-8, init_scale='auto'):
|
||||
self.min_denominator = min_denominator
|
||||
super().__init__(init_scale)
|
||||
|
||||
def _update_implementation(self, delta_x, delta_grad):
|
||||
# Auxiliary variables w and z
|
||||
if self.approx_type == 'hess':
|
||||
w = delta_x
|
||||
z = delta_grad
|
||||
else:
|
||||
w = delta_grad
|
||||
z = delta_x
|
||||
# Do some common operations
|
||||
Mw = self.dot(w)
|
||||
z_minus_Mw = z - Mw
|
||||
denominator = np.dot(w, z_minus_Mw)
|
||||
# If the denominator is too small
|
||||
# we just skip the update.
|
||||
if np.abs(denominator) <= self.min_denominator*norm(w)*norm(z_minus_Mw):
|
||||
return
|
||||
# Update matrix
|
||||
if self.approx_type == 'hess':
|
||||
self.B = self._syr(1/denominator, z_minus_Mw, a=self.B)
|
||||
else:
|
||||
self.H = self._syr(1/denominator, z_minus_Mw, a=self.H)
|
||||
Binary file not shown.
Binary file not shown.
@ -0,0 +1,106 @@
|
||||
# cython: language_level=3
|
||||
|
||||
from libcpp cimport bool
|
||||
from libcpp.string cimport string
|
||||
|
||||
cdef extern from "HConst.h" nogil:
|
||||
|
||||
const int HIGHS_CONST_I_INF "kHighsIInf"
|
||||
const double HIGHS_CONST_INF "kHighsInf"
|
||||
const double kHighsTiny
|
||||
const double kHighsZero
|
||||
const int kHighsThreadLimit
|
||||
|
||||
cdef enum HighsDebugLevel:
|
||||
HighsDebugLevel_kHighsDebugLevelNone "kHighsDebugLevelNone" = 0
|
||||
HighsDebugLevel_kHighsDebugLevelCheap "kHighsDebugLevelCheap"
|
||||
HighsDebugLevel_kHighsDebugLevelCostly "kHighsDebugLevelCostly"
|
||||
HighsDebugLevel_kHighsDebugLevelExpensive "kHighsDebugLevelExpensive"
|
||||
HighsDebugLevel_kHighsDebugLevelMin "kHighsDebugLevelMin" = HighsDebugLevel_kHighsDebugLevelNone
|
||||
HighsDebugLevel_kHighsDebugLevelMax "kHighsDebugLevelMax" = HighsDebugLevel_kHighsDebugLevelExpensive
|
||||
|
||||
ctypedef enum HighsModelStatus:
|
||||
HighsModelStatusNOTSET "HighsModelStatus::kNotset" = 0
|
||||
HighsModelStatusLOAD_ERROR "HighsModelStatus::kLoadError"
|
||||
HighsModelStatusMODEL_ERROR "HighsModelStatus::kModelError"
|
||||
HighsModelStatusPRESOLVE_ERROR "HighsModelStatus::kPresolveError"
|
||||
HighsModelStatusSOLVE_ERROR "HighsModelStatus::kSolveError"
|
||||
HighsModelStatusPOSTSOLVE_ERROR "HighsModelStatus::kPostsolveError"
|
||||
HighsModelStatusMODEL_EMPTY "HighsModelStatus::kModelEmpty"
|
||||
HighsModelStatusOPTIMAL "HighsModelStatus::kOptimal"
|
||||
HighsModelStatusINFEASIBLE "HighsModelStatus::kInfeasible"
|
||||
HighsModelStatus_UNBOUNDED_OR_INFEASIBLE "HighsModelStatus::kUnboundedOrInfeasible"
|
||||
HighsModelStatusUNBOUNDED "HighsModelStatus::kUnbounded"
|
||||
HighsModelStatusREACHED_DUAL_OBJECTIVE_VALUE_UPPER_BOUND "HighsModelStatus::kObjectiveBound"
|
||||
HighsModelStatusREACHED_OBJECTIVE_TARGET "HighsModelStatus::kObjectiveTarget"
|
||||
HighsModelStatusREACHED_TIME_LIMIT "HighsModelStatus::kTimeLimit"
|
||||
HighsModelStatusREACHED_ITERATION_LIMIT "HighsModelStatus::kIterationLimit"
|
||||
HighsModelStatusUNKNOWN "HighsModelStatus::kUnknown"
|
||||
HighsModelStatusHIGHS_MODEL_STATUS_MIN "HighsModelStatus::kMin" = HighsModelStatusNOTSET
|
||||
HighsModelStatusHIGHS_MODEL_STATUS_MAX "HighsModelStatus::kMax" = HighsModelStatusUNKNOWN
|
||||
|
||||
cdef enum HighsBasisStatus:
|
||||
HighsBasisStatusLOWER "HighsBasisStatus::kLower" = 0, # (slack) variable is at its lower bound [including fixed variables]
|
||||
HighsBasisStatusBASIC "HighsBasisStatus::kBasic" # (slack) variable is basic
|
||||
HighsBasisStatusUPPER "HighsBasisStatus::kUpper" # (slack) variable is at its upper bound
|
||||
HighsBasisStatusZERO "HighsBasisStatus::kZero" # free variable is non-basic and set to zero
|
||||
HighsBasisStatusNONBASIC "HighsBasisStatus::kNonbasic" # nonbasic with no specific bound information - useful for users and postsolve
|
||||
|
||||
cdef enum SolverOption:
|
||||
SOLVER_OPTION_SIMPLEX "SolverOption::SOLVER_OPTION_SIMPLEX" = -1
|
||||
SOLVER_OPTION_CHOOSE "SolverOption::SOLVER_OPTION_CHOOSE"
|
||||
SOLVER_OPTION_IPM "SolverOption::SOLVER_OPTION_IPM"
|
||||
|
||||
cdef enum PrimalDualStatus:
|
||||
PrimalDualStatusSTATUS_NOT_SET "PrimalDualStatus::STATUS_NOT_SET" = -1
|
||||
PrimalDualStatusSTATUS_MIN "PrimalDualStatus::STATUS_MIN" = PrimalDualStatusSTATUS_NOT_SET
|
||||
PrimalDualStatusSTATUS_NO_SOLUTION "PrimalDualStatus::STATUS_NO_SOLUTION"
|
||||
PrimalDualStatusSTATUS_UNKNOWN "PrimalDualStatus::STATUS_UNKNOWN"
|
||||
PrimalDualStatusSTATUS_INFEASIBLE_POINT "PrimalDualStatus::STATUS_INFEASIBLE_POINT"
|
||||
PrimalDualStatusSTATUS_FEASIBLE_POINT "PrimalDualStatus::STATUS_FEASIBLE_POINT"
|
||||
PrimalDualStatusSTATUS_MAX "PrimalDualStatus::STATUS_MAX" = PrimalDualStatusSTATUS_FEASIBLE_POINT
|
||||
|
||||
cdef enum HighsOptionType:
|
||||
HighsOptionTypeBOOL "HighsOptionType::kBool" = 0
|
||||
HighsOptionTypeINT "HighsOptionType::kInt"
|
||||
HighsOptionTypeDOUBLE "HighsOptionType::kDouble"
|
||||
HighsOptionTypeSTRING "HighsOptionType::kString"
|
||||
|
||||
# workaround for lack of enum class support in Cython < 3.x
|
||||
# cdef enum class ObjSense(int):
|
||||
# ObjSenseMINIMIZE "ObjSense::kMinimize" = 1
|
||||
# ObjSenseMAXIMIZE "ObjSense::kMaximize" = -1
|
||||
|
||||
cdef cppclass ObjSense:
|
||||
pass
|
||||
|
||||
cdef ObjSense ObjSenseMINIMIZE "ObjSense::kMinimize"
|
||||
cdef ObjSense ObjSenseMAXIMIZE "ObjSense::kMaximize"
|
||||
|
||||
# cdef enum class MatrixFormat(int):
|
||||
# MatrixFormatkColwise "MatrixFormat::kColwise" = 1
|
||||
# MatrixFormatkRowwise "MatrixFormat::kRowwise"
|
||||
# MatrixFormatkRowwisePartitioned "MatrixFormat::kRowwisePartitioned"
|
||||
|
||||
cdef cppclass MatrixFormat:
|
||||
pass
|
||||
|
||||
cdef MatrixFormat MatrixFormatkColwise "MatrixFormat::kColwise"
|
||||
cdef MatrixFormat MatrixFormatkRowwise "MatrixFormat::kRowwise"
|
||||
cdef MatrixFormat MatrixFormatkRowwisePartitioned "MatrixFormat::kRowwisePartitioned"
|
||||
|
||||
# cdef enum class HighsVarType(int):
|
||||
# kContinuous "HighsVarType::kContinuous"
|
||||
# kInteger "HighsVarType::kInteger"
|
||||
# kSemiContinuous "HighsVarType::kSemiContinuous"
|
||||
# kSemiInteger "HighsVarType::kSemiInteger"
|
||||
# kImplicitInteger "HighsVarType::kImplicitInteger"
|
||||
|
||||
cdef cppclass HighsVarType:
|
||||
pass
|
||||
|
||||
cdef HighsVarType kContinuous "HighsVarType::kContinuous"
|
||||
cdef HighsVarType kInteger "HighsVarType::kInteger"
|
||||
cdef HighsVarType kSemiContinuous "HighsVarType::kSemiContinuous"
|
||||
cdef HighsVarType kSemiInteger "HighsVarType::kSemiInteger"
|
||||
cdef HighsVarType kImplicitInteger "HighsVarType::kImplicitInteger"
|
||||
@ -0,0 +1,56 @@
|
||||
# cython: language_level=3
|
||||
|
||||
from libc.stdio cimport FILE
|
||||
|
||||
from libcpp cimport bool
|
||||
from libcpp.string cimport string
|
||||
|
||||
from .HighsStatus cimport HighsStatus
|
||||
from .HighsOptions cimport HighsOptions
|
||||
from .HighsInfo cimport HighsInfo
|
||||
from .HighsLp cimport (
|
||||
HighsLp,
|
||||
HighsSolution,
|
||||
HighsBasis,
|
||||
ObjSense,
|
||||
)
|
||||
from .HConst cimport HighsModelStatus
|
||||
|
||||
cdef extern from "Highs.h":
|
||||
# From HiGHS/src/Highs.h
|
||||
cdef cppclass Highs:
|
||||
HighsStatus passHighsOptions(const HighsOptions& options)
|
||||
HighsStatus passModel(const HighsLp& lp)
|
||||
HighsStatus run()
|
||||
HighsStatus setHighsLogfile(FILE* logfile)
|
||||
HighsStatus setHighsOutput(FILE* output)
|
||||
HighsStatus writeHighsOptions(const string filename, const bool report_only_non_default_values = true)
|
||||
|
||||
# split up for cython below
|
||||
#const HighsModelStatus& getModelStatus(const bool scaled_model = False) const
|
||||
const HighsModelStatus & getModelStatus() const
|
||||
|
||||
const HighsInfo& getHighsInfo "getInfo" () const
|
||||
string modelStatusToString(const HighsModelStatus model_status) const
|
||||
#HighsStatus getHighsInfoValue(const string& info, int& value)
|
||||
HighsStatus getHighsInfoValue(const string& info, double& value) const
|
||||
const HighsOptions& getHighsOptions() const
|
||||
|
||||
const HighsLp& getLp() const
|
||||
|
||||
HighsStatus writeSolution(const string filename, const bool pretty) const
|
||||
|
||||
HighsStatus setBasis()
|
||||
const HighsSolution& getSolution() const
|
||||
const HighsBasis& getBasis() const
|
||||
|
||||
bool changeObjectiveSense(const ObjSense sense)
|
||||
|
||||
HighsStatus setHighsOptionValueBool "setOptionValue" (const string & option, const bool value)
|
||||
HighsStatus setHighsOptionValueInt "setOptionValue" (const string & option, const int value)
|
||||
HighsStatus setHighsOptionValueStr "setOptionValue" (const string & option, const string & value)
|
||||
HighsStatus setHighsOptionValueDbl "setOptionValue" (const string & option, const double value)
|
||||
|
||||
string primalDualStatusToString(const int primal_dual_status)
|
||||
|
||||
void resetGlobalScheduler(bool blocking)
|
||||
@ -0,0 +1,20 @@
|
||||
# cython: language_level=3
|
||||
|
||||
|
||||
cdef extern from "HighsIO.h" nogil:
|
||||
# workaround for lack of enum class support in Cython < 3.x
|
||||
# cdef enum class HighsLogType(int):
|
||||
# kInfo "HighsLogType::kInfo" = 1
|
||||
# kDetailed "HighsLogType::kDetailed"
|
||||
# kVerbose "HighsLogType::kVerbose"
|
||||
# kWarning "HighsLogType::kWarning"
|
||||
# kError "HighsLogType::kError"
|
||||
|
||||
cdef cppclass HighsLogType:
|
||||
pass
|
||||
|
||||
cdef HighsLogType kInfo "HighsLogType::kInfo"
|
||||
cdef HighsLogType kDetailed "HighsLogType::kDetailed"
|
||||
cdef HighsLogType kVerbose "HighsLogType::kVerbose"
|
||||
cdef HighsLogType kWarning "HighsLogType::kWarning"
|
||||
cdef HighsLogType kError "HighsLogType::kError"
|
||||
@ -0,0 +1,22 @@
|
||||
# cython: language_level=3
|
||||
|
||||
cdef extern from "HighsInfo.h" nogil:
|
||||
# From HiGHS/src/lp_data/HighsInfo.h
|
||||
cdef cppclass HighsInfo:
|
||||
# Inherited from HighsInfoStruct:
|
||||
int mip_node_count
|
||||
int simplex_iteration_count
|
||||
int ipm_iteration_count
|
||||
int crossover_iteration_count
|
||||
int primal_solution_status
|
||||
int dual_solution_status
|
||||
int basis_validity
|
||||
double objective_function_value
|
||||
double mip_dual_bound
|
||||
double mip_gap
|
||||
int num_primal_infeasibilities
|
||||
double max_primal_infeasibility
|
||||
double sum_primal_infeasibilities
|
||||
int num_dual_infeasibilities
|
||||
double max_dual_infeasibility
|
||||
double sum_dual_infeasibilities
|
||||
@ -0,0 +1,46 @@
|
||||
# cython: language_level=3
|
||||
|
||||
from libcpp cimport bool
|
||||
from libcpp.string cimport string
|
||||
from libcpp.vector cimport vector
|
||||
|
||||
from .HConst cimport HighsBasisStatus, ObjSense, HighsVarType
|
||||
from .HighsSparseMatrix cimport HighsSparseMatrix
|
||||
|
||||
|
||||
cdef extern from "HighsLp.h" nogil:
|
||||
# From HiGHS/src/lp_data/HighsLp.h
|
||||
cdef cppclass HighsLp:
|
||||
int num_col_
|
||||
int num_row_
|
||||
|
||||
vector[double] col_cost_
|
||||
vector[double] col_lower_
|
||||
vector[double] col_upper_
|
||||
vector[double] row_lower_
|
||||
vector[double] row_upper_
|
||||
|
||||
HighsSparseMatrix a_matrix_
|
||||
|
||||
ObjSense sense_
|
||||
double offset_
|
||||
|
||||
string model_name_
|
||||
|
||||
vector[string] row_names_
|
||||
vector[string] col_names_
|
||||
|
||||
vector[HighsVarType] integrality_
|
||||
|
||||
bool isMip() const
|
||||
|
||||
cdef cppclass HighsSolution:
|
||||
vector[double] col_value
|
||||
vector[double] col_dual
|
||||
vector[double] row_value
|
||||
vector[double] row_dual
|
||||
|
||||
cdef cppclass HighsBasis:
|
||||
bool valid_
|
||||
vector[HighsBasisStatus] col_status
|
||||
vector[HighsBasisStatus] row_status
|
||||
@ -0,0 +1,9 @@
|
||||
# cython: language_level=3
|
||||
|
||||
from .HighsStatus cimport HighsStatus
|
||||
from .HighsLp cimport HighsLp
|
||||
from .HighsOptions cimport HighsOptions
|
||||
|
||||
cdef extern from "HighsLpUtils.h" nogil:
|
||||
# From HiGHS/src/lp_data/HighsLpUtils.h
|
||||
HighsStatus assessLp(HighsLp& lp, const HighsOptions& options)
|
||||
@ -0,0 +1,10 @@
|
||||
# cython: language_level=3
|
||||
|
||||
from libcpp.string cimport string
|
||||
|
||||
from .HConst cimport HighsModelStatus
|
||||
|
||||
cdef extern from "HighsModelUtils.h" nogil:
|
||||
# From HiGHS/src/lp_data/HighsModelUtils.h
|
||||
string utilHighsModelStatusToString(const HighsModelStatus model_status)
|
||||
string utilBasisStatusToString(const int primal_dual_status)
|
||||
@ -0,0 +1,110 @@
|
||||
# cython: language_level=3
|
||||
|
||||
from libc.stdio cimport FILE
|
||||
|
||||
from libcpp cimport bool
|
||||
from libcpp.string cimport string
|
||||
from libcpp.vector cimport vector
|
||||
|
||||
from .HConst cimport HighsOptionType
|
||||
|
||||
cdef extern from "HighsOptions.h" nogil:
|
||||
|
||||
cdef cppclass OptionRecord:
|
||||
HighsOptionType type
|
||||
string name
|
||||
string description
|
||||
bool advanced
|
||||
|
||||
cdef cppclass OptionRecordBool(OptionRecord):
|
||||
bool* value
|
||||
bool default_value
|
||||
|
||||
cdef cppclass OptionRecordInt(OptionRecord):
|
||||
int* value
|
||||
int lower_bound
|
||||
int default_value
|
||||
int upper_bound
|
||||
|
||||
cdef cppclass OptionRecordDouble(OptionRecord):
|
||||
double* value
|
||||
double lower_bound
|
||||
double default_value
|
||||
double upper_bound
|
||||
|
||||
cdef cppclass OptionRecordString(OptionRecord):
|
||||
string* value
|
||||
string default_value
|
||||
|
||||
cdef cppclass HighsOptions:
|
||||
# From HighsOptionsStruct:
|
||||
|
||||
# Options read from the command line
|
||||
string model_file
|
||||
string presolve
|
||||
string solver
|
||||
string parallel
|
||||
double time_limit
|
||||
string options_file
|
||||
|
||||
# Options read from the file
|
||||
double infinite_cost
|
||||
double infinite_bound
|
||||
double small_matrix_value
|
||||
double large_matrix_value
|
||||
double primal_feasibility_tolerance
|
||||
double dual_feasibility_tolerance
|
||||
double ipm_optimality_tolerance
|
||||
double dual_objective_value_upper_bound
|
||||
int highs_debug_level
|
||||
int simplex_strategy
|
||||
int simplex_scale_strategy
|
||||
int simplex_crash_strategy
|
||||
int simplex_dual_edge_weight_strategy
|
||||
int simplex_primal_edge_weight_strategy
|
||||
int simplex_iteration_limit
|
||||
int simplex_update_limit
|
||||
int ipm_iteration_limit
|
||||
int highs_min_threads
|
||||
int highs_max_threads
|
||||
int message_level
|
||||
string solution_file
|
||||
bool write_solution_to_file
|
||||
bool write_solution_pretty
|
||||
|
||||
# Advanced options
|
||||
bool run_crossover
|
||||
bool mps_parser_type_free
|
||||
int keep_n_rows
|
||||
int allowed_simplex_matrix_scale_factor
|
||||
int allowed_simplex_cost_scale_factor
|
||||
int simplex_dualise_strategy
|
||||
int simplex_permute_strategy
|
||||
int dual_simplex_cleanup_strategy
|
||||
int simplex_price_strategy
|
||||
int dual_chuzc_sort_strategy
|
||||
bool simplex_initial_condition_check
|
||||
double simplex_initial_condition_tolerance
|
||||
double dual_steepest_edge_weight_log_error_threshhold
|
||||
double dual_simplex_cost_perturbation_multiplier
|
||||
double start_crossover_tolerance
|
||||
bool less_infeasible_DSE_check
|
||||
bool less_infeasible_DSE_choose_row
|
||||
bool use_original_HFactor_logic
|
||||
|
||||
# Options for MIP solver
|
||||
int mip_max_nodes
|
||||
int mip_report_level
|
||||
|
||||
# Switch for MIP solver
|
||||
bool mip
|
||||
|
||||
# Options for HighsPrintMessage and HighsLogMessage
|
||||
FILE* logfile
|
||||
FILE* output
|
||||
int message_level
|
||||
string solution_file
|
||||
bool write_solution_to_file
|
||||
bool write_solution_pretty
|
||||
|
||||
vector[OptionRecord*] records
|
||||
@ -0,0 +1,9 @@
|
||||
# cython: language_level=3
|
||||
|
||||
from libcpp cimport bool
|
||||
|
||||
from .HighsOptions cimport HighsOptions
|
||||
|
||||
cdef extern from "HighsRuntimeOptions.h" nogil:
|
||||
# From HiGHS/src/lp_data/HighsRuntimeOptions.h
|
||||
bool loadOptions(int argc, char** argv, HighsOptions& options)
|
||||
@ -0,0 +1,12 @@
|
||||
# cython: language_level=3
|
||||
|
||||
from libcpp.string cimport string
|
||||
|
||||
cdef extern from "HighsStatus.h" nogil:
|
||||
ctypedef enum HighsStatus:
|
||||
HighsStatusError "HighsStatus::kError" = -1
|
||||
HighsStatusOK "HighsStatus::kOk" = 0
|
||||
HighsStatusWarning "HighsStatus::kWarning" = 1
|
||||
|
||||
|
||||
string highsStatusToString(HighsStatus status)
|
||||
@ -0,0 +1,95 @@
|
||||
# cython: language_level=3
|
||||
|
||||
from libcpp cimport bool
|
||||
|
||||
cdef extern from "SimplexConst.h" nogil:
|
||||
|
||||
cdef enum SimplexAlgorithm:
|
||||
PRIMAL "SimplexAlgorithm::kPrimal" = 0
|
||||
DUAL "SimplexAlgorithm::kDual"
|
||||
|
||||
cdef enum SimplexStrategy:
|
||||
SIMPLEX_STRATEGY_MIN "SimplexStrategy::kSimplexStrategyMin" = 0
|
||||
SIMPLEX_STRATEGY_CHOOSE "SimplexStrategy::kSimplexStrategyChoose" = SIMPLEX_STRATEGY_MIN
|
||||
SIMPLEX_STRATEGY_DUAL "SimplexStrategy::kSimplexStrategyDual"
|
||||
SIMPLEX_STRATEGY_DUAL_PLAIN "SimplexStrategy::kSimplexStrategyDualPlain" = SIMPLEX_STRATEGY_DUAL
|
||||
SIMPLEX_STRATEGY_DUAL_TASKS "SimplexStrategy::kSimplexStrategyDualTasks"
|
||||
SIMPLEX_STRATEGY_DUAL_MULTI "SimplexStrategy::kSimplexStrategyDualMulti"
|
||||
SIMPLEX_STRATEGY_PRIMAL "SimplexStrategy::kSimplexStrategyPrimal"
|
||||
SIMPLEX_STRATEGY_MAX "SimplexStrategy::kSimplexStrategyMax" = SIMPLEX_STRATEGY_PRIMAL
|
||||
SIMPLEX_STRATEGY_NUM "SimplexStrategy::kSimplexStrategyNum"
|
||||
|
||||
cdef enum SimplexCrashStrategy:
|
||||
SIMPLEX_CRASH_STRATEGY_MIN "SimplexCrashStrategy::kSimplexCrashStrategyMin" = 0
|
||||
SIMPLEX_CRASH_STRATEGY_OFF "SimplexCrashStrategy::kSimplexCrashStrategyOff" = SIMPLEX_CRASH_STRATEGY_MIN
|
||||
SIMPLEX_CRASH_STRATEGY_LTSSF_K "SimplexCrashStrategy::kSimplexCrashStrategyLtssfK"
|
||||
SIMPLEX_CRASH_STRATEGY_LTSSF "SimplexCrashStrategy::kSimplexCrashStrategyLtssf" = SIMPLEX_CRASH_STRATEGY_LTSSF_K
|
||||
SIMPLEX_CRASH_STRATEGY_BIXBY "SimplexCrashStrategy::kSimplexCrashStrategyBixby"
|
||||
SIMPLEX_CRASH_STRATEGY_LTSSF_PRI "SimplexCrashStrategy::kSimplexCrashStrategyLtssfPri"
|
||||
SIMPLEX_CRASH_STRATEGY_LTSF_K "SimplexCrashStrategy::kSimplexCrashStrategyLtsfK"
|
||||
SIMPLEX_CRASH_STRATEGY_LTSF_PRI "SimplexCrashStrategy::kSimplexCrashStrategyLtsfPri"
|
||||
SIMPLEX_CRASH_STRATEGY_LTSF "SimplexCrashStrategy::kSimplexCrashStrategyLtsf"
|
||||
SIMPLEX_CRASH_STRATEGY_BIXBY_NO_NONZERO_COL_COSTS "SimplexCrashStrategy::kSimplexCrashStrategyBixbyNoNonzeroColCosts"
|
||||
SIMPLEX_CRASH_STRATEGY_BASIC "SimplexCrashStrategy::kSimplexCrashStrategyBasic"
|
||||
SIMPLEX_CRASH_STRATEGY_TEST_SING "SimplexCrashStrategy::kSimplexCrashStrategyTestSing"
|
||||
SIMPLEX_CRASH_STRATEGY_MAX "SimplexCrashStrategy::kSimplexCrashStrategyMax" = SIMPLEX_CRASH_STRATEGY_TEST_SING
|
||||
|
||||
cdef enum SimplexEdgeWeightStrategy:
|
||||
SIMPLEX_EDGE_WEIGHT_STRATEGY_MIN "SimplexEdgeWeightStrategy::kSimplexEdgeWeightStrategyMin" = -1
|
||||
SIMPLEX_EDGE_WEIGHT_STRATEGY_CHOOSE "SimplexEdgeWeightStrategy::kSimplexEdgeWeightStrategyChoose" = SIMPLEX_EDGE_WEIGHT_STRATEGY_MIN
|
||||
SIMPLEX_EDGE_WEIGHT_STRATEGY_DANTZIG "SimplexEdgeWeightStrategy::kSimplexEdgeWeightStrategyDantzig"
|
||||
SIMPLEX_EDGE_WEIGHT_STRATEGY_DEVEX "SimplexEdgeWeightStrategy::kSimplexEdgeWeightStrategyDevex"
|
||||
SIMPLEX_EDGE_WEIGHT_STRATEGY_STEEPEST_EDGE "SimplexEdgeWeightStrategy::kSimplexEdgeWeightStrategySteepestEdge"
|
||||
SIMPLEX_EDGE_WEIGHT_STRATEGY_STEEPEST_EDGE_UNIT_INITIAL "SimplexEdgeWeightStrategy::kSimplexEdgeWeightStrategySteepestEdgeUnitInitial"
|
||||
SIMPLEX_EDGE_WEIGHT_STRATEGY_MAX "SimplexEdgeWeightStrategy::kSimplexEdgeWeightStrategyMax" = SIMPLEX_EDGE_WEIGHT_STRATEGY_STEEPEST_EDGE_UNIT_INITIAL
|
||||
|
||||
cdef enum SimplexPriceStrategy:
|
||||
SIMPLEX_PRICE_STRATEGY_MIN = 0
|
||||
SIMPLEX_PRICE_STRATEGY_COL = SIMPLEX_PRICE_STRATEGY_MIN
|
||||
SIMPLEX_PRICE_STRATEGY_ROW
|
||||
SIMPLEX_PRICE_STRATEGY_ROW_SWITCH
|
||||
SIMPLEX_PRICE_STRATEGY_ROW_SWITCH_COL_SWITCH
|
||||
SIMPLEX_PRICE_STRATEGY_MAX = SIMPLEX_PRICE_STRATEGY_ROW_SWITCH_COL_SWITCH
|
||||
|
||||
cdef enum SimplexDualChuzcStrategy:
|
||||
SIMPLEX_DUAL_CHUZC_STRATEGY_MIN = 0
|
||||
SIMPLEX_DUAL_CHUZC_STRATEGY_CHOOSE = SIMPLEX_DUAL_CHUZC_STRATEGY_MIN
|
||||
SIMPLEX_DUAL_CHUZC_STRATEGY_QUAD
|
||||
SIMPLEX_DUAL_CHUZC_STRATEGY_HEAP
|
||||
SIMPLEX_DUAL_CHUZC_STRATEGY_BOTH
|
||||
SIMPLEX_DUAL_CHUZC_STRATEGY_MAX = SIMPLEX_DUAL_CHUZC_STRATEGY_BOTH
|
||||
|
||||
cdef enum InvertHint:
|
||||
INVERT_HINT_NO = 0
|
||||
INVERT_HINT_UPDATE_LIMIT_REACHED
|
||||
INVERT_HINT_SYNTHETIC_CLOCK_SAYS_INVERT
|
||||
INVERT_HINT_POSSIBLY_OPTIMAL
|
||||
INVERT_HINT_POSSIBLY_PRIMAL_UNBOUNDED
|
||||
INVERT_HINT_POSSIBLY_DUAL_UNBOUNDED
|
||||
INVERT_HINT_POSSIBLY_SINGULAR_BASIS
|
||||
INVERT_HINT_PRIMAL_INFEASIBLE_IN_PRIMAL_SIMPLEX
|
||||
INVERT_HINT_CHOOSE_COLUMN_FAIL
|
||||
INVERT_HINT_Count
|
||||
|
||||
cdef enum DualEdgeWeightMode:
|
||||
DANTZIG "DualEdgeWeightMode::DANTZIG" = 0
|
||||
DEVEX "DualEdgeWeightMode::DEVEX"
|
||||
STEEPEST_EDGE "DualEdgeWeightMode::STEEPEST_EDGE"
|
||||
Count "DualEdgeWeightMode::Count"
|
||||
|
||||
cdef enum PriceMode:
|
||||
ROW "PriceMode::ROW" = 0
|
||||
COL "PriceMode::COL"
|
||||
|
||||
const int PARALLEL_THREADS_DEFAULT
|
||||
const int DUAL_TASKS_MIN_THREADS
|
||||
const int DUAL_MULTI_MIN_THREADS
|
||||
|
||||
const bool invert_if_row_out_negative
|
||||
|
||||
const int NONBASIC_FLAG_TRUE
|
||||
const int NONBASIC_FLAG_FALSE
|
||||
|
||||
const int NONBASIC_MOVE_UP
|
||||
const int NONBASIC_MOVE_DN
|
||||
const int NONBASIC_MOVE_ZE
|
||||
@ -0,0 +1,7 @@
|
||||
# cython: language_level=3
|
||||
|
||||
cdef extern from "highs_c_api.h" nogil:
|
||||
int Highs_passLp(void* highs, int numcol, int numrow, int numnz,
|
||||
double* colcost, double* collower, double* colupper,
|
||||
double* rowlower, double* rowupper,
|
||||
int* astart, int* aindex, double* avalue)
|
||||
158
venv/lib/python3.12/site-packages/scipy/optimize/_isotonic.py
Normal file
158
venv/lib/python3.12/site-packages/scipy/optimize/_isotonic.py
Normal file
@ -0,0 +1,158 @@
|
||||
from __future__ import annotations
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
import numpy as np
|
||||
|
||||
from ._optimize import OptimizeResult
|
||||
from ._pava_pybind import pava
|
||||
|
||||
if TYPE_CHECKING:
|
||||
import numpy.typing as npt
|
||||
|
||||
|
||||
__all__ = ["isotonic_regression"]
|
||||
|
||||
|
||||
def isotonic_regression(
|
||||
y: npt.ArrayLike,
|
||||
*,
|
||||
weights: npt.ArrayLike | None = None,
|
||||
increasing: bool = True,
|
||||
) -> OptimizeResult:
|
||||
r"""Nonparametric isotonic regression.
|
||||
|
||||
A (not strictly) monotonically increasing array `x` with the same length
|
||||
as `y` is calculated by the pool adjacent violators algorithm (PAVA), see
|
||||
[1]_. See the Notes section for more details.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
y : (N,) array_like
|
||||
Response variable.
|
||||
weights : (N,) array_like or None
|
||||
Case weights.
|
||||
increasing : bool
|
||||
If True, fit monotonic increasing, i.e. isotonic, regression.
|
||||
If False, fit a monotonic decreasing, i.e. antitonic, regression.
|
||||
Default is True.
|
||||
|
||||
Returns
|
||||
-------
|
||||
res : OptimizeResult
|
||||
The optimization result represented as a ``OptimizeResult`` object.
|
||||
Important attributes are:
|
||||
|
||||
- ``x``: The isotonic regression solution, i.e. an increasing (or
|
||||
decreasing) array of the same length than y, with elements in the
|
||||
range from min(y) to max(y).
|
||||
- ``weights`` : Array with the sum of case weights for each block
|
||||
(or pool) B.
|
||||
- ``blocks``: Array of length B+1 with the indices of the start
|
||||
positions of each block (or pool) B. The j-th block is given by
|
||||
``x[blocks[j]:blocks[j+1]]`` for which all values are the same.
|
||||
|
||||
Notes
|
||||
-----
|
||||
Given data :math:`y` and case weights :math:`w`, the isotonic regression
|
||||
solves the following optimization problem:
|
||||
|
||||
.. math::
|
||||
|
||||
\operatorname{argmin}_{x_i} \sum_i w_i (y_i - x_i)^2 \quad
|
||||
\text{subject to } x_i \leq x_j \text{ whenever } i \leq j \,.
|
||||
|
||||
For every input value :math:`y_i`, it generates a value :math:`x_i` such
|
||||
that :math:`x` is increasing (but not strictly), i.e.
|
||||
:math:`x_i \leq x_{i+1}`. This is accomplished by the PAVA.
|
||||
The solution consists of pools or blocks, i.e. neighboring elements of
|
||||
:math:`x`, e.g. :math:`x_i` and :math:`x_{i+1}`, that all have the same
|
||||
value.
|
||||
|
||||
Most interestingly, the solution stays the same if the squared loss is
|
||||
replaced by the wide class of Bregman functions which are the unique
|
||||
class of strictly consistent scoring functions for the mean, see [2]_
|
||||
and references therein.
|
||||
|
||||
The implemented version of PAVA according to [1]_ has a computational
|
||||
complexity of O(N) with input size N.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Busing, F. M. T. A. (2022).
|
||||
Monotone Regression: A Simple and Fast O(n) PAVA Implementation.
|
||||
Journal of Statistical Software, Code Snippets, 102(1), 1-25.
|
||||
:doi:`10.18637/jss.v102.c01`
|
||||
.. [2] Jordan, A.I., Mühlemann, A. & Ziegel, J.F.
|
||||
Characterizing the optimal solutions to the isotonic regression
|
||||
problem for identifiable functionals.
|
||||
Ann Inst Stat Math 74, 489-514 (2022).
|
||||
:doi:`10.1007/s10463-021-00808-0`
|
||||
|
||||
Examples
|
||||
--------
|
||||
This example demonstrates that ``isotonic_regression`` really solves a
|
||||
constrained optimization problem.
|
||||
|
||||
>>> import numpy as np
|
||||
>>> from scipy.optimize import isotonic_regression, minimize
|
||||
>>> y = [1.5, 1.0, 4.0, 6.0, 5.7, 5.0, 7.8, 9.0, 7.5, 9.5, 9.0]
|
||||
>>> def objective(yhat, y):
|
||||
... return np.sum((yhat - y)**2)
|
||||
>>> def constraint(yhat, y):
|
||||
... # This is for a monotonically increasing regression.
|
||||
... return np.diff(yhat)
|
||||
>>> result = minimize(objective, x0=y, args=(y,),
|
||||
... constraints=[{'type': 'ineq',
|
||||
... 'fun': lambda x: constraint(x, y)}])
|
||||
>>> result.x
|
||||
array([1.25 , 1.25 , 4. , 5.56666667, 5.56666667,
|
||||
5.56666667, 7.8 , 8.25 , 8.25 , 9.25 ,
|
||||
9.25 ])
|
||||
>>> result = isotonic_regression(y)
|
||||
>>> result.x
|
||||
array([1.25 , 1.25 , 4. , 5.56666667, 5.56666667,
|
||||
5.56666667, 7.8 , 8.25 , 8.25 , 9.25 ,
|
||||
9.25 ])
|
||||
|
||||
The big advantage of ``isotonic_regression`` compared to calling
|
||||
``minimize`` is that it is more user friendly, i.e. one does not need to
|
||||
define objective and constraint functions, and that it is orders of
|
||||
magnitudes faster. On commodity hardware (in 2023), for normal distributed
|
||||
input y of length 1000, the minimizer takes about 4 seconds, while
|
||||
``isotonic_regression`` takes about 200 microseconds.
|
||||
"""
|
||||
yarr = np.atleast_1d(y) # Check yarr.ndim == 1 is implicit (pybind11) in pava.
|
||||
order = slice(None) if increasing else slice(None, None, -1)
|
||||
x = np.array(yarr[order], order="C", dtype=np.float64, copy=True)
|
||||
if weights is None:
|
||||
wx = np.ones_like(yarr, dtype=np.float64)
|
||||
else:
|
||||
warr = np.atleast_1d(weights)
|
||||
|
||||
if not (yarr.ndim == warr.ndim == 1 and yarr.shape[0] == warr.shape[0]):
|
||||
raise ValueError(
|
||||
"Input arrays y and w must have one dimension of equal length."
|
||||
)
|
||||
if np.any(warr <= 0):
|
||||
raise ValueError("Weights w must be strictly positive.")
|
||||
|
||||
wx = np.array(warr[order], order="C", dtype=np.float64, copy=True)
|
||||
n = x.shape[0]
|
||||
r = np.full(shape=n + 1, fill_value=-1, dtype=np.intp)
|
||||
x, wx, r, b = pava(x, wx, r)
|
||||
# Now that we know the number of blocks b, we only keep the relevant part
|
||||
# of r and wx.
|
||||
# As information: Due to the pava implementation, after the last block
|
||||
# index, there might be smaller numbers appended to r, e.g.
|
||||
# r = [0, 10, 8, 7] which in the end should be r = [0, 10].
|
||||
r = r[:b + 1]
|
||||
wx = wx[:b]
|
||||
if not increasing:
|
||||
x = x[::-1]
|
||||
wx = wx[::-1]
|
||||
r = r[-1] - r[::-1]
|
||||
return OptimizeResult(
|
||||
x=x,
|
||||
weights=wx,
|
||||
blocks=r,
|
||||
)
|
||||
Binary file not shown.
543
venv/lib/python3.12/site-packages/scipy/optimize/_lbfgsb_py.py
Normal file
543
venv/lib/python3.12/site-packages/scipy/optimize/_lbfgsb_py.py
Normal file
@ -0,0 +1,543 @@
|
||||
"""
|
||||
Functions
|
||||
---------
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
fmin_l_bfgs_b
|
||||
|
||||
"""
|
||||
|
||||
## License for the Python wrapper
|
||||
## ==============================
|
||||
|
||||
## Copyright (c) 2004 David M. Cooke <cookedm@physics.mcmaster.ca>
|
||||
|
||||
## Permission is hereby granted, free of charge, to any person obtaining a
|
||||
## copy of this software and associated documentation files (the "Software"),
|
||||
## to deal in the Software without restriction, including without limitation
|
||||
## the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
## and/or sell copies of the Software, and to permit persons to whom the
|
||||
## Software is furnished to do so, subject to the following conditions:
|
||||
|
||||
## The above copyright notice and this permission notice shall be included in
|
||||
## all copies or substantial portions of the Software.
|
||||
|
||||
## THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
## IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
## FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
## AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
## LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
## FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
## DEALINGS IN THE SOFTWARE.
|
||||
|
||||
## Modifications by Travis Oliphant and Enthought, Inc. for inclusion in SciPy
|
||||
|
||||
import numpy as np
|
||||
from numpy import array, asarray, float64, zeros
|
||||
from . import _lbfgsb
|
||||
from ._optimize import (MemoizeJac, OptimizeResult, _call_callback_maybe_halt,
|
||||
_wrap_callback, _check_unknown_options,
|
||||
_prepare_scalar_function)
|
||||
from ._constraints import old_bound_to_new
|
||||
|
||||
from scipy.sparse.linalg import LinearOperator
|
||||
|
||||
__all__ = ['fmin_l_bfgs_b', 'LbfgsInvHessProduct']
|
||||
|
||||
|
||||
def fmin_l_bfgs_b(func, x0, fprime=None, args=(),
|
||||
approx_grad=0,
|
||||
bounds=None, m=10, factr=1e7, pgtol=1e-5,
|
||||
epsilon=1e-8,
|
||||
iprint=-1, maxfun=15000, maxiter=15000, disp=None,
|
||||
callback=None, maxls=20):
|
||||
"""
|
||||
Minimize a function func using the L-BFGS-B algorithm.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
func : callable f(x,*args)
|
||||
Function to minimize.
|
||||
x0 : ndarray
|
||||
Initial guess.
|
||||
fprime : callable fprime(x,*args), optional
|
||||
The gradient of `func`. If None, then `func` returns the function
|
||||
value and the gradient (``f, g = func(x, *args)``), unless
|
||||
`approx_grad` is True in which case `func` returns only ``f``.
|
||||
args : sequence, optional
|
||||
Arguments to pass to `func` and `fprime`.
|
||||
approx_grad : bool, optional
|
||||
Whether to approximate the gradient numerically (in which case
|
||||
`func` returns only the function value).
|
||||
bounds : list, optional
|
||||
``(min, max)`` pairs for each element in ``x``, defining
|
||||
the bounds on that parameter. Use None or +-inf for one of ``min`` or
|
||||
``max`` when there is no bound in that direction.
|
||||
m : int, optional
|
||||
The maximum number of variable metric corrections
|
||||
used to define the limited memory matrix. (The limited memory BFGS
|
||||
method does not store the full hessian but uses this many terms in an
|
||||
approximation to it.)
|
||||
factr : float, optional
|
||||
The iteration stops when
|
||||
``(f^k - f^{k+1})/max{|f^k|,|f^{k+1}|,1} <= factr * eps``,
|
||||
where ``eps`` is the machine precision, which is automatically
|
||||
generated by the code. Typical values for `factr` are: 1e12 for
|
||||
low accuracy; 1e7 for moderate accuracy; 10.0 for extremely
|
||||
high accuracy. See Notes for relationship to `ftol`, which is exposed
|
||||
(instead of `factr`) by the `scipy.optimize.minimize` interface to
|
||||
L-BFGS-B.
|
||||
pgtol : float, optional
|
||||
The iteration will stop when
|
||||
``max{|proj g_i | i = 1, ..., n} <= pgtol``
|
||||
where ``proj g_i`` is the i-th component of the projected gradient.
|
||||
epsilon : float, optional
|
||||
Step size used when `approx_grad` is True, for numerically
|
||||
calculating the gradient
|
||||
iprint : int, optional
|
||||
Controls the frequency of output. ``iprint < 0`` means no output;
|
||||
``iprint = 0`` print only one line at the last iteration;
|
||||
``0 < iprint < 99`` print also f and ``|proj g|`` every iprint iterations;
|
||||
``iprint = 99`` print details of every iteration except n-vectors;
|
||||
``iprint = 100`` print also the changes of active set and final x;
|
||||
``iprint > 100`` print details of every iteration including x and g.
|
||||
disp : int, optional
|
||||
If zero, then no output. If a positive number, then this over-rides
|
||||
`iprint` (i.e., `iprint` gets the value of `disp`).
|
||||
maxfun : int, optional
|
||||
Maximum number of function evaluations. Note that this function
|
||||
may violate the limit because of evaluating gradients by numerical
|
||||
differentiation.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations.
|
||||
callback : callable, optional
|
||||
Called after each iteration, as ``callback(xk)``, where ``xk`` is the
|
||||
current parameter vector.
|
||||
maxls : int, optional
|
||||
Maximum number of line search steps (per iteration). Default is 20.
|
||||
|
||||
Returns
|
||||
-------
|
||||
x : array_like
|
||||
Estimated position of the minimum.
|
||||
f : float
|
||||
Value of `func` at the minimum.
|
||||
d : dict
|
||||
Information dictionary.
|
||||
|
||||
* d['warnflag'] is
|
||||
|
||||
- 0 if converged,
|
||||
- 1 if too many function evaluations or too many iterations,
|
||||
- 2 if stopped for another reason, given in d['task']
|
||||
|
||||
* d['grad'] is the gradient at the minimum (should be 0 ish)
|
||||
* d['funcalls'] is the number of function calls made.
|
||||
* d['nit'] is the number of iterations.
|
||||
|
||||
See also
|
||||
--------
|
||||
minimize: Interface to minimization algorithms for multivariate
|
||||
functions. See the 'L-BFGS-B' `method` in particular. Note that the
|
||||
`ftol` option is made available via that interface, while `factr` is
|
||||
provided via this interface, where `factr` is the factor multiplying
|
||||
the default machine floating-point precision to arrive at `ftol`:
|
||||
``ftol = factr * numpy.finfo(float).eps``.
|
||||
|
||||
Notes
|
||||
-----
|
||||
License of L-BFGS-B (FORTRAN code):
|
||||
|
||||
The version included here (in fortran code) is 3.0
|
||||
(released April 25, 2011). It was written by Ciyou Zhu, Richard Byrd,
|
||||
and Jorge Nocedal <nocedal@ece.nwu.edu>. It carries the following
|
||||
condition for use:
|
||||
|
||||
This software is freely available, but we expect that all publications
|
||||
describing work using this software, or all commercial products using it,
|
||||
quote at least one of the references given below. This software is released
|
||||
under the BSD License.
|
||||
|
||||
References
|
||||
----------
|
||||
* R. H. Byrd, P. Lu and J. Nocedal. A Limited Memory Algorithm for Bound
|
||||
Constrained Optimization, (1995), SIAM Journal on Scientific and
|
||||
Statistical Computing, 16, 5, pp. 1190-1208.
|
||||
* C. Zhu, R. H. Byrd and J. Nocedal. L-BFGS-B: Algorithm 778: L-BFGS-B,
|
||||
FORTRAN routines for large scale bound constrained optimization (1997),
|
||||
ACM Transactions on Mathematical Software, 23, 4, pp. 550 - 560.
|
||||
* J.L. Morales and J. Nocedal. L-BFGS-B: Remark on Algorithm 778: L-BFGS-B,
|
||||
FORTRAN routines for large scale bound constrained optimization (2011),
|
||||
ACM Transactions on Mathematical Software, 38, 1.
|
||||
|
||||
Examples
|
||||
--------
|
||||
Solve a linear regression problem via `fmin_l_bfgs_b`. To do this, first we define
|
||||
an objective function ``f(m, b) = (y - y_model)**2``, where `y` describes the
|
||||
observations and `y_model` the prediction of the linear model as
|
||||
``y_model = m*x + b``. The bounds for the parameters, ``m`` and ``b``, are arbitrarily
|
||||
chosen as ``(0,5)`` and ``(5,10)`` for this example.
|
||||
|
||||
>>> import numpy as np
|
||||
>>> from scipy.optimize import fmin_l_bfgs_b
|
||||
>>> X = np.arange(0, 10, 1)
|
||||
>>> M = 2
|
||||
>>> B = 3
|
||||
>>> Y = M * X + B
|
||||
>>> def func(parameters, *args):
|
||||
... x = args[0]
|
||||
... y = args[1]
|
||||
... m, b = parameters
|
||||
... y_model = m*x + b
|
||||
... error = sum(np.power((y - y_model), 2))
|
||||
... return error
|
||||
|
||||
>>> initial_values = np.array([0.0, 1.0])
|
||||
|
||||
>>> x_opt, f_opt, info = fmin_l_bfgs_b(func, x0=initial_values, args=(X, Y),
|
||||
... approx_grad=True)
|
||||
>>> x_opt, f_opt
|
||||
array([1.99999999, 3.00000006]), 1.7746231151323805e-14 # may vary
|
||||
|
||||
The optimized parameters in ``x_opt`` agree with the ground truth parameters
|
||||
``m`` and ``b``. Next, let us perform a bound contrained optimization using the `bounds`
|
||||
parameter.
|
||||
|
||||
>>> bounds = [(0, 5), (5, 10)]
|
||||
>>> x_opt, f_op, info = fmin_l_bfgs_b(func, x0=initial_values, args=(X, Y),
|
||||
... approx_grad=True, bounds=bounds)
|
||||
>>> x_opt, f_opt
|
||||
array([1.65990508, 5.31649385]), 15.721334516453945 # may vary
|
||||
"""
|
||||
# handle fprime/approx_grad
|
||||
if approx_grad:
|
||||
fun = func
|
||||
jac = None
|
||||
elif fprime is None:
|
||||
fun = MemoizeJac(func)
|
||||
jac = fun.derivative
|
||||
else:
|
||||
fun = func
|
||||
jac = fprime
|
||||
|
||||
# build options
|
||||
callback = _wrap_callback(callback)
|
||||
opts = {'disp': disp,
|
||||
'iprint': iprint,
|
||||
'maxcor': m,
|
||||
'ftol': factr * np.finfo(float).eps,
|
||||
'gtol': pgtol,
|
||||
'eps': epsilon,
|
||||
'maxfun': maxfun,
|
||||
'maxiter': maxiter,
|
||||
'callback': callback,
|
||||
'maxls': maxls}
|
||||
|
||||
res = _minimize_lbfgsb(fun, x0, args=args, jac=jac, bounds=bounds,
|
||||
**opts)
|
||||
d = {'grad': res['jac'],
|
||||
'task': res['message'],
|
||||
'funcalls': res['nfev'],
|
||||
'nit': res['nit'],
|
||||
'warnflag': res['status']}
|
||||
f = res['fun']
|
||||
x = res['x']
|
||||
|
||||
return x, f, d
|
||||
|
||||
|
||||
def _minimize_lbfgsb(fun, x0, args=(), jac=None, bounds=None,
|
||||
disp=None, maxcor=10, ftol=2.2204460492503131e-09,
|
||||
gtol=1e-5, eps=1e-8, maxfun=15000, maxiter=15000,
|
||||
iprint=-1, callback=None, maxls=20,
|
||||
finite_diff_rel_step=None, **unknown_options):
|
||||
"""
|
||||
Minimize a scalar function of one or more variables using the L-BFGS-B
|
||||
algorithm.
|
||||
|
||||
Options
|
||||
-------
|
||||
disp : None or int
|
||||
If `disp is None` (the default), then the supplied version of `iprint`
|
||||
is used. If `disp is not None`, then it overrides the supplied version
|
||||
of `iprint` with the behaviour you outlined.
|
||||
maxcor : int
|
||||
The maximum number of variable metric corrections used to
|
||||
define the limited memory matrix. (The limited memory BFGS
|
||||
method does not store the full hessian but uses this many terms
|
||||
in an approximation to it.)
|
||||
ftol : float
|
||||
The iteration stops when ``(f^k -
|
||||
f^{k+1})/max{|f^k|,|f^{k+1}|,1} <= ftol``.
|
||||
gtol : float
|
||||
The iteration will stop when ``max{|proj g_i | i = 1, ..., n}
|
||||
<= gtol`` where ``proj g_i`` is the i-th component of the
|
||||
projected gradient.
|
||||
eps : float or ndarray
|
||||
If `jac is None` the absolute step size used for numerical
|
||||
approximation of the jacobian via forward differences.
|
||||
maxfun : int
|
||||
Maximum number of function evaluations. Note that this function
|
||||
may violate the limit because of evaluating gradients by numerical
|
||||
differentiation.
|
||||
maxiter : int
|
||||
Maximum number of iterations.
|
||||
iprint : int, optional
|
||||
Controls the frequency of output. ``iprint < 0`` means no output;
|
||||
``iprint = 0`` print only one line at the last iteration;
|
||||
``0 < iprint < 99`` print also f and ``|proj g|`` every iprint iterations;
|
||||
``iprint = 99`` print details of every iteration except n-vectors;
|
||||
``iprint = 100`` print also the changes of active set and final x;
|
||||
``iprint > 100`` print details of every iteration including x and g.
|
||||
maxls : int, optional
|
||||
Maximum number of line search steps (per iteration). Default is 20.
|
||||
finite_diff_rel_step : None or array_like, optional
|
||||
If `jac in ['2-point', '3-point', 'cs']` the relative step size to
|
||||
use for numerical approximation of the jacobian. The absolute step
|
||||
size is computed as ``h = rel_step * sign(x) * max(1, abs(x))``,
|
||||
possibly adjusted to fit into the bounds. For ``method='3-point'``
|
||||
the sign of `h` is ignored. If None (default) then step is selected
|
||||
automatically.
|
||||
|
||||
Notes
|
||||
-----
|
||||
The option `ftol` is exposed via the `scipy.optimize.minimize` interface,
|
||||
but calling `scipy.optimize.fmin_l_bfgs_b` directly exposes `factr`. The
|
||||
relationship between the two is ``ftol = factr * numpy.finfo(float).eps``.
|
||||
I.e., `factr` multiplies the default machine floating-point precision to
|
||||
arrive at `ftol`.
|
||||
|
||||
"""
|
||||
_check_unknown_options(unknown_options)
|
||||
m = maxcor
|
||||
pgtol = gtol
|
||||
factr = ftol / np.finfo(float).eps
|
||||
|
||||
x0 = asarray(x0).ravel()
|
||||
n, = x0.shape
|
||||
|
||||
# historically old-style bounds were/are expected by lbfgsb.
|
||||
# That's still the case but we'll deal with new-style from here on,
|
||||
# it's easier
|
||||
if bounds is None:
|
||||
pass
|
||||
elif len(bounds) != n:
|
||||
raise ValueError('length of x0 != length of bounds')
|
||||
else:
|
||||
bounds = np.array(old_bound_to_new(bounds))
|
||||
|
||||
# check bounds
|
||||
if (bounds[0] > bounds[1]).any():
|
||||
raise ValueError(
|
||||
"LBFGSB - one of the lower bounds is greater than an upper bound."
|
||||
)
|
||||
|
||||
# initial vector must lie within the bounds. Otherwise ScalarFunction and
|
||||
# approx_derivative will cause problems
|
||||
x0 = np.clip(x0, bounds[0], bounds[1])
|
||||
|
||||
if disp is not None:
|
||||
if disp == 0:
|
||||
iprint = -1
|
||||
else:
|
||||
iprint = disp
|
||||
|
||||
# _prepare_scalar_function can use bounds=None to represent no bounds
|
||||
sf = _prepare_scalar_function(fun, x0, jac=jac, args=args, epsilon=eps,
|
||||
bounds=bounds,
|
||||
finite_diff_rel_step=finite_diff_rel_step)
|
||||
|
||||
func_and_grad = sf.fun_and_grad
|
||||
|
||||
fortran_int = _lbfgsb.types.intvar.dtype
|
||||
|
||||
nbd = zeros(n, fortran_int)
|
||||
low_bnd = zeros(n, float64)
|
||||
upper_bnd = zeros(n, float64)
|
||||
bounds_map = {(-np.inf, np.inf): 0,
|
||||
(1, np.inf): 1,
|
||||
(1, 1): 2,
|
||||
(-np.inf, 1): 3}
|
||||
|
||||
if bounds is not None:
|
||||
for i in range(0, n):
|
||||
l, u = bounds[0, i], bounds[1, i]
|
||||
if not np.isinf(l):
|
||||
low_bnd[i] = l
|
||||
l = 1
|
||||
if not np.isinf(u):
|
||||
upper_bnd[i] = u
|
||||
u = 1
|
||||
nbd[i] = bounds_map[l, u]
|
||||
|
||||
if not maxls > 0:
|
||||
raise ValueError('maxls must be positive.')
|
||||
|
||||
x = array(x0, float64)
|
||||
f = array(0.0, float64)
|
||||
g = zeros((n,), float64)
|
||||
wa = zeros(2*m*n + 5*n + 11*m*m + 8*m, float64)
|
||||
iwa = zeros(3*n, fortran_int)
|
||||
task = zeros(1, 'S60')
|
||||
csave = zeros(1, 'S60')
|
||||
lsave = zeros(4, fortran_int)
|
||||
isave = zeros(44, fortran_int)
|
||||
dsave = zeros(29, float64)
|
||||
|
||||
task[:] = 'START'
|
||||
|
||||
n_iterations = 0
|
||||
|
||||
while 1:
|
||||
# g may become float32 if a user provides a function that calculates
|
||||
# the Jacobian in float32 (see gh-18730). The underlying Fortran code
|
||||
# expects float64, so upcast it
|
||||
g = g.astype(np.float64)
|
||||
# x, f, g, wa, iwa, task, csave, lsave, isave, dsave = \
|
||||
_lbfgsb.setulb(m, x, low_bnd, upper_bnd, nbd, f, g, factr,
|
||||
pgtol, wa, iwa, task, iprint, csave, lsave,
|
||||
isave, dsave, maxls)
|
||||
task_str = task.tobytes()
|
||||
if task_str.startswith(b'FG'):
|
||||
# The minimization routine wants f and g at the current x.
|
||||
# Note that interruptions due to maxfun are postponed
|
||||
# until the completion of the current minimization iteration.
|
||||
# Overwrite f and g:
|
||||
f, g = func_and_grad(x)
|
||||
elif task_str.startswith(b'NEW_X'):
|
||||
# new iteration
|
||||
n_iterations += 1
|
||||
|
||||
intermediate_result = OptimizeResult(x=x, fun=f)
|
||||
if _call_callback_maybe_halt(callback, intermediate_result):
|
||||
task[:] = 'STOP: CALLBACK REQUESTED HALT'
|
||||
if n_iterations >= maxiter:
|
||||
task[:] = 'STOP: TOTAL NO. of ITERATIONS REACHED LIMIT'
|
||||
elif sf.nfev > maxfun:
|
||||
task[:] = ('STOP: TOTAL NO. of f AND g EVALUATIONS '
|
||||
'EXCEEDS LIMIT')
|
||||
else:
|
||||
break
|
||||
|
||||
task_str = task.tobytes().strip(b'\x00').strip()
|
||||
if task_str.startswith(b'CONV'):
|
||||
warnflag = 0
|
||||
elif sf.nfev > maxfun or n_iterations >= maxiter:
|
||||
warnflag = 1
|
||||
else:
|
||||
warnflag = 2
|
||||
|
||||
# These two portions of the workspace are described in the mainlb
|
||||
# subroutine in lbfgsb.f. See line 363.
|
||||
s = wa[0: m*n].reshape(m, n)
|
||||
y = wa[m*n: 2*m*n].reshape(m, n)
|
||||
|
||||
# See lbfgsb.f line 160 for this portion of the workspace.
|
||||
# isave(31) = the total number of BFGS updates prior the current iteration;
|
||||
n_bfgs_updates = isave[30]
|
||||
|
||||
n_corrs = min(n_bfgs_updates, maxcor)
|
||||
hess_inv = LbfgsInvHessProduct(s[:n_corrs], y[:n_corrs])
|
||||
|
||||
task_str = task_str.decode()
|
||||
return OptimizeResult(fun=f, jac=g, nfev=sf.nfev,
|
||||
njev=sf.ngev,
|
||||
nit=n_iterations, status=warnflag, message=task_str,
|
||||
x=x, success=(warnflag == 0), hess_inv=hess_inv)
|
||||
|
||||
|
||||
class LbfgsInvHessProduct(LinearOperator):
|
||||
"""Linear operator for the L-BFGS approximate inverse Hessian.
|
||||
|
||||
This operator computes the product of a vector with the approximate inverse
|
||||
of the Hessian of the objective function, using the L-BFGS limited
|
||||
memory approximation to the inverse Hessian, accumulated during the
|
||||
optimization.
|
||||
|
||||
Objects of this class implement the ``scipy.sparse.linalg.LinearOperator``
|
||||
interface.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
sk : array_like, shape=(n_corr, n)
|
||||
Array of `n_corr` most recent updates to the solution vector.
|
||||
(See [1]).
|
||||
yk : array_like, shape=(n_corr, n)
|
||||
Array of `n_corr` most recent updates to the gradient. (See [1]).
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Nocedal, Jorge. "Updating quasi-Newton matrices with limited
|
||||
storage." Mathematics of computation 35.151 (1980): 773-782.
|
||||
|
||||
"""
|
||||
|
||||
def __init__(self, sk, yk):
|
||||
"""Construct the operator."""
|
||||
if sk.shape != yk.shape or sk.ndim != 2:
|
||||
raise ValueError('sk and yk must have matching shape, (n_corrs, n)')
|
||||
n_corrs, n = sk.shape
|
||||
|
||||
super().__init__(dtype=np.float64, shape=(n, n))
|
||||
|
||||
self.sk = sk
|
||||
self.yk = yk
|
||||
self.n_corrs = n_corrs
|
||||
self.rho = 1 / np.einsum('ij,ij->i', sk, yk)
|
||||
|
||||
def _matvec(self, x):
|
||||
"""Efficient matrix-vector multiply with the BFGS matrices.
|
||||
|
||||
This calculation is described in Section (4) of [1].
|
||||
|
||||
Parameters
|
||||
----------
|
||||
x : ndarray
|
||||
An array with shape (n,) or (n,1).
|
||||
|
||||
Returns
|
||||
-------
|
||||
y : ndarray
|
||||
The matrix-vector product
|
||||
|
||||
"""
|
||||
s, y, n_corrs, rho = self.sk, self.yk, self.n_corrs, self.rho
|
||||
q = np.array(x, dtype=self.dtype, copy=True)
|
||||
if q.ndim == 2 and q.shape[1] == 1:
|
||||
q = q.reshape(-1)
|
||||
|
||||
alpha = np.empty(n_corrs)
|
||||
|
||||
for i in range(n_corrs-1, -1, -1):
|
||||
alpha[i] = rho[i] * np.dot(s[i], q)
|
||||
q = q - alpha[i]*y[i]
|
||||
|
||||
r = q
|
||||
for i in range(n_corrs):
|
||||
beta = rho[i] * np.dot(y[i], r)
|
||||
r = r + s[i] * (alpha[i] - beta)
|
||||
|
||||
return r
|
||||
|
||||
def todense(self):
|
||||
"""Return a dense array representation of this operator.
|
||||
|
||||
Returns
|
||||
-------
|
||||
arr : ndarray, shape=(n, n)
|
||||
An array with the same shape and containing
|
||||
the same data represented by this `LinearOperator`.
|
||||
|
||||
"""
|
||||
s, y, n_corrs, rho = self.sk, self.yk, self.n_corrs, self.rho
|
||||
I = np.eye(*self.shape, dtype=self.dtype)
|
||||
Hk = I
|
||||
|
||||
for i in range(n_corrs):
|
||||
A1 = I - s[i][:, np.newaxis] * y[i][np.newaxis, :] * rho[i]
|
||||
A2 = I - y[i][:, np.newaxis] * s[i][np.newaxis, :] * rho[i]
|
||||
|
||||
Hk = np.dot(A1, np.dot(Hk, A2)) + (rho[i] * s[i][:, np.newaxis] *
|
||||
s[i][np.newaxis, :])
|
||||
return Hk
|
||||
896
venv/lib/python3.12/site-packages/scipy/optimize/_linesearch.py
Normal file
896
venv/lib/python3.12/site-packages/scipy/optimize/_linesearch.py
Normal file
@ -0,0 +1,896 @@
|
||||
"""
|
||||
Functions
|
||||
---------
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
line_search_armijo
|
||||
line_search_wolfe1
|
||||
line_search_wolfe2
|
||||
scalar_search_wolfe1
|
||||
scalar_search_wolfe2
|
||||
|
||||
"""
|
||||
from warnings import warn
|
||||
|
||||
from ._dcsrch import DCSRCH
|
||||
import numpy as np
|
||||
|
||||
__all__ = ['LineSearchWarning', 'line_search_wolfe1', 'line_search_wolfe2',
|
||||
'scalar_search_wolfe1', 'scalar_search_wolfe2',
|
||||
'line_search_armijo']
|
||||
|
||||
class LineSearchWarning(RuntimeWarning):
|
||||
pass
|
||||
|
||||
|
||||
def _check_c1_c2(c1, c2):
|
||||
if not (0 < c1 < c2 < 1):
|
||||
raise ValueError("'c1' and 'c2' do not satisfy"
|
||||
"'0 < c1 < c2 < 1'.")
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# Minpack's Wolfe line and scalar searches
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
def line_search_wolfe1(f, fprime, xk, pk, gfk=None,
|
||||
old_fval=None, old_old_fval=None,
|
||||
args=(), c1=1e-4, c2=0.9, amax=50, amin=1e-8,
|
||||
xtol=1e-14):
|
||||
"""
|
||||
As `scalar_search_wolfe1` but do a line search to direction `pk`
|
||||
|
||||
Parameters
|
||||
----------
|
||||
f : callable
|
||||
Function `f(x)`
|
||||
fprime : callable
|
||||
Gradient of `f`
|
||||
xk : array_like
|
||||
Current point
|
||||
pk : array_like
|
||||
Search direction
|
||||
gfk : array_like, optional
|
||||
Gradient of `f` at point `xk`
|
||||
old_fval : float, optional
|
||||
Value of `f` at point `xk`
|
||||
old_old_fval : float, optional
|
||||
Value of `f` at point preceding `xk`
|
||||
|
||||
The rest of the parameters are the same as for `scalar_search_wolfe1`.
|
||||
|
||||
Returns
|
||||
-------
|
||||
stp, f_count, g_count, fval, old_fval
|
||||
As in `line_search_wolfe1`
|
||||
gval : array
|
||||
Gradient of `f` at the final point
|
||||
|
||||
Notes
|
||||
-----
|
||||
Parameters `c1` and `c2` must satisfy ``0 < c1 < c2 < 1``.
|
||||
|
||||
"""
|
||||
if gfk is None:
|
||||
gfk = fprime(xk, *args)
|
||||
|
||||
gval = [gfk]
|
||||
gc = [0]
|
||||
fc = [0]
|
||||
|
||||
def phi(s):
|
||||
fc[0] += 1
|
||||
return f(xk + s*pk, *args)
|
||||
|
||||
def derphi(s):
|
||||
gval[0] = fprime(xk + s*pk, *args)
|
||||
gc[0] += 1
|
||||
return np.dot(gval[0], pk)
|
||||
|
||||
derphi0 = np.dot(gfk, pk)
|
||||
|
||||
stp, fval, old_fval = scalar_search_wolfe1(
|
||||
phi, derphi, old_fval, old_old_fval, derphi0,
|
||||
c1=c1, c2=c2, amax=amax, amin=amin, xtol=xtol)
|
||||
|
||||
return stp, fc[0], gc[0], fval, old_fval, gval[0]
|
||||
|
||||
|
||||
def scalar_search_wolfe1(phi, derphi, phi0=None, old_phi0=None, derphi0=None,
|
||||
c1=1e-4, c2=0.9,
|
||||
amax=50, amin=1e-8, xtol=1e-14):
|
||||
"""
|
||||
Scalar function search for alpha that satisfies strong Wolfe conditions
|
||||
|
||||
alpha > 0 is assumed to be a descent direction.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
phi : callable phi(alpha)
|
||||
Function at point `alpha`
|
||||
derphi : callable phi'(alpha)
|
||||
Objective function derivative. Returns a scalar.
|
||||
phi0 : float, optional
|
||||
Value of phi at 0
|
||||
old_phi0 : float, optional
|
||||
Value of phi at previous point
|
||||
derphi0 : float, optional
|
||||
Value derphi at 0
|
||||
c1 : float, optional
|
||||
Parameter for Armijo condition rule.
|
||||
c2 : float, optional
|
||||
Parameter for curvature condition rule.
|
||||
amax, amin : float, optional
|
||||
Maximum and minimum step size
|
||||
xtol : float, optional
|
||||
Relative tolerance for an acceptable step.
|
||||
|
||||
Returns
|
||||
-------
|
||||
alpha : float
|
||||
Step size, or None if no suitable step was found
|
||||
phi : float
|
||||
Value of `phi` at the new point `alpha`
|
||||
phi0 : float
|
||||
Value of `phi` at `alpha=0`
|
||||
|
||||
Notes
|
||||
-----
|
||||
Uses routine DCSRCH from MINPACK.
|
||||
|
||||
Parameters `c1` and `c2` must satisfy ``0 < c1 < c2 < 1`` as described in [1]_.
|
||||
|
||||
References
|
||||
----------
|
||||
|
||||
.. [1] Nocedal, J., & Wright, S. J. (2006). Numerical optimization.
|
||||
In Springer Series in Operations Research and Financial Engineering.
|
||||
(Springer Series in Operations Research and Financial Engineering).
|
||||
Springer Nature.
|
||||
|
||||
"""
|
||||
_check_c1_c2(c1, c2)
|
||||
|
||||
if phi0 is None:
|
||||
phi0 = phi(0.)
|
||||
if derphi0 is None:
|
||||
derphi0 = derphi(0.)
|
||||
|
||||
if old_phi0 is not None and derphi0 != 0:
|
||||
alpha1 = min(1.0, 1.01*2*(phi0 - old_phi0)/derphi0)
|
||||
if alpha1 < 0:
|
||||
alpha1 = 1.0
|
||||
else:
|
||||
alpha1 = 1.0
|
||||
|
||||
maxiter = 100
|
||||
|
||||
dcsrch = DCSRCH(phi, derphi, c1, c2, xtol, amin, amax)
|
||||
stp, phi1, phi0, task = dcsrch(
|
||||
alpha1, phi0=phi0, derphi0=derphi0, maxiter=maxiter
|
||||
)
|
||||
|
||||
return stp, phi1, phi0
|
||||
|
||||
|
||||
line_search = line_search_wolfe1
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# Pure-Python Wolfe line and scalar searches
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
# Note: `line_search_wolfe2` is the public `scipy.optimize.line_search`
|
||||
|
||||
def line_search_wolfe2(f, myfprime, xk, pk, gfk=None, old_fval=None,
|
||||
old_old_fval=None, args=(), c1=1e-4, c2=0.9, amax=None,
|
||||
extra_condition=None, maxiter=10):
|
||||
"""Find alpha that satisfies strong Wolfe conditions.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
f : callable f(x,*args)
|
||||
Objective function.
|
||||
myfprime : callable f'(x,*args)
|
||||
Objective function gradient.
|
||||
xk : ndarray
|
||||
Starting point.
|
||||
pk : ndarray
|
||||
Search direction. The search direction must be a descent direction
|
||||
for the algorithm to converge.
|
||||
gfk : ndarray, optional
|
||||
Gradient value for x=xk (xk being the current parameter
|
||||
estimate). Will be recomputed if omitted.
|
||||
old_fval : float, optional
|
||||
Function value for x=xk. Will be recomputed if omitted.
|
||||
old_old_fval : float, optional
|
||||
Function value for the point preceding x=xk.
|
||||
args : tuple, optional
|
||||
Additional arguments passed to objective function.
|
||||
c1 : float, optional
|
||||
Parameter for Armijo condition rule.
|
||||
c2 : float, optional
|
||||
Parameter for curvature condition rule.
|
||||
amax : float, optional
|
||||
Maximum step size
|
||||
extra_condition : callable, optional
|
||||
A callable of the form ``extra_condition(alpha, x, f, g)``
|
||||
returning a boolean. Arguments are the proposed step ``alpha``
|
||||
and the corresponding ``x``, ``f`` and ``g`` values. The line search
|
||||
accepts the value of ``alpha`` only if this
|
||||
callable returns ``True``. If the callable returns ``False``
|
||||
for the step length, the algorithm will continue with
|
||||
new iterates. The callable is only called for iterates
|
||||
satisfying the strong Wolfe conditions.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations to perform.
|
||||
|
||||
Returns
|
||||
-------
|
||||
alpha : float or None
|
||||
Alpha for which ``x_new = x0 + alpha * pk``,
|
||||
or None if the line search algorithm did not converge.
|
||||
fc : int
|
||||
Number of function evaluations made.
|
||||
gc : int
|
||||
Number of gradient evaluations made.
|
||||
new_fval : float or None
|
||||
New function value ``f(x_new)=f(x0+alpha*pk)``,
|
||||
or None if the line search algorithm did not converge.
|
||||
old_fval : float
|
||||
Old function value ``f(x0)``.
|
||||
new_slope : float or None
|
||||
The local slope along the search direction at the
|
||||
new value ``<myfprime(x_new), pk>``,
|
||||
or None if the line search algorithm did not converge.
|
||||
|
||||
|
||||
Notes
|
||||
-----
|
||||
Uses the line search algorithm to enforce strong Wolfe
|
||||
conditions. See Wright and Nocedal, 'Numerical Optimization',
|
||||
1999, pp. 59-61.
|
||||
|
||||
The search direction `pk` must be a descent direction (e.g.
|
||||
``-myfprime(xk)``) to find a step length that satisfies the strong Wolfe
|
||||
conditions. If the search direction is not a descent direction (e.g.
|
||||
``myfprime(xk)``), then `alpha`, `new_fval`, and `new_slope` will be None.
|
||||
|
||||
Examples
|
||||
--------
|
||||
>>> import numpy as np
|
||||
>>> from scipy.optimize import line_search
|
||||
|
||||
A objective function and its gradient are defined.
|
||||
|
||||
>>> def obj_func(x):
|
||||
... return (x[0])**2+(x[1])**2
|
||||
>>> def obj_grad(x):
|
||||
... return [2*x[0], 2*x[1]]
|
||||
|
||||
We can find alpha that satisfies strong Wolfe conditions.
|
||||
|
||||
>>> start_point = np.array([1.8, 1.7])
|
||||
>>> search_gradient = np.array([-1.0, -1.0])
|
||||
>>> line_search(obj_func, obj_grad, start_point, search_gradient)
|
||||
(1.0, 2, 1, 1.1300000000000001, 6.13, [1.6, 1.4])
|
||||
|
||||
"""
|
||||
fc = [0]
|
||||
gc = [0]
|
||||
gval = [None]
|
||||
gval_alpha = [None]
|
||||
|
||||
def phi(alpha):
|
||||
fc[0] += 1
|
||||
return f(xk + alpha * pk, *args)
|
||||
|
||||
fprime = myfprime
|
||||
|
||||
def derphi(alpha):
|
||||
gc[0] += 1
|
||||
gval[0] = fprime(xk + alpha * pk, *args) # store for later use
|
||||
gval_alpha[0] = alpha
|
||||
return np.dot(gval[0], pk)
|
||||
|
||||
if gfk is None:
|
||||
gfk = fprime(xk, *args)
|
||||
derphi0 = np.dot(gfk, pk)
|
||||
|
||||
if extra_condition is not None:
|
||||
# Add the current gradient as argument, to avoid needless
|
||||
# re-evaluation
|
||||
def extra_condition2(alpha, phi):
|
||||
if gval_alpha[0] != alpha:
|
||||
derphi(alpha)
|
||||
x = xk + alpha * pk
|
||||
return extra_condition(alpha, x, phi, gval[0])
|
||||
else:
|
||||
extra_condition2 = None
|
||||
|
||||
alpha_star, phi_star, old_fval, derphi_star = scalar_search_wolfe2(
|
||||
phi, derphi, old_fval, old_old_fval, derphi0, c1, c2, amax,
|
||||
extra_condition2, maxiter=maxiter)
|
||||
|
||||
if derphi_star is None:
|
||||
warn('The line search algorithm did not converge',
|
||||
LineSearchWarning, stacklevel=2)
|
||||
else:
|
||||
# derphi_star is a number (derphi) -- so use the most recently
|
||||
# calculated gradient used in computing it derphi = gfk*pk
|
||||
# this is the gradient at the next step no need to compute it
|
||||
# again in the outer loop.
|
||||
derphi_star = gval[0]
|
||||
|
||||
return alpha_star, fc[0], gc[0], phi_star, old_fval, derphi_star
|
||||
|
||||
|
||||
def scalar_search_wolfe2(phi, derphi, phi0=None,
|
||||
old_phi0=None, derphi0=None,
|
||||
c1=1e-4, c2=0.9, amax=None,
|
||||
extra_condition=None, maxiter=10):
|
||||
"""Find alpha that satisfies strong Wolfe conditions.
|
||||
|
||||
alpha > 0 is assumed to be a descent direction.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
phi : callable phi(alpha)
|
||||
Objective scalar function.
|
||||
derphi : callable phi'(alpha)
|
||||
Objective function derivative. Returns a scalar.
|
||||
phi0 : float, optional
|
||||
Value of phi at 0.
|
||||
old_phi0 : float, optional
|
||||
Value of phi at previous point.
|
||||
derphi0 : float, optional
|
||||
Value of derphi at 0
|
||||
c1 : float, optional
|
||||
Parameter for Armijo condition rule.
|
||||
c2 : float, optional
|
||||
Parameter for curvature condition rule.
|
||||
amax : float, optional
|
||||
Maximum step size.
|
||||
extra_condition : callable, optional
|
||||
A callable of the form ``extra_condition(alpha, phi_value)``
|
||||
returning a boolean. The line search accepts the value
|
||||
of ``alpha`` only if this callable returns ``True``.
|
||||
If the callable returns ``False`` for the step length,
|
||||
the algorithm will continue with new iterates.
|
||||
The callable is only called for iterates satisfying
|
||||
the strong Wolfe conditions.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations to perform.
|
||||
|
||||
Returns
|
||||
-------
|
||||
alpha_star : float or None
|
||||
Best alpha, or None if the line search algorithm did not converge.
|
||||
phi_star : float
|
||||
phi at alpha_star.
|
||||
phi0 : float
|
||||
phi at 0.
|
||||
derphi_star : float or None
|
||||
derphi at alpha_star, or None if the line search algorithm
|
||||
did not converge.
|
||||
|
||||
Notes
|
||||
-----
|
||||
Uses the line search algorithm to enforce strong Wolfe
|
||||
conditions. See Wright and Nocedal, 'Numerical Optimization',
|
||||
1999, pp. 59-61.
|
||||
|
||||
"""
|
||||
_check_c1_c2(c1, c2)
|
||||
|
||||
if phi0 is None:
|
||||
phi0 = phi(0.)
|
||||
|
||||
if derphi0 is None:
|
||||
derphi0 = derphi(0.)
|
||||
|
||||
alpha0 = 0
|
||||
if old_phi0 is not None and derphi0 != 0:
|
||||
alpha1 = min(1.0, 1.01*2*(phi0 - old_phi0)/derphi0)
|
||||
else:
|
||||
alpha1 = 1.0
|
||||
|
||||
if alpha1 < 0:
|
||||
alpha1 = 1.0
|
||||
|
||||
if amax is not None:
|
||||
alpha1 = min(alpha1, amax)
|
||||
|
||||
phi_a1 = phi(alpha1)
|
||||
#derphi_a1 = derphi(alpha1) evaluated below
|
||||
|
||||
phi_a0 = phi0
|
||||
derphi_a0 = derphi0
|
||||
|
||||
if extra_condition is None:
|
||||
def extra_condition(alpha, phi):
|
||||
return True
|
||||
|
||||
for i in range(maxiter):
|
||||
if alpha1 == 0 or (amax is not None and alpha0 > amax):
|
||||
# alpha1 == 0: This shouldn't happen. Perhaps the increment has
|
||||
# slipped below machine precision?
|
||||
alpha_star = None
|
||||
phi_star = phi0
|
||||
phi0 = old_phi0
|
||||
derphi_star = None
|
||||
|
||||
if alpha1 == 0:
|
||||
msg = 'Rounding errors prevent the line search from converging'
|
||||
else:
|
||||
msg = "The line search algorithm could not find a solution " + \
|
||||
"less than or equal to amax: %s" % amax
|
||||
|
||||
warn(msg, LineSearchWarning, stacklevel=2)
|
||||
break
|
||||
|
||||
not_first_iteration = i > 0
|
||||
if (phi_a1 > phi0 + c1 * alpha1 * derphi0) or \
|
||||
((phi_a1 >= phi_a0) and not_first_iteration):
|
||||
alpha_star, phi_star, derphi_star = \
|
||||
_zoom(alpha0, alpha1, phi_a0,
|
||||
phi_a1, derphi_a0, phi, derphi,
|
||||
phi0, derphi0, c1, c2, extra_condition)
|
||||
break
|
||||
|
||||
derphi_a1 = derphi(alpha1)
|
||||
if (abs(derphi_a1) <= -c2*derphi0):
|
||||
if extra_condition(alpha1, phi_a1):
|
||||
alpha_star = alpha1
|
||||
phi_star = phi_a1
|
||||
derphi_star = derphi_a1
|
||||
break
|
||||
|
||||
if (derphi_a1 >= 0):
|
||||
alpha_star, phi_star, derphi_star = \
|
||||
_zoom(alpha1, alpha0, phi_a1,
|
||||
phi_a0, derphi_a1, phi, derphi,
|
||||
phi0, derphi0, c1, c2, extra_condition)
|
||||
break
|
||||
|
||||
alpha2 = 2 * alpha1 # increase by factor of two on each iteration
|
||||
if amax is not None:
|
||||
alpha2 = min(alpha2, amax)
|
||||
alpha0 = alpha1
|
||||
alpha1 = alpha2
|
||||
phi_a0 = phi_a1
|
||||
phi_a1 = phi(alpha1)
|
||||
derphi_a0 = derphi_a1
|
||||
|
||||
else:
|
||||
# stopping test maxiter reached
|
||||
alpha_star = alpha1
|
||||
phi_star = phi_a1
|
||||
derphi_star = None
|
||||
warn('The line search algorithm did not converge',
|
||||
LineSearchWarning, stacklevel=2)
|
||||
|
||||
return alpha_star, phi_star, phi0, derphi_star
|
||||
|
||||
|
||||
def _cubicmin(a, fa, fpa, b, fb, c, fc):
|
||||
"""
|
||||
Finds the minimizer for a cubic polynomial that goes through the
|
||||
points (a,fa), (b,fb), and (c,fc) with derivative at a of fpa.
|
||||
|
||||
If no minimizer can be found, return None.
|
||||
|
||||
"""
|
||||
# f(x) = A *(x-a)^3 + B*(x-a)^2 + C*(x-a) + D
|
||||
|
||||
with np.errstate(divide='raise', over='raise', invalid='raise'):
|
||||
try:
|
||||
C = fpa
|
||||
db = b - a
|
||||
dc = c - a
|
||||
denom = (db * dc) ** 2 * (db - dc)
|
||||
d1 = np.empty((2, 2))
|
||||
d1[0, 0] = dc ** 2
|
||||
d1[0, 1] = -db ** 2
|
||||
d1[1, 0] = -dc ** 3
|
||||
d1[1, 1] = db ** 3
|
||||
[A, B] = np.dot(d1, np.asarray([fb - fa - C * db,
|
||||
fc - fa - C * dc]).flatten())
|
||||
A /= denom
|
||||
B /= denom
|
||||
radical = B * B - 3 * A * C
|
||||
xmin = a + (-B + np.sqrt(radical)) / (3 * A)
|
||||
except ArithmeticError:
|
||||
return None
|
||||
if not np.isfinite(xmin):
|
||||
return None
|
||||
return xmin
|
||||
|
||||
|
||||
def _quadmin(a, fa, fpa, b, fb):
|
||||
"""
|
||||
Finds the minimizer for a quadratic polynomial that goes through
|
||||
the points (a,fa), (b,fb) with derivative at a of fpa.
|
||||
|
||||
"""
|
||||
# f(x) = B*(x-a)^2 + C*(x-a) + D
|
||||
with np.errstate(divide='raise', over='raise', invalid='raise'):
|
||||
try:
|
||||
D = fa
|
||||
C = fpa
|
||||
db = b - a * 1.0
|
||||
B = (fb - D - C * db) / (db * db)
|
||||
xmin = a - C / (2.0 * B)
|
||||
except ArithmeticError:
|
||||
return None
|
||||
if not np.isfinite(xmin):
|
||||
return None
|
||||
return xmin
|
||||
|
||||
|
||||
def _zoom(a_lo, a_hi, phi_lo, phi_hi, derphi_lo,
|
||||
phi, derphi, phi0, derphi0, c1, c2, extra_condition):
|
||||
"""Zoom stage of approximate linesearch satisfying strong Wolfe conditions.
|
||||
|
||||
Part of the optimization algorithm in `scalar_search_wolfe2`.
|
||||
|
||||
Notes
|
||||
-----
|
||||
Implements Algorithm 3.6 (zoom) in Wright and Nocedal,
|
||||
'Numerical Optimization', 1999, pp. 61.
|
||||
|
||||
"""
|
||||
|
||||
maxiter = 10
|
||||
i = 0
|
||||
delta1 = 0.2 # cubic interpolant check
|
||||
delta2 = 0.1 # quadratic interpolant check
|
||||
phi_rec = phi0
|
||||
a_rec = 0
|
||||
while True:
|
||||
# interpolate to find a trial step length between a_lo and
|
||||
# a_hi Need to choose interpolation here. Use cubic
|
||||
# interpolation and then if the result is within delta *
|
||||
# dalpha or outside of the interval bounded by a_lo or a_hi
|
||||
# then use quadratic interpolation, if the result is still too
|
||||
# close, then use bisection
|
||||
|
||||
dalpha = a_hi - a_lo
|
||||
if dalpha < 0:
|
||||
a, b = a_hi, a_lo
|
||||
else:
|
||||
a, b = a_lo, a_hi
|
||||
|
||||
# minimizer of cubic interpolant
|
||||
# (uses phi_lo, derphi_lo, phi_hi, and the most recent value of phi)
|
||||
#
|
||||
# if the result is too close to the end points (or out of the
|
||||
# interval), then use quadratic interpolation with phi_lo,
|
||||
# derphi_lo and phi_hi if the result is still too close to the
|
||||
# end points (or out of the interval) then use bisection
|
||||
|
||||
if (i > 0):
|
||||
cchk = delta1 * dalpha
|
||||
a_j = _cubicmin(a_lo, phi_lo, derphi_lo, a_hi, phi_hi,
|
||||
a_rec, phi_rec)
|
||||
if (i == 0) or (a_j is None) or (a_j > b - cchk) or (a_j < a + cchk):
|
||||
qchk = delta2 * dalpha
|
||||
a_j = _quadmin(a_lo, phi_lo, derphi_lo, a_hi, phi_hi)
|
||||
if (a_j is None) or (a_j > b-qchk) or (a_j < a+qchk):
|
||||
a_j = a_lo + 0.5*dalpha
|
||||
|
||||
# Check new value of a_j
|
||||
|
||||
phi_aj = phi(a_j)
|
||||
if (phi_aj > phi0 + c1*a_j*derphi0) or (phi_aj >= phi_lo):
|
||||
phi_rec = phi_hi
|
||||
a_rec = a_hi
|
||||
a_hi = a_j
|
||||
phi_hi = phi_aj
|
||||
else:
|
||||
derphi_aj = derphi(a_j)
|
||||
if abs(derphi_aj) <= -c2*derphi0 and extra_condition(a_j, phi_aj):
|
||||
a_star = a_j
|
||||
val_star = phi_aj
|
||||
valprime_star = derphi_aj
|
||||
break
|
||||
if derphi_aj*(a_hi - a_lo) >= 0:
|
||||
phi_rec = phi_hi
|
||||
a_rec = a_hi
|
||||
a_hi = a_lo
|
||||
phi_hi = phi_lo
|
||||
else:
|
||||
phi_rec = phi_lo
|
||||
a_rec = a_lo
|
||||
a_lo = a_j
|
||||
phi_lo = phi_aj
|
||||
derphi_lo = derphi_aj
|
||||
i += 1
|
||||
if (i > maxiter):
|
||||
# Failed to find a conforming step size
|
||||
a_star = None
|
||||
val_star = None
|
||||
valprime_star = None
|
||||
break
|
||||
return a_star, val_star, valprime_star
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# Armijo line and scalar searches
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
def line_search_armijo(f, xk, pk, gfk, old_fval, args=(), c1=1e-4, alpha0=1):
|
||||
"""Minimize over alpha, the function ``f(xk+alpha pk)``.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
f : callable
|
||||
Function to be minimized.
|
||||
xk : array_like
|
||||
Current point.
|
||||
pk : array_like
|
||||
Search direction.
|
||||
gfk : array_like
|
||||
Gradient of `f` at point `xk`.
|
||||
old_fval : float
|
||||
Value of `f` at point `xk`.
|
||||
args : tuple, optional
|
||||
Optional arguments.
|
||||
c1 : float, optional
|
||||
Value to control stopping criterion.
|
||||
alpha0 : scalar, optional
|
||||
Value of `alpha` at start of the optimization.
|
||||
|
||||
Returns
|
||||
-------
|
||||
alpha
|
||||
f_count
|
||||
f_val_at_alpha
|
||||
|
||||
Notes
|
||||
-----
|
||||
Uses the interpolation algorithm (Armijo backtracking) as suggested by
|
||||
Wright and Nocedal in 'Numerical Optimization', 1999, pp. 56-57
|
||||
|
||||
"""
|
||||
xk = np.atleast_1d(xk)
|
||||
fc = [0]
|
||||
|
||||
def phi(alpha1):
|
||||
fc[0] += 1
|
||||
return f(xk + alpha1*pk, *args)
|
||||
|
||||
if old_fval is None:
|
||||
phi0 = phi(0.)
|
||||
else:
|
||||
phi0 = old_fval # compute f(xk) -- done in past loop
|
||||
|
||||
derphi0 = np.dot(gfk, pk)
|
||||
alpha, phi1 = scalar_search_armijo(phi, phi0, derphi0, c1=c1,
|
||||
alpha0=alpha0)
|
||||
return alpha, fc[0], phi1
|
||||
|
||||
|
||||
def line_search_BFGS(f, xk, pk, gfk, old_fval, args=(), c1=1e-4, alpha0=1):
|
||||
"""
|
||||
Compatibility wrapper for `line_search_armijo`
|
||||
"""
|
||||
r = line_search_armijo(f, xk, pk, gfk, old_fval, args=args, c1=c1,
|
||||
alpha0=alpha0)
|
||||
return r[0], r[1], 0, r[2]
|
||||
|
||||
|
||||
def scalar_search_armijo(phi, phi0, derphi0, c1=1e-4, alpha0=1, amin=0):
|
||||
"""Minimize over alpha, the function ``phi(alpha)``.
|
||||
|
||||
Uses the interpolation algorithm (Armijo backtracking) as suggested by
|
||||
Wright and Nocedal in 'Numerical Optimization', 1999, pp. 56-57
|
||||
|
||||
alpha > 0 is assumed to be a descent direction.
|
||||
|
||||
Returns
|
||||
-------
|
||||
alpha
|
||||
phi1
|
||||
|
||||
"""
|
||||
phi_a0 = phi(alpha0)
|
||||
if phi_a0 <= phi0 + c1*alpha0*derphi0:
|
||||
return alpha0, phi_a0
|
||||
|
||||
# Otherwise, compute the minimizer of a quadratic interpolant:
|
||||
|
||||
alpha1 = -(derphi0) * alpha0**2 / 2.0 / (phi_a0 - phi0 - derphi0 * alpha0)
|
||||
phi_a1 = phi(alpha1)
|
||||
|
||||
if (phi_a1 <= phi0 + c1*alpha1*derphi0):
|
||||
return alpha1, phi_a1
|
||||
|
||||
# Otherwise, loop with cubic interpolation until we find an alpha which
|
||||
# satisfies the first Wolfe condition (since we are backtracking, we will
|
||||
# assume that the value of alpha is not too small and satisfies the second
|
||||
# condition.
|
||||
|
||||
while alpha1 > amin: # we are assuming alpha>0 is a descent direction
|
||||
factor = alpha0**2 * alpha1**2 * (alpha1-alpha0)
|
||||
a = alpha0**2 * (phi_a1 - phi0 - derphi0*alpha1) - \
|
||||
alpha1**2 * (phi_a0 - phi0 - derphi0*alpha0)
|
||||
a = a / factor
|
||||
b = -alpha0**3 * (phi_a1 - phi0 - derphi0*alpha1) + \
|
||||
alpha1**3 * (phi_a0 - phi0 - derphi0*alpha0)
|
||||
b = b / factor
|
||||
|
||||
alpha2 = (-b + np.sqrt(abs(b**2 - 3 * a * derphi0))) / (3.0*a)
|
||||
phi_a2 = phi(alpha2)
|
||||
|
||||
if (phi_a2 <= phi0 + c1*alpha2*derphi0):
|
||||
return alpha2, phi_a2
|
||||
|
||||
if (alpha1 - alpha2) > alpha1 / 2.0 or (1 - alpha2/alpha1) < 0.96:
|
||||
alpha2 = alpha1 / 2.0
|
||||
|
||||
alpha0 = alpha1
|
||||
alpha1 = alpha2
|
||||
phi_a0 = phi_a1
|
||||
phi_a1 = phi_a2
|
||||
|
||||
# Failed to find a suitable step length
|
||||
return None, phi_a1
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# Non-monotone line search for DF-SANE
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
def _nonmonotone_line_search_cruz(f, x_k, d, prev_fs, eta,
|
||||
gamma=1e-4, tau_min=0.1, tau_max=0.5):
|
||||
"""
|
||||
Nonmonotone backtracking line search as described in [1]_
|
||||
|
||||
Parameters
|
||||
----------
|
||||
f : callable
|
||||
Function returning a tuple ``(f, F)`` where ``f`` is the value
|
||||
of a merit function and ``F`` the residual.
|
||||
x_k : ndarray
|
||||
Initial position.
|
||||
d : ndarray
|
||||
Search direction.
|
||||
prev_fs : float
|
||||
List of previous merit function values. Should have ``len(prev_fs) <= M``
|
||||
where ``M`` is the nonmonotonicity window parameter.
|
||||
eta : float
|
||||
Allowed merit function increase, see [1]_
|
||||
gamma, tau_min, tau_max : float, optional
|
||||
Search parameters, see [1]_
|
||||
|
||||
Returns
|
||||
-------
|
||||
alpha : float
|
||||
Step length
|
||||
xp : ndarray
|
||||
Next position
|
||||
fp : float
|
||||
Merit function value at next position
|
||||
Fp : ndarray
|
||||
Residual at next position
|
||||
|
||||
References
|
||||
----------
|
||||
[1] "Spectral residual method without gradient information for solving
|
||||
large-scale nonlinear systems of equations." W. La Cruz,
|
||||
J.M. Martinez, M. Raydan. Math. Comp. **75**, 1429 (2006).
|
||||
|
||||
"""
|
||||
f_k = prev_fs[-1]
|
||||
f_bar = max(prev_fs)
|
||||
|
||||
alpha_p = 1
|
||||
alpha_m = 1
|
||||
alpha = 1
|
||||
|
||||
while True:
|
||||
xp = x_k + alpha_p * d
|
||||
fp, Fp = f(xp)
|
||||
|
||||
if fp <= f_bar + eta - gamma * alpha_p**2 * f_k:
|
||||
alpha = alpha_p
|
||||
break
|
||||
|
||||
alpha_tp = alpha_p**2 * f_k / (fp + (2*alpha_p - 1)*f_k)
|
||||
|
||||
xp = x_k - alpha_m * d
|
||||
fp, Fp = f(xp)
|
||||
|
||||
if fp <= f_bar + eta - gamma * alpha_m**2 * f_k:
|
||||
alpha = -alpha_m
|
||||
break
|
||||
|
||||
alpha_tm = alpha_m**2 * f_k / (fp + (2*alpha_m - 1)*f_k)
|
||||
|
||||
alpha_p = np.clip(alpha_tp, tau_min * alpha_p, tau_max * alpha_p)
|
||||
alpha_m = np.clip(alpha_tm, tau_min * alpha_m, tau_max * alpha_m)
|
||||
|
||||
return alpha, xp, fp, Fp
|
||||
|
||||
|
||||
def _nonmonotone_line_search_cheng(f, x_k, d, f_k, C, Q, eta,
|
||||
gamma=1e-4, tau_min=0.1, tau_max=0.5,
|
||||
nu=0.85):
|
||||
"""
|
||||
Nonmonotone line search from [1]
|
||||
|
||||
Parameters
|
||||
----------
|
||||
f : callable
|
||||
Function returning a tuple ``(f, F)`` where ``f`` is the value
|
||||
of a merit function and ``F`` the residual.
|
||||
x_k : ndarray
|
||||
Initial position.
|
||||
d : ndarray
|
||||
Search direction.
|
||||
f_k : float
|
||||
Initial merit function value.
|
||||
C, Q : float
|
||||
Control parameters. On the first iteration, give values
|
||||
Q=1.0, C=f_k
|
||||
eta : float
|
||||
Allowed merit function increase, see [1]_
|
||||
nu, gamma, tau_min, tau_max : float, optional
|
||||
Search parameters, see [1]_
|
||||
|
||||
Returns
|
||||
-------
|
||||
alpha : float
|
||||
Step length
|
||||
xp : ndarray
|
||||
Next position
|
||||
fp : float
|
||||
Merit function value at next position
|
||||
Fp : ndarray
|
||||
Residual at next position
|
||||
C : float
|
||||
New value for the control parameter C
|
||||
Q : float
|
||||
New value for the control parameter Q
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] W. Cheng & D.-H. Li, ''A derivative-free nonmonotone line
|
||||
search and its application to the spectral residual
|
||||
method'', IMA J. Numer. Anal. 29, 814 (2009).
|
||||
|
||||
"""
|
||||
alpha_p = 1
|
||||
alpha_m = 1
|
||||
alpha = 1
|
||||
|
||||
while True:
|
||||
xp = x_k + alpha_p * d
|
||||
fp, Fp = f(xp)
|
||||
|
||||
if fp <= C + eta - gamma * alpha_p**2 * f_k:
|
||||
alpha = alpha_p
|
||||
break
|
||||
|
||||
alpha_tp = alpha_p**2 * f_k / (fp + (2*alpha_p - 1)*f_k)
|
||||
|
||||
xp = x_k - alpha_m * d
|
||||
fp, Fp = f(xp)
|
||||
|
||||
if fp <= C + eta - gamma * alpha_m**2 * f_k:
|
||||
alpha = -alpha_m
|
||||
break
|
||||
|
||||
alpha_tm = alpha_m**2 * f_k / (fp + (2*alpha_m - 1)*f_k)
|
||||
|
||||
alpha_p = np.clip(alpha_tp, tau_min * alpha_p, tau_max * alpha_p)
|
||||
alpha_m = np.clip(alpha_tm, tau_min * alpha_m, tau_max * alpha_m)
|
||||
|
||||
# Update C and Q
|
||||
Q_next = nu * Q + 1
|
||||
C = (nu * Q * (C + eta) + fp) / Q_next
|
||||
Q = Q_next
|
||||
|
||||
return alpha, xp, fp, Fp, C, Q
|
||||
716
venv/lib/python3.12/site-packages/scipy/optimize/_linprog.py
Normal file
716
venv/lib/python3.12/site-packages/scipy/optimize/_linprog.py
Normal file
@ -0,0 +1,716 @@
|
||||
"""
|
||||
A top-level linear programming interface.
|
||||
|
||||
.. versionadded:: 0.15.0
|
||||
|
||||
Functions
|
||||
---------
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
linprog
|
||||
linprog_verbose_callback
|
||||
linprog_terse_callback
|
||||
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
|
||||
from ._optimize import OptimizeResult, OptimizeWarning
|
||||
from warnings import warn
|
||||
from ._linprog_highs import _linprog_highs
|
||||
from ._linprog_ip import _linprog_ip
|
||||
from ._linprog_simplex import _linprog_simplex
|
||||
from ._linprog_rs import _linprog_rs
|
||||
from ._linprog_doc import (_linprog_highs_doc, _linprog_ip_doc, # noqa: F401
|
||||
_linprog_rs_doc, _linprog_simplex_doc,
|
||||
_linprog_highs_ipm_doc, _linprog_highs_ds_doc)
|
||||
from ._linprog_util import (
|
||||
_parse_linprog, _presolve, _get_Abc, _LPProblem, _autoscale,
|
||||
_postsolve, _check_result, _display_summary)
|
||||
from copy import deepcopy
|
||||
|
||||
__all__ = ['linprog', 'linprog_verbose_callback', 'linprog_terse_callback']
|
||||
|
||||
__docformat__ = "restructuredtext en"
|
||||
|
||||
LINPROG_METHODS = [
|
||||
'simplex', 'revised simplex', 'interior-point', 'highs', 'highs-ds', 'highs-ipm'
|
||||
]
|
||||
|
||||
|
||||
def linprog_verbose_callback(res):
|
||||
"""
|
||||
A sample callback function demonstrating the linprog callback interface.
|
||||
This callback produces detailed output to sys.stdout before each iteration
|
||||
and after the final iteration of the simplex algorithm.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
res : A `scipy.optimize.OptimizeResult` consisting of the following fields:
|
||||
|
||||
x : 1-D array
|
||||
The independent variable vector which optimizes the linear
|
||||
programming problem.
|
||||
fun : float
|
||||
Value of the objective function.
|
||||
success : bool
|
||||
True if the algorithm succeeded in finding an optimal solution.
|
||||
slack : 1-D array
|
||||
The values of the slack variables. Each slack variable corresponds
|
||||
to an inequality constraint. If the slack is zero, then the
|
||||
corresponding constraint is active.
|
||||
con : 1-D array
|
||||
The (nominally zero) residuals of the equality constraints, that is,
|
||||
``b - A_eq @ x``
|
||||
phase : int
|
||||
The phase of the optimization being executed. In phase 1 a basic
|
||||
feasible solution is sought and the T has an additional row
|
||||
representing an alternate objective function.
|
||||
status : int
|
||||
An integer representing the exit status of the optimization::
|
||||
|
||||
0 : Optimization terminated successfully
|
||||
1 : Iteration limit reached
|
||||
2 : Problem appears to be infeasible
|
||||
3 : Problem appears to be unbounded
|
||||
4 : Serious numerical difficulties encountered
|
||||
|
||||
nit : int
|
||||
The number of iterations performed.
|
||||
message : str
|
||||
A string descriptor of the exit status of the optimization.
|
||||
"""
|
||||
x = res['x']
|
||||
fun = res['fun']
|
||||
phase = res['phase']
|
||||
status = res['status']
|
||||
nit = res['nit']
|
||||
message = res['message']
|
||||
complete = res['complete']
|
||||
|
||||
saved_printoptions = np.get_printoptions()
|
||||
np.set_printoptions(linewidth=500,
|
||||
formatter={'float': lambda x: f"{x: 12.4f}"})
|
||||
if status:
|
||||
print('--------- Simplex Early Exit -------\n')
|
||||
print(f'The simplex method exited early with status {status:d}')
|
||||
print(message)
|
||||
elif complete:
|
||||
print('--------- Simplex Complete --------\n')
|
||||
print(f'Iterations required: {nit}')
|
||||
else:
|
||||
print(f'--------- Iteration {nit:d} ---------\n')
|
||||
|
||||
if nit > 0:
|
||||
if phase == 1:
|
||||
print('Current Pseudo-Objective Value:')
|
||||
else:
|
||||
print('Current Objective Value:')
|
||||
print('f = ', fun)
|
||||
print()
|
||||
print('Current Solution Vector:')
|
||||
print('x = ', x)
|
||||
print()
|
||||
|
||||
np.set_printoptions(**saved_printoptions)
|
||||
|
||||
|
||||
def linprog_terse_callback(res):
|
||||
"""
|
||||
A sample callback function demonstrating the linprog callback interface.
|
||||
This callback produces brief output to sys.stdout before each iteration
|
||||
and after the final iteration of the simplex algorithm.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
res : A `scipy.optimize.OptimizeResult` consisting of the following fields:
|
||||
|
||||
x : 1-D array
|
||||
The independent variable vector which optimizes the linear
|
||||
programming problem.
|
||||
fun : float
|
||||
Value of the objective function.
|
||||
success : bool
|
||||
True if the algorithm succeeded in finding an optimal solution.
|
||||
slack : 1-D array
|
||||
The values of the slack variables. Each slack variable corresponds
|
||||
to an inequality constraint. If the slack is zero, then the
|
||||
corresponding constraint is active.
|
||||
con : 1-D array
|
||||
The (nominally zero) residuals of the equality constraints, that is,
|
||||
``b - A_eq @ x``.
|
||||
phase : int
|
||||
The phase of the optimization being executed. In phase 1 a basic
|
||||
feasible solution is sought and the T has an additional row
|
||||
representing an alternate objective function.
|
||||
status : int
|
||||
An integer representing the exit status of the optimization::
|
||||
|
||||
0 : Optimization terminated successfully
|
||||
1 : Iteration limit reached
|
||||
2 : Problem appears to be infeasible
|
||||
3 : Problem appears to be unbounded
|
||||
4 : Serious numerical difficulties encountered
|
||||
|
||||
nit : int
|
||||
The number of iterations performed.
|
||||
message : str
|
||||
A string descriptor of the exit status of the optimization.
|
||||
"""
|
||||
nit = res['nit']
|
||||
x = res['x']
|
||||
|
||||
if nit == 0:
|
||||
print("Iter: X:")
|
||||
print(f"{nit: <5d} ", end="")
|
||||
print(x)
|
||||
|
||||
|
||||
def linprog(c, A_ub=None, b_ub=None, A_eq=None, b_eq=None,
|
||||
bounds=(0, None), method='highs', callback=None,
|
||||
options=None, x0=None, integrality=None):
|
||||
r"""
|
||||
Linear programming: minimize a linear objective function subject to linear
|
||||
equality and inequality constraints.
|
||||
|
||||
Linear programming solves problems of the following form:
|
||||
|
||||
.. math::
|
||||
|
||||
\min_x \ & c^T x \\
|
||||
\mbox{such that} \ & A_{ub} x \leq b_{ub},\\
|
||||
& A_{eq} x = b_{eq},\\
|
||||
& l \leq x \leq u ,
|
||||
|
||||
where :math:`x` is a vector of decision variables; :math:`c`,
|
||||
:math:`b_{ub}`, :math:`b_{eq}`, :math:`l`, and :math:`u` are vectors; and
|
||||
:math:`A_{ub}` and :math:`A_{eq}` are matrices.
|
||||
|
||||
Alternatively, that's:
|
||||
|
||||
- minimize ::
|
||||
|
||||
c @ x
|
||||
|
||||
- such that ::
|
||||
|
||||
A_ub @ x <= b_ub
|
||||
A_eq @ x == b_eq
|
||||
lb <= x <= ub
|
||||
|
||||
Note that by default ``lb = 0`` and ``ub = None``. Other bounds can be
|
||||
specified with ``bounds``.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
c : 1-D array
|
||||
The coefficients of the linear objective function to be minimized.
|
||||
A_ub : 2-D array, optional
|
||||
The inequality constraint matrix. Each row of ``A_ub`` specifies the
|
||||
coefficients of a linear inequality constraint on ``x``.
|
||||
b_ub : 1-D array, optional
|
||||
The inequality constraint vector. Each element represents an
|
||||
upper bound on the corresponding value of ``A_ub @ x``.
|
||||
A_eq : 2-D array, optional
|
||||
The equality constraint matrix. Each row of ``A_eq`` specifies the
|
||||
coefficients of a linear equality constraint on ``x``.
|
||||
b_eq : 1-D array, optional
|
||||
The equality constraint vector. Each element of ``A_eq @ x`` must equal
|
||||
the corresponding element of ``b_eq``.
|
||||
bounds : sequence, optional
|
||||
A sequence of ``(min, max)`` pairs for each element in ``x``, defining
|
||||
the minimum and maximum values of that decision variable.
|
||||
If a single tuple ``(min, max)`` is provided, then ``min`` and ``max``
|
||||
will serve as bounds for all decision variables.
|
||||
Use ``None`` to indicate that there is no bound. For instance, the
|
||||
default bound ``(0, None)`` means that all decision variables are
|
||||
non-negative, and the pair ``(None, None)`` means no bounds at all,
|
||||
i.e. all variables are allowed to be any real.
|
||||
method : str, optional
|
||||
The algorithm used to solve the standard form problem.
|
||||
:ref:`'highs' <optimize.linprog-highs>` (default),
|
||||
:ref:`'highs-ds' <optimize.linprog-highs-ds>`,
|
||||
:ref:`'highs-ipm' <optimize.linprog-highs-ipm>`,
|
||||
:ref:`'interior-point' <optimize.linprog-interior-point>` (legacy),
|
||||
:ref:`'revised simplex' <optimize.linprog-revised_simplex>` (legacy),
|
||||
and
|
||||
:ref:`'simplex' <optimize.linprog-simplex>` (legacy) are supported.
|
||||
The legacy methods are deprecated and will be removed in SciPy 1.11.0.
|
||||
callback : callable, optional
|
||||
If a callback function is provided, it will be called at least once per
|
||||
iteration of the algorithm. The callback function must accept a single
|
||||
`scipy.optimize.OptimizeResult` consisting of the following fields:
|
||||
|
||||
x : 1-D array
|
||||
The current solution vector.
|
||||
fun : float
|
||||
The current value of the objective function ``c @ x``.
|
||||
success : bool
|
||||
``True`` when the algorithm has completed successfully.
|
||||
slack : 1-D array
|
||||
The (nominally positive) values of the slack,
|
||||
``b_ub - A_ub @ x``.
|
||||
con : 1-D array
|
||||
The (nominally zero) residuals of the equality constraints,
|
||||
``b_eq - A_eq @ x``.
|
||||
phase : int
|
||||
The phase of the algorithm being executed.
|
||||
status : int
|
||||
An integer representing the status of the algorithm.
|
||||
|
||||
``0`` : Optimization proceeding nominally.
|
||||
|
||||
``1`` : Iteration limit reached.
|
||||
|
||||
``2`` : Problem appears to be infeasible.
|
||||
|
||||
``3`` : Problem appears to be unbounded.
|
||||
|
||||
``4`` : Numerical difficulties encountered.
|
||||
|
||||
nit : int
|
||||
The current iteration number.
|
||||
message : str
|
||||
A string descriptor of the algorithm status.
|
||||
|
||||
Callback functions are not currently supported by the HiGHS methods.
|
||||
|
||||
options : dict, optional
|
||||
A dictionary of solver options. All methods accept the following
|
||||
options:
|
||||
|
||||
maxiter : int
|
||||
Maximum number of iterations to perform.
|
||||
Default: see method-specific documentation.
|
||||
disp : bool
|
||||
Set to ``True`` to print convergence messages.
|
||||
Default: ``False``.
|
||||
presolve : bool
|
||||
Set to ``False`` to disable automatic presolve.
|
||||
Default: ``True``.
|
||||
|
||||
All methods except the HiGHS solvers also accept:
|
||||
|
||||
tol : float
|
||||
A tolerance which determines when a residual is "close enough" to
|
||||
zero to be considered exactly zero.
|
||||
autoscale : bool
|
||||
Set to ``True`` to automatically perform equilibration.
|
||||
Consider using this option if the numerical values in the
|
||||
constraints are separated by several orders of magnitude.
|
||||
Default: ``False``.
|
||||
rr : bool
|
||||
Set to ``False`` to disable automatic redundancy removal.
|
||||
Default: ``True``.
|
||||
rr_method : string
|
||||
Method used to identify and remove redundant rows from the
|
||||
equality constraint matrix after presolve. For problems with
|
||||
dense input, the available methods for redundancy removal are:
|
||||
|
||||
"SVD":
|
||||
Repeatedly performs singular value decomposition on
|
||||
the matrix, detecting redundant rows based on nonzeros
|
||||
in the left singular vectors that correspond with
|
||||
zero singular values. May be fast when the matrix is
|
||||
nearly full rank.
|
||||
"pivot":
|
||||
Uses the algorithm presented in [5]_ to identify
|
||||
redundant rows.
|
||||
"ID":
|
||||
Uses a randomized interpolative decomposition.
|
||||
Identifies columns of the matrix transpose not used in
|
||||
a full-rank interpolative decomposition of the matrix.
|
||||
None:
|
||||
Uses "svd" if the matrix is nearly full rank, that is,
|
||||
the difference between the matrix rank and the number
|
||||
of rows is less than five. If not, uses "pivot". The
|
||||
behavior of this default is subject to change without
|
||||
prior notice.
|
||||
|
||||
Default: None.
|
||||
For problems with sparse input, this option is ignored, and the
|
||||
pivot-based algorithm presented in [5]_ is used.
|
||||
|
||||
For method-specific options, see
|
||||
:func:`show_options('linprog') <show_options>`.
|
||||
|
||||
x0 : 1-D array, optional
|
||||
Guess values of the decision variables, which will be refined by
|
||||
the optimization algorithm. This argument is currently used only by the
|
||||
'revised simplex' method, and can only be used if `x0` represents a
|
||||
basic feasible solution.
|
||||
|
||||
integrality : 1-D array or int, optional
|
||||
Indicates the type of integrality constraint on each decision variable.
|
||||
|
||||
``0`` : Continuous variable; no integrality constraint.
|
||||
|
||||
``1`` : Integer variable; decision variable must be an integer
|
||||
within `bounds`.
|
||||
|
||||
``2`` : Semi-continuous variable; decision variable must be within
|
||||
`bounds` or take value ``0``.
|
||||
|
||||
``3`` : Semi-integer variable; decision variable must be an integer
|
||||
within `bounds` or take value ``0``.
|
||||
|
||||
By default, all variables are continuous.
|
||||
|
||||
For mixed integrality constraints, supply an array of shape `c.shape`.
|
||||
To infer a constraint on each decision variable from shorter inputs,
|
||||
the argument will be broadcasted to `c.shape` using `np.broadcast_to`.
|
||||
|
||||
This argument is currently used only by the ``'highs'`` method and
|
||||
ignored otherwise.
|
||||
|
||||
Returns
|
||||
-------
|
||||
res : OptimizeResult
|
||||
A :class:`scipy.optimize.OptimizeResult` consisting of the fields
|
||||
below. Note that the return types of the fields may depend on whether
|
||||
the optimization was successful, therefore it is recommended to check
|
||||
`OptimizeResult.status` before relying on the other fields:
|
||||
|
||||
x : 1-D array
|
||||
The values of the decision variables that minimizes the
|
||||
objective function while satisfying the constraints.
|
||||
fun : float
|
||||
The optimal value of the objective function ``c @ x``.
|
||||
slack : 1-D array
|
||||
The (nominally positive) values of the slack variables,
|
||||
``b_ub - A_ub @ x``.
|
||||
con : 1-D array
|
||||
The (nominally zero) residuals of the equality constraints,
|
||||
``b_eq - A_eq @ x``.
|
||||
success : bool
|
||||
``True`` when the algorithm succeeds in finding an optimal
|
||||
solution.
|
||||
status : int
|
||||
An integer representing the exit status of the algorithm.
|
||||
|
||||
``0`` : Optimization terminated successfully.
|
||||
|
||||
``1`` : Iteration limit reached.
|
||||
|
||||
``2`` : Problem appears to be infeasible.
|
||||
|
||||
``3`` : Problem appears to be unbounded.
|
||||
|
||||
``4`` : Numerical difficulties encountered.
|
||||
|
||||
nit : int
|
||||
The total number of iterations performed in all phases.
|
||||
message : str
|
||||
A string descriptor of the exit status of the algorithm.
|
||||
|
||||
See Also
|
||||
--------
|
||||
show_options : Additional options accepted by the solvers.
|
||||
|
||||
Notes
|
||||
-----
|
||||
This section describes the available solvers that can be selected by the
|
||||
'method' parameter.
|
||||
|
||||
`'highs-ds'` and
|
||||
`'highs-ipm'` are interfaces to the
|
||||
HiGHS simplex and interior-point method solvers [13]_, respectively.
|
||||
`'highs'` (default) chooses between
|
||||
the two automatically. These are the fastest linear
|
||||
programming solvers in SciPy, especially for large, sparse problems;
|
||||
which of these two is faster is problem-dependent.
|
||||
The other solvers (`'interior-point'`, `'revised simplex'`, and
|
||||
`'simplex'`) are legacy methods and will be removed in SciPy 1.11.0.
|
||||
|
||||
Method *highs-ds* is a wrapper of the C++ high performance dual
|
||||
revised simplex implementation (HSOL) [13]_, [14]_. Method *highs-ipm*
|
||||
is a wrapper of a C++ implementation of an **i**\ nterior-\ **p**\ oint
|
||||
**m**\ ethod [13]_; it features a crossover routine, so it is as accurate
|
||||
as a simplex solver. Method *highs* chooses between the two automatically.
|
||||
For new code involving `linprog`, we recommend explicitly choosing one of
|
||||
these three method values.
|
||||
|
||||
.. versionadded:: 1.6.0
|
||||
|
||||
Method *interior-point* uses the primal-dual path following algorithm
|
||||
as outlined in [4]_. This algorithm supports sparse constraint matrices and
|
||||
is typically faster than the simplex methods, especially for large, sparse
|
||||
problems. Note, however, that the solution returned may be slightly less
|
||||
accurate than those of the simplex methods and will not, in general,
|
||||
correspond with a vertex of the polytope defined by the constraints.
|
||||
|
||||
.. versionadded:: 1.0.0
|
||||
|
||||
Method *revised simplex* uses the revised simplex method as described in
|
||||
[9]_, except that a factorization [11]_ of the basis matrix, rather than
|
||||
its inverse, is efficiently maintained and used to solve the linear systems
|
||||
at each iteration of the algorithm.
|
||||
|
||||
.. versionadded:: 1.3.0
|
||||
|
||||
Method *simplex* uses a traditional, full-tableau implementation of
|
||||
Dantzig's simplex algorithm [1]_, [2]_ (*not* the
|
||||
Nelder-Mead simplex). This algorithm is included for backwards
|
||||
compatibility and educational purposes.
|
||||
|
||||
.. versionadded:: 0.15.0
|
||||
|
||||
Before applying *interior-point*, *revised simplex*, or *simplex*,
|
||||
a presolve procedure based on [8]_ attempts
|
||||
to identify trivial infeasibilities, trivial unboundedness, and potential
|
||||
problem simplifications. Specifically, it checks for:
|
||||
|
||||
- rows of zeros in ``A_eq`` or ``A_ub``, representing trivial constraints;
|
||||
- columns of zeros in ``A_eq`` `and` ``A_ub``, representing unconstrained
|
||||
variables;
|
||||
- column singletons in ``A_eq``, representing fixed variables; and
|
||||
- column singletons in ``A_ub``, representing simple bounds.
|
||||
|
||||
If presolve reveals that the problem is unbounded (e.g. an unconstrained
|
||||
and unbounded variable has negative cost) or infeasible (e.g., a row of
|
||||
zeros in ``A_eq`` corresponds with a nonzero in ``b_eq``), the solver
|
||||
terminates with the appropriate status code. Note that presolve terminates
|
||||
as soon as any sign of unboundedness is detected; consequently, a problem
|
||||
may be reported as unbounded when in reality the problem is infeasible
|
||||
(but infeasibility has not been detected yet). Therefore, if it is
|
||||
important to know whether the problem is actually infeasible, solve the
|
||||
problem again with option ``presolve=False``.
|
||||
|
||||
If neither infeasibility nor unboundedness are detected in a single pass
|
||||
of the presolve, bounds are tightened where possible and fixed
|
||||
variables are removed from the problem. Then, linearly dependent rows
|
||||
of the ``A_eq`` matrix are removed, (unless they represent an
|
||||
infeasibility) to avoid numerical difficulties in the primary solve
|
||||
routine. Note that rows that are nearly linearly dependent (within a
|
||||
prescribed tolerance) may also be removed, which can change the optimal
|
||||
solution in rare cases. If this is a concern, eliminate redundancy from
|
||||
your problem formulation and run with option ``rr=False`` or
|
||||
``presolve=False``.
|
||||
|
||||
Several potential improvements can be made here: additional presolve
|
||||
checks outlined in [8]_ should be implemented, the presolve routine should
|
||||
be run multiple times (until no further simplifications can be made), and
|
||||
more of the efficiency improvements from [5]_ should be implemented in the
|
||||
redundancy removal routines.
|
||||
|
||||
After presolve, the problem is transformed to standard form by converting
|
||||
the (tightened) simple bounds to upper bound constraints, introducing
|
||||
non-negative slack variables for inequality constraints, and expressing
|
||||
unbounded variables as the difference between two non-negative variables.
|
||||
Optionally, the problem is automatically scaled via equilibration [12]_.
|
||||
The selected algorithm solves the standard form problem, and a
|
||||
postprocessing routine converts the result to a solution to the original
|
||||
problem.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Dantzig, George B., Linear programming and extensions. Rand
|
||||
Corporation Research Study Princeton Univ. Press, Princeton, NJ,
|
||||
1963
|
||||
.. [2] Hillier, S.H. and Lieberman, G.J. (1995), "Introduction to
|
||||
Mathematical Programming", McGraw-Hill, Chapter 4.
|
||||
.. [3] Bland, Robert G. New finite pivoting rules for the simplex method.
|
||||
Mathematics of Operations Research (2), 1977: pp. 103-107.
|
||||
.. [4] Andersen, Erling D., and Knud D. Andersen. "The MOSEK interior point
|
||||
optimizer for linear programming: an implementation of the
|
||||
homogeneous algorithm." High performance optimization. Springer US,
|
||||
2000. 197-232.
|
||||
.. [5] Andersen, Erling D. "Finding all linearly dependent rows in
|
||||
large-scale linear programming." Optimization Methods and Software
|
||||
6.3 (1995): 219-227.
|
||||
.. [6] Freund, Robert M. "Primal-Dual Interior-Point Methods for Linear
|
||||
Programming based on Newton's Method." Unpublished Course Notes,
|
||||
March 2004. Available 2/25/2017 at
|
||||
https://ocw.mit.edu/courses/sloan-school-of-management/15-084j-nonlinear-programming-spring-2004/lecture-notes/lec14_int_pt_mthd.pdf
|
||||
.. [7] Fourer, Robert. "Solving Linear Programs by Interior-Point Methods."
|
||||
Unpublished Course Notes, August 26, 2005. Available 2/25/2017 at
|
||||
http://www.4er.org/CourseNotes/Book%20B/B-III.pdf
|
||||
.. [8] Andersen, Erling D., and Knud D. Andersen. "Presolving in linear
|
||||
programming." Mathematical Programming 71.2 (1995): 221-245.
|
||||
.. [9] Bertsimas, Dimitris, and J. Tsitsiklis. "Introduction to linear
|
||||
programming." Athena Scientific 1 (1997): 997.
|
||||
.. [10] Andersen, Erling D., et al. Implementation of interior point
|
||||
methods for large scale linear programming. HEC/Universite de
|
||||
Geneve, 1996.
|
||||
.. [11] Bartels, Richard H. "A stabilization of the simplex method."
|
||||
Journal in Numerische Mathematik 16.5 (1971): 414-434.
|
||||
.. [12] Tomlin, J. A. "On scaling linear programming problems."
|
||||
Mathematical Programming Study 4 (1975): 146-166.
|
||||
.. [13] Huangfu, Q., Galabova, I., Feldmeier, M., and Hall, J. A. J.
|
||||
"HiGHS - high performance software for linear optimization."
|
||||
https://highs.dev/
|
||||
.. [14] Huangfu, Q. and Hall, J. A. J. "Parallelizing the dual revised
|
||||
simplex method." Mathematical Programming Computation, 10 (1),
|
||||
119-142, 2018. DOI: 10.1007/s12532-017-0130-5
|
||||
|
||||
Examples
|
||||
--------
|
||||
Consider the following problem:
|
||||
|
||||
.. math::
|
||||
|
||||
\min_{x_0, x_1} \ -x_0 + 4x_1 & \\
|
||||
\mbox{such that} \ -3x_0 + x_1 & \leq 6,\\
|
||||
-x_0 - 2x_1 & \geq -4,\\
|
||||
x_1 & \geq -3.
|
||||
|
||||
The problem is not presented in the form accepted by `linprog`. This is
|
||||
easily remedied by converting the "greater than" inequality
|
||||
constraint to a "less than" inequality constraint by
|
||||
multiplying both sides by a factor of :math:`-1`. Note also that the last
|
||||
constraint is really the simple bound :math:`-3 \leq x_1 \leq \infty`.
|
||||
Finally, since there are no bounds on :math:`x_0`, we must explicitly
|
||||
specify the bounds :math:`-\infty \leq x_0 \leq \infty`, as the
|
||||
default is for variables to be non-negative. After collecting coeffecients
|
||||
into arrays and tuples, the input for this problem is:
|
||||
|
||||
>>> from scipy.optimize import linprog
|
||||
>>> c = [-1, 4]
|
||||
>>> A = [[-3, 1], [1, 2]]
|
||||
>>> b = [6, 4]
|
||||
>>> x0_bounds = (None, None)
|
||||
>>> x1_bounds = (-3, None)
|
||||
>>> res = linprog(c, A_ub=A, b_ub=b, bounds=[x0_bounds, x1_bounds])
|
||||
>>> res.fun
|
||||
-22.0
|
||||
>>> res.x
|
||||
array([10., -3.])
|
||||
>>> res.message
|
||||
'Optimization terminated successfully. (HiGHS Status 7: Optimal)'
|
||||
|
||||
The marginals (AKA dual values / shadow prices / Lagrange multipliers)
|
||||
and residuals (slacks) are also available.
|
||||
|
||||
>>> res.ineqlin
|
||||
residual: [ 3.900e+01 0.000e+00]
|
||||
marginals: [-0.000e+00 -1.000e+00]
|
||||
|
||||
For example, because the marginal associated with the second inequality
|
||||
constraint is -1, we expect the optimal value of the objective function
|
||||
to decrease by ``eps`` if we add a small amount ``eps`` to the right hand
|
||||
side of the second inequality constraint:
|
||||
|
||||
>>> eps = 0.05
|
||||
>>> b[1] += eps
|
||||
>>> linprog(c, A_ub=A, b_ub=b, bounds=[x0_bounds, x1_bounds]).fun
|
||||
-22.05
|
||||
|
||||
Also, because the residual on the first inequality constraint is 39, we
|
||||
can decrease the right hand side of the first constraint by 39 without
|
||||
affecting the optimal solution.
|
||||
|
||||
>>> b = [6, 4] # reset to original values
|
||||
>>> b[0] -= 39
|
||||
>>> linprog(c, A_ub=A, b_ub=b, bounds=[x0_bounds, x1_bounds]).fun
|
||||
-22.0
|
||||
|
||||
"""
|
||||
|
||||
meth = method.lower()
|
||||
methods = {"highs", "highs-ds", "highs-ipm",
|
||||
"simplex", "revised simplex", "interior-point"}
|
||||
|
||||
if meth not in methods:
|
||||
raise ValueError(f"Unknown solver '{method}'")
|
||||
|
||||
if x0 is not None and meth != "revised simplex":
|
||||
warning_message = "x0 is used only when method is 'revised simplex'. "
|
||||
warn(warning_message, OptimizeWarning, stacklevel=2)
|
||||
|
||||
if np.any(integrality) and not meth == "highs":
|
||||
integrality = None
|
||||
warning_message = ("Only `method='highs'` supports integer "
|
||||
"constraints. Ignoring `integrality`.")
|
||||
warn(warning_message, OptimizeWarning, stacklevel=2)
|
||||
elif np.any(integrality):
|
||||
integrality = np.broadcast_to(integrality, np.shape(c))
|
||||
else:
|
||||
integrality = None
|
||||
|
||||
lp = _LPProblem(c, A_ub, b_ub, A_eq, b_eq, bounds, x0, integrality)
|
||||
lp, solver_options = _parse_linprog(lp, options, meth)
|
||||
tol = solver_options.get('tol', 1e-9)
|
||||
|
||||
# Give unmodified problem to HiGHS
|
||||
if meth.startswith('highs'):
|
||||
if callback is not None:
|
||||
raise NotImplementedError("HiGHS solvers do not support the "
|
||||
"callback interface.")
|
||||
highs_solvers = {'highs-ipm': 'ipm', 'highs-ds': 'simplex',
|
||||
'highs': None}
|
||||
|
||||
sol = _linprog_highs(lp, solver=highs_solvers[meth],
|
||||
**solver_options)
|
||||
sol['status'], sol['message'] = (
|
||||
_check_result(sol['x'], sol['fun'], sol['status'], sol['slack'],
|
||||
sol['con'], lp.bounds, tol, sol['message'],
|
||||
integrality))
|
||||
sol['success'] = sol['status'] == 0
|
||||
return OptimizeResult(sol)
|
||||
|
||||
warn(f"`method='{meth}'` is deprecated and will be removed in SciPy "
|
||||
"1.11.0. Please use one of the HiGHS solvers (e.g. "
|
||||
"`method='highs'`) in new code.", DeprecationWarning, stacklevel=2)
|
||||
|
||||
iteration = 0
|
||||
complete = False # will become True if solved in presolve
|
||||
undo = []
|
||||
|
||||
# Keep the original arrays to calculate slack/residuals for original
|
||||
# problem.
|
||||
lp_o = deepcopy(lp)
|
||||
|
||||
# Solve trivial problem, eliminate variables, tighten bounds, etc.
|
||||
rr_method = solver_options.pop('rr_method', None) # need to pop these;
|
||||
rr = solver_options.pop('rr', True) # they're not passed to methods
|
||||
c0 = 0 # we might get a constant term in the objective
|
||||
if solver_options.pop('presolve', True):
|
||||
(lp, c0, x, undo, complete, status, message) = _presolve(lp, rr,
|
||||
rr_method,
|
||||
tol)
|
||||
|
||||
C, b_scale = 1, 1 # for trivial unscaling if autoscale is not used
|
||||
postsolve_args = (lp_o._replace(bounds=lp.bounds), undo, C, b_scale)
|
||||
|
||||
if not complete:
|
||||
A, b, c, c0, x0 = _get_Abc(lp, c0)
|
||||
if solver_options.pop('autoscale', False):
|
||||
A, b, c, x0, C, b_scale = _autoscale(A, b, c, x0)
|
||||
postsolve_args = postsolve_args[:-2] + (C, b_scale)
|
||||
|
||||
if meth == 'simplex':
|
||||
x, status, message, iteration = _linprog_simplex(
|
||||
c, c0=c0, A=A, b=b, callback=callback,
|
||||
postsolve_args=postsolve_args, **solver_options)
|
||||
elif meth == 'interior-point':
|
||||
x, status, message, iteration = _linprog_ip(
|
||||
c, c0=c0, A=A, b=b, callback=callback,
|
||||
postsolve_args=postsolve_args, **solver_options)
|
||||
elif meth == 'revised simplex':
|
||||
x, status, message, iteration = _linprog_rs(
|
||||
c, c0=c0, A=A, b=b, x0=x0, callback=callback,
|
||||
postsolve_args=postsolve_args, **solver_options)
|
||||
|
||||
# Eliminate artificial variables, re-introduce presolved variables, etc.
|
||||
disp = solver_options.get('disp', False)
|
||||
|
||||
x, fun, slack, con = _postsolve(x, postsolve_args, complete)
|
||||
|
||||
status, message = _check_result(x, fun, status, slack, con, lp_o.bounds,
|
||||
tol, message, integrality)
|
||||
|
||||
if disp:
|
||||
_display_summary(message, status, fun, iteration)
|
||||
|
||||
sol = {
|
||||
'x': x,
|
||||
'fun': fun,
|
||||
'slack': slack,
|
||||
'con': con,
|
||||
'status': status,
|
||||
'message': message,
|
||||
'nit': iteration,
|
||||
'success': status == 0}
|
||||
|
||||
return OptimizeResult(sol)
|
||||
1434
venv/lib/python3.12/site-packages/scipy/optimize/_linprog_doc.py
Normal file
1434
venv/lib/python3.12/site-packages/scipy/optimize/_linprog_doc.py
Normal file
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,440 @@
|
||||
"""HiGHS Linear Optimization Methods
|
||||
|
||||
Interface to HiGHS linear optimization software.
|
||||
https://highs.dev/
|
||||
|
||||
.. versionadded:: 1.5.0
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Q. Huangfu and J.A.J. Hall. "Parallelizing the dual revised simplex
|
||||
method." Mathematical Programming Computation, 10 (1), 119-142,
|
||||
2018. DOI: 10.1007/s12532-017-0130-5
|
||||
|
||||
"""
|
||||
|
||||
import inspect
|
||||
import numpy as np
|
||||
from ._optimize import OptimizeWarning, OptimizeResult
|
||||
from warnings import warn
|
||||
from ._highs._highs_wrapper import _highs_wrapper
|
||||
from ._highs._highs_constants import (
|
||||
CONST_INF,
|
||||
MESSAGE_LEVEL_NONE,
|
||||
HIGHS_OBJECTIVE_SENSE_MINIMIZE,
|
||||
|
||||
MODEL_STATUS_NOTSET,
|
||||
MODEL_STATUS_LOAD_ERROR,
|
||||
MODEL_STATUS_MODEL_ERROR,
|
||||
MODEL_STATUS_PRESOLVE_ERROR,
|
||||
MODEL_STATUS_SOLVE_ERROR,
|
||||
MODEL_STATUS_POSTSOLVE_ERROR,
|
||||
MODEL_STATUS_MODEL_EMPTY,
|
||||
MODEL_STATUS_OPTIMAL,
|
||||
MODEL_STATUS_INFEASIBLE,
|
||||
MODEL_STATUS_UNBOUNDED_OR_INFEASIBLE,
|
||||
MODEL_STATUS_UNBOUNDED,
|
||||
MODEL_STATUS_REACHED_DUAL_OBJECTIVE_VALUE_UPPER_BOUND
|
||||
as MODEL_STATUS_RDOVUB,
|
||||
MODEL_STATUS_REACHED_OBJECTIVE_TARGET,
|
||||
MODEL_STATUS_REACHED_TIME_LIMIT,
|
||||
MODEL_STATUS_REACHED_ITERATION_LIMIT,
|
||||
|
||||
HIGHS_SIMPLEX_STRATEGY_DUAL,
|
||||
|
||||
HIGHS_SIMPLEX_CRASH_STRATEGY_OFF,
|
||||
|
||||
HIGHS_SIMPLEX_EDGE_WEIGHT_STRATEGY_CHOOSE,
|
||||
HIGHS_SIMPLEX_EDGE_WEIGHT_STRATEGY_DANTZIG,
|
||||
HIGHS_SIMPLEX_EDGE_WEIGHT_STRATEGY_DEVEX,
|
||||
HIGHS_SIMPLEX_EDGE_WEIGHT_STRATEGY_STEEPEST_EDGE,
|
||||
)
|
||||
from scipy.sparse import csc_matrix, vstack, issparse
|
||||
|
||||
|
||||
def _highs_to_scipy_status_message(highs_status, highs_message):
|
||||
"""Converts HiGHS status number/message to SciPy status number/message"""
|
||||
|
||||
scipy_statuses_messages = {
|
||||
None: (4, "HiGHS did not provide a status code. "),
|
||||
MODEL_STATUS_NOTSET: (4, ""),
|
||||
MODEL_STATUS_LOAD_ERROR: (4, ""),
|
||||
MODEL_STATUS_MODEL_ERROR: (2, ""),
|
||||
MODEL_STATUS_PRESOLVE_ERROR: (4, ""),
|
||||
MODEL_STATUS_SOLVE_ERROR: (4, ""),
|
||||
MODEL_STATUS_POSTSOLVE_ERROR: (4, ""),
|
||||
MODEL_STATUS_MODEL_EMPTY: (4, ""),
|
||||
MODEL_STATUS_RDOVUB: (4, ""),
|
||||
MODEL_STATUS_REACHED_OBJECTIVE_TARGET: (4, ""),
|
||||
MODEL_STATUS_OPTIMAL: (0, "Optimization terminated successfully. "),
|
||||
MODEL_STATUS_REACHED_TIME_LIMIT: (1, "Time limit reached. "),
|
||||
MODEL_STATUS_REACHED_ITERATION_LIMIT: (1, "Iteration limit reached. "),
|
||||
MODEL_STATUS_INFEASIBLE: (2, "The problem is infeasible. "),
|
||||
MODEL_STATUS_UNBOUNDED: (3, "The problem is unbounded. "),
|
||||
MODEL_STATUS_UNBOUNDED_OR_INFEASIBLE: (4, "The problem is unbounded "
|
||||
"or infeasible. ")}
|
||||
unrecognized = (4, "The HiGHS status code was not recognized. ")
|
||||
scipy_status, scipy_message = (
|
||||
scipy_statuses_messages.get(highs_status, unrecognized))
|
||||
scipy_message = (f"{scipy_message}"
|
||||
f"(HiGHS Status {highs_status}: {highs_message})")
|
||||
return scipy_status, scipy_message
|
||||
|
||||
|
||||
def _replace_inf(x):
|
||||
# Replace `np.inf` with CONST_INF
|
||||
infs = np.isinf(x)
|
||||
with np.errstate(invalid="ignore"):
|
||||
x[infs] = np.sign(x[infs])*CONST_INF
|
||||
return x
|
||||
|
||||
|
||||
def _convert_to_highs_enum(option, option_str, choices):
|
||||
# If option is in the choices we can look it up, if not use
|
||||
# the default value taken from function signature and warn:
|
||||
try:
|
||||
return choices[option.lower()]
|
||||
except AttributeError:
|
||||
return choices[option]
|
||||
except KeyError:
|
||||
sig = inspect.signature(_linprog_highs)
|
||||
default_str = sig.parameters[option_str].default
|
||||
warn(f"Option {option_str} is {option}, but only values in "
|
||||
f"{set(choices.keys())} are allowed. Using default: "
|
||||
f"{default_str}.",
|
||||
OptimizeWarning, stacklevel=3)
|
||||
return choices[default_str]
|
||||
|
||||
|
||||
def _linprog_highs(lp, solver, time_limit=None, presolve=True,
|
||||
disp=False, maxiter=None,
|
||||
dual_feasibility_tolerance=None,
|
||||
primal_feasibility_tolerance=None,
|
||||
ipm_optimality_tolerance=None,
|
||||
simplex_dual_edge_weight_strategy=None,
|
||||
mip_rel_gap=None,
|
||||
mip_max_nodes=None,
|
||||
**unknown_options):
|
||||
r"""
|
||||
Solve the following linear programming problem using one of the HiGHS
|
||||
solvers:
|
||||
|
||||
User-facing documentation is in _linprog_doc.py.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
lp : _LPProblem
|
||||
A ``scipy.optimize._linprog_util._LPProblem`` ``namedtuple``.
|
||||
solver : "ipm" or "simplex" or None
|
||||
Which HiGHS solver to use. If ``None``, "simplex" will be used.
|
||||
|
||||
Options
|
||||
-------
|
||||
maxiter : int
|
||||
The maximum number of iterations to perform in either phase. For
|
||||
``solver='ipm'``, this does not include the number of crossover
|
||||
iterations. Default is the largest possible value for an ``int``
|
||||
on the platform.
|
||||
disp : bool
|
||||
Set to ``True`` if indicators of optimization status are to be printed
|
||||
to the console each iteration; default ``False``.
|
||||
time_limit : float
|
||||
The maximum time in seconds allotted to solve the problem; default is
|
||||
the largest possible value for a ``double`` on the platform.
|
||||
presolve : bool
|
||||
Presolve attempts to identify trivial infeasibilities,
|
||||
identify trivial unboundedness, and simplify the problem before
|
||||
sending it to the main solver. It is generally recommended
|
||||
to keep the default setting ``True``; set to ``False`` if presolve is
|
||||
to be disabled.
|
||||
dual_feasibility_tolerance : double
|
||||
Dual feasibility tolerance. Default is 1e-07.
|
||||
The minimum of this and ``primal_feasibility_tolerance``
|
||||
is used for the feasibility tolerance when ``solver='ipm'``.
|
||||
primal_feasibility_tolerance : double
|
||||
Primal feasibility tolerance. Default is 1e-07.
|
||||
The minimum of this and ``dual_feasibility_tolerance``
|
||||
is used for the feasibility tolerance when ``solver='ipm'``.
|
||||
ipm_optimality_tolerance : double
|
||||
Optimality tolerance for ``solver='ipm'``. Default is 1e-08.
|
||||
Minimum possible value is 1e-12 and must be smaller than the largest
|
||||
possible value for a ``double`` on the platform.
|
||||
simplex_dual_edge_weight_strategy : str (default: None)
|
||||
Strategy for simplex dual edge weights. The default, ``None``,
|
||||
automatically selects one of the following.
|
||||
|
||||
``'dantzig'`` uses Dantzig's original strategy of choosing the most
|
||||
negative reduced cost.
|
||||
|
||||
``'devex'`` uses the strategy described in [15]_.
|
||||
|
||||
``steepest`` uses the exact steepest edge strategy as described in
|
||||
[16]_.
|
||||
|
||||
``'steepest-devex'`` begins with the exact steepest edge strategy
|
||||
until the computation is too costly or inexact and then switches to
|
||||
the devex method.
|
||||
|
||||
Currently, using ``None`` always selects ``'steepest-devex'``, but this
|
||||
may change as new options become available.
|
||||
|
||||
mip_max_nodes : int
|
||||
The maximum number of nodes allotted to solve the problem; default is
|
||||
the largest possible value for a ``HighsInt`` on the platform.
|
||||
Ignored if not using the MIP solver.
|
||||
unknown_options : dict
|
||||
Optional arguments not used by this particular solver. If
|
||||
``unknown_options`` is non-empty, a warning is issued listing all
|
||||
unused options.
|
||||
|
||||
Returns
|
||||
-------
|
||||
sol : dict
|
||||
A dictionary consisting of the fields:
|
||||
|
||||
x : 1D array
|
||||
The values of the decision variables that minimizes the
|
||||
objective function while satisfying the constraints.
|
||||
fun : float
|
||||
The optimal value of the objective function ``c @ x``.
|
||||
slack : 1D array
|
||||
The (nominally positive) values of the slack,
|
||||
``b_ub - A_ub @ x``.
|
||||
con : 1D array
|
||||
The (nominally zero) residuals of the equality constraints,
|
||||
``b_eq - A_eq @ x``.
|
||||
success : bool
|
||||
``True`` when the algorithm succeeds in finding an optimal
|
||||
solution.
|
||||
status : int
|
||||
An integer representing the exit status of the algorithm.
|
||||
|
||||
``0`` : Optimization terminated successfully.
|
||||
|
||||
``1`` : Iteration or time limit reached.
|
||||
|
||||
``2`` : Problem appears to be infeasible.
|
||||
|
||||
``3`` : Problem appears to be unbounded.
|
||||
|
||||
``4`` : The HiGHS solver ran into a problem.
|
||||
|
||||
message : str
|
||||
A string descriptor of the exit status of the algorithm.
|
||||
nit : int
|
||||
The total number of iterations performed.
|
||||
For ``solver='simplex'``, this includes iterations in all
|
||||
phases. For ``solver='ipm'``, this does not include
|
||||
crossover iterations.
|
||||
crossover_nit : int
|
||||
The number of primal/dual pushes performed during the
|
||||
crossover routine for ``solver='ipm'``. This is ``0``
|
||||
for ``solver='simplex'``.
|
||||
ineqlin : OptimizeResult
|
||||
Solution and sensitivity information corresponding to the
|
||||
inequality constraints, `b_ub`. A dictionary consisting of the
|
||||
fields:
|
||||
|
||||
residual : np.ndnarray
|
||||
The (nominally positive) values of the slack variables,
|
||||
``b_ub - A_ub @ x``. This quantity is also commonly
|
||||
referred to as "slack".
|
||||
|
||||
marginals : np.ndarray
|
||||
The sensitivity (partial derivative) of the objective
|
||||
function with respect to the right-hand side of the
|
||||
inequality constraints, `b_ub`.
|
||||
|
||||
eqlin : OptimizeResult
|
||||
Solution and sensitivity information corresponding to the
|
||||
equality constraints, `b_eq`. A dictionary consisting of the
|
||||
fields:
|
||||
|
||||
residual : np.ndarray
|
||||
The (nominally zero) residuals of the equality constraints,
|
||||
``b_eq - A_eq @ x``.
|
||||
|
||||
marginals : np.ndarray
|
||||
The sensitivity (partial derivative) of the objective
|
||||
function with respect to the right-hand side of the
|
||||
equality constraints, `b_eq`.
|
||||
|
||||
lower, upper : OptimizeResult
|
||||
Solution and sensitivity information corresponding to the
|
||||
lower and upper bounds on decision variables, `bounds`.
|
||||
|
||||
residual : np.ndarray
|
||||
The (nominally positive) values of the quantity
|
||||
``x - lb`` (lower) or ``ub - x`` (upper).
|
||||
|
||||
marginals : np.ndarray
|
||||
The sensitivity (partial derivative) of the objective
|
||||
function with respect to the lower and upper
|
||||
`bounds`.
|
||||
|
||||
mip_node_count : int
|
||||
The number of subproblems or "nodes" solved by the MILP
|
||||
solver. Only present when `integrality` is not `None`.
|
||||
|
||||
mip_dual_bound : float
|
||||
The MILP solver's final estimate of the lower bound on the
|
||||
optimal solution. Only present when `integrality` is not
|
||||
`None`.
|
||||
|
||||
mip_gap : float
|
||||
The difference between the final objective function value
|
||||
and the final dual bound, scaled by the final objective
|
||||
function value. Only present when `integrality` is not
|
||||
`None`.
|
||||
|
||||
Notes
|
||||
-----
|
||||
The result fields `ineqlin`, `eqlin`, `lower`, and `upper` all contain
|
||||
`marginals`, or partial derivatives of the objective function with respect
|
||||
to the right-hand side of each constraint. These partial derivatives are
|
||||
also referred to as "Lagrange multipliers", "dual values", and
|
||||
"shadow prices". The sign convention of `marginals` is opposite that
|
||||
of Lagrange multipliers produced by many nonlinear solvers.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [15] Harris, Paula MJ. "Pivot selection methods of the Devex LP code."
|
||||
Mathematical programming 5.1 (1973): 1-28.
|
||||
.. [16] Goldfarb, Donald, and John Ker Reid. "A practicable steepest-edge
|
||||
simplex algorithm." Mathematical Programming 12.1 (1977): 361-371.
|
||||
"""
|
||||
if unknown_options:
|
||||
message = (f"Unrecognized options detected: {unknown_options}. "
|
||||
"These will be passed to HiGHS verbatim.")
|
||||
warn(message, OptimizeWarning, stacklevel=3)
|
||||
|
||||
# Map options to HiGHS enum values
|
||||
simplex_dual_edge_weight_strategy_enum = _convert_to_highs_enum(
|
||||
simplex_dual_edge_weight_strategy,
|
||||
'simplex_dual_edge_weight_strategy',
|
||||
choices={'dantzig': HIGHS_SIMPLEX_EDGE_WEIGHT_STRATEGY_DANTZIG,
|
||||
'devex': HIGHS_SIMPLEX_EDGE_WEIGHT_STRATEGY_DEVEX,
|
||||
'steepest-devex': HIGHS_SIMPLEX_EDGE_WEIGHT_STRATEGY_CHOOSE,
|
||||
'steepest':
|
||||
HIGHS_SIMPLEX_EDGE_WEIGHT_STRATEGY_STEEPEST_EDGE,
|
||||
None: None})
|
||||
|
||||
c, A_ub, b_ub, A_eq, b_eq, bounds, x0, integrality = lp
|
||||
|
||||
lb, ub = bounds.T.copy() # separate bounds, copy->C-cntgs
|
||||
# highs_wrapper solves LHS <= A*x <= RHS, not equality constraints
|
||||
with np.errstate(invalid="ignore"):
|
||||
lhs_ub = -np.ones_like(b_ub)*np.inf # LHS of UB constraints is -inf
|
||||
rhs_ub = b_ub # RHS of UB constraints is b_ub
|
||||
lhs_eq = b_eq # Equality constraint is inequality
|
||||
rhs_eq = b_eq # constraint with LHS=RHS
|
||||
lhs = np.concatenate((lhs_ub, lhs_eq))
|
||||
rhs = np.concatenate((rhs_ub, rhs_eq))
|
||||
|
||||
if issparse(A_ub) or issparse(A_eq):
|
||||
A = vstack((A_ub, A_eq))
|
||||
else:
|
||||
A = np.vstack((A_ub, A_eq))
|
||||
A = csc_matrix(A)
|
||||
|
||||
options = {
|
||||
'presolve': presolve,
|
||||
'sense': HIGHS_OBJECTIVE_SENSE_MINIMIZE,
|
||||
'solver': solver,
|
||||
'time_limit': time_limit,
|
||||
'highs_debug_level': MESSAGE_LEVEL_NONE,
|
||||
'dual_feasibility_tolerance': dual_feasibility_tolerance,
|
||||
'ipm_optimality_tolerance': ipm_optimality_tolerance,
|
||||
'log_to_console': disp,
|
||||
'mip_max_nodes': mip_max_nodes,
|
||||
'output_flag': disp,
|
||||
'primal_feasibility_tolerance': primal_feasibility_tolerance,
|
||||
'simplex_dual_edge_weight_strategy':
|
||||
simplex_dual_edge_weight_strategy_enum,
|
||||
'simplex_strategy': HIGHS_SIMPLEX_STRATEGY_DUAL,
|
||||
'simplex_crash_strategy': HIGHS_SIMPLEX_CRASH_STRATEGY_OFF,
|
||||
'ipm_iteration_limit': maxiter,
|
||||
'simplex_iteration_limit': maxiter,
|
||||
'mip_rel_gap': mip_rel_gap,
|
||||
}
|
||||
options.update(unknown_options)
|
||||
|
||||
# np.inf doesn't work; use very large constant
|
||||
rhs = _replace_inf(rhs)
|
||||
lhs = _replace_inf(lhs)
|
||||
lb = _replace_inf(lb)
|
||||
ub = _replace_inf(ub)
|
||||
|
||||
if integrality is None or np.sum(integrality) == 0:
|
||||
integrality = np.empty(0)
|
||||
else:
|
||||
integrality = np.array(integrality)
|
||||
|
||||
res = _highs_wrapper(c, A.indptr, A.indices, A.data, lhs, rhs,
|
||||
lb, ub, integrality.astype(np.uint8), options)
|
||||
|
||||
# HiGHS represents constraints as lhs/rhs, so
|
||||
# Ax + s = b => Ax = b - s
|
||||
# and we need to split up s by A_ub and A_eq
|
||||
if 'slack' in res:
|
||||
slack = res['slack']
|
||||
con = np.array(slack[len(b_ub):])
|
||||
slack = np.array(slack[:len(b_ub)])
|
||||
else:
|
||||
slack, con = None, None
|
||||
|
||||
# lagrange multipliers for equalities/inequalities and upper/lower bounds
|
||||
if 'lambda' in res:
|
||||
lamda = res['lambda']
|
||||
marg_ineqlin = np.array(lamda[:len(b_ub)])
|
||||
marg_eqlin = np.array(lamda[len(b_ub):])
|
||||
marg_upper = np.array(res['marg_bnds'][1, :])
|
||||
marg_lower = np.array(res['marg_bnds'][0, :])
|
||||
else:
|
||||
marg_ineqlin, marg_eqlin = None, None
|
||||
marg_upper, marg_lower = None, None
|
||||
|
||||
# this needs to be updated if we start choosing the solver intelligently
|
||||
|
||||
# Convert to scipy-style status and message
|
||||
highs_status = res.get('status', None)
|
||||
highs_message = res.get('message', None)
|
||||
status, message = _highs_to_scipy_status_message(highs_status,
|
||||
highs_message)
|
||||
|
||||
x = np.array(res['x']) if 'x' in res else None
|
||||
sol = {'x': x,
|
||||
'slack': slack,
|
||||
'con': con,
|
||||
'ineqlin': OptimizeResult({
|
||||
'residual': slack,
|
||||
'marginals': marg_ineqlin,
|
||||
}),
|
||||
'eqlin': OptimizeResult({
|
||||
'residual': con,
|
||||
'marginals': marg_eqlin,
|
||||
}),
|
||||
'lower': OptimizeResult({
|
||||
'residual': None if x is None else x - lb,
|
||||
'marginals': marg_lower,
|
||||
}),
|
||||
'upper': OptimizeResult({
|
||||
'residual': None if x is None else ub - x,
|
||||
'marginals': marg_upper
|
||||
}),
|
||||
'fun': res.get('fun'),
|
||||
'status': status,
|
||||
'success': res['status'] == MODEL_STATUS_OPTIMAL,
|
||||
'message': message,
|
||||
'nit': res.get('simplex_nit', 0) or res.get('ipm_nit', 0),
|
||||
'crossover_nit': res.get('crossover_nit'),
|
||||
}
|
||||
|
||||
if np.any(x) and integrality is not None:
|
||||
sol.update({
|
||||
'mip_node_count': res.get('mip_node_count', 0),
|
||||
'mip_dual_bound': res.get('mip_dual_bound', 0.0),
|
||||
'mip_gap': res.get('mip_gap', 0.0),
|
||||
})
|
||||
|
||||
return sol
|
||||
1126
venv/lib/python3.12/site-packages/scipy/optimize/_linprog_ip.py
Normal file
1126
venv/lib/python3.12/site-packages/scipy/optimize/_linprog_ip.py
Normal file
File diff suppressed because it is too large
Load Diff
572
venv/lib/python3.12/site-packages/scipy/optimize/_linprog_rs.py
Normal file
572
venv/lib/python3.12/site-packages/scipy/optimize/_linprog_rs.py
Normal file
@ -0,0 +1,572 @@
|
||||
"""Revised simplex method for linear programming
|
||||
|
||||
The *revised simplex* method uses the method described in [1]_, except
|
||||
that a factorization [2]_ of the basis matrix, rather than its inverse,
|
||||
is efficiently maintained and used to solve the linear systems at each
|
||||
iteration of the algorithm.
|
||||
|
||||
.. versionadded:: 1.3.0
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Bertsimas, Dimitris, and J. Tsitsiklis. "Introduction to linear
|
||||
programming." Athena Scientific 1 (1997): 997.
|
||||
.. [2] Bartels, Richard H. "A stabilization of the simplex method."
|
||||
Journal in Numerische Mathematik 16.5 (1971): 414-434.
|
||||
|
||||
"""
|
||||
# Author: Matt Haberland
|
||||
|
||||
import numpy as np
|
||||
from numpy.linalg import LinAlgError
|
||||
|
||||
from scipy.linalg import solve
|
||||
from ._optimize import _check_unknown_options
|
||||
from ._bglu_dense import LU
|
||||
from ._bglu_dense import BGLU as BGLU
|
||||
from ._linprog_util import _postsolve
|
||||
from ._optimize import OptimizeResult
|
||||
|
||||
|
||||
def _phase_one(A, b, x0, callback, postsolve_args, maxiter, tol, disp,
|
||||
maxupdate, mast, pivot):
|
||||
"""
|
||||
The purpose of phase one is to find an initial basic feasible solution
|
||||
(BFS) to the original problem.
|
||||
|
||||
Generates an auxiliary problem with a trivial BFS and an objective that
|
||||
minimizes infeasibility of the original problem. Solves the auxiliary
|
||||
problem using the main simplex routine (phase two). This either yields
|
||||
a BFS to the original problem or determines that the original problem is
|
||||
infeasible. If feasible, phase one detects redundant rows in the original
|
||||
constraint matrix and removes them, then chooses additional indices as
|
||||
necessary to complete a basis/BFS for the original problem.
|
||||
"""
|
||||
|
||||
m, n = A.shape
|
||||
status = 0
|
||||
|
||||
# generate auxiliary problem to get initial BFS
|
||||
A, b, c, basis, x, status = _generate_auxiliary_problem(A, b, x0, tol)
|
||||
|
||||
if status == 6:
|
||||
residual = c.dot(x)
|
||||
iter_k = 0
|
||||
return x, basis, A, b, residual, status, iter_k
|
||||
|
||||
# solve auxiliary problem
|
||||
phase_one_n = n
|
||||
iter_k = 0
|
||||
x, basis, status, iter_k = _phase_two(c, A, x, basis, callback,
|
||||
postsolve_args,
|
||||
maxiter, tol, disp,
|
||||
maxupdate, mast, pivot,
|
||||
iter_k, phase_one_n)
|
||||
|
||||
# check for infeasibility
|
||||
residual = c.dot(x)
|
||||
if status == 0 and residual > tol:
|
||||
status = 2
|
||||
|
||||
# drive artificial variables out of basis
|
||||
# TODO: test redundant row removal better
|
||||
# TODO: make solve more efficient with BGLU? This could take a while.
|
||||
keep_rows = np.ones(m, dtype=bool)
|
||||
for basis_column in basis[basis >= n]:
|
||||
B = A[:, basis]
|
||||
try:
|
||||
basis_finder = np.abs(solve(B, A)) # inefficient
|
||||
pertinent_row = np.argmax(basis_finder[:, basis_column])
|
||||
eligible_columns = np.ones(n, dtype=bool)
|
||||
eligible_columns[basis[basis < n]] = 0
|
||||
eligible_column_indices = np.where(eligible_columns)[0]
|
||||
index = np.argmax(basis_finder[:, :n]
|
||||
[pertinent_row, eligible_columns])
|
||||
new_basis_column = eligible_column_indices[index]
|
||||
if basis_finder[pertinent_row, new_basis_column] < tol:
|
||||
keep_rows[pertinent_row] = False
|
||||
else:
|
||||
basis[basis == basis_column] = new_basis_column
|
||||
except LinAlgError:
|
||||
status = 4
|
||||
|
||||
# form solution to original problem
|
||||
A = A[keep_rows, :n]
|
||||
basis = basis[keep_rows]
|
||||
x = x[:n]
|
||||
m = A.shape[0]
|
||||
return x, basis, A, b, residual, status, iter_k
|
||||
|
||||
|
||||
def _get_more_basis_columns(A, basis):
|
||||
"""
|
||||
Called when the auxiliary problem terminates with artificial columns in
|
||||
the basis, which must be removed and replaced with non-artificial
|
||||
columns. Finds additional columns that do not make the matrix singular.
|
||||
"""
|
||||
m, n = A.shape
|
||||
|
||||
# options for inclusion are those that aren't already in the basis
|
||||
a = np.arange(m+n)
|
||||
bl = np.zeros(len(a), dtype=bool)
|
||||
bl[basis] = 1
|
||||
options = a[~bl]
|
||||
options = options[options < n] # and they have to be non-artificial
|
||||
|
||||
# form basis matrix
|
||||
B = np.zeros((m, m))
|
||||
B[:, 0:len(basis)] = A[:, basis]
|
||||
|
||||
if (basis.size > 0 and
|
||||
np.linalg.matrix_rank(B[:, :len(basis)]) < len(basis)):
|
||||
raise Exception("Basis has dependent columns")
|
||||
|
||||
rank = 0 # just enter the loop
|
||||
for i in range(n): # somewhat arbitrary, but we need another way out
|
||||
# permute the options, and take as many as needed
|
||||
new_basis = np.random.permutation(options)[:m-len(basis)]
|
||||
B[:, len(basis):] = A[:, new_basis] # update the basis matrix
|
||||
rank = np.linalg.matrix_rank(B) # check the rank
|
||||
if rank == m:
|
||||
break
|
||||
|
||||
return np.concatenate((basis, new_basis))
|
||||
|
||||
|
||||
def _generate_auxiliary_problem(A, b, x0, tol):
|
||||
"""
|
||||
Modifies original problem to create an auxiliary problem with a trivial
|
||||
initial basic feasible solution and an objective that minimizes
|
||||
infeasibility in the original problem.
|
||||
|
||||
Conceptually, this is done by stacking an identity matrix on the right of
|
||||
the original constraint matrix, adding artificial variables to correspond
|
||||
with each of these new columns, and generating a cost vector that is all
|
||||
zeros except for ones corresponding with each of the new variables.
|
||||
|
||||
A initial basic feasible solution is trivial: all variables are zero
|
||||
except for the artificial variables, which are set equal to the
|
||||
corresponding element of the right hand side `b`.
|
||||
|
||||
Running the simplex method on this auxiliary problem drives all of the
|
||||
artificial variables - and thus the cost - to zero if the original problem
|
||||
is feasible. The original problem is declared infeasible otherwise.
|
||||
|
||||
Much of the complexity below is to improve efficiency by using singleton
|
||||
columns in the original problem where possible, thus generating artificial
|
||||
variables only as necessary, and using an initial 'guess' basic feasible
|
||||
solution.
|
||||
"""
|
||||
status = 0
|
||||
m, n = A.shape
|
||||
|
||||
if x0 is not None:
|
||||
x = x0
|
||||
else:
|
||||
x = np.zeros(n)
|
||||
|
||||
r = b - A@x # residual; this must be all zeros for feasibility
|
||||
|
||||
A[r < 0] = -A[r < 0] # express problem with RHS positive for trivial BFS
|
||||
b[r < 0] = -b[r < 0] # to the auxiliary problem
|
||||
r[r < 0] *= -1
|
||||
|
||||
# Rows which we will need to find a trivial way to zero.
|
||||
# This should just be the rows where there is a nonzero residual.
|
||||
# But then we would not necessarily have a column singleton in every row.
|
||||
# This makes it difficult to find an initial basis.
|
||||
if x0 is None:
|
||||
nonzero_constraints = np.arange(m)
|
||||
else:
|
||||
nonzero_constraints = np.where(r > tol)[0]
|
||||
|
||||
# these are (at least some of) the initial basis columns
|
||||
basis = np.where(np.abs(x) > tol)[0]
|
||||
|
||||
if len(nonzero_constraints) == 0 and len(basis) <= m: # already a BFS
|
||||
c = np.zeros(n)
|
||||
basis = _get_more_basis_columns(A, basis)
|
||||
return A, b, c, basis, x, status
|
||||
elif (len(nonzero_constraints) > m - len(basis) or
|
||||
np.any(x < 0)): # can't get trivial BFS
|
||||
c = np.zeros(n)
|
||||
status = 6
|
||||
return A, b, c, basis, x, status
|
||||
|
||||
# chooses existing columns appropriate for inclusion in initial basis
|
||||
cols, rows = _select_singleton_columns(A, r)
|
||||
|
||||
# find the rows we need to zero that we _can_ zero with column singletons
|
||||
i_tofix = np.isin(rows, nonzero_constraints)
|
||||
# these columns can't already be in the basis, though
|
||||
# we are going to add them to the basis and change the corresponding x val
|
||||
i_notinbasis = np.logical_not(np.isin(cols, basis))
|
||||
i_fix_without_aux = np.logical_and(i_tofix, i_notinbasis)
|
||||
rows = rows[i_fix_without_aux]
|
||||
cols = cols[i_fix_without_aux]
|
||||
|
||||
# indices of the rows we can only zero with auxiliary variable
|
||||
# these rows will get a one in each auxiliary column
|
||||
arows = nonzero_constraints[np.logical_not(
|
||||
np.isin(nonzero_constraints, rows))]
|
||||
n_aux = len(arows)
|
||||
acols = n + np.arange(n_aux) # indices of auxiliary columns
|
||||
|
||||
basis_ng = np.concatenate((cols, acols)) # basis columns not from guess
|
||||
basis_ng_rows = np.concatenate((rows, arows)) # rows we need to zero
|
||||
|
||||
# add auxiliary singleton columns
|
||||
A = np.hstack((A, np.zeros((m, n_aux))))
|
||||
A[arows, acols] = 1
|
||||
|
||||
# generate initial BFS
|
||||
x = np.concatenate((x, np.zeros(n_aux)))
|
||||
x[basis_ng] = r[basis_ng_rows]/A[basis_ng_rows, basis_ng]
|
||||
|
||||
# generate costs to minimize infeasibility
|
||||
c = np.zeros(n_aux + n)
|
||||
c[acols] = 1
|
||||
|
||||
# basis columns correspond with nonzeros in guess, those with column
|
||||
# singletons we used to zero remaining constraints, and any additional
|
||||
# columns to get a full set (m columns)
|
||||
basis = np.concatenate((basis, basis_ng))
|
||||
basis = _get_more_basis_columns(A, basis) # add columns as needed
|
||||
|
||||
return A, b, c, basis, x, status
|
||||
|
||||
|
||||
def _select_singleton_columns(A, b):
|
||||
"""
|
||||
Finds singleton columns for which the singleton entry is of the same sign
|
||||
as the right-hand side; these columns are eligible for inclusion in an
|
||||
initial basis. Determines the rows in which the singleton entries are
|
||||
located. For each of these rows, returns the indices of the one singleton
|
||||
column and its corresponding row.
|
||||
"""
|
||||
# find indices of all singleton columns and corresponding row indices
|
||||
column_indices = np.nonzero(np.sum(np.abs(A) != 0, axis=0) == 1)[0]
|
||||
columns = A[:, column_indices] # array of singleton columns
|
||||
row_indices = np.zeros(len(column_indices), dtype=int)
|
||||
nonzero_rows, nonzero_columns = np.nonzero(columns)
|
||||
row_indices[nonzero_columns] = nonzero_rows # corresponding row indices
|
||||
|
||||
# keep only singletons with entries that have same sign as RHS
|
||||
# this is necessary because all elements of BFS must be non-negative
|
||||
same_sign = A[row_indices, column_indices]*b[row_indices] >= 0
|
||||
column_indices = column_indices[same_sign][::-1]
|
||||
row_indices = row_indices[same_sign][::-1]
|
||||
# Reversing the order so that steps below select rightmost columns
|
||||
# for initial basis, which will tend to be slack variables. (If the
|
||||
# guess corresponds with a basic feasible solution but a constraint
|
||||
# is not satisfied with the corresponding slack variable zero, the slack
|
||||
# variable must be basic.)
|
||||
|
||||
# for each row, keep rightmost singleton column with an entry in that row
|
||||
unique_row_indices, first_columns = np.unique(row_indices,
|
||||
return_index=True)
|
||||
return column_indices[first_columns], unique_row_indices
|
||||
|
||||
|
||||
def _find_nonzero_rows(A, tol):
|
||||
"""
|
||||
Returns logical array indicating the locations of rows with at least
|
||||
one nonzero element.
|
||||
"""
|
||||
return np.any(np.abs(A) > tol, axis=1)
|
||||
|
||||
|
||||
def _select_enter_pivot(c_hat, bl, a, rule="bland", tol=1e-12):
|
||||
"""
|
||||
Selects a pivot to enter the basis. Currently Bland's rule - the smallest
|
||||
index that has a negative reduced cost - is the default.
|
||||
"""
|
||||
if rule.lower() == "mrc": # index with minimum reduced cost
|
||||
return a[~bl][np.argmin(c_hat)]
|
||||
else: # smallest index w/ negative reduced cost
|
||||
return a[~bl][c_hat < -tol][0]
|
||||
|
||||
|
||||
def _display_iter(phase, iteration, slack, con, fun):
|
||||
"""
|
||||
Print indicators of optimization status to the console.
|
||||
"""
|
||||
header = True if not iteration % 20 else False
|
||||
|
||||
if header:
|
||||
print("Phase",
|
||||
"Iteration",
|
||||
"Minimum Slack ",
|
||||
"Constraint Residual",
|
||||
"Objective ")
|
||||
|
||||
# :<X.Y left aligns Y digits in X digit spaces
|
||||
fmt = '{0:<6}{1:<10}{2:<20.13}{3:<20.13}{4:<20.13}'
|
||||
try:
|
||||
slack = np.min(slack)
|
||||
except ValueError:
|
||||
slack = "NA"
|
||||
print(fmt.format(phase, iteration, slack, np.linalg.norm(con), fun))
|
||||
|
||||
|
||||
def _display_and_callback(phase_one_n, x, postsolve_args, status,
|
||||
iteration, disp, callback):
|
||||
if phase_one_n is not None:
|
||||
phase = 1
|
||||
x_postsolve = x[:phase_one_n]
|
||||
else:
|
||||
phase = 2
|
||||
x_postsolve = x
|
||||
x_o, fun, slack, con = _postsolve(x_postsolve,
|
||||
postsolve_args)
|
||||
|
||||
if callback is not None:
|
||||
res = OptimizeResult({'x': x_o, 'fun': fun, 'slack': slack,
|
||||
'con': con, 'nit': iteration,
|
||||
'phase': phase, 'complete': False,
|
||||
'status': status, 'message': "",
|
||||
'success': False})
|
||||
callback(res)
|
||||
if disp:
|
||||
_display_iter(phase, iteration, slack, con, fun)
|
||||
|
||||
|
||||
def _phase_two(c, A, x, b, callback, postsolve_args, maxiter, tol, disp,
|
||||
maxupdate, mast, pivot, iteration=0, phase_one_n=None):
|
||||
"""
|
||||
The heart of the simplex method. Beginning with a basic feasible solution,
|
||||
moves to adjacent basic feasible solutions successively lower reduced cost.
|
||||
Terminates when there are no basic feasible solutions with lower reduced
|
||||
cost or if the problem is determined to be unbounded.
|
||||
|
||||
This implementation follows the revised simplex method based on LU
|
||||
decomposition. Rather than maintaining a tableau or an inverse of the
|
||||
basis matrix, we keep a factorization of the basis matrix that allows
|
||||
efficient solution of linear systems while avoiding stability issues
|
||||
associated with inverted matrices.
|
||||
"""
|
||||
m, n = A.shape
|
||||
status = 0
|
||||
a = np.arange(n) # indices of columns of A
|
||||
ab = np.arange(m) # indices of columns of B
|
||||
if maxupdate:
|
||||
# basis matrix factorization object; similar to B = A[:, b]
|
||||
B = BGLU(A, b, maxupdate, mast)
|
||||
else:
|
||||
B = LU(A, b)
|
||||
|
||||
for iteration in range(iteration, maxiter):
|
||||
|
||||
if disp or callback is not None:
|
||||
_display_and_callback(phase_one_n, x, postsolve_args, status,
|
||||
iteration, disp, callback)
|
||||
|
||||
bl = np.zeros(len(a), dtype=bool)
|
||||
bl[b] = 1
|
||||
|
||||
xb = x[b] # basic variables
|
||||
cb = c[b] # basic costs
|
||||
|
||||
try:
|
||||
v = B.solve(cb, transposed=True) # similar to v = solve(B.T, cb)
|
||||
except LinAlgError:
|
||||
status = 4
|
||||
break
|
||||
|
||||
# TODO: cythonize?
|
||||
c_hat = c - v.dot(A) # reduced cost
|
||||
c_hat = c_hat[~bl]
|
||||
# Above is much faster than:
|
||||
# N = A[:, ~bl] # slow!
|
||||
# c_hat = c[~bl] - v.T.dot(N)
|
||||
# Can we perform the multiplication only on the nonbasic columns?
|
||||
|
||||
if np.all(c_hat >= -tol): # all reduced costs positive -> terminate
|
||||
break
|
||||
|
||||
j = _select_enter_pivot(c_hat, bl, a, rule=pivot, tol=tol)
|
||||
u = B.solve(A[:, j]) # similar to u = solve(B, A[:, j])
|
||||
|
||||
i = u > tol # if none of the u are positive, unbounded
|
||||
if not np.any(i):
|
||||
status = 3
|
||||
break
|
||||
|
||||
th = xb[i]/u[i]
|
||||
l = np.argmin(th) # implicitly selects smallest subscript
|
||||
th_star = th[l] # step size
|
||||
|
||||
x[b] = x[b] - th_star*u # take step
|
||||
x[j] = th_star
|
||||
B.update(ab[i][l], j) # modify basis
|
||||
b = B.b # similar to b[ab[i][l]] =
|
||||
|
||||
else:
|
||||
# If the end of the for loop is reached (without a break statement),
|
||||
# then another step has been taken, so the iteration counter should
|
||||
# increment, info should be displayed, and callback should be called.
|
||||
iteration += 1
|
||||
status = 1
|
||||
if disp or callback is not None:
|
||||
_display_and_callback(phase_one_n, x, postsolve_args, status,
|
||||
iteration, disp, callback)
|
||||
|
||||
return x, b, status, iteration
|
||||
|
||||
|
||||
def _linprog_rs(c, c0, A, b, x0, callback, postsolve_args,
|
||||
maxiter=5000, tol=1e-12, disp=False,
|
||||
maxupdate=10, mast=False, pivot="mrc",
|
||||
**unknown_options):
|
||||
"""
|
||||
Solve the following linear programming problem via a two-phase
|
||||
revised simplex algorithm.::
|
||||
|
||||
minimize: c @ x
|
||||
|
||||
subject to: A @ x == b
|
||||
0 <= x < oo
|
||||
|
||||
User-facing documentation is in _linprog_doc.py.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
c : 1-D array
|
||||
Coefficients of the linear objective function to be minimized.
|
||||
c0 : float
|
||||
Constant term in objective function due to fixed (and eliminated)
|
||||
variables. (Currently unused.)
|
||||
A : 2-D array
|
||||
2-D array which, when matrix-multiplied by ``x``, gives the values of
|
||||
the equality constraints at ``x``.
|
||||
b : 1-D array
|
||||
1-D array of values representing the RHS of each equality constraint
|
||||
(row) in ``A_eq``.
|
||||
x0 : 1-D array, optional
|
||||
Starting values of the independent variables, which will be refined by
|
||||
the optimization algorithm. For the revised simplex method, these must
|
||||
correspond with a basic feasible solution.
|
||||
callback : callable, optional
|
||||
If a callback function is provided, it will be called within each
|
||||
iteration of the algorithm. The callback function must accept a single
|
||||
`scipy.optimize.OptimizeResult` consisting of the following fields:
|
||||
|
||||
x : 1-D array
|
||||
Current solution vector.
|
||||
fun : float
|
||||
Current value of the objective function ``c @ x``.
|
||||
success : bool
|
||||
True only when an algorithm has completed successfully,
|
||||
so this is always False as the callback function is called
|
||||
only while the algorithm is still iterating.
|
||||
slack : 1-D array
|
||||
The values of the slack variables. Each slack variable
|
||||
corresponds to an inequality constraint. If the slack is zero,
|
||||
the corresponding constraint is active.
|
||||
con : 1-D array
|
||||
The (nominally zero) residuals of the equality constraints,
|
||||
that is, ``b - A_eq @ x``.
|
||||
phase : int
|
||||
The phase of the algorithm being executed.
|
||||
status : int
|
||||
For revised simplex, this is always 0 because if a different
|
||||
status is detected, the algorithm terminates.
|
||||
nit : int
|
||||
The number of iterations performed.
|
||||
message : str
|
||||
A string descriptor of the exit status of the optimization.
|
||||
postsolve_args : tuple
|
||||
Data needed by _postsolve to convert the solution to the standard-form
|
||||
problem into the solution to the original problem.
|
||||
|
||||
Options
|
||||
-------
|
||||
maxiter : int
|
||||
The maximum number of iterations to perform in either phase.
|
||||
tol : float
|
||||
The tolerance which determines when a solution is "close enough" to
|
||||
zero in Phase 1 to be considered a basic feasible solution or close
|
||||
enough to positive to serve as an optimal solution.
|
||||
disp : bool
|
||||
Set to ``True`` if indicators of optimization status are to be printed
|
||||
to the console each iteration.
|
||||
maxupdate : int
|
||||
The maximum number of updates performed on the LU factorization.
|
||||
After this many updates is reached, the basis matrix is factorized
|
||||
from scratch.
|
||||
mast : bool
|
||||
Minimize Amortized Solve Time. If enabled, the average time to solve
|
||||
a linear system using the basis factorization is measured. Typically,
|
||||
the average solve time will decrease with each successive solve after
|
||||
initial factorization, as factorization takes much more time than the
|
||||
solve operation (and updates). Eventually, however, the updated
|
||||
factorization becomes sufficiently complex that the average solve time
|
||||
begins to increase. When this is detected, the basis is refactorized
|
||||
from scratch. Enable this option to maximize speed at the risk of
|
||||
nondeterministic behavior. Ignored if ``maxupdate`` is 0.
|
||||
pivot : "mrc" or "bland"
|
||||
Pivot rule: Minimum Reduced Cost (default) or Bland's rule. Choose
|
||||
Bland's rule if iteration limit is reached and cycling is suspected.
|
||||
unknown_options : dict
|
||||
Optional arguments not used by this particular solver. If
|
||||
`unknown_options` is non-empty a warning is issued listing all
|
||||
unused options.
|
||||
|
||||
Returns
|
||||
-------
|
||||
x : 1-D array
|
||||
Solution vector.
|
||||
status : int
|
||||
An integer representing the exit status of the optimization::
|
||||
|
||||
0 : Optimization terminated successfully
|
||||
1 : Iteration limit reached
|
||||
2 : Problem appears to be infeasible
|
||||
3 : Problem appears to be unbounded
|
||||
4 : Numerical difficulties encountered
|
||||
5 : No constraints; turn presolve on
|
||||
6 : Guess x0 cannot be converted to a basic feasible solution
|
||||
|
||||
message : str
|
||||
A string descriptor of the exit status of the optimization.
|
||||
iteration : int
|
||||
The number of iterations taken to solve the problem.
|
||||
"""
|
||||
|
||||
_check_unknown_options(unknown_options)
|
||||
|
||||
messages = ["Optimization terminated successfully.",
|
||||
"Iteration limit reached.",
|
||||
"The problem appears infeasible, as the phase one auxiliary "
|
||||
"problem terminated successfully with a residual of {0:.1e}, "
|
||||
"greater than the tolerance {1} required for the solution to "
|
||||
"be considered feasible. Consider increasing the tolerance to "
|
||||
"be greater than {0:.1e}. If this tolerance is unnaceptably "
|
||||
"large, the problem is likely infeasible.",
|
||||
"The problem is unbounded, as the simplex algorithm found "
|
||||
"a basic feasible solution from which there is a direction "
|
||||
"with negative reduced cost in which all decision variables "
|
||||
"increase.",
|
||||
"Numerical difficulties encountered; consider trying "
|
||||
"method='interior-point'.",
|
||||
"Problems with no constraints are trivially solved; please "
|
||||
"turn presolve on.",
|
||||
"The guess x0 cannot be converted to a basic feasible "
|
||||
"solution. "
|
||||
]
|
||||
|
||||
if A.size == 0: # address test_unbounded_below_no_presolve_corrected
|
||||
return np.zeros(c.shape), 5, messages[5], 0
|
||||
|
||||
x, basis, A, b, residual, status, iteration = (
|
||||
_phase_one(A, b, x0, callback, postsolve_args,
|
||||
maxiter, tol, disp, maxupdate, mast, pivot))
|
||||
|
||||
if status == 0:
|
||||
x, basis, status, iteration = _phase_two(c, A, x, basis, callback,
|
||||
postsolve_args,
|
||||
maxiter, tol, disp,
|
||||
maxupdate, mast, pivot,
|
||||
iteration)
|
||||
|
||||
return x, status, messages[status].format(residual, tol), iteration
|
||||
@ -0,0 +1,661 @@
|
||||
"""Simplex method for linear programming
|
||||
|
||||
The *simplex* method uses a traditional, full-tableau implementation of
|
||||
Dantzig's simplex algorithm [1]_, [2]_ (*not* the Nelder-Mead simplex).
|
||||
This algorithm is included for backwards compatibility and educational
|
||||
purposes.
|
||||
|
||||
.. versionadded:: 0.15.0
|
||||
|
||||
Warnings
|
||||
--------
|
||||
|
||||
The simplex method may encounter numerical difficulties when pivot
|
||||
values are close to the specified tolerance. If encountered try
|
||||
remove any redundant constraints, change the pivot strategy to Bland's
|
||||
rule or increase the tolerance value.
|
||||
|
||||
Alternatively, more robust methods maybe be used. See
|
||||
:ref:`'interior-point' <optimize.linprog-interior-point>` and
|
||||
:ref:`'revised simplex' <optimize.linprog-revised_simplex>`.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Dantzig, George B., Linear programming and extensions. Rand
|
||||
Corporation Research Study Princeton Univ. Press, Princeton, NJ,
|
||||
1963
|
||||
.. [2] Hillier, S.H. and Lieberman, G.J. (1995), "Introduction to
|
||||
Mathematical Programming", McGraw-Hill, Chapter 4.
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
from warnings import warn
|
||||
from ._optimize import OptimizeResult, OptimizeWarning, _check_unknown_options
|
||||
from ._linprog_util import _postsolve
|
||||
|
||||
|
||||
def _pivot_col(T, tol=1e-9, bland=False):
|
||||
"""
|
||||
Given a linear programming simplex tableau, determine the column
|
||||
of the variable to enter the basis.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
T : 2-D array
|
||||
A 2-D array representing the simplex tableau, T, corresponding to the
|
||||
linear programming problem. It should have the form:
|
||||
|
||||
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
|
||||
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
|
||||
.
|
||||
.
|
||||
.
|
||||
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
|
||||
[c[0], c[1], ..., c[n_total], 0]]
|
||||
|
||||
for a Phase 2 problem, or the form:
|
||||
|
||||
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
|
||||
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
|
||||
.
|
||||
.
|
||||
.
|
||||
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
|
||||
[c[0], c[1], ..., c[n_total], 0],
|
||||
[c'[0], c'[1], ..., c'[n_total], 0]]
|
||||
|
||||
for a Phase 1 problem (a problem in which a basic feasible solution is
|
||||
sought prior to maximizing the actual objective. ``T`` is modified in
|
||||
place by ``_solve_simplex``.
|
||||
tol : float
|
||||
Elements in the objective row larger than -tol will not be considered
|
||||
for pivoting. Nominally this value is zero, but numerical issues
|
||||
cause a tolerance about zero to be necessary.
|
||||
bland : bool
|
||||
If True, use Bland's rule for selection of the column (select the
|
||||
first column with a negative coefficient in the objective row,
|
||||
regardless of magnitude).
|
||||
|
||||
Returns
|
||||
-------
|
||||
status: bool
|
||||
True if a suitable pivot column was found, otherwise False.
|
||||
A return of False indicates that the linear programming simplex
|
||||
algorithm is complete.
|
||||
col: int
|
||||
The index of the column of the pivot element.
|
||||
If status is False, col will be returned as nan.
|
||||
"""
|
||||
ma = np.ma.masked_where(T[-1, :-1] >= -tol, T[-1, :-1], copy=False)
|
||||
if ma.count() == 0:
|
||||
return False, np.nan
|
||||
if bland:
|
||||
# ma.mask is sometimes 0d
|
||||
return True, np.nonzero(np.logical_not(np.atleast_1d(ma.mask)))[0][0]
|
||||
return True, np.ma.nonzero(ma == ma.min())[0][0]
|
||||
|
||||
|
||||
def _pivot_row(T, basis, pivcol, phase, tol=1e-9, bland=False):
|
||||
"""
|
||||
Given a linear programming simplex tableau, determine the row for the
|
||||
pivot operation.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
T : 2-D array
|
||||
A 2-D array representing the simplex tableau, T, corresponding to the
|
||||
linear programming problem. It should have the form:
|
||||
|
||||
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
|
||||
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
|
||||
.
|
||||
.
|
||||
.
|
||||
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
|
||||
[c[0], c[1], ..., c[n_total], 0]]
|
||||
|
||||
for a Phase 2 problem, or the form:
|
||||
|
||||
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
|
||||
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
|
||||
.
|
||||
.
|
||||
.
|
||||
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
|
||||
[c[0], c[1], ..., c[n_total], 0],
|
||||
[c'[0], c'[1], ..., c'[n_total], 0]]
|
||||
|
||||
for a Phase 1 problem (a Problem in which a basic feasible solution is
|
||||
sought prior to maximizing the actual objective. ``T`` is modified in
|
||||
place by ``_solve_simplex``.
|
||||
basis : array
|
||||
A list of the current basic variables.
|
||||
pivcol : int
|
||||
The index of the pivot column.
|
||||
phase : int
|
||||
The phase of the simplex algorithm (1 or 2).
|
||||
tol : float
|
||||
Elements in the pivot column smaller than tol will not be considered
|
||||
for pivoting. Nominally this value is zero, but numerical issues
|
||||
cause a tolerance about zero to be necessary.
|
||||
bland : bool
|
||||
If True, use Bland's rule for selection of the row (if more than one
|
||||
row can be used, choose the one with the lowest variable index).
|
||||
|
||||
Returns
|
||||
-------
|
||||
status: bool
|
||||
True if a suitable pivot row was found, otherwise False. A return
|
||||
of False indicates that the linear programming problem is unbounded.
|
||||
row: int
|
||||
The index of the row of the pivot element. If status is False, row
|
||||
will be returned as nan.
|
||||
"""
|
||||
if phase == 1:
|
||||
k = 2
|
||||
else:
|
||||
k = 1
|
||||
ma = np.ma.masked_where(T[:-k, pivcol] <= tol, T[:-k, pivcol], copy=False)
|
||||
if ma.count() == 0:
|
||||
return False, np.nan
|
||||
mb = np.ma.masked_where(T[:-k, pivcol] <= tol, T[:-k, -1], copy=False)
|
||||
q = mb / ma
|
||||
min_rows = np.ma.nonzero(q == q.min())[0]
|
||||
if bland:
|
||||
return True, min_rows[np.argmin(np.take(basis, min_rows))]
|
||||
return True, min_rows[0]
|
||||
|
||||
|
||||
def _apply_pivot(T, basis, pivrow, pivcol, tol=1e-9):
|
||||
"""
|
||||
Pivot the simplex tableau inplace on the element given by (pivrow, pivol).
|
||||
The entering variable corresponds to the column given by pivcol forcing
|
||||
the variable basis[pivrow] to leave the basis.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
T : 2-D array
|
||||
A 2-D array representing the simplex tableau, T, corresponding to the
|
||||
linear programming problem. It should have the form:
|
||||
|
||||
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
|
||||
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
|
||||
.
|
||||
.
|
||||
.
|
||||
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
|
||||
[c[0], c[1], ..., c[n_total], 0]]
|
||||
|
||||
for a Phase 2 problem, or the form:
|
||||
|
||||
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
|
||||
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
|
||||
.
|
||||
.
|
||||
.
|
||||
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
|
||||
[c[0], c[1], ..., c[n_total], 0],
|
||||
[c'[0], c'[1], ..., c'[n_total], 0]]
|
||||
|
||||
for a Phase 1 problem (a problem in which a basic feasible solution is
|
||||
sought prior to maximizing the actual objective. ``T`` is modified in
|
||||
place by ``_solve_simplex``.
|
||||
basis : 1-D array
|
||||
An array of the indices of the basic variables, such that basis[i]
|
||||
contains the column corresponding to the basic variable for row i.
|
||||
Basis is modified in place by _apply_pivot.
|
||||
pivrow : int
|
||||
Row index of the pivot.
|
||||
pivcol : int
|
||||
Column index of the pivot.
|
||||
"""
|
||||
basis[pivrow] = pivcol
|
||||
pivval = T[pivrow, pivcol]
|
||||
T[pivrow] = T[pivrow] / pivval
|
||||
for irow in range(T.shape[0]):
|
||||
if irow != pivrow:
|
||||
T[irow] = T[irow] - T[pivrow] * T[irow, pivcol]
|
||||
|
||||
# The selected pivot should never lead to a pivot value less than the tol.
|
||||
if np.isclose(pivval, tol, atol=0, rtol=1e4):
|
||||
message = (
|
||||
f"The pivot operation produces a pivot value of:{pivval: .1e}, "
|
||||
"which is only slightly greater than the specified "
|
||||
f"tolerance{tol: .1e}. This may lead to issues regarding the "
|
||||
"numerical stability of the simplex method. "
|
||||
"Removing redundant constraints, changing the pivot strategy "
|
||||
"via Bland's rule or increasing the tolerance may "
|
||||
"help reduce the issue.")
|
||||
warn(message, OptimizeWarning, stacklevel=5)
|
||||
|
||||
|
||||
def _solve_simplex(T, n, basis, callback, postsolve_args,
|
||||
maxiter=1000, tol=1e-9, phase=2, bland=False, nit0=0,
|
||||
):
|
||||
"""
|
||||
Solve a linear programming problem in "standard form" using the Simplex
|
||||
Method. Linear Programming is intended to solve the following problem form:
|
||||
|
||||
Minimize::
|
||||
|
||||
c @ x
|
||||
|
||||
Subject to::
|
||||
|
||||
A @ x == b
|
||||
x >= 0
|
||||
|
||||
Parameters
|
||||
----------
|
||||
T : 2-D array
|
||||
A 2-D array representing the simplex tableau, T, corresponding to the
|
||||
linear programming problem. It should have the form:
|
||||
|
||||
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
|
||||
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
|
||||
.
|
||||
.
|
||||
.
|
||||
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
|
||||
[c[0], c[1], ..., c[n_total], 0]]
|
||||
|
||||
for a Phase 2 problem, or the form:
|
||||
|
||||
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
|
||||
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
|
||||
.
|
||||
.
|
||||
.
|
||||
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
|
||||
[c[0], c[1], ..., c[n_total], 0],
|
||||
[c'[0], c'[1], ..., c'[n_total], 0]]
|
||||
|
||||
for a Phase 1 problem (a problem in which a basic feasible solution is
|
||||
sought prior to maximizing the actual objective. ``T`` is modified in
|
||||
place by ``_solve_simplex``.
|
||||
n : int
|
||||
The number of true variables in the problem.
|
||||
basis : 1-D array
|
||||
An array of the indices of the basic variables, such that basis[i]
|
||||
contains the column corresponding to the basic variable for row i.
|
||||
Basis is modified in place by _solve_simplex
|
||||
callback : callable, optional
|
||||
If a callback function is provided, it will be called within each
|
||||
iteration of the algorithm. The callback must accept a
|
||||
`scipy.optimize.OptimizeResult` consisting of the following fields:
|
||||
|
||||
x : 1-D array
|
||||
Current solution vector
|
||||
fun : float
|
||||
Current value of the objective function
|
||||
success : bool
|
||||
True only when a phase has completed successfully. This
|
||||
will be False for most iterations.
|
||||
slack : 1-D array
|
||||
The values of the slack variables. Each slack variable
|
||||
corresponds to an inequality constraint. If the slack is zero,
|
||||
the corresponding constraint is active.
|
||||
con : 1-D array
|
||||
The (nominally zero) residuals of the equality constraints,
|
||||
that is, ``b - A_eq @ x``
|
||||
phase : int
|
||||
The phase of the optimization being executed. In phase 1 a basic
|
||||
feasible solution is sought and the T has an additional row
|
||||
representing an alternate objective function.
|
||||
status : int
|
||||
An integer representing the exit status of the optimization::
|
||||
|
||||
0 : Optimization terminated successfully
|
||||
1 : Iteration limit reached
|
||||
2 : Problem appears to be infeasible
|
||||
3 : Problem appears to be unbounded
|
||||
4 : Serious numerical difficulties encountered
|
||||
|
||||
nit : int
|
||||
The number of iterations performed.
|
||||
message : str
|
||||
A string descriptor of the exit status of the optimization.
|
||||
postsolve_args : tuple
|
||||
Data needed by _postsolve to convert the solution to the standard-form
|
||||
problem into the solution to the original problem.
|
||||
maxiter : int
|
||||
The maximum number of iterations to perform before aborting the
|
||||
optimization.
|
||||
tol : float
|
||||
The tolerance which determines when a solution is "close enough" to
|
||||
zero in Phase 1 to be considered a basic feasible solution or close
|
||||
enough to positive to serve as an optimal solution.
|
||||
phase : int
|
||||
The phase of the optimization being executed. In phase 1 a basic
|
||||
feasible solution is sought and the T has an additional row
|
||||
representing an alternate objective function.
|
||||
bland : bool
|
||||
If True, choose pivots using Bland's rule [3]_. In problems which
|
||||
fail to converge due to cycling, using Bland's rule can provide
|
||||
convergence at the expense of a less optimal path about the simplex.
|
||||
nit0 : int
|
||||
The initial iteration number used to keep an accurate iteration total
|
||||
in a two-phase problem.
|
||||
|
||||
Returns
|
||||
-------
|
||||
nit : int
|
||||
The number of iterations. Used to keep an accurate iteration total
|
||||
in the two-phase problem.
|
||||
status : int
|
||||
An integer representing the exit status of the optimization::
|
||||
|
||||
0 : Optimization terminated successfully
|
||||
1 : Iteration limit reached
|
||||
2 : Problem appears to be infeasible
|
||||
3 : Problem appears to be unbounded
|
||||
4 : Serious numerical difficulties encountered
|
||||
|
||||
"""
|
||||
nit = nit0
|
||||
status = 0
|
||||
message = ''
|
||||
complete = False
|
||||
|
||||
if phase == 1:
|
||||
m = T.shape[1]-2
|
||||
elif phase == 2:
|
||||
m = T.shape[1]-1
|
||||
else:
|
||||
raise ValueError("Argument 'phase' to _solve_simplex must be 1 or 2")
|
||||
|
||||
if phase == 2:
|
||||
# Check if any artificial variables are still in the basis.
|
||||
# If yes, check if any coefficients from this row and a column
|
||||
# corresponding to one of the non-artificial variable is non-zero.
|
||||
# If found, pivot at this term. If not, start phase 2.
|
||||
# Do this for all artificial variables in the basis.
|
||||
# Ref: "An Introduction to Linear Programming and Game Theory"
|
||||
# by Paul R. Thie, Gerard E. Keough, 3rd Ed,
|
||||
# Chapter 3.7 Redundant Systems (pag 102)
|
||||
for pivrow in [row for row in range(basis.size)
|
||||
if basis[row] > T.shape[1] - 2]:
|
||||
non_zero_row = [col for col in range(T.shape[1] - 1)
|
||||
if abs(T[pivrow, col]) > tol]
|
||||
if len(non_zero_row) > 0:
|
||||
pivcol = non_zero_row[0]
|
||||
_apply_pivot(T, basis, pivrow, pivcol, tol)
|
||||
nit += 1
|
||||
|
||||
if len(basis[:m]) == 0:
|
||||
solution = np.empty(T.shape[1] - 1, dtype=np.float64)
|
||||
else:
|
||||
solution = np.empty(max(T.shape[1] - 1, max(basis[:m]) + 1),
|
||||
dtype=np.float64)
|
||||
|
||||
while not complete:
|
||||
# Find the pivot column
|
||||
pivcol_found, pivcol = _pivot_col(T, tol, bland)
|
||||
if not pivcol_found:
|
||||
pivcol = np.nan
|
||||
pivrow = np.nan
|
||||
status = 0
|
||||
complete = True
|
||||
else:
|
||||
# Find the pivot row
|
||||
pivrow_found, pivrow = _pivot_row(T, basis, pivcol, phase, tol, bland)
|
||||
if not pivrow_found:
|
||||
status = 3
|
||||
complete = True
|
||||
|
||||
if callback is not None:
|
||||
solution[:] = 0
|
||||
solution[basis[:n]] = T[:n, -1]
|
||||
x = solution[:m]
|
||||
x, fun, slack, con = _postsolve(
|
||||
x, postsolve_args
|
||||
)
|
||||
res = OptimizeResult({
|
||||
'x': x,
|
||||
'fun': fun,
|
||||
'slack': slack,
|
||||
'con': con,
|
||||
'status': status,
|
||||
'message': message,
|
||||
'nit': nit,
|
||||
'success': status == 0 and complete,
|
||||
'phase': phase,
|
||||
'complete': complete,
|
||||
})
|
||||
callback(res)
|
||||
|
||||
if not complete:
|
||||
if nit >= maxiter:
|
||||
# Iteration limit exceeded
|
||||
status = 1
|
||||
complete = True
|
||||
else:
|
||||
_apply_pivot(T, basis, pivrow, pivcol, tol)
|
||||
nit += 1
|
||||
return nit, status
|
||||
|
||||
|
||||
def _linprog_simplex(c, c0, A, b, callback, postsolve_args,
|
||||
maxiter=1000, tol=1e-9, disp=False, bland=False,
|
||||
**unknown_options):
|
||||
"""
|
||||
Minimize a linear objective function subject to linear equality and
|
||||
non-negativity constraints using the two phase simplex method.
|
||||
Linear programming is intended to solve problems of the following form:
|
||||
|
||||
Minimize::
|
||||
|
||||
c @ x
|
||||
|
||||
Subject to::
|
||||
|
||||
A @ x == b
|
||||
x >= 0
|
||||
|
||||
User-facing documentation is in _linprog_doc.py.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
c : 1-D array
|
||||
Coefficients of the linear objective function to be minimized.
|
||||
c0 : float
|
||||
Constant term in objective function due to fixed (and eliminated)
|
||||
variables. (Purely for display.)
|
||||
A : 2-D array
|
||||
2-D array such that ``A @ x``, gives the values of the equality
|
||||
constraints at ``x``.
|
||||
b : 1-D array
|
||||
1-D array of values representing the right hand side of each equality
|
||||
constraint (row) in ``A``.
|
||||
callback : callable, optional
|
||||
If a callback function is provided, it will be called within each
|
||||
iteration of the algorithm. The callback function must accept a single
|
||||
`scipy.optimize.OptimizeResult` consisting of the following fields:
|
||||
|
||||
x : 1-D array
|
||||
Current solution vector
|
||||
fun : float
|
||||
Current value of the objective function
|
||||
success : bool
|
||||
True when an algorithm has completed successfully.
|
||||
slack : 1-D array
|
||||
The values of the slack variables. Each slack variable
|
||||
corresponds to an inequality constraint. If the slack is zero,
|
||||
the corresponding constraint is active.
|
||||
con : 1-D array
|
||||
The (nominally zero) residuals of the equality constraints,
|
||||
that is, ``b - A_eq @ x``
|
||||
phase : int
|
||||
The phase of the algorithm being executed.
|
||||
status : int
|
||||
An integer representing the status of the optimization::
|
||||
|
||||
0 : Algorithm proceeding nominally
|
||||
1 : Iteration limit reached
|
||||
2 : Problem appears to be infeasible
|
||||
3 : Problem appears to be unbounded
|
||||
4 : Serious numerical difficulties encountered
|
||||
nit : int
|
||||
The number of iterations performed.
|
||||
message : str
|
||||
A string descriptor of the exit status of the optimization.
|
||||
postsolve_args : tuple
|
||||
Data needed by _postsolve to convert the solution to the standard-form
|
||||
problem into the solution to the original problem.
|
||||
|
||||
Options
|
||||
-------
|
||||
maxiter : int
|
||||
The maximum number of iterations to perform.
|
||||
disp : bool
|
||||
If True, print exit status message to sys.stdout
|
||||
tol : float
|
||||
The tolerance which determines when a solution is "close enough" to
|
||||
zero in Phase 1 to be considered a basic feasible solution or close
|
||||
enough to positive to serve as an optimal solution.
|
||||
bland : bool
|
||||
If True, use Bland's anti-cycling rule [3]_ to choose pivots to
|
||||
prevent cycling. If False, choose pivots which should lead to a
|
||||
converged solution more quickly. The latter method is subject to
|
||||
cycling (non-convergence) in rare instances.
|
||||
unknown_options : dict
|
||||
Optional arguments not used by this particular solver. If
|
||||
`unknown_options` is non-empty a warning is issued listing all
|
||||
unused options.
|
||||
|
||||
Returns
|
||||
-------
|
||||
x : 1-D array
|
||||
Solution vector.
|
||||
status : int
|
||||
An integer representing the exit status of the optimization::
|
||||
|
||||
0 : Optimization terminated successfully
|
||||
1 : Iteration limit reached
|
||||
2 : Problem appears to be infeasible
|
||||
3 : Problem appears to be unbounded
|
||||
4 : Serious numerical difficulties encountered
|
||||
|
||||
message : str
|
||||
A string descriptor of the exit status of the optimization.
|
||||
iteration : int
|
||||
The number of iterations taken to solve the problem.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Dantzig, George B., Linear programming and extensions. Rand
|
||||
Corporation Research Study Princeton Univ. Press, Princeton, NJ,
|
||||
1963
|
||||
.. [2] Hillier, S.H. and Lieberman, G.J. (1995), "Introduction to
|
||||
Mathematical Programming", McGraw-Hill, Chapter 4.
|
||||
.. [3] Bland, Robert G. New finite pivoting rules for the simplex method.
|
||||
Mathematics of Operations Research (2), 1977: pp. 103-107.
|
||||
|
||||
|
||||
Notes
|
||||
-----
|
||||
The expected problem formulation differs between the top level ``linprog``
|
||||
module and the method specific solvers. The method specific solvers expect a
|
||||
problem in standard form:
|
||||
|
||||
Minimize::
|
||||
|
||||
c @ x
|
||||
|
||||
Subject to::
|
||||
|
||||
A @ x == b
|
||||
x >= 0
|
||||
|
||||
Whereas the top level ``linprog`` module expects a problem of form:
|
||||
|
||||
Minimize::
|
||||
|
||||
c @ x
|
||||
|
||||
Subject to::
|
||||
|
||||
A_ub @ x <= b_ub
|
||||
A_eq @ x == b_eq
|
||||
lb <= x <= ub
|
||||
|
||||
where ``lb = 0`` and ``ub = None`` unless set in ``bounds``.
|
||||
|
||||
The original problem contains equality, upper-bound and variable constraints
|
||||
whereas the method specific solver requires equality constraints and
|
||||
variable non-negativity.
|
||||
|
||||
``linprog`` module converts the original problem to standard form by
|
||||
converting the simple bounds to upper bound constraints, introducing
|
||||
non-negative slack variables for inequality constraints, and expressing
|
||||
unbounded variables as the difference between two non-negative variables.
|
||||
"""
|
||||
_check_unknown_options(unknown_options)
|
||||
|
||||
status = 0
|
||||
messages = {0: "Optimization terminated successfully.",
|
||||
1: "Iteration limit reached.",
|
||||
2: "Optimization failed. Unable to find a feasible"
|
||||
" starting point.",
|
||||
3: "Optimization failed. The problem appears to be unbounded.",
|
||||
4: "Optimization failed. Singular matrix encountered."}
|
||||
|
||||
n, m = A.shape
|
||||
|
||||
# All constraints must have b >= 0.
|
||||
is_negative_constraint = np.less(b, 0)
|
||||
A[is_negative_constraint] *= -1
|
||||
b[is_negative_constraint] *= -1
|
||||
|
||||
# As all constraints are equality constraints the artificial variables
|
||||
# will also be basic variables.
|
||||
av = np.arange(n) + m
|
||||
basis = av.copy()
|
||||
|
||||
# Format the phase one tableau by adding artificial variables and stacking
|
||||
# the constraints, the objective row and pseudo-objective row.
|
||||
row_constraints = np.hstack((A, np.eye(n), b[:, np.newaxis]))
|
||||
row_objective = np.hstack((c, np.zeros(n), c0))
|
||||
row_pseudo_objective = -row_constraints.sum(axis=0)
|
||||
row_pseudo_objective[av] = 0
|
||||
T = np.vstack((row_constraints, row_objective, row_pseudo_objective))
|
||||
|
||||
nit1, status = _solve_simplex(T, n, basis, callback=callback,
|
||||
postsolve_args=postsolve_args,
|
||||
maxiter=maxiter, tol=tol, phase=1,
|
||||
bland=bland
|
||||
)
|
||||
# if pseudo objective is zero, remove the last row from the tableau and
|
||||
# proceed to phase 2
|
||||
nit2 = nit1
|
||||
if abs(T[-1, -1]) < tol:
|
||||
# Remove the pseudo-objective row from the tableau
|
||||
T = T[:-1, :]
|
||||
# Remove the artificial variable columns from the tableau
|
||||
T = np.delete(T, av, 1)
|
||||
else:
|
||||
# Failure to find a feasible starting point
|
||||
status = 2
|
||||
messages[status] = (
|
||||
"Phase 1 of the simplex method failed to find a feasible "
|
||||
"solution. The pseudo-objective function evaluates to {0:.1e} "
|
||||
"which exceeds the required tolerance of {1} for a solution to be "
|
||||
"considered 'close enough' to zero to be a basic solution. "
|
||||
"Consider increasing the tolerance to be greater than {0:.1e}. "
|
||||
"If this tolerance is unacceptably large the problem may be "
|
||||
"infeasible.".format(abs(T[-1, -1]), tol)
|
||||
)
|
||||
|
||||
if status == 0:
|
||||
# Phase 2
|
||||
nit2, status = _solve_simplex(T, n, basis, callback=callback,
|
||||
postsolve_args=postsolve_args,
|
||||
maxiter=maxiter, tol=tol, phase=2,
|
||||
bland=bland, nit0=nit1
|
||||
)
|
||||
|
||||
solution = np.zeros(n + m)
|
||||
solution[basis[:n]] = T[:n, -1]
|
||||
x = solution[:m]
|
||||
|
||||
return x, status, messages[status], int(nit2)
|
||||
1522
venv/lib/python3.12/site-packages/scipy/optimize/_linprog_util.py
Normal file
1522
venv/lib/python3.12/site-packages/scipy/optimize/_linprog_util.py
Normal file
File diff suppressed because it is too large
Load Diff
Binary file not shown.
@ -0,0 +1,5 @@
|
||||
"""This module contains least-squares algorithms."""
|
||||
from .least_squares import least_squares
|
||||
from .lsq_linear import lsq_linear
|
||||
|
||||
__all__ = ['least_squares', 'lsq_linear']
|
||||
183
venv/lib/python3.12/site-packages/scipy/optimize/_lsq/bvls.py
Normal file
183
venv/lib/python3.12/site-packages/scipy/optimize/_lsq/bvls.py
Normal file
@ -0,0 +1,183 @@
|
||||
"""Bounded-variable least-squares algorithm."""
|
||||
import numpy as np
|
||||
from numpy.linalg import norm, lstsq
|
||||
from scipy.optimize import OptimizeResult
|
||||
|
||||
from .common import print_header_linear, print_iteration_linear
|
||||
|
||||
|
||||
def compute_kkt_optimality(g, on_bound):
|
||||
"""Compute the maximum violation of KKT conditions."""
|
||||
g_kkt = g * on_bound
|
||||
free_set = on_bound == 0
|
||||
g_kkt[free_set] = np.abs(g[free_set])
|
||||
return np.max(g_kkt)
|
||||
|
||||
|
||||
def bvls(A, b, x_lsq, lb, ub, tol, max_iter, verbose, rcond=None):
|
||||
m, n = A.shape
|
||||
|
||||
x = x_lsq.copy()
|
||||
on_bound = np.zeros(n)
|
||||
|
||||
mask = x <= lb
|
||||
x[mask] = lb[mask]
|
||||
on_bound[mask] = -1
|
||||
|
||||
mask = x >= ub
|
||||
x[mask] = ub[mask]
|
||||
on_bound[mask] = 1
|
||||
|
||||
free_set = on_bound == 0
|
||||
active_set = ~free_set
|
||||
free_set, = np.nonzero(free_set)
|
||||
|
||||
r = A.dot(x) - b
|
||||
cost = 0.5 * np.dot(r, r)
|
||||
initial_cost = cost
|
||||
g = A.T.dot(r)
|
||||
|
||||
cost_change = None
|
||||
step_norm = None
|
||||
iteration = 0
|
||||
|
||||
if verbose == 2:
|
||||
print_header_linear()
|
||||
|
||||
# This is the initialization loop. The requirement is that the
|
||||
# least-squares solution on free variables is feasible before BVLS starts.
|
||||
# One possible initialization is to set all variables to lower or upper
|
||||
# bounds, but many iterations may be required from this state later on.
|
||||
# The implemented ad-hoc procedure which intuitively should give a better
|
||||
# initial state: find the least-squares solution on current free variables,
|
||||
# if its feasible then stop, otherwise, set violating variables to
|
||||
# corresponding bounds and continue on the reduced set of free variables.
|
||||
|
||||
while free_set.size > 0:
|
||||
if verbose == 2:
|
||||
optimality = compute_kkt_optimality(g, on_bound)
|
||||
print_iteration_linear(iteration, cost, cost_change, step_norm,
|
||||
optimality)
|
||||
|
||||
iteration += 1
|
||||
x_free_old = x[free_set].copy()
|
||||
|
||||
A_free = A[:, free_set]
|
||||
b_free = b - A.dot(x * active_set)
|
||||
z = lstsq(A_free, b_free, rcond=rcond)[0]
|
||||
|
||||
lbv = z < lb[free_set]
|
||||
ubv = z > ub[free_set]
|
||||
v = lbv | ubv
|
||||
|
||||
if np.any(lbv):
|
||||
ind = free_set[lbv]
|
||||
x[ind] = lb[ind]
|
||||
active_set[ind] = True
|
||||
on_bound[ind] = -1
|
||||
|
||||
if np.any(ubv):
|
||||
ind = free_set[ubv]
|
||||
x[ind] = ub[ind]
|
||||
active_set[ind] = True
|
||||
on_bound[ind] = 1
|
||||
|
||||
ind = free_set[~v]
|
||||
x[ind] = z[~v]
|
||||
|
||||
r = A.dot(x) - b
|
||||
cost_new = 0.5 * np.dot(r, r)
|
||||
cost_change = cost - cost_new
|
||||
cost = cost_new
|
||||
g = A.T.dot(r)
|
||||
step_norm = norm(x[free_set] - x_free_old)
|
||||
|
||||
if np.any(v):
|
||||
free_set = free_set[~v]
|
||||
else:
|
||||
break
|
||||
|
||||
if max_iter is None:
|
||||
max_iter = n
|
||||
max_iter += iteration
|
||||
|
||||
termination_status = None
|
||||
|
||||
# Main BVLS loop.
|
||||
|
||||
optimality = compute_kkt_optimality(g, on_bound)
|
||||
for iteration in range(iteration, max_iter): # BVLS Loop A
|
||||
if verbose == 2:
|
||||
print_iteration_linear(iteration, cost, cost_change,
|
||||
step_norm, optimality)
|
||||
|
||||
if optimality < tol:
|
||||
termination_status = 1
|
||||
|
||||
if termination_status is not None:
|
||||
break
|
||||
|
||||
move_to_free = np.argmax(g * on_bound)
|
||||
on_bound[move_to_free] = 0
|
||||
|
||||
while True: # BVLS Loop B
|
||||
|
||||
free_set = on_bound == 0
|
||||
active_set = ~free_set
|
||||
free_set, = np.nonzero(free_set)
|
||||
|
||||
x_free = x[free_set]
|
||||
x_free_old = x_free.copy()
|
||||
lb_free = lb[free_set]
|
||||
ub_free = ub[free_set]
|
||||
|
||||
A_free = A[:, free_set]
|
||||
b_free = b - A.dot(x * active_set)
|
||||
z = lstsq(A_free, b_free, rcond=rcond)[0]
|
||||
|
||||
lbv, = np.nonzero(z < lb_free)
|
||||
ubv, = np.nonzero(z > ub_free)
|
||||
v = np.hstack((lbv, ubv))
|
||||
|
||||
if v.size > 0:
|
||||
alphas = np.hstack((
|
||||
lb_free[lbv] - x_free[lbv],
|
||||
ub_free[ubv] - x_free[ubv])) / (z[v] - x_free[v])
|
||||
|
||||
i = np.argmin(alphas)
|
||||
i_free = v[i]
|
||||
alpha = alphas[i]
|
||||
|
||||
x_free *= 1 - alpha
|
||||
x_free += alpha * z
|
||||
x[free_set] = x_free
|
||||
|
||||
if i < lbv.size:
|
||||
on_bound[free_set[i_free]] = -1
|
||||
else:
|
||||
on_bound[free_set[i_free]] = 1
|
||||
else:
|
||||
x_free = z
|
||||
x[free_set] = x_free
|
||||
break
|
||||
|
||||
step_norm = norm(x_free - x_free_old)
|
||||
|
||||
r = A.dot(x) - b
|
||||
cost_new = 0.5 * np.dot(r, r)
|
||||
cost_change = cost - cost_new
|
||||
|
||||
if cost_change < tol * cost:
|
||||
termination_status = 2
|
||||
cost = cost_new
|
||||
|
||||
g = A.T.dot(r)
|
||||
optimality = compute_kkt_optimality(g, on_bound)
|
||||
|
||||
if termination_status is None:
|
||||
termination_status = 0
|
||||
|
||||
return OptimizeResult(
|
||||
x=x, fun=r, cost=cost, optimality=optimality, active_mask=on_bound,
|
||||
nit=iteration + 1, status=termination_status,
|
||||
initial_cost=initial_cost)
|
||||
733
venv/lib/python3.12/site-packages/scipy/optimize/_lsq/common.py
Normal file
733
venv/lib/python3.12/site-packages/scipy/optimize/_lsq/common.py
Normal file
@ -0,0 +1,733 @@
|
||||
"""Functions used by least-squares algorithms."""
|
||||
from math import copysign
|
||||
|
||||
import numpy as np
|
||||
from numpy.linalg import norm
|
||||
|
||||
from scipy.linalg import cho_factor, cho_solve, LinAlgError
|
||||
from scipy.sparse import issparse
|
||||
from scipy.sparse.linalg import LinearOperator, aslinearoperator
|
||||
|
||||
|
||||
EPS = np.finfo(float).eps
|
||||
|
||||
|
||||
# Functions related to a trust-region problem.
|
||||
|
||||
|
||||
def intersect_trust_region(x, s, Delta):
|
||||
"""Find the intersection of a line with the boundary of a trust region.
|
||||
|
||||
This function solves the quadratic equation with respect to t
|
||||
||(x + s*t)||**2 = Delta**2.
|
||||
|
||||
Returns
|
||||
-------
|
||||
t_neg, t_pos : tuple of float
|
||||
Negative and positive roots.
|
||||
|
||||
Raises
|
||||
------
|
||||
ValueError
|
||||
If `s` is zero or `x` is not within the trust region.
|
||||
"""
|
||||
a = np.dot(s, s)
|
||||
if a == 0:
|
||||
raise ValueError("`s` is zero.")
|
||||
|
||||
b = np.dot(x, s)
|
||||
|
||||
c = np.dot(x, x) - Delta**2
|
||||
if c > 0:
|
||||
raise ValueError("`x` is not within the trust region.")
|
||||
|
||||
d = np.sqrt(b*b - a*c) # Root from one fourth of the discriminant.
|
||||
|
||||
# Computations below avoid loss of significance, see "Numerical Recipes".
|
||||
q = -(b + copysign(d, b))
|
||||
t1 = q / a
|
||||
t2 = c / q
|
||||
|
||||
if t1 < t2:
|
||||
return t1, t2
|
||||
else:
|
||||
return t2, t1
|
||||
|
||||
|
||||
def solve_lsq_trust_region(n, m, uf, s, V, Delta, initial_alpha=None,
|
||||
rtol=0.01, max_iter=10):
|
||||
"""Solve a trust-region problem arising in least-squares minimization.
|
||||
|
||||
This function implements a method described by J. J. More [1]_ and used
|
||||
in MINPACK, but it relies on a single SVD of Jacobian instead of series
|
||||
of Cholesky decompositions. Before running this function, compute:
|
||||
``U, s, VT = svd(J, full_matrices=False)``.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
n : int
|
||||
Number of variables.
|
||||
m : int
|
||||
Number of residuals.
|
||||
uf : ndarray
|
||||
Computed as U.T.dot(f).
|
||||
s : ndarray
|
||||
Singular values of J.
|
||||
V : ndarray
|
||||
Transpose of VT.
|
||||
Delta : float
|
||||
Radius of a trust region.
|
||||
initial_alpha : float, optional
|
||||
Initial guess for alpha, which might be available from a previous
|
||||
iteration. If None, determined automatically.
|
||||
rtol : float, optional
|
||||
Stopping tolerance for the root-finding procedure. Namely, the
|
||||
solution ``p`` will satisfy ``abs(norm(p) - Delta) < rtol * Delta``.
|
||||
max_iter : int, optional
|
||||
Maximum allowed number of iterations for the root-finding procedure.
|
||||
|
||||
Returns
|
||||
-------
|
||||
p : ndarray, shape (n,)
|
||||
Found solution of a trust-region problem.
|
||||
alpha : float
|
||||
Positive value such that (J.T*J + alpha*I)*p = -J.T*f.
|
||||
Sometimes called Levenberg-Marquardt parameter.
|
||||
n_iter : int
|
||||
Number of iterations made by root-finding procedure. Zero means
|
||||
that Gauss-Newton step was selected as the solution.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] More, J. J., "The Levenberg-Marquardt Algorithm: Implementation
|
||||
and Theory," Numerical Analysis, ed. G. A. Watson, Lecture Notes
|
||||
in Mathematics 630, Springer Verlag, pp. 105-116, 1977.
|
||||
"""
|
||||
def phi_and_derivative(alpha, suf, s, Delta):
|
||||
"""Function of which to find zero.
|
||||
|
||||
It is defined as "norm of regularized (by alpha) least-squares
|
||||
solution minus `Delta`". Refer to [1]_.
|
||||
"""
|
||||
denom = s**2 + alpha
|
||||
p_norm = norm(suf / denom)
|
||||
phi = p_norm - Delta
|
||||
phi_prime = -np.sum(suf ** 2 / denom**3) / p_norm
|
||||
return phi, phi_prime
|
||||
|
||||
suf = s * uf
|
||||
|
||||
# Check if J has full rank and try Gauss-Newton step.
|
||||
if m >= n:
|
||||
threshold = EPS * m * s[0]
|
||||
full_rank = s[-1] > threshold
|
||||
else:
|
||||
full_rank = False
|
||||
|
||||
if full_rank:
|
||||
p = -V.dot(uf / s)
|
||||
if norm(p) <= Delta:
|
||||
return p, 0.0, 0
|
||||
|
||||
alpha_upper = norm(suf) / Delta
|
||||
|
||||
if full_rank:
|
||||
phi, phi_prime = phi_and_derivative(0.0, suf, s, Delta)
|
||||
alpha_lower = -phi / phi_prime
|
||||
else:
|
||||
alpha_lower = 0.0
|
||||
|
||||
if initial_alpha is None or not full_rank and initial_alpha == 0:
|
||||
alpha = max(0.001 * alpha_upper, (alpha_lower * alpha_upper)**0.5)
|
||||
else:
|
||||
alpha = initial_alpha
|
||||
|
||||
for it in range(max_iter):
|
||||
if alpha < alpha_lower or alpha > alpha_upper:
|
||||
alpha = max(0.001 * alpha_upper, (alpha_lower * alpha_upper)**0.5)
|
||||
|
||||
phi, phi_prime = phi_and_derivative(alpha, suf, s, Delta)
|
||||
|
||||
if phi < 0:
|
||||
alpha_upper = alpha
|
||||
|
||||
ratio = phi / phi_prime
|
||||
alpha_lower = max(alpha_lower, alpha - ratio)
|
||||
alpha -= (phi + Delta) * ratio / Delta
|
||||
|
||||
if np.abs(phi) < rtol * Delta:
|
||||
break
|
||||
|
||||
p = -V.dot(suf / (s**2 + alpha))
|
||||
|
||||
# Make the norm of p equal to Delta, p is changed only slightly during
|
||||
# this. It is done to prevent p lie outside the trust region (which can
|
||||
# cause problems later).
|
||||
p *= Delta / norm(p)
|
||||
|
||||
return p, alpha, it + 1
|
||||
|
||||
|
||||
def solve_trust_region_2d(B, g, Delta):
|
||||
"""Solve a general trust-region problem in 2 dimensions.
|
||||
|
||||
The problem is reformulated as a 4th order algebraic equation,
|
||||
the solution of which is found by numpy.roots.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
B : ndarray, shape (2, 2)
|
||||
Symmetric matrix, defines a quadratic term of the function.
|
||||
g : ndarray, shape (2,)
|
||||
Defines a linear term of the function.
|
||||
Delta : float
|
||||
Radius of a trust region.
|
||||
|
||||
Returns
|
||||
-------
|
||||
p : ndarray, shape (2,)
|
||||
Found solution.
|
||||
newton_step : bool
|
||||
Whether the returned solution is the Newton step which lies within
|
||||
the trust region.
|
||||
"""
|
||||
try:
|
||||
R, lower = cho_factor(B)
|
||||
p = -cho_solve((R, lower), g)
|
||||
if np.dot(p, p) <= Delta**2:
|
||||
return p, True
|
||||
except LinAlgError:
|
||||
pass
|
||||
|
||||
a = B[0, 0] * Delta**2
|
||||
b = B[0, 1] * Delta**2
|
||||
c = B[1, 1] * Delta**2
|
||||
|
||||
d = g[0] * Delta
|
||||
f = g[1] * Delta
|
||||
|
||||
coeffs = np.array(
|
||||
[-b + d, 2 * (a - c + f), 6 * b, 2 * (-a + c + f), -b - d])
|
||||
t = np.roots(coeffs) # Can handle leading zeros.
|
||||
t = np.real(t[np.isreal(t)])
|
||||
|
||||
p = Delta * np.vstack((2 * t / (1 + t**2), (1 - t**2) / (1 + t**2)))
|
||||
value = 0.5 * np.sum(p * B.dot(p), axis=0) + np.dot(g, p)
|
||||
i = np.argmin(value)
|
||||
p = p[:, i]
|
||||
|
||||
return p, False
|
||||
|
||||
|
||||
def update_tr_radius(Delta, actual_reduction, predicted_reduction,
|
||||
step_norm, bound_hit):
|
||||
"""Update the radius of a trust region based on the cost reduction.
|
||||
|
||||
Returns
|
||||
-------
|
||||
Delta : float
|
||||
New radius.
|
||||
ratio : float
|
||||
Ratio between actual and predicted reductions.
|
||||
"""
|
||||
if predicted_reduction > 0:
|
||||
ratio = actual_reduction / predicted_reduction
|
||||
elif predicted_reduction == actual_reduction == 0:
|
||||
ratio = 1
|
||||
else:
|
||||
ratio = 0
|
||||
|
||||
if ratio < 0.25:
|
||||
Delta = 0.25 * step_norm
|
||||
elif ratio > 0.75 and bound_hit:
|
||||
Delta *= 2.0
|
||||
|
||||
return Delta, ratio
|
||||
|
||||
|
||||
# Construction and minimization of quadratic functions.
|
||||
|
||||
|
||||
def build_quadratic_1d(J, g, s, diag=None, s0=None):
|
||||
"""Parameterize a multivariate quadratic function along a line.
|
||||
|
||||
The resulting univariate quadratic function is given as follows::
|
||||
|
||||
f(t) = 0.5 * (s0 + s*t).T * (J.T*J + diag) * (s0 + s*t) +
|
||||
g.T * (s0 + s*t)
|
||||
|
||||
Parameters
|
||||
----------
|
||||
J : ndarray, sparse matrix or LinearOperator shape (m, n)
|
||||
Jacobian matrix, affects the quadratic term.
|
||||
g : ndarray, shape (n,)
|
||||
Gradient, defines the linear term.
|
||||
s : ndarray, shape (n,)
|
||||
Direction vector of a line.
|
||||
diag : None or ndarray with shape (n,), optional
|
||||
Addition diagonal part, affects the quadratic term.
|
||||
If None, assumed to be 0.
|
||||
s0 : None or ndarray with shape (n,), optional
|
||||
Initial point. If None, assumed to be 0.
|
||||
|
||||
Returns
|
||||
-------
|
||||
a : float
|
||||
Coefficient for t**2.
|
||||
b : float
|
||||
Coefficient for t.
|
||||
c : float
|
||||
Free term. Returned only if `s0` is provided.
|
||||
"""
|
||||
v = J.dot(s)
|
||||
a = np.dot(v, v)
|
||||
if diag is not None:
|
||||
a += np.dot(s * diag, s)
|
||||
a *= 0.5
|
||||
|
||||
b = np.dot(g, s)
|
||||
|
||||
if s0 is not None:
|
||||
u = J.dot(s0)
|
||||
b += np.dot(u, v)
|
||||
c = 0.5 * np.dot(u, u) + np.dot(g, s0)
|
||||
if diag is not None:
|
||||
b += np.dot(s0 * diag, s)
|
||||
c += 0.5 * np.dot(s0 * diag, s0)
|
||||
return a, b, c
|
||||
else:
|
||||
return a, b
|
||||
|
||||
|
||||
def minimize_quadratic_1d(a, b, lb, ub, c=0):
|
||||
"""Minimize a 1-D quadratic function subject to bounds.
|
||||
|
||||
The free term `c` is 0 by default. Bounds must be finite.
|
||||
|
||||
Returns
|
||||
-------
|
||||
t : float
|
||||
Minimum point.
|
||||
y : float
|
||||
Minimum value.
|
||||
"""
|
||||
t = [lb, ub]
|
||||
if a != 0:
|
||||
extremum = -0.5 * b / a
|
||||
if lb < extremum < ub:
|
||||
t.append(extremum)
|
||||
t = np.asarray(t)
|
||||
y = t * (a * t + b) + c
|
||||
min_index = np.argmin(y)
|
||||
return t[min_index], y[min_index]
|
||||
|
||||
|
||||
def evaluate_quadratic(J, g, s, diag=None):
|
||||
"""Compute values of a quadratic function arising in least squares.
|
||||
|
||||
The function is 0.5 * s.T * (J.T * J + diag) * s + g.T * s.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
J : ndarray, sparse matrix or LinearOperator, shape (m, n)
|
||||
Jacobian matrix, affects the quadratic term.
|
||||
g : ndarray, shape (n,)
|
||||
Gradient, defines the linear term.
|
||||
s : ndarray, shape (k, n) or (n,)
|
||||
Array containing steps as rows.
|
||||
diag : ndarray, shape (n,), optional
|
||||
Addition diagonal part, affects the quadratic term.
|
||||
If None, assumed to be 0.
|
||||
|
||||
Returns
|
||||
-------
|
||||
values : ndarray with shape (k,) or float
|
||||
Values of the function. If `s` was 2-D, then ndarray is
|
||||
returned, otherwise, float is returned.
|
||||
"""
|
||||
if s.ndim == 1:
|
||||
Js = J.dot(s)
|
||||
q = np.dot(Js, Js)
|
||||
if diag is not None:
|
||||
q += np.dot(s * diag, s)
|
||||
else:
|
||||
Js = J.dot(s.T)
|
||||
q = np.sum(Js**2, axis=0)
|
||||
if diag is not None:
|
||||
q += np.sum(diag * s**2, axis=1)
|
||||
|
||||
l = np.dot(s, g)
|
||||
|
||||
return 0.5 * q + l
|
||||
|
||||
|
||||
# Utility functions to work with bound constraints.
|
||||
|
||||
|
||||
def in_bounds(x, lb, ub):
|
||||
"""Check if a point lies within bounds."""
|
||||
return np.all((x >= lb) & (x <= ub))
|
||||
|
||||
|
||||
def step_size_to_bound(x, s, lb, ub):
|
||||
"""Compute a min_step size required to reach a bound.
|
||||
|
||||
The function computes a positive scalar t, such that x + s * t is on
|
||||
the bound.
|
||||
|
||||
Returns
|
||||
-------
|
||||
step : float
|
||||
Computed step. Non-negative value.
|
||||
hits : ndarray of int with shape of x
|
||||
Each element indicates whether a corresponding variable reaches the
|
||||
bound:
|
||||
|
||||
* 0 - the bound was not hit.
|
||||
* -1 - the lower bound was hit.
|
||||
* 1 - the upper bound was hit.
|
||||
"""
|
||||
non_zero = np.nonzero(s)
|
||||
s_non_zero = s[non_zero]
|
||||
steps = np.empty_like(x)
|
||||
steps.fill(np.inf)
|
||||
with np.errstate(over='ignore'):
|
||||
steps[non_zero] = np.maximum((lb - x)[non_zero] / s_non_zero,
|
||||
(ub - x)[non_zero] / s_non_zero)
|
||||
min_step = np.min(steps)
|
||||
return min_step, np.equal(steps, min_step) * np.sign(s).astype(int)
|
||||
|
||||
|
||||
def find_active_constraints(x, lb, ub, rtol=1e-10):
|
||||
"""Determine which constraints are active in a given point.
|
||||
|
||||
The threshold is computed using `rtol` and the absolute value of the
|
||||
closest bound.
|
||||
|
||||
Returns
|
||||
-------
|
||||
active : ndarray of int with shape of x
|
||||
Each component shows whether the corresponding constraint is active:
|
||||
|
||||
* 0 - a constraint is not active.
|
||||
* -1 - a lower bound is active.
|
||||
* 1 - a upper bound is active.
|
||||
"""
|
||||
active = np.zeros_like(x, dtype=int)
|
||||
|
||||
if rtol == 0:
|
||||
active[x <= lb] = -1
|
||||
active[x >= ub] = 1
|
||||
return active
|
||||
|
||||
lower_dist = x - lb
|
||||
upper_dist = ub - x
|
||||
|
||||
lower_threshold = rtol * np.maximum(1, np.abs(lb))
|
||||
upper_threshold = rtol * np.maximum(1, np.abs(ub))
|
||||
|
||||
lower_active = (np.isfinite(lb) &
|
||||
(lower_dist <= np.minimum(upper_dist, lower_threshold)))
|
||||
active[lower_active] = -1
|
||||
|
||||
upper_active = (np.isfinite(ub) &
|
||||
(upper_dist <= np.minimum(lower_dist, upper_threshold)))
|
||||
active[upper_active] = 1
|
||||
|
||||
return active
|
||||
|
||||
|
||||
def make_strictly_feasible(x, lb, ub, rstep=1e-10):
|
||||
"""Shift a point to the interior of a feasible region.
|
||||
|
||||
Each element of the returned vector is at least at a relative distance
|
||||
`rstep` from the closest bound. If ``rstep=0`` then `np.nextafter` is used.
|
||||
"""
|
||||
x_new = x.copy()
|
||||
|
||||
active = find_active_constraints(x, lb, ub, rstep)
|
||||
lower_mask = np.equal(active, -1)
|
||||
upper_mask = np.equal(active, 1)
|
||||
|
||||
if rstep == 0:
|
||||
x_new[lower_mask] = np.nextafter(lb[lower_mask], ub[lower_mask])
|
||||
x_new[upper_mask] = np.nextafter(ub[upper_mask], lb[upper_mask])
|
||||
else:
|
||||
x_new[lower_mask] = (lb[lower_mask] +
|
||||
rstep * np.maximum(1, np.abs(lb[lower_mask])))
|
||||
x_new[upper_mask] = (ub[upper_mask] -
|
||||
rstep * np.maximum(1, np.abs(ub[upper_mask])))
|
||||
|
||||
tight_bounds = (x_new < lb) | (x_new > ub)
|
||||
x_new[tight_bounds] = 0.5 * (lb[tight_bounds] + ub[tight_bounds])
|
||||
|
||||
return x_new
|
||||
|
||||
|
||||
def CL_scaling_vector(x, g, lb, ub):
|
||||
"""Compute Coleman-Li scaling vector and its derivatives.
|
||||
|
||||
Components of a vector v are defined as follows::
|
||||
|
||||
| ub[i] - x[i], if g[i] < 0 and ub[i] < np.inf
|
||||
v[i] = | x[i] - lb[i], if g[i] > 0 and lb[i] > -np.inf
|
||||
| 1, otherwise
|
||||
|
||||
According to this definition v[i] >= 0 for all i. It differs from the
|
||||
definition in paper [1]_ (eq. (2.2)), where the absolute value of v is
|
||||
used. Both definitions are equivalent down the line.
|
||||
Derivatives of v with respect to x take value 1, -1 or 0 depending on a
|
||||
case.
|
||||
|
||||
Returns
|
||||
-------
|
||||
v : ndarray with shape of x
|
||||
Scaling vector.
|
||||
dv : ndarray with shape of x
|
||||
Derivatives of v[i] with respect to x[i], diagonal elements of v's
|
||||
Jacobian.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] M.A. Branch, T.F. Coleman, and Y. Li, "A Subspace, Interior,
|
||||
and Conjugate Gradient Method for Large-Scale Bound-Constrained
|
||||
Minimization Problems," SIAM Journal on Scientific Computing,
|
||||
Vol. 21, Number 1, pp 1-23, 1999.
|
||||
"""
|
||||
v = np.ones_like(x)
|
||||
dv = np.zeros_like(x)
|
||||
|
||||
mask = (g < 0) & np.isfinite(ub)
|
||||
v[mask] = ub[mask] - x[mask]
|
||||
dv[mask] = -1
|
||||
|
||||
mask = (g > 0) & np.isfinite(lb)
|
||||
v[mask] = x[mask] - lb[mask]
|
||||
dv[mask] = 1
|
||||
|
||||
return v, dv
|
||||
|
||||
|
||||
def reflective_transformation(y, lb, ub):
|
||||
"""Compute reflective transformation and its gradient."""
|
||||
if in_bounds(y, lb, ub):
|
||||
return y, np.ones_like(y)
|
||||
|
||||
lb_finite = np.isfinite(lb)
|
||||
ub_finite = np.isfinite(ub)
|
||||
|
||||
x = y.copy()
|
||||
g_negative = np.zeros_like(y, dtype=bool)
|
||||
|
||||
mask = lb_finite & ~ub_finite
|
||||
x[mask] = np.maximum(y[mask], 2 * lb[mask] - y[mask])
|
||||
g_negative[mask] = y[mask] < lb[mask]
|
||||
|
||||
mask = ~lb_finite & ub_finite
|
||||
x[mask] = np.minimum(y[mask], 2 * ub[mask] - y[mask])
|
||||
g_negative[mask] = y[mask] > ub[mask]
|
||||
|
||||
mask = lb_finite & ub_finite
|
||||
d = ub - lb
|
||||
t = np.remainder(y[mask] - lb[mask], 2 * d[mask])
|
||||
x[mask] = lb[mask] + np.minimum(t, 2 * d[mask] - t)
|
||||
g_negative[mask] = t > d[mask]
|
||||
|
||||
g = np.ones_like(y)
|
||||
g[g_negative] = -1
|
||||
|
||||
return x, g
|
||||
|
||||
|
||||
# Functions to display algorithm's progress.
|
||||
|
||||
|
||||
def print_header_nonlinear():
|
||||
print("{:^15}{:^15}{:^15}{:^15}{:^15}{:^15}"
|
||||
.format("Iteration", "Total nfev", "Cost", "Cost reduction",
|
||||
"Step norm", "Optimality"))
|
||||
|
||||
|
||||
def print_iteration_nonlinear(iteration, nfev, cost, cost_reduction,
|
||||
step_norm, optimality):
|
||||
if cost_reduction is None:
|
||||
cost_reduction = " " * 15
|
||||
else:
|
||||
cost_reduction = f"{cost_reduction:^15.2e}"
|
||||
|
||||
if step_norm is None:
|
||||
step_norm = " " * 15
|
||||
else:
|
||||
step_norm = f"{step_norm:^15.2e}"
|
||||
|
||||
print("{:^15}{:^15}{:^15.4e}{}{}{:^15.2e}"
|
||||
.format(iteration, nfev, cost, cost_reduction,
|
||||
step_norm, optimality))
|
||||
|
||||
|
||||
def print_header_linear():
|
||||
print("{:^15}{:^15}{:^15}{:^15}{:^15}"
|
||||
.format("Iteration", "Cost", "Cost reduction", "Step norm",
|
||||
"Optimality"))
|
||||
|
||||
|
||||
def print_iteration_linear(iteration, cost, cost_reduction, step_norm,
|
||||
optimality):
|
||||
if cost_reduction is None:
|
||||
cost_reduction = " " * 15
|
||||
else:
|
||||
cost_reduction = f"{cost_reduction:^15.2e}"
|
||||
|
||||
if step_norm is None:
|
||||
step_norm = " " * 15
|
||||
else:
|
||||
step_norm = f"{step_norm:^15.2e}"
|
||||
|
||||
print(f"{iteration:^15}{cost:^15.4e}{cost_reduction}{step_norm}{optimality:^15.2e}")
|
||||
|
||||
|
||||
# Simple helper functions.
|
||||
|
||||
|
||||
def compute_grad(J, f):
|
||||
"""Compute gradient of the least-squares cost function."""
|
||||
if isinstance(J, LinearOperator):
|
||||
return J.rmatvec(f)
|
||||
else:
|
||||
return J.T.dot(f)
|
||||
|
||||
|
||||
def compute_jac_scale(J, scale_inv_old=None):
|
||||
"""Compute variables scale based on the Jacobian matrix."""
|
||||
if issparse(J):
|
||||
scale_inv = np.asarray(J.power(2).sum(axis=0)).ravel()**0.5
|
||||
else:
|
||||
scale_inv = np.sum(J**2, axis=0)**0.5
|
||||
|
||||
if scale_inv_old is None:
|
||||
scale_inv[scale_inv == 0] = 1
|
||||
else:
|
||||
scale_inv = np.maximum(scale_inv, scale_inv_old)
|
||||
|
||||
return 1 / scale_inv, scale_inv
|
||||
|
||||
|
||||
def left_multiplied_operator(J, d):
|
||||
"""Return diag(d) J as LinearOperator."""
|
||||
J = aslinearoperator(J)
|
||||
|
||||
def matvec(x):
|
||||
return d * J.matvec(x)
|
||||
|
||||
def matmat(X):
|
||||
return d[:, np.newaxis] * J.matmat(X)
|
||||
|
||||
def rmatvec(x):
|
||||
return J.rmatvec(x.ravel() * d)
|
||||
|
||||
return LinearOperator(J.shape, matvec=matvec, matmat=matmat,
|
||||
rmatvec=rmatvec)
|
||||
|
||||
|
||||
def right_multiplied_operator(J, d):
|
||||
"""Return J diag(d) as LinearOperator."""
|
||||
J = aslinearoperator(J)
|
||||
|
||||
def matvec(x):
|
||||
return J.matvec(np.ravel(x) * d)
|
||||
|
||||
def matmat(X):
|
||||
return J.matmat(X * d[:, np.newaxis])
|
||||
|
||||
def rmatvec(x):
|
||||
return d * J.rmatvec(x)
|
||||
|
||||
return LinearOperator(J.shape, matvec=matvec, matmat=matmat,
|
||||
rmatvec=rmatvec)
|
||||
|
||||
|
||||
def regularized_lsq_operator(J, diag):
|
||||
"""Return a matrix arising in regularized least squares as LinearOperator.
|
||||
|
||||
The matrix is
|
||||
[ J ]
|
||||
[ D ]
|
||||
where D is diagonal matrix with elements from `diag`.
|
||||
"""
|
||||
J = aslinearoperator(J)
|
||||
m, n = J.shape
|
||||
|
||||
def matvec(x):
|
||||
return np.hstack((J.matvec(x), diag * x))
|
||||
|
||||
def rmatvec(x):
|
||||
x1 = x[:m]
|
||||
x2 = x[m:]
|
||||
return J.rmatvec(x1) + diag * x2
|
||||
|
||||
return LinearOperator((m + n, n), matvec=matvec, rmatvec=rmatvec)
|
||||
|
||||
|
||||
def right_multiply(J, d, copy=True):
|
||||
"""Compute J diag(d).
|
||||
|
||||
If `copy` is False, `J` is modified in place (unless being LinearOperator).
|
||||
"""
|
||||
if copy and not isinstance(J, LinearOperator):
|
||||
J = J.copy()
|
||||
|
||||
if issparse(J):
|
||||
J.data *= d.take(J.indices, mode='clip') # scikit-learn recipe.
|
||||
elif isinstance(J, LinearOperator):
|
||||
J = right_multiplied_operator(J, d)
|
||||
else:
|
||||
J *= d
|
||||
|
||||
return J
|
||||
|
||||
|
||||
def left_multiply(J, d, copy=True):
|
||||
"""Compute diag(d) J.
|
||||
|
||||
If `copy` is False, `J` is modified in place (unless being LinearOperator).
|
||||
"""
|
||||
if copy and not isinstance(J, LinearOperator):
|
||||
J = J.copy()
|
||||
|
||||
if issparse(J):
|
||||
J.data *= np.repeat(d, np.diff(J.indptr)) # scikit-learn recipe.
|
||||
elif isinstance(J, LinearOperator):
|
||||
J = left_multiplied_operator(J, d)
|
||||
else:
|
||||
J *= d[:, np.newaxis]
|
||||
|
||||
return J
|
||||
|
||||
|
||||
def check_termination(dF, F, dx_norm, x_norm, ratio, ftol, xtol):
|
||||
"""Check termination condition for nonlinear least squares."""
|
||||
ftol_satisfied = dF < ftol * F and ratio > 0.25
|
||||
xtol_satisfied = dx_norm < xtol * (xtol + x_norm)
|
||||
|
||||
if ftol_satisfied and xtol_satisfied:
|
||||
return 4
|
||||
elif ftol_satisfied:
|
||||
return 2
|
||||
elif xtol_satisfied:
|
||||
return 3
|
||||
else:
|
||||
return None
|
||||
|
||||
|
||||
def scale_for_robust_loss_function(J, f, rho):
|
||||
"""Scale Jacobian and residuals for a robust loss function.
|
||||
|
||||
Arrays are modified in place.
|
||||
"""
|
||||
J_scale = rho[1] + 2 * rho[2] * f**2
|
||||
J_scale[J_scale < EPS] = EPS
|
||||
J_scale **= 0.5
|
||||
|
||||
f *= rho[1] / J_scale
|
||||
|
||||
return left_multiply(J, J_scale, copy=False), f
|
||||
331
venv/lib/python3.12/site-packages/scipy/optimize/_lsq/dogbox.py
Normal file
331
venv/lib/python3.12/site-packages/scipy/optimize/_lsq/dogbox.py
Normal file
@ -0,0 +1,331 @@
|
||||
"""
|
||||
Dogleg algorithm with rectangular trust regions for least-squares minimization.
|
||||
|
||||
The description of the algorithm can be found in [Voglis]_. The algorithm does
|
||||
trust-region iterations, but the shape of trust regions is rectangular as
|
||||
opposed to conventional elliptical. The intersection of a trust region and
|
||||
an initial feasible region is again some rectangle. Thus, on each iteration a
|
||||
bound-constrained quadratic optimization problem is solved.
|
||||
|
||||
A quadratic problem is solved by well-known dogleg approach, where the
|
||||
function is minimized along piecewise-linear "dogleg" path [NumOpt]_,
|
||||
Chapter 4. If Jacobian is not rank-deficient then the function is decreasing
|
||||
along this path, and optimization amounts to simply following along this
|
||||
path as long as a point stays within the bounds. A constrained Cauchy step
|
||||
(along the anti-gradient) is considered for safety in rank deficient cases,
|
||||
in this situations the convergence might be slow.
|
||||
|
||||
If during iterations some variable hit the initial bound and the component
|
||||
of anti-gradient points outside the feasible region, then a next dogleg step
|
||||
won't make any progress. At this state such variables satisfy first-order
|
||||
optimality conditions and they are excluded before computing a next dogleg
|
||||
step.
|
||||
|
||||
Gauss-Newton step can be computed exactly by `numpy.linalg.lstsq` (for dense
|
||||
Jacobian matrices) or by iterative procedure `scipy.sparse.linalg.lsmr` (for
|
||||
dense and sparse matrices, or Jacobian being LinearOperator). The second
|
||||
option allows to solve very large problems (up to couple of millions of
|
||||
residuals on a regular PC), provided the Jacobian matrix is sufficiently
|
||||
sparse. But note that dogbox is not very good for solving problems with
|
||||
large number of constraints, because of variables exclusion-inclusion on each
|
||||
iteration (a required number of function evaluations might be high or accuracy
|
||||
of a solution will be poor), thus its large-scale usage is probably limited
|
||||
to unconstrained problems.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [Voglis] C. Voglis and I. E. Lagaris, "A Rectangular Trust Region Dogleg
|
||||
Approach for Unconstrained and Bound Constrained Nonlinear
|
||||
Optimization", WSEAS International Conference on Applied
|
||||
Mathematics, Corfu, Greece, 2004.
|
||||
.. [NumOpt] J. Nocedal and S. J. Wright, "Numerical optimization, 2nd edition".
|
||||
"""
|
||||
import numpy as np
|
||||
from numpy.linalg import lstsq, norm
|
||||
|
||||
from scipy.sparse.linalg import LinearOperator, aslinearoperator, lsmr
|
||||
from scipy.optimize import OptimizeResult
|
||||
|
||||
from .common import (
|
||||
step_size_to_bound, in_bounds, update_tr_radius, evaluate_quadratic,
|
||||
build_quadratic_1d, minimize_quadratic_1d, compute_grad,
|
||||
compute_jac_scale, check_termination, scale_for_robust_loss_function,
|
||||
print_header_nonlinear, print_iteration_nonlinear)
|
||||
|
||||
|
||||
def lsmr_operator(Jop, d, active_set):
|
||||
"""Compute LinearOperator to use in LSMR by dogbox algorithm.
|
||||
|
||||
`active_set` mask is used to excluded active variables from computations
|
||||
of matrix-vector products.
|
||||
"""
|
||||
m, n = Jop.shape
|
||||
|
||||
def matvec(x):
|
||||
x_free = x.ravel().copy()
|
||||
x_free[active_set] = 0
|
||||
return Jop.matvec(x * d)
|
||||
|
||||
def rmatvec(x):
|
||||
r = d * Jop.rmatvec(x)
|
||||
r[active_set] = 0
|
||||
return r
|
||||
|
||||
return LinearOperator((m, n), matvec=matvec, rmatvec=rmatvec, dtype=float)
|
||||
|
||||
|
||||
def find_intersection(x, tr_bounds, lb, ub):
|
||||
"""Find intersection of trust-region bounds and initial bounds.
|
||||
|
||||
Returns
|
||||
-------
|
||||
lb_total, ub_total : ndarray with shape of x
|
||||
Lower and upper bounds of the intersection region.
|
||||
orig_l, orig_u : ndarray of bool with shape of x
|
||||
True means that an original bound is taken as a corresponding bound
|
||||
in the intersection region.
|
||||
tr_l, tr_u : ndarray of bool with shape of x
|
||||
True means that a trust-region bound is taken as a corresponding bound
|
||||
in the intersection region.
|
||||
"""
|
||||
lb_centered = lb - x
|
||||
ub_centered = ub - x
|
||||
|
||||
lb_total = np.maximum(lb_centered, -tr_bounds)
|
||||
ub_total = np.minimum(ub_centered, tr_bounds)
|
||||
|
||||
orig_l = np.equal(lb_total, lb_centered)
|
||||
orig_u = np.equal(ub_total, ub_centered)
|
||||
|
||||
tr_l = np.equal(lb_total, -tr_bounds)
|
||||
tr_u = np.equal(ub_total, tr_bounds)
|
||||
|
||||
return lb_total, ub_total, orig_l, orig_u, tr_l, tr_u
|
||||
|
||||
|
||||
def dogleg_step(x, newton_step, g, a, b, tr_bounds, lb, ub):
|
||||
"""Find dogleg step in a rectangular region.
|
||||
|
||||
Returns
|
||||
-------
|
||||
step : ndarray, shape (n,)
|
||||
Computed dogleg step.
|
||||
bound_hits : ndarray of int, shape (n,)
|
||||
Each component shows whether a corresponding variable hits the
|
||||
initial bound after the step is taken:
|
||||
* 0 - a variable doesn't hit the bound.
|
||||
* -1 - lower bound is hit.
|
||||
* 1 - upper bound is hit.
|
||||
tr_hit : bool
|
||||
Whether the step hit the boundary of the trust-region.
|
||||
"""
|
||||
lb_total, ub_total, orig_l, orig_u, tr_l, tr_u = find_intersection(
|
||||
x, tr_bounds, lb, ub
|
||||
)
|
||||
bound_hits = np.zeros_like(x, dtype=int)
|
||||
|
||||
if in_bounds(newton_step, lb_total, ub_total):
|
||||
return newton_step, bound_hits, False
|
||||
|
||||
to_bounds, _ = step_size_to_bound(np.zeros_like(x), -g, lb_total, ub_total)
|
||||
|
||||
# The classical dogleg algorithm would check if Cauchy step fits into
|
||||
# the bounds, and just return it constrained version if not. But in a
|
||||
# rectangular trust region it makes sense to try to improve constrained
|
||||
# Cauchy step too. Thus, we don't distinguish these two cases.
|
||||
|
||||
cauchy_step = -minimize_quadratic_1d(a, b, 0, to_bounds)[0] * g
|
||||
|
||||
step_diff = newton_step - cauchy_step
|
||||
step_size, hits = step_size_to_bound(cauchy_step, step_diff,
|
||||
lb_total, ub_total)
|
||||
bound_hits[(hits < 0) & orig_l] = -1
|
||||
bound_hits[(hits > 0) & orig_u] = 1
|
||||
tr_hit = np.any((hits < 0) & tr_l | (hits > 0) & tr_u)
|
||||
|
||||
return cauchy_step + step_size * step_diff, bound_hits, tr_hit
|
||||
|
||||
|
||||
def dogbox(fun, jac, x0, f0, J0, lb, ub, ftol, xtol, gtol, max_nfev, x_scale,
|
||||
loss_function, tr_solver, tr_options, verbose):
|
||||
f = f0
|
||||
f_true = f.copy()
|
||||
nfev = 1
|
||||
|
||||
J = J0
|
||||
njev = 1
|
||||
|
||||
if loss_function is not None:
|
||||
rho = loss_function(f)
|
||||
cost = 0.5 * np.sum(rho[0])
|
||||
J, f = scale_for_robust_loss_function(J, f, rho)
|
||||
else:
|
||||
cost = 0.5 * np.dot(f, f)
|
||||
|
||||
g = compute_grad(J, f)
|
||||
|
||||
jac_scale = isinstance(x_scale, str) and x_scale == 'jac'
|
||||
if jac_scale:
|
||||
scale, scale_inv = compute_jac_scale(J)
|
||||
else:
|
||||
scale, scale_inv = x_scale, 1 / x_scale
|
||||
|
||||
Delta = norm(x0 * scale_inv, ord=np.inf)
|
||||
if Delta == 0:
|
||||
Delta = 1.0
|
||||
|
||||
on_bound = np.zeros_like(x0, dtype=int)
|
||||
on_bound[np.equal(x0, lb)] = -1
|
||||
on_bound[np.equal(x0, ub)] = 1
|
||||
|
||||
x = x0
|
||||
step = np.empty_like(x0)
|
||||
|
||||
if max_nfev is None:
|
||||
max_nfev = x0.size * 100
|
||||
|
||||
termination_status = None
|
||||
iteration = 0
|
||||
step_norm = None
|
||||
actual_reduction = None
|
||||
|
||||
if verbose == 2:
|
||||
print_header_nonlinear()
|
||||
|
||||
while True:
|
||||
active_set = on_bound * g < 0
|
||||
free_set = ~active_set
|
||||
|
||||
g_free = g[free_set]
|
||||
g_full = g.copy()
|
||||
g[active_set] = 0
|
||||
|
||||
g_norm = norm(g, ord=np.inf)
|
||||
if g_norm < gtol:
|
||||
termination_status = 1
|
||||
|
||||
if verbose == 2:
|
||||
print_iteration_nonlinear(iteration, nfev, cost, actual_reduction,
|
||||
step_norm, g_norm)
|
||||
|
||||
if termination_status is not None or nfev == max_nfev:
|
||||
break
|
||||
|
||||
x_free = x[free_set]
|
||||
lb_free = lb[free_set]
|
||||
ub_free = ub[free_set]
|
||||
scale_free = scale[free_set]
|
||||
|
||||
# Compute (Gauss-)Newton and build quadratic model for Cauchy step.
|
||||
if tr_solver == 'exact':
|
||||
J_free = J[:, free_set]
|
||||
newton_step = lstsq(J_free, -f, rcond=-1)[0]
|
||||
|
||||
# Coefficients for the quadratic model along the anti-gradient.
|
||||
a, b = build_quadratic_1d(J_free, g_free, -g_free)
|
||||
elif tr_solver == 'lsmr':
|
||||
Jop = aslinearoperator(J)
|
||||
|
||||
# We compute lsmr step in scaled variables and then
|
||||
# transform back to normal variables, if lsmr would give exact lsq
|
||||
# solution, this would be equivalent to not doing any
|
||||
# transformations, but from experience it's better this way.
|
||||
|
||||
# We pass active_set to make computations as if we selected
|
||||
# the free subset of J columns, but without actually doing any
|
||||
# slicing, which is expensive for sparse matrices and impossible
|
||||
# for LinearOperator.
|
||||
|
||||
lsmr_op = lsmr_operator(Jop, scale, active_set)
|
||||
newton_step = -lsmr(lsmr_op, f, **tr_options)[0][free_set]
|
||||
newton_step *= scale_free
|
||||
|
||||
# Components of g for active variables were zeroed, so this call
|
||||
# is correct and equivalent to using J_free and g_free.
|
||||
a, b = build_quadratic_1d(Jop, g, -g)
|
||||
|
||||
actual_reduction = -1.0
|
||||
while actual_reduction <= 0 and nfev < max_nfev:
|
||||
tr_bounds = Delta * scale_free
|
||||
|
||||
step_free, on_bound_free, tr_hit = dogleg_step(
|
||||
x_free, newton_step, g_free, a, b, tr_bounds, lb_free, ub_free)
|
||||
|
||||
step.fill(0.0)
|
||||
step[free_set] = step_free
|
||||
|
||||
if tr_solver == 'exact':
|
||||
predicted_reduction = -evaluate_quadratic(J_free, g_free,
|
||||
step_free)
|
||||
elif tr_solver == 'lsmr':
|
||||
predicted_reduction = -evaluate_quadratic(Jop, g, step)
|
||||
|
||||
# gh11403 ensure that solution is fully within bounds.
|
||||
x_new = np.clip(x + step, lb, ub)
|
||||
|
||||
f_new = fun(x_new)
|
||||
nfev += 1
|
||||
|
||||
step_h_norm = norm(step * scale_inv, ord=np.inf)
|
||||
|
||||
if not np.all(np.isfinite(f_new)):
|
||||
Delta = 0.25 * step_h_norm
|
||||
continue
|
||||
|
||||
# Usual trust-region step quality estimation.
|
||||
if loss_function is not None:
|
||||
cost_new = loss_function(f_new, cost_only=True)
|
||||
else:
|
||||
cost_new = 0.5 * np.dot(f_new, f_new)
|
||||
actual_reduction = cost - cost_new
|
||||
|
||||
Delta, ratio = update_tr_radius(
|
||||
Delta, actual_reduction, predicted_reduction,
|
||||
step_h_norm, tr_hit
|
||||
)
|
||||
|
||||
step_norm = norm(step)
|
||||
termination_status = check_termination(
|
||||
actual_reduction, cost, step_norm, norm(x), ratio, ftol, xtol)
|
||||
|
||||
if termination_status is not None:
|
||||
break
|
||||
|
||||
if actual_reduction > 0:
|
||||
on_bound[free_set] = on_bound_free
|
||||
|
||||
x = x_new
|
||||
# Set variables exactly at the boundary.
|
||||
mask = on_bound == -1
|
||||
x[mask] = lb[mask]
|
||||
mask = on_bound == 1
|
||||
x[mask] = ub[mask]
|
||||
|
||||
f = f_new
|
||||
f_true = f.copy()
|
||||
|
||||
cost = cost_new
|
||||
|
||||
J = jac(x, f)
|
||||
njev += 1
|
||||
|
||||
if loss_function is not None:
|
||||
rho = loss_function(f)
|
||||
J, f = scale_for_robust_loss_function(J, f, rho)
|
||||
|
||||
g = compute_grad(J, f)
|
||||
|
||||
if jac_scale:
|
||||
scale, scale_inv = compute_jac_scale(J, scale_inv)
|
||||
else:
|
||||
step_norm = 0
|
||||
actual_reduction = 0
|
||||
|
||||
iteration += 1
|
||||
|
||||
if termination_status is None:
|
||||
termination_status = 0
|
||||
|
||||
return OptimizeResult(
|
||||
x=x, cost=cost, fun=f_true, jac=J, grad=g_full, optimality=g_norm,
|
||||
active_mask=on_bound, nfev=nfev, njev=njev, status=termination_status)
|
||||
Binary file not shown.
@ -0,0 +1,967 @@
|
||||
"""Generic interface for least-squares minimization."""
|
||||
from warnings import warn
|
||||
|
||||
import numpy as np
|
||||
from numpy.linalg import norm
|
||||
|
||||
from scipy.sparse import issparse
|
||||
from scipy.sparse.linalg import LinearOperator
|
||||
from scipy.optimize import _minpack, OptimizeResult
|
||||
from scipy.optimize._numdiff import approx_derivative, group_columns
|
||||
from scipy.optimize._minimize import Bounds
|
||||
|
||||
from .trf import trf
|
||||
from .dogbox import dogbox
|
||||
from .common import EPS, in_bounds, make_strictly_feasible
|
||||
|
||||
|
||||
TERMINATION_MESSAGES = {
|
||||
-1: "Improper input parameters status returned from `leastsq`",
|
||||
0: "The maximum number of function evaluations is exceeded.",
|
||||
1: "`gtol` termination condition is satisfied.",
|
||||
2: "`ftol` termination condition is satisfied.",
|
||||
3: "`xtol` termination condition is satisfied.",
|
||||
4: "Both `ftol` and `xtol` termination conditions are satisfied."
|
||||
}
|
||||
|
||||
|
||||
FROM_MINPACK_TO_COMMON = {
|
||||
0: -1, # Improper input parameters from MINPACK.
|
||||
1: 2,
|
||||
2: 3,
|
||||
3: 4,
|
||||
4: 1,
|
||||
5: 0
|
||||
# There are 6, 7, 8 for too small tolerance parameters,
|
||||
# but we guard against it by checking ftol, xtol, gtol beforehand.
|
||||
}
|
||||
|
||||
|
||||
def call_minpack(fun, x0, jac, ftol, xtol, gtol, max_nfev, x_scale, diff_step):
|
||||
n = x0.size
|
||||
|
||||
if diff_step is None:
|
||||
epsfcn = EPS
|
||||
else:
|
||||
epsfcn = diff_step**2
|
||||
|
||||
# Compute MINPACK's `diag`, which is inverse of our `x_scale` and
|
||||
# ``x_scale='jac'`` corresponds to ``diag=None``.
|
||||
if isinstance(x_scale, str) and x_scale == 'jac':
|
||||
diag = None
|
||||
else:
|
||||
diag = 1 / x_scale
|
||||
|
||||
full_output = True
|
||||
col_deriv = False
|
||||
factor = 100.0
|
||||
|
||||
if jac is None:
|
||||
if max_nfev is None:
|
||||
# n squared to account for Jacobian evaluations.
|
||||
max_nfev = 100 * n * (n + 1)
|
||||
x, info, status = _minpack._lmdif(
|
||||
fun, x0, (), full_output, ftol, xtol, gtol,
|
||||
max_nfev, epsfcn, factor, diag)
|
||||
else:
|
||||
if max_nfev is None:
|
||||
max_nfev = 100 * n
|
||||
x, info, status = _minpack._lmder(
|
||||
fun, jac, x0, (), full_output, col_deriv,
|
||||
ftol, xtol, gtol, max_nfev, factor, diag)
|
||||
|
||||
f = info['fvec']
|
||||
|
||||
if callable(jac):
|
||||
J = jac(x)
|
||||
else:
|
||||
J = np.atleast_2d(approx_derivative(fun, x))
|
||||
|
||||
cost = 0.5 * np.dot(f, f)
|
||||
g = J.T.dot(f)
|
||||
g_norm = norm(g, ord=np.inf)
|
||||
|
||||
nfev = info['nfev']
|
||||
njev = info.get('njev', None)
|
||||
|
||||
status = FROM_MINPACK_TO_COMMON[status]
|
||||
active_mask = np.zeros_like(x0, dtype=int)
|
||||
|
||||
return OptimizeResult(
|
||||
x=x, cost=cost, fun=f, jac=J, grad=g, optimality=g_norm,
|
||||
active_mask=active_mask, nfev=nfev, njev=njev, status=status)
|
||||
|
||||
|
||||
def prepare_bounds(bounds, n):
|
||||
lb, ub = (np.asarray(b, dtype=float) for b in bounds)
|
||||
if lb.ndim == 0:
|
||||
lb = np.resize(lb, n)
|
||||
|
||||
if ub.ndim == 0:
|
||||
ub = np.resize(ub, n)
|
||||
|
||||
return lb, ub
|
||||
|
||||
|
||||
def check_tolerance(ftol, xtol, gtol, method):
|
||||
def check(tol, name):
|
||||
if tol is None:
|
||||
tol = 0
|
||||
elif tol < EPS:
|
||||
warn(f"Setting `{name}` below the machine epsilon ({EPS:.2e}) effectively "
|
||||
f"disables the corresponding termination condition.",
|
||||
stacklevel=3)
|
||||
return tol
|
||||
|
||||
ftol = check(ftol, "ftol")
|
||||
xtol = check(xtol, "xtol")
|
||||
gtol = check(gtol, "gtol")
|
||||
|
||||
if method == "lm" and (ftol < EPS or xtol < EPS or gtol < EPS):
|
||||
raise ValueError("All tolerances must be higher than machine epsilon "
|
||||
f"({EPS:.2e}) for method 'lm'.")
|
||||
elif ftol < EPS and xtol < EPS and gtol < EPS:
|
||||
raise ValueError("At least one of the tolerances must be higher than "
|
||||
f"machine epsilon ({EPS:.2e}).")
|
||||
|
||||
return ftol, xtol, gtol
|
||||
|
||||
|
||||
def check_x_scale(x_scale, x0):
|
||||
if isinstance(x_scale, str) and x_scale == 'jac':
|
||||
return x_scale
|
||||
|
||||
try:
|
||||
x_scale = np.asarray(x_scale, dtype=float)
|
||||
valid = np.all(np.isfinite(x_scale)) and np.all(x_scale > 0)
|
||||
except (ValueError, TypeError):
|
||||
valid = False
|
||||
|
||||
if not valid:
|
||||
raise ValueError("`x_scale` must be 'jac' or array_like with "
|
||||
"positive numbers.")
|
||||
|
||||
if x_scale.ndim == 0:
|
||||
x_scale = np.resize(x_scale, x0.shape)
|
||||
|
||||
if x_scale.shape != x0.shape:
|
||||
raise ValueError("Inconsistent shapes between `x_scale` and `x0`.")
|
||||
|
||||
return x_scale
|
||||
|
||||
|
||||
def check_jac_sparsity(jac_sparsity, m, n):
|
||||
if jac_sparsity is None:
|
||||
return None
|
||||
|
||||
if not issparse(jac_sparsity):
|
||||
jac_sparsity = np.atleast_2d(jac_sparsity)
|
||||
|
||||
if jac_sparsity.shape != (m, n):
|
||||
raise ValueError("`jac_sparsity` has wrong shape.")
|
||||
|
||||
return jac_sparsity, group_columns(jac_sparsity)
|
||||
|
||||
|
||||
# Loss functions.
|
||||
|
||||
|
||||
def huber(z, rho, cost_only):
|
||||
mask = z <= 1
|
||||
rho[0, mask] = z[mask]
|
||||
rho[0, ~mask] = 2 * z[~mask]**0.5 - 1
|
||||
if cost_only:
|
||||
return
|
||||
rho[1, mask] = 1
|
||||
rho[1, ~mask] = z[~mask]**-0.5
|
||||
rho[2, mask] = 0
|
||||
rho[2, ~mask] = -0.5 * z[~mask]**-1.5
|
||||
|
||||
|
||||
def soft_l1(z, rho, cost_only):
|
||||
t = 1 + z
|
||||
rho[0] = 2 * (t**0.5 - 1)
|
||||
if cost_only:
|
||||
return
|
||||
rho[1] = t**-0.5
|
||||
rho[2] = -0.5 * t**-1.5
|
||||
|
||||
|
||||
def cauchy(z, rho, cost_only):
|
||||
rho[0] = np.log1p(z)
|
||||
if cost_only:
|
||||
return
|
||||
t = 1 + z
|
||||
rho[1] = 1 / t
|
||||
rho[2] = -1 / t**2
|
||||
|
||||
|
||||
def arctan(z, rho, cost_only):
|
||||
rho[0] = np.arctan(z)
|
||||
if cost_only:
|
||||
return
|
||||
t = 1 + z**2
|
||||
rho[1] = 1 / t
|
||||
rho[2] = -2 * z / t**2
|
||||
|
||||
|
||||
IMPLEMENTED_LOSSES = dict(linear=None, huber=huber, soft_l1=soft_l1,
|
||||
cauchy=cauchy, arctan=arctan)
|
||||
|
||||
|
||||
def construct_loss_function(m, loss, f_scale):
|
||||
if loss == 'linear':
|
||||
return None
|
||||
|
||||
if not callable(loss):
|
||||
loss = IMPLEMENTED_LOSSES[loss]
|
||||
rho = np.empty((3, m))
|
||||
|
||||
def loss_function(f, cost_only=False):
|
||||
z = (f / f_scale) ** 2
|
||||
loss(z, rho, cost_only=cost_only)
|
||||
if cost_only:
|
||||
return 0.5 * f_scale ** 2 * np.sum(rho[0])
|
||||
rho[0] *= f_scale ** 2
|
||||
rho[2] /= f_scale ** 2
|
||||
return rho
|
||||
else:
|
||||
def loss_function(f, cost_only=False):
|
||||
z = (f / f_scale) ** 2
|
||||
rho = loss(z)
|
||||
if cost_only:
|
||||
return 0.5 * f_scale ** 2 * np.sum(rho[0])
|
||||
rho[0] *= f_scale ** 2
|
||||
rho[2] /= f_scale ** 2
|
||||
return rho
|
||||
|
||||
return loss_function
|
||||
|
||||
|
||||
def least_squares(
|
||||
fun, x0, jac='2-point', bounds=(-np.inf, np.inf), method='trf',
|
||||
ftol=1e-8, xtol=1e-8, gtol=1e-8, x_scale=1.0, loss='linear',
|
||||
f_scale=1.0, diff_step=None, tr_solver=None, tr_options={},
|
||||
jac_sparsity=None, max_nfev=None, verbose=0, args=(), kwargs={}):
|
||||
"""Solve a nonlinear least-squares problem with bounds on the variables.
|
||||
|
||||
Given the residuals f(x) (an m-D real function of n real
|
||||
variables) and the loss function rho(s) (a scalar function), `least_squares`
|
||||
finds a local minimum of the cost function F(x)::
|
||||
|
||||
minimize F(x) = 0.5 * sum(rho(f_i(x)**2), i = 0, ..., m - 1)
|
||||
subject to lb <= x <= ub
|
||||
|
||||
The purpose of the loss function rho(s) is to reduce the influence of
|
||||
outliers on the solution.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
fun : callable
|
||||
Function which computes the vector of residuals, with the signature
|
||||
``fun(x, *args, **kwargs)``, i.e., the minimization proceeds with
|
||||
respect to its first argument. The argument ``x`` passed to this
|
||||
function is an ndarray of shape (n,) (never a scalar, even for n=1).
|
||||
It must allocate and return a 1-D array_like of shape (m,) or a scalar.
|
||||
If the argument ``x`` is complex or the function ``fun`` returns
|
||||
complex residuals, it must be wrapped in a real function of real
|
||||
arguments, as shown at the end of the Examples section.
|
||||
x0 : array_like with shape (n,) or float
|
||||
Initial guess on independent variables. If float, it will be treated
|
||||
as a 1-D array with one element. When `method` is 'trf', the initial
|
||||
guess might be slightly adjusted to lie sufficiently within the given
|
||||
`bounds`.
|
||||
jac : {'2-point', '3-point', 'cs', callable}, optional
|
||||
Method of computing the Jacobian matrix (an m-by-n matrix, where
|
||||
element (i, j) is the partial derivative of f[i] with respect to
|
||||
x[j]). The keywords select a finite difference scheme for numerical
|
||||
estimation. The scheme '3-point' is more accurate, but requires
|
||||
twice as many operations as '2-point' (default). The scheme 'cs'
|
||||
uses complex steps, and while potentially the most accurate, it is
|
||||
applicable only when `fun` correctly handles complex inputs and
|
||||
can be analytically continued to the complex plane. Method 'lm'
|
||||
always uses the '2-point' scheme. If callable, it is used as
|
||||
``jac(x, *args, **kwargs)`` and should return a good approximation
|
||||
(or the exact value) for the Jacobian as an array_like (np.atleast_2d
|
||||
is applied), a sparse matrix (csr_matrix preferred for performance) or
|
||||
a `scipy.sparse.linalg.LinearOperator`.
|
||||
bounds : 2-tuple of array_like or `Bounds`, optional
|
||||
There are two ways to specify bounds:
|
||||
|
||||
1. Instance of `Bounds` class
|
||||
2. Lower and upper bounds on independent variables. Defaults to no
|
||||
bounds. Each array must match the size of `x0` or be a scalar,
|
||||
in the latter case a bound will be the same for all variables.
|
||||
Use ``np.inf`` with an appropriate sign to disable bounds on all
|
||||
or some variables.
|
||||
method : {'trf', 'dogbox', 'lm'}, optional
|
||||
Algorithm to perform minimization.
|
||||
|
||||
* 'trf' : Trust Region Reflective algorithm, particularly suitable
|
||||
for large sparse problems with bounds. Generally robust method.
|
||||
* 'dogbox' : dogleg algorithm with rectangular trust regions,
|
||||
typical use case is small problems with bounds. Not recommended
|
||||
for problems with rank-deficient Jacobian.
|
||||
* 'lm' : Levenberg-Marquardt algorithm as implemented in MINPACK.
|
||||
Doesn't handle bounds and sparse Jacobians. Usually the most
|
||||
efficient method for small unconstrained problems.
|
||||
|
||||
Default is 'trf'. See Notes for more information.
|
||||
ftol : float or None, optional
|
||||
Tolerance for termination by the change of the cost function. Default
|
||||
is 1e-8. The optimization process is stopped when ``dF < ftol * F``,
|
||||
and there was an adequate agreement between a local quadratic model and
|
||||
the true model in the last step.
|
||||
|
||||
If None and 'method' is not 'lm', the termination by this condition is
|
||||
disabled. If 'method' is 'lm', this tolerance must be higher than
|
||||
machine epsilon.
|
||||
xtol : float or None, optional
|
||||
Tolerance for termination by the change of the independent variables.
|
||||
Default is 1e-8. The exact condition depends on the `method` used:
|
||||
|
||||
* For 'trf' and 'dogbox' : ``norm(dx) < xtol * (xtol + norm(x))``.
|
||||
* For 'lm' : ``Delta < xtol * norm(xs)``, where ``Delta`` is
|
||||
a trust-region radius and ``xs`` is the value of ``x``
|
||||
scaled according to `x_scale` parameter (see below).
|
||||
|
||||
If None and 'method' is not 'lm', the termination by this condition is
|
||||
disabled. If 'method' is 'lm', this tolerance must be higher than
|
||||
machine epsilon.
|
||||
gtol : float or None, optional
|
||||
Tolerance for termination by the norm of the gradient. Default is 1e-8.
|
||||
The exact condition depends on a `method` used:
|
||||
|
||||
* For 'trf' : ``norm(g_scaled, ord=np.inf) < gtol``, where
|
||||
``g_scaled`` is the value of the gradient scaled to account for
|
||||
the presence of the bounds [STIR]_.
|
||||
* For 'dogbox' : ``norm(g_free, ord=np.inf) < gtol``, where
|
||||
``g_free`` is the gradient with respect to the variables which
|
||||
are not in the optimal state on the boundary.
|
||||
* For 'lm' : the maximum absolute value of the cosine of angles
|
||||
between columns of the Jacobian and the residual vector is less
|
||||
than `gtol`, or the residual vector is zero.
|
||||
|
||||
If None and 'method' is not 'lm', the termination by this condition is
|
||||
disabled. If 'method' is 'lm', this tolerance must be higher than
|
||||
machine epsilon.
|
||||
x_scale : array_like or 'jac', optional
|
||||
Characteristic scale of each variable. Setting `x_scale` is equivalent
|
||||
to reformulating the problem in scaled variables ``xs = x / x_scale``.
|
||||
An alternative view is that the size of a trust region along jth
|
||||
dimension is proportional to ``x_scale[j]``. Improved convergence may
|
||||
be achieved by setting `x_scale` such that a step of a given size
|
||||
along any of the scaled variables has a similar effect on the cost
|
||||
function. If set to 'jac', the scale is iteratively updated using the
|
||||
inverse norms of the columns of the Jacobian matrix (as described in
|
||||
[JJMore]_).
|
||||
loss : str or callable, optional
|
||||
Determines the loss function. The following keyword values are allowed:
|
||||
|
||||
* 'linear' (default) : ``rho(z) = z``. Gives a standard
|
||||
least-squares problem.
|
||||
* 'soft_l1' : ``rho(z) = 2 * ((1 + z)**0.5 - 1)``. The smooth
|
||||
approximation of l1 (absolute value) loss. Usually a good
|
||||
choice for robust least squares.
|
||||
* 'huber' : ``rho(z) = z if z <= 1 else 2*z**0.5 - 1``. Works
|
||||
similarly to 'soft_l1'.
|
||||
* 'cauchy' : ``rho(z) = ln(1 + z)``. Severely weakens outliers
|
||||
influence, but may cause difficulties in optimization process.
|
||||
* 'arctan' : ``rho(z) = arctan(z)``. Limits a maximum loss on
|
||||
a single residual, has properties similar to 'cauchy'.
|
||||
|
||||
If callable, it must take a 1-D ndarray ``z=f**2`` and return an
|
||||
array_like with shape (3, m) where row 0 contains function values,
|
||||
row 1 contains first derivatives and row 2 contains second
|
||||
derivatives. Method 'lm' supports only 'linear' loss.
|
||||
f_scale : float, optional
|
||||
Value of soft margin between inlier and outlier residuals, default
|
||||
is 1.0. The loss function is evaluated as follows
|
||||
``rho_(f**2) = C**2 * rho(f**2 / C**2)``, where ``C`` is `f_scale`,
|
||||
and ``rho`` is determined by `loss` parameter. This parameter has
|
||||
no effect with ``loss='linear'``, but for other `loss` values it is
|
||||
of crucial importance.
|
||||
max_nfev : None or int, optional
|
||||
Maximum number of function evaluations before the termination.
|
||||
If None (default), the value is chosen automatically:
|
||||
|
||||
* For 'trf' and 'dogbox' : 100 * n.
|
||||
* For 'lm' : 100 * n if `jac` is callable and 100 * n * (n + 1)
|
||||
otherwise (because 'lm' counts function calls in Jacobian
|
||||
estimation).
|
||||
|
||||
diff_step : None or array_like, optional
|
||||
Determines the relative step size for the finite difference
|
||||
approximation of the Jacobian. The actual step is computed as
|
||||
``x * diff_step``. If None (default), then `diff_step` is taken to be
|
||||
a conventional "optimal" power of machine epsilon for the finite
|
||||
difference scheme used [NR]_.
|
||||
tr_solver : {None, 'exact', 'lsmr'}, optional
|
||||
Method for solving trust-region subproblems, relevant only for 'trf'
|
||||
and 'dogbox' methods.
|
||||
|
||||
* 'exact' is suitable for not very large problems with dense
|
||||
Jacobian matrices. The computational complexity per iteration is
|
||||
comparable to a singular value decomposition of the Jacobian
|
||||
matrix.
|
||||
* 'lsmr' is suitable for problems with sparse and large Jacobian
|
||||
matrices. It uses the iterative procedure
|
||||
`scipy.sparse.linalg.lsmr` for finding a solution of a linear
|
||||
least-squares problem and only requires matrix-vector product
|
||||
evaluations.
|
||||
|
||||
If None (default), the solver is chosen based on the type of Jacobian
|
||||
returned on the first iteration.
|
||||
tr_options : dict, optional
|
||||
Keyword options passed to trust-region solver.
|
||||
|
||||
* ``tr_solver='exact'``: `tr_options` are ignored.
|
||||
* ``tr_solver='lsmr'``: options for `scipy.sparse.linalg.lsmr`.
|
||||
Additionally, ``method='trf'`` supports 'regularize' option
|
||||
(bool, default is True), which adds a regularization term to the
|
||||
normal equation, which improves convergence if the Jacobian is
|
||||
rank-deficient [Byrd]_ (eq. 3.4).
|
||||
|
||||
jac_sparsity : {None, array_like, sparse matrix}, optional
|
||||
Defines the sparsity structure of the Jacobian matrix for finite
|
||||
difference estimation, its shape must be (m, n). If the Jacobian has
|
||||
only few non-zero elements in *each* row, providing the sparsity
|
||||
structure will greatly speed up the computations [Curtis]_. A zero
|
||||
entry means that a corresponding element in the Jacobian is identically
|
||||
zero. If provided, forces the use of 'lsmr' trust-region solver.
|
||||
If None (default), then dense differencing will be used. Has no effect
|
||||
for 'lm' method.
|
||||
verbose : {0, 1, 2}, optional
|
||||
Level of algorithm's verbosity:
|
||||
|
||||
* 0 (default) : work silently.
|
||||
* 1 : display a termination report.
|
||||
* 2 : display progress during iterations (not supported by 'lm'
|
||||
method).
|
||||
|
||||
args, kwargs : tuple and dict, optional
|
||||
Additional arguments passed to `fun` and `jac`. Both empty by default.
|
||||
The calling signature is ``fun(x, *args, **kwargs)`` and the same for
|
||||
`jac`.
|
||||
|
||||
Returns
|
||||
-------
|
||||
result : OptimizeResult
|
||||
`OptimizeResult` with the following fields defined:
|
||||
|
||||
x : ndarray, shape (n,)
|
||||
Solution found.
|
||||
cost : float
|
||||
Value of the cost function at the solution.
|
||||
fun : ndarray, shape (m,)
|
||||
Vector of residuals at the solution.
|
||||
jac : ndarray, sparse matrix or LinearOperator, shape (m, n)
|
||||
Modified Jacobian matrix at the solution, in the sense that J^T J
|
||||
is a Gauss-Newton approximation of the Hessian of the cost function.
|
||||
The type is the same as the one used by the algorithm.
|
||||
grad : ndarray, shape (m,)
|
||||
Gradient of the cost function at the solution.
|
||||
optimality : float
|
||||
First-order optimality measure. In unconstrained problems, it is
|
||||
always the uniform norm of the gradient. In constrained problems,
|
||||
it is the quantity which was compared with `gtol` during iterations.
|
||||
active_mask : ndarray of int, shape (n,)
|
||||
Each component shows whether a corresponding constraint is active
|
||||
(that is, whether a variable is at the bound):
|
||||
|
||||
* 0 : a constraint is not active.
|
||||
* -1 : a lower bound is active.
|
||||
* 1 : an upper bound is active.
|
||||
|
||||
Might be somewhat arbitrary for 'trf' method as it generates a
|
||||
sequence of strictly feasible iterates and `active_mask` is
|
||||
determined within a tolerance threshold.
|
||||
nfev : int
|
||||
Number of function evaluations done. Methods 'trf' and 'dogbox' do
|
||||
not count function calls for numerical Jacobian approximation, as
|
||||
opposed to 'lm' method.
|
||||
njev : int or None
|
||||
Number of Jacobian evaluations done. If numerical Jacobian
|
||||
approximation is used in 'lm' method, it is set to None.
|
||||
status : int
|
||||
The reason for algorithm termination:
|
||||
|
||||
* -1 : improper input parameters status returned from MINPACK.
|
||||
* 0 : the maximum number of function evaluations is exceeded.
|
||||
* 1 : `gtol` termination condition is satisfied.
|
||||
* 2 : `ftol` termination condition is satisfied.
|
||||
* 3 : `xtol` termination condition is satisfied.
|
||||
* 4 : Both `ftol` and `xtol` termination conditions are satisfied.
|
||||
|
||||
message : str
|
||||
Verbal description of the termination reason.
|
||||
success : bool
|
||||
True if one of the convergence criteria is satisfied (`status` > 0).
|
||||
|
||||
See Also
|
||||
--------
|
||||
leastsq : A legacy wrapper for the MINPACK implementation of the
|
||||
Levenberg-Marquadt algorithm.
|
||||
curve_fit : Least-squares minimization applied to a curve-fitting problem.
|
||||
|
||||
Notes
|
||||
-----
|
||||
Method 'lm' (Levenberg-Marquardt) calls a wrapper over least-squares
|
||||
algorithms implemented in MINPACK (lmder, lmdif). It runs the
|
||||
Levenberg-Marquardt algorithm formulated as a trust-region type algorithm.
|
||||
The implementation is based on paper [JJMore]_, it is very robust and
|
||||
efficient with a lot of smart tricks. It should be your first choice
|
||||
for unconstrained problems. Note that it doesn't support bounds. Also,
|
||||
it doesn't work when m < n.
|
||||
|
||||
Method 'trf' (Trust Region Reflective) is motivated by the process of
|
||||
solving a system of equations, which constitute the first-order optimality
|
||||
condition for a bound-constrained minimization problem as formulated in
|
||||
[STIR]_. The algorithm iteratively solves trust-region subproblems
|
||||
augmented by a special diagonal quadratic term and with trust-region shape
|
||||
determined by the distance from the bounds and the direction of the
|
||||
gradient. This enhancements help to avoid making steps directly into bounds
|
||||
and efficiently explore the whole space of variables. To further improve
|
||||
convergence, the algorithm considers search directions reflected from the
|
||||
bounds. To obey theoretical requirements, the algorithm keeps iterates
|
||||
strictly feasible. With dense Jacobians trust-region subproblems are
|
||||
solved by an exact method very similar to the one described in [JJMore]_
|
||||
(and implemented in MINPACK). The difference from the MINPACK
|
||||
implementation is that a singular value decomposition of a Jacobian
|
||||
matrix is done once per iteration, instead of a QR decomposition and series
|
||||
of Givens rotation eliminations. For large sparse Jacobians a 2-D subspace
|
||||
approach of solving trust-region subproblems is used [STIR]_, [Byrd]_.
|
||||
The subspace is spanned by a scaled gradient and an approximate
|
||||
Gauss-Newton solution delivered by `scipy.sparse.linalg.lsmr`. When no
|
||||
constraints are imposed the algorithm is very similar to MINPACK and has
|
||||
generally comparable performance. The algorithm works quite robust in
|
||||
unbounded and bounded problems, thus it is chosen as a default algorithm.
|
||||
|
||||
Method 'dogbox' operates in a trust-region framework, but considers
|
||||
rectangular trust regions as opposed to conventional ellipsoids [Voglis]_.
|
||||
The intersection of a current trust region and initial bounds is again
|
||||
rectangular, so on each iteration a quadratic minimization problem subject
|
||||
to bound constraints is solved approximately by Powell's dogleg method
|
||||
[NumOpt]_. The required Gauss-Newton step can be computed exactly for
|
||||
dense Jacobians or approximately by `scipy.sparse.linalg.lsmr` for large
|
||||
sparse Jacobians. The algorithm is likely to exhibit slow convergence when
|
||||
the rank of Jacobian is less than the number of variables. The algorithm
|
||||
often outperforms 'trf' in bounded problems with a small number of
|
||||
variables.
|
||||
|
||||
Robust loss functions are implemented as described in [BA]_. The idea
|
||||
is to modify a residual vector and a Jacobian matrix on each iteration
|
||||
such that computed gradient and Gauss-Newton Hessian approximation match
|
||||
the true gradient and Hessian approximation of the cost function. Then
|
||||
the algorithm proceeds in a normal way, i.e., robust loss functions are
|
||||
implemented as a simple wrapper over standard least-squares algorithms.
|
||||
|
||||
.. versionadded:: 0.17.0
|
||||
|
||||
References
|
||||
----------
|
||||
.. [STIR] M. A. Branch, T. F. Coleman, and Y. Li, "A Subspace, Interior,
|
||||
and Conjugate Gradient Method for Large-Scale Bound-Constrained
|
||||
Minimization Problems," SIAM Journal on Scientific Computing,
|
||||
Vol. 21, Number 1, pp 1-23, 1999.
|
||||
.. [NR] William H. Press et. al., "Numerical Recipes. The Art of Scientific
|
||||
Computing. 3rd edition", Sec. 5.7.
|
||||
.. [Byrd] R. H. Byrd, R. B. Schnabel and G. A. Shultz, "Approximate
|
||||
solution of the trust region problem by minimization over
|
||||
two-dimensional subspaces", Math. Programming, 40, pp. 247-263,
|
||||
1988.
|
||||
.. [Curtis] A. Curtis, M. J. D. Powell, and J. Reid, "On the estimation of
|
||||
sparse Jacobian matrices", Journal of the Institute of
|
||||
Mathematics and its Applications, 13, pp. 117-120, 1974.
|
||||
.. [JJMore] J. J. More, "The Levenberg-Marquardt Algorithm: Implementation
|
||||
and Theory," Numerical Analysis, ed. G. A. Watson, Lecture
|
||||
Notes in Mathematics 630, Springer Verlag, pp. 105-116, 1977.
|
||||
.. [Voglis] C. Voglis and I. E. Lagaris, "A Rectangular Trust Region
|
||||
Dogleg Approach for Unconstrained and Bound Constrained
|
||||
Nonlinear Optimization", WSEAS International Conference on
|
||||
Applied Mathematics, Corfu, Greece, 2004.
|
||||
.. [NumOpt] J. Nocedal and S. J. Wright, "Numerical optimization,
|
||||
2nd edition", Chapter 4.
|
||||
.. [BA] B. Triggs et. al., "Bundle Adjustment - A Modern Synthesis",
|
||||
Proceedings of the International Workshop on Vision Algorithms:
|
||||
Theory and Practice, pp. 298-372, 1999.
|
||||
|
||||
Examples
|
||||
--------
|
||||
In this example we find a minimum of the Rosenbrock function without bounds
|
||||
on independent variables.
|
||||
|
||||
>>> import numpy as np
|
||||
>>> def fun_rosenbrock(x):
|
||||
... return np.array([10 * (x[1] - x[0]**2), (1 - x[0])])
|
||||
|
||||
Notice that we only provide the vector of the residuals. The algorithm
|
||||
constructs the cost function as a sum of squares of the residuals, which
|
||||
gives the Rosenbrock function. The exact minimum is at ``x = [1.0, 1.0]``.
|
||||
|
||||
>>> from scipy.optimize import least_squares
|
||||
>>> x0_rosenbrock = np.array([2, 2])
|
||||
>>> res_1 = least_squares(fun_rosenbrock, x0_rosenbrock)
|
||||
>>> res_1.x
|
||||
array([ 1., 1.])
|
||||
>>> res_1.cost
|
||||
9.8669242910846867e-30
|
||||
>>> res_1.optimality
|
||||
8.8928864934219529e-14
|
||||
|
||||
We now constrain the variables, in such a way that the previous solution
|
||||
becomes infeasible. Specifically, we require that ``x[1] >= 1.5``, and
|
||||
``x[0]`` left unconstrained. To this end, we specify the `bounds` parameter
|
||||
to `least_squares` in the form ``bounds=([-np.inf, 1.5], np.inf)``.
|
||||
|
||||
We also provide the analytic Jacobian:
|
||||
|
||||
>>> def jac_rosenbrock(x):
|
||||
... return np.array([
|
||||
... [-20 * x[0], 10],
|
||||
... [-1, 0]])
|
||||
|
||||
Putting this all together, we see that the new solution lies on the bound:
|
||||
|
||||
>>> res_2 = least_squares(fun_rosenbrock, x0_rosenbrock, jac_rosenbrock,
|
||||
... bounds=([-np.inf, 1.5], np.inf))
|
||||
>>> res_2.x
|
||||
array([ 1.22437075, 1.5 ])
|
||||
>>> res_2.cost
|
||||
0.025213093946805685
|
||||
>>> res_2.optimality
|
||||
1.5885401433157753e-07
|
||||
|
||||
Now we solve a system of equations (i.e., the cost function should be zero
|
||||
at a minimum) for a Broyden tridiagonal vector-valued function of 100000
|
||||
variables:
|
||||
|
||||
>>> def fun_broyden(x):
|
||||
... f = (3 - x) * x + 1
|
||||
... f[1:] -= x[:-1]
|
||||
... f[:-1] -= 2 * x[1:]
|
||||
... return f
|
||||
|
||||
The corresponding Jacobian matrix is sparse. We tell the algorithm to
|
||||
estimate it by finite differences and provide the sparsity structure of
|
||||
Jacobian to significantly speed up this process.
|
||||
|
||||
>>> from scipy.sparse import lil_matrix
|
||||
>>> def sparsity_broyden(n):
|
||||
... sparsity = lil_matrix((n, n), dtype=int)
|
||||
... i = np.arange(n)
|
||||
... sparsity[i, i] = 1
|
||||
... i = np.arange(1, n)
|
||||
... sparsity[i, i - 1] = 1
|
||||
... i = np.arange(n - 1)
|
||||
... sparsity[i, i + 1] = 1
|
||||
... return sparsity
|
||||
...
|
||||
>>> n = 100000
|
||||
>>> x0_broyden = -np.ones(n)
|
||||
...
|
||||
>>> res_3 = least_squares(fun_broyden, x0_broyden,
|
||||
... jac_sparsity=sparsity_broyden(n))
|
||||
>>> res_3.cost
|
||||
4.5687069299604613e-23
|
||||
>>> res_3.optimality
|
||||
1.1650454296851518e-11
|
||||
|
||||
Let's also solve a curve fitting problem using robust loss function to
|
||||
take care of outliers in the data. Define the model function as
|
||||
``y = a + b * exp(c * t)``, where t is a predictor variable, y is an
|
||||
observation and a, b, c are parameters to estimate.
|
||||
|
||||
First, define the function which generates the data with noise and
|
||||
outliers, define the model parameters, and generate data:
|
||||
|
||||
>>> from numpy.random import default_rng
|
||||
>>> rng = default_rng()
|
||||
>>> def gen_data(t, a, b, c, noise=0., n_outliers=0, seed=None):
|
||||
... rng = default_rng(seed)
|
||||
...
|
||||
... y = a + b * np.exp(t * c)
|
||||
...
|
||||
... error = noise * rng.standard_normal(t.size)
|
||||
... outliers = rng.integers(0, t.size, n_outliers)
|
||||
... error[outliers] *= 10
|
||||
...
|
||||
... return y + error
|
||||
...
|
||||
>>> a = 0.5
|
||||
>>> b = 2.0
|
||||
>>> c = -1
|
||||
>>> t_min = 0
|
||||
>>> t_max = 10
|
||||
>>> n_points = 15
|
||||
...
|
||||
>>> t_train = np.linspace(t_min, t_max, n_points)
|
||||
>>> y_train = gen_data(t_train, a, b, c, noise=0.1, n_outliers=3)
|
||||
|
||||
Define function for computing residuals and initial estimate of
|
||||
parameters.
|
||||
|
||||
>>> def fun(x, t, y):
|
||||
... return x[0] + x[1] * np.exp(x[2] * t) - y
|
||||
...
|
||||
>>> x0 = np.array([1.0, 1.0, 0.0])
|
||||
|
||||
Compute a standard least-squares solution:
|
||||
|
||||
>>> res_lsq = least_squares(fun, x0, args=(t_train, y_train))
|
||||
|
||||
Now compute two solutions with two different robust loss functions. The
|
||||
parameter `f_scale` is set to 0.1, meaning that inlier residuals should
|
||||
not significantly exceed 0.1 (the noise level used).
|
||||
|
||||
>>> res_soft_l1 = least_squares(fun, x0, loss='soft_l1', f_scale=0.1,
|
||||
... args=(t_train, y_train))
|
||||
>>> res_log = least_squares(fun, x0, loss='cauchy', f_scale=0.1,
|
||||
... args=(t_train, y_train))
|
||||
|
||||
And, finally, plot all the curves. We see that by selecting an appropriate
|
||||
`loss` we can get estimates close to optimal even in the presence of
|
||||
strong outliers. But keep in mind that generally it is recommended to try
|
||||
'soft_l1' or 'huber' losses first (if at all necessary) as the other two
|
||||
options may cause difficulties in optimization process.
|
||||
|
||||
>>> t_test = np.linspace(t_min, t_max, n_points * 10)
|
||||
>>> y_true = gen_data(t_test, a, b, c)
|
||||
>>> y_lsq = gen_data(t_test, *res_lsq.x)
|
||||
>>> y_soft_l1 = gen_data(t_test, *res_soft_l1.x)
|
||||
>>> y_log = gen_data(t_test, *res_log.x)
|
||||
...
|
||||
>>> import matplotlib.pyplot as plt
|
||||
>>> plt.plot(t_train, y_train, 'o')
|
||||
>>> plt.plot(t_test, y_true, 'k', linewidth=2, label='true')
|
||||
>>> plt.plot(t_test, y_lsq, label='linear loss')
|
||||
>>> plt.plot(t_test, y_soft_l1, label='soft_l1 loss')
|
||||
>>> plt.plot(t_test, y_log, label='cauchy loss')
|
||||
>>> plt.xlabel("t")
|
||||
>>> plt.ylabel("y")
|
||||
>>> plt.legend()
|
||||
>>> plt.show()
|
||||
|
||||
In the next example, we show how complex-valued residual functions of
|
||||
complex variables can be optimized with ``least_squares()``. Consider the
|
||||
following function:
|
||||
|
||||
>>> def f(z):
|
||||
... return z - (0.5 + 0.5j)
|
||||
|
||||
We wrap it into a function of real variables that returns real residuals
|
||||
by simply handling the real and imaginary parts as independent variables:
|
||||
|
||||
>>> def f_wrap(x):
|
||||
... fx = f(x[0] + 1j*x[1])
|
||||
... return np.array([fx.real, fx.imag])
|
||||
|
||||
Thus, instead of the original m-D complex function of n complex
|
||||
variables we optimize a 2m-D real function of 2n real variables:
|
||||
|
||||
>>> from scipy.optimize import least_squares
|
||||
>>> res_wrapped = least_squares(f_wrap, (0.1, 0.1), bounds=([0, 0], [1, 1]))
|
||||
>>> z = res_wrapped.x[0] + res_wrapped.x[1]*1j
|
||||
>>> z
|
||||
(0.49999999999925893+0.49999999999925893j)
|
||||
|
||||
"""
|
||||
if method not in ['trf', 'dogbox', 'lm']:
|
||||
raise ValueError("`method` must be 'trf', 'dogbox' or 'lm'.")
|
||||
|
||||
if jac not in ['2-point', '3-point', 'cs'] and not callable(jac):
|
||||
raise ValueError("`jac` must be '2-point', '3-point', 'cs' or "
|
||||
"callable.")
|
||||
|
||||
if tr_solver not in [None, 'exact', 'lsmr']:
|
||||
raise ValueError("`tr_solver` must be None, 'exact' or 'lsmr'.")
|
||||
|
||||
if loss not in IMPLEMENTED_LOSSES and not callable(loss):
|
||||
raise ValueError("`loss` must be one of {} or a callable."
|
||||
.format(IMPLEMENTED_LOSSES.keys()))
|
||||
|
||||
if method == 'lm' and loss != 'linear':
|
||||
raise ValueError("method='lm' supports only 'linear' loss function.")
|
||||
|
||||
if verbose not in [0, 1, 2]:
|
||||
raise ValueError("`verbose` must be in [0, 1, 2].")
|
||||
|
||||
if max_nfev is not None and max_nfev <= 0:
|
||||
raise ValueError("`max_nfev` must be None or positive integer.")
|
||||
|
||||
if np.iscomplexobj(x0):
|
||||
raise ValueError("`x0` must be real.")
|
||||
|
||||
x0 = np.atleast_1d(x0).astype(float)
|
||||
|
||||
if x0.ndim > 1:
|
||||
raise ValueError("`x0` must have at most 1 dimension.")
|
||||
|
||||
if isinstance(bounds, Bounds):
|
||||
lb, ub = bounds.lb, bounds.ub
|
||||
bounds = (lb, ub)
|
||||
else:
|
||||
if len(bounds) == 2:
|
||||
lb, ub = prepare_bounds(bounds, x0.shape[0])
|
||||
else:
|
||||
raise ValueError("`bounds` must contain 2 elements.")
|
||||
|
||||
if method == 'lm' and not np.all((lb == -np.inf) & (ub == np.inf)):
|
||||
raise ValueError("Method 'lm' doesn't support bounds.")
|
||||
|
||||
if lb.shape != x0.shape or ub.shape != x0.shape:
|
||||
raise ValueError("Inconsistent shapes between bounds and `x0`.")
|
||||
|
||||
if np.any(lb >= ub):
|
||||
raise ValueError("Each lower bound must be strictly less than each "
|
||||
"upper bound.")
|
||||
|
||||
if not in_bounds(x0, lb, ub):
|
||||
raise ValueError("`x0` is infeasible.")
|
||||
|
||||
x_scale = check_x_scale(x_scale, x0)
|
||||
|
||||
ftol, xtol, gtol = check_tolerance(ftol, xtol, gtol, method)
|
||||
|
||||
if method == 'trf':
|
||||
x0 = make_strictly_feasible(x0, lb, ub)
|
||||
|
||||
def fun_wrapped(x):
|
||||
return np.atleast_1d(fun(x, *args, **kwargs))
|
||||
|
||||
f0 = fun_wrapped(x0)
|
||||
|
||||
if f0.ndim != 1:
|
||||
raise ValueError("`fun` must return at most 1-d array_like. "
|
||||
f"f0.shape: {f0.shape}")
|
||||
|
||||
if not np.all(np.isfinite(f0)):
|
||||
raise ValueError("Residuals are not finite in the initial point.")
|
||||
|
||||
n = x0.size
|
||||
m = f0.size
|
||||
|
||||
if method == 'lm' and m < n:
|
||||
raise ValueError("Method 'lm' doesn't work when the number of "
|
||||
"residuals is less than the number of variables.")
|
||||
|
||||
loss_function = construct_loss_function(m, loss, f_scale)
|
||||
if callable(loss):
|
||||
rho = loss_function(f0)
|
||||
if rho.shape != (3, m):
|
||||
raise ValueError("The return value of `loss` callable has wrong "
|
||||
"shape.")
|
||||
initial_cost = 0.5 * np.sum(rho[0])
|
||||
elif loss_function is not None:
|
||||
initial_cost = loss_function(f0, cost_only=True)
|
||||
else:
|
||||
initial_cost = 0.5 * np.dot(f0, f0)
|
||||
|
||||
if callable(jac):
|
||||
J0 = jac(x0, *args, **kwargs)
|
||||
|
||||
if issparse(J0):
|
||||
J0 = J0.tocsr()
|
||||
|
||||
def jac_wrapped(x, _=None):
|
||||
return jac(x, *args, **kwargs).tocsr()
|
||||
|
||||
elif isinstance(J0, LinearOperator):
|
||||
def jac_wrapped(x, _=None):
|
||||
return jac(x, *args, **kwargs)
|
||||
|
||||
else:
|
||||
J0 = np.atleast_2d(J0)
|
||||
|
||||
def jac_wrapped(x, _=None):
|
||||
return np.atleast_2d(jac(x, *args, **kwargs))
|
||||
|
||||
else: # Estimate Jacobian by finite differences.
|
||||
if method == 'lm':
|
||||
if jac_sparsity is not None:
|
||||
raise ValueError("method='lm' does not support "
|
||||
"`jac_sparsity`.")
|
||||
|
||||
if jac != '2-point':
|
||||
warn(f"jac='{jac}' works equivalently to '2-point' for method='lm'.",
|
||||
stacklevel=2)
|
||||
|
||||
J0 = jac_wrapped = None
|
||||
else:
|
||||
if jac_sparsity is not None and tr_solver == 'exact':
|
||||
raise ValueError("tr_solver='exact' is incompatible "
|
||||
"with `jac_sparsity`.")
|
||||
|
||||
jac_sparsity = check_jac_sparsity(jac_sparsity, m, n)
|
||||
|
||||
def jac_wrapped(x, f):
|
||||
J = approx_derivative(fun, x, rel_step=diff_step, method=jac,
|
||||
f0=f, bounds=bounds, args=args,
|
||||
kwargs=kwargs, sparsity=jac_sparsity)
|
||||
if J.ndim != 2: # J is guaranteed not sparse.
|
||||
J = np.atleast_2d(J)
|
||||
|
||||
return J
|
||||
|
||||
J0 = jac_wrapped(x0, f0)
|
||||
|
||||
if J0 is not None:
|
||||
if J0.shape != (m, n):
|
||||
raise ValueError(
|
||||
f"The return value of `jac` has wrong shape: expected {(m, n)}, "
|
||||
f"actual {J0.shape}."
|
||||
)
|
||||
|
||||
if not isinstance(J0, np.ndarray):
|
||||
if method == 'lm':
|
||||
raise ValueError("method='lm' works only with dense "
|
||||
"Jacobian matrices.")
|
||||
|
||||
if tr_solver == 'exact':
|
||||
raise ValueError(
|
||||
"tr_solver='exact' works only with dense "
|
||||
"Jacobian matrices.")
|
||||
|
||||
jac_scale = isinstance(x_scale, str) and x_scale == 'jac'
|
||||
if isinstance(J0, LinearOperator) and jac_scale:
|
||||
raise ValueError("x_scale='jac' can't be used when `jac` "
|
||||
"returns LinearOperator.")
|
||||
|
||||
if tr_solver is None:
|
||||
if isinstance(J0, np.ndarray):
|
||||
tr_solver = 'exact'
|
||||
else:
|
||||
tr_solver = 'lsmr'
|
||||
|
||||
if method == 'lm':
|
||||
result = call_minpack(fun_wrapped, x0, jac_wrapped, ftol, xtol, gtol,
|
||||
max_nfev, x_scale, diff_step)
|
||||
|
||||
elif method == 'trf':
|
||||
result = trf(fun_wrapped, jac_wrapped, x0, f0, J0, lb, ub, ftol, xtol,
|
||||
gtol, max_nfev, x_scale, loss_function, tr_solver,
|
||||
tr_options.copy(), verbose)
|
||||
|
||||
elif method == 'dogbox':
|
||||
if tr_solver == 'lsmr' and 'regularize' in tr_options:
|
||||
warn("The keyword 'regularize' in `tr_options` is not relevant "
|
||||
"for 'dogbox' method.",
|
||||
stacklevel=2)
|
||||
tr_options = tr_options.copy()
|
||||
del tr_options['regularize']
|
||||
|
||||
result = dogbox(fun_wrapped, jac_wrapped, x0, f0, J0, lb, ub, ftol,
|
||||
xtol, gtol, max_nfev, x_scale, loss_function,
|
||||
tr_solver, tr_options, verbose)
|
||||
|
||||
result.message = TERMINATION_MESSAGES[result.status]
|
||||
result.success = result.status > 0
|
||||
|
||||
if verbose >= 1:
|
||||
print(result.message)
|
||||
print("Function evaluations {}, initial cost {:.4e}, final cost "
|
||||
"{:.4e}, first-order optimality {:.2e}."
|
||||
.format(result.nfev, initial_cost, result.cost,
|
||||
result.optimality))
|
||||
|
||||
return result
|
||||
@ -0,0 +1,362 @@
|
||||
"""Linear least squares with bound constraints on independent variables."""
|
||||
import numpy as np
|
||||
from numpy.linalg import norm
|
||||
from scipy.sparse import issparse, csr_matrix
|
||||
from scipy.sparse.linalg import LinearOperator, lsmr
|
||||
from scipy.optimize import OptimizeResult
|
||||
from scipy.optimize._minimize import Bounds
|
||||
|
||||
from .common import in_bounds, compute_grad
|
||||
from .trf_linear import trf_linear
|
||||
from .bvls import bvls
|
||||
|
||||
|
||||
def prepare_bounds(bounds, n):
|
||||
if len(bounds) != 2:
|
||||
raise ValueError("`bounds` must contain 2 elements.")
|
||||
lb, ub = (np.asarray(b, dtype=float) for b in bounds)
|
||||
|
||||
if lb.ndim == 0:
|
||||
lb = np.resize(lb, n)
|
||||
|
||||
if ub.ndim == 0:
|
||||
ub = np.resize(ub, n)
|
||||
|
||||
return lb, ub
|
||||
|
||||
|
||||
TERMINATION_MESSAGES = {
|
||||
-1: "The algorithm was not able to make progress on the last iteration.",
|
||||
0: "The maximum number of iterations is exceeded.",
|
||||
1: "The first-order optimality measure is less than `tol`.",
|
||||
2: "The relative change of the cost function is less than `tol`.",
|
||||
3: "The unconstrained solution is optimal."
|
||||
}
|
||||
|
||||
|
||||
def lsq_linear(A, b, bounds=(-np.inf, np.inf), method='trf', tol=1e-10,
|
||||
lsq_solver=None, lsmr_tol=None, max_iter=None,
|
||||
verbose=0, *, lsmr_maxiter=None,):
|
||||
r"""Solve a linear least-squares problem with bounds on the variables.
|
||||
|
||||
Given a m-by-n design matrix A and a target vector b with m elements,
|
||||
`lsq_linear` solves the following optimization problem::
|
||||
|
||||
minimize 0.5 * ||A x - b||**2
|
||||
subject to lb <= x <= ub
|
||||
|
||||
This optimization problem is convex, hence a found minimum (if iterations
|
||||
have converged) is guaranteed to be global.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : array_like, sparse matrix of LinearOperator, shape (m, n)
|
||||
Design matrix. Can be `scipy.sparse.linalg.LinearOperator`.
|
||||
b : array_like, shape (m,)
|
||||
Target vector.
|
||||
bounds : 2-tuple of array_like or `Bounds`, optional
|
||||
Lower and upper bounds on parameters. Defaults to no bounds.
|
||||
There are two ways to specify the bounds:
|
||||
|
||||
- Instance of `Bounds` class.
|
||||
|
||||
- 2-tuple of array_like: Each element of the tuple must be either
|
||||
an array with the length equal to the number of parameters, or a
|
||||
scalar (in which case the bound is taken to be the same for all
|
||||
parameters). Use ``np.inf`` with an appropriate sign to disable
|
||||
bounds on all or some parameters.
|
||||
|
||||
method : 'trf' or 'bvls', optional
|
||||
Method to perform minimization.
|
||||
|
||||
* 'trf' : Trust Region Reflective algorithm adapted for a linear
|
||||
least-squares problem. This is an interior-point-like method
|
||||
and the required number of iterations is weakly correlated with
|
||||
the number of variables.
|
||||
* 'bvls' : Bounded-variable least-squares algorithm. This is
|
||||
an active set method, which requires the number of iterations
|
||||
comparable to the number of variables. Can't be used when `A` is
|
||||
sparse or LinearOperator.
|
||||
|
||||
Default is 'trf'.
|
||||
tol : float, optional
|
||||
Tolerance parameter. The algorithm terminates if a relative change
|
||||
of the cost function is less than `tol` on the last iteration.
|
||||
Additionally, the first-order optimality measure is considered:
|
||||
|
||||
* ``method='trf'`` terminates if the uniform norm of the gradient,
|
||||
scaled to account for the presence of the bounds, is less than
|
||||
`tol`.
|
||||
* ``method='bvls'`` terminates if Karush-Kuhn-Tucker conditions
|
||||
are satisfied within `tol` tolerance.
|
||||
|
||||
lsq_solver : {None, 'exact', 'lsmr'}, optional
|
||||
Method of solving unbounded least-squares problems throughout
|
||||
iterations:
|
||||
|
||||
* 'exact' : Use dense QR or SVD decomposition approach. Can't be
|
||||
used when `A` is sparse or LinearOperator.
|
||||
* 'lsmr' : Use `scipy.sparse.linalg.lsmr` iterative procedure
|
||||
which requires only matrix-vector product evaluations. Can't
|
||||
be used with ``method='bvls'``.
|
||||
|
||||
If None (default), the solver is chosen based on type of `A`.
|
||||
lsmr_tol : None, float or 'auto', optional
|
||||
Tolerance parameters 'atol' and 'btol' for `scipy.sparse.linalg.lsmr`
|
||||
If None (default), it is set to ``1e-2 * tol``. If 'auto', the
|
||||
tolerance will be adjusted based on the optimality of the current
|
||||
iterate, which can speed up the optimization process, but is not always
|
||||
reliable.
|
||||
max_iter : None or int, optional
|
||||
Maximum number of iterations before termination. If None (default), it
|
||||
is set to 100 for ``method='trf'`` or to the number of variables for
|
||||
``method='bvls'`` (not counting iterations for 'bvls' initialization).
|
||||
verbose : {0, 1, 2}, optional
|
||||
Level of algorithm's verbosity:
|
||||
|
||||
* 0 : work silently (default).
|
||||
* 1 : display a termination report.
|
||||
* 2 : display progress during iterations.
|
||||
lsmr_maxiter : None or int, optional
|
||||
Maximum number of iterations for the lsmr least squares solver,
|
||||
if it is used (by setting ``lsq_solver='lsmr'``). If None (default), it
|
||||
uses lsmr's default of ``min(m, n)`` where ``m`` and ``n`` are the
|
||||
number of rows and columns of `A`, respectively. Has no effect if
|
||||
``lsq_solver='exact'``.
|
||||
|
||||
Returns
|
||||
-------
|
||||
OptimizeResult with the following fields defined:
|
||||
x : ndarray, shape (n,)
|
||||
Solution found.
|
||||
cost : float
|
||||
Value of the cost function at the solution.
|
||||
fun : ndarray, shape (m,)
|
||||
Vector of residuals at the solution.
|
||||
optimality : float
|
||||
First-order optimality measure. The exact meaning depends on `method`,
|
||||
refer to the description of `tol` parameter.
|
||||
active_mask : ndarray of int, shape (n,)
|
||||
Each component shows whether a corresponding constraint is active
|
||||
(that is, whether a variable is at the bound):
|
||||
|
||||
* 0 : a constraint is not active.
|
||||
* -1 : a lower bound is active.
|
||||
* 1 : an upper bound is active.
|
||||
|
||||
Might be somewhat arbitrary for the `trf` method as it generates a
|
||||
sequence of strictly feasible iterates and active_mask is determined
|
||||
within a tolerance threshold.
|
||||
unbounded_sol : tuple
|
||||
Unbounded least squares solution tuple returned by the least squares
|
||||
solver (set with `lsq_solver` option). If `lsq_solver` is not set or is
|
||||
set to ``'exact'``, the tuple contains an ndarray of shape (n,) with
|
||||
the unbounded solution, an ndarray with the sum of squared residuals,
|
||||
an int with the rank of `A`, and an ndarray with the singular values
|
||||
of `A` (see NumPy's ``linalg.lstsq`` for more information). If
|
||||
`lsq_solver` is set to ``'lsmr'``, the tuple contains an ndarray of
|
||||
shape (n,) with the unbounded solution, an int with the exit code,
|
||||
an int with the number of iterations, and five floats with
|
||||
various norms and the condition number of `A` (see SciPy's
|
||||
``sparse.linalg.lsmr`` for more information). This output can be
|
||||
useful for determining the convergence of the least squares solver,
|
||||
particularly the iterative ``'lsmr'`` solver. The unbounded least
|
||||
squares problem is to minimize ``0.5 * ||A x - b||**2``.
|
||||
nit : int
|
||||
Number of iterations. Zero if the unconstrained solution is optimal.
|
||||
status : int
|
||||
Reason for algorithm termination:
|
||||
|
||||
* -1 : the algorithm was not able to make progress on the last
|
||||
iteration.
|
||||
* 0 : the maximum number of iterations is exceeded.
|
||||
* 1 : the first-order optimality measure is less than `tol`.
|
||||
* 2 : the relative change of the cost function is less than `tol`.
|
||||
* 3 : the unconstrained solution is optimal.
|
||||
|
||||
message : str
|
||||
Verbal description of the termination reason.
|
||||
success : bool
|
||||
True if one of the convergence criteria is satisfied (`status` > 0).
|
||||
|
||||
See Also
|
||||
--------
|
||||
nnls : Linear least squares with non-negativity constraint.
|
||||
least_squares : Nonlinear least squares with bounds on the variables.
|
||||
|
||||
Notes
|
||||
-----
|
||||
The algorithm first computes the unconstrained least-squares solution by
|
||||
`numpy.linalg.lstsq` or `scipy.sparse.linalg.lsmr` depending on
|
||||
`lsq_solver`. This solution is returned as optimal if it lies within the
|
||||
bounds.
|
||||
|
||||
Method 'trf' runs the adaptation of the algorithm described in [STIR]_ for
|
||||
a linear least-squares problem. The iterations are essentially the same as
|
||||
in the nonlinear least-squares algorithm, but as the quadratic function
|
||||
model is always accurate, we don't need to track or modify the radius of
|
||||
a trust region. The line search (backtracking) is used as a safety net
|
||||
when a selected step does not decrease the cost function. Read more
|
||||
detailed description of the algorithm in `scipy.optimize.least_squares`.
|
||||
|
||||
Method 'bvls' runs a Python implementation of the algorithm described in
|
||||
[BVLS]_. The algorithm maintains active and free sets of variables, on
|
||||
each iteration chooses a new variable to move from the active set to the
|
||||
free set and then solves the unconstrained least-squares problem on free
|
||||
variables. This algorithm is guaranteed to give an accurate solution
|
||||
eventually, but may require up to n iterations for a problem with n
|
||||
variables. Additionally, an ad-hoc initialization procedure is
|
||||
implemented, that determines which variables to set free or active
|
||||
initially. It takes some number of iterations before actual BVLS starts,
|
||||
but can significantly reduce the number of further iterations.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [STIR] M. A. Branch, T. F. Coleman, and Y. Li, "A Subspace, Interior,
|
||||
and Conjugate Gradient Method for Large-Scale Bound-Constrained
|
||||
Minimization Problems," SIAM Journal on Scientific Computing,
|
||||
Vol. 21, Number 1, pp 1-23, 1999.
|
||||
.. [BVLS] P. B. Start and R. L. Parker, "Bounded-Variable Least-Squares:
|
||||
an Algorithm and Applications", Computational Statistics, 10,
|
||||
129-141, 1995.
|
||||
|
||||
Examples
|
||||
--------
|
||||
In this example, a problem with a large sparse matrix and bounds on the
|
||||
variables is solved.
|
||||
|
||||
>>> import numpy as np
|
||||
>>> from scipy.sparse import rand
|
||||
>>> from scipy.optimize import lsq_linear
|
||||
>>> rng = np.random.default_rng()
|
||||
...
|
||||
>>> m = 20000
|
||||
>>> n = 10000
|
||||
...
|
||||
>>> A = rand(m, n, density=1e-4, random_state=rng)
|
||||
>>> b = rng.standard_normal(m)
|
||||
...
|
||||
>>> lb = rng.standard_normal(n)
|
||||
>>> ub = lb + 1
|
||||
...
|
||||
>>> res = lsq_linear(A, b, bounds=(lb, ub), lsmr_tol='auto', verbose=1)
|
||||
# may vary
|
||||
The relative change of the cost function is less than `tol`.
|
||||
Number of iterations 16, initial cost 1.5039e+04, final cost 1.1112e+04,
|
||||
first-order optimality 4.66e-08.
|
||||
"""
|
||||
if method not in ['trf', 'bvls']:
|
||||
raise ValueError("`method` must be 'trf' or 'bvls'")
|
||||
|
||||
if lsq_solver not in [None, 'exact', 'lsmr']:
|
||||
raise ValueError("`solver` must be None, 'exact' or 'lsmr'.")
|
||||
|
||||
if verbose not in [0, 1, 2]:
|
||||
raise ValueError("`verbose` must be in [0, 1, 2].")
|
||||
|
||||
if issparse(A):
|
||||
A = csr_matrix(A)
|
||||
elif not isinstance(A, LinearOperator):
|
||||
A = np.atleast_2d(np.asarray(A))
|
||||
|
||||
if method == 'bvls':
|
||||
if lsq_solver == 'lsmr':
|
||||
raise ValueError("method='bvls' can't be used with "
|
||||
"lsq_solver='lsmr'")
|
||||
|
||||
if not isinstance(A, np.ndarray):
|
||||
raise ValueError("method='bvls' can't be used with `A` being "
|
||||
"sparse or LinearOperator.")
|
||||
|
||||
if lsq_solver is None:
|
||||
if isinstance(A, np.ndarray):
|
||||
lsq_solver = 'exact'
|
||||
else:
|
||||
lsq_solver = 'lsmr'
|
||||
elif lsq_solver == 'exact' and not isinstance(A, np.ndarray):
|
||||
raise ValueError("`exact` solver can't be used when `A` is "
|
||||
"sparse or LinearOperator.")
|
||||
|
||||
if len(A.shape) != 2: # No ndim for LinearOperator.
|
||||
raise ValueError("`A` must have at most 2 dimensions.")
|
||||
|
||||
if max_iter is not None and max_iter <= 0:
|
||||
raise ValueError("`max_iter` must be None or positive integer.")
|
||||
|
||||
m, n = A.shape
|
||||
|
||||
b = np.atleast_1d(b)
|
||||
if b.ndim != 1:
|
||||
raise ValueError("`b` must have at most 1 dimension.")
|
||||
|
||||
if b.size != m:
|
||||
raise ValueError("Inconsistent shapes between `A` and `b`.")
|
||||
|
||||
if isinstance(bounds, Bounds):
|
||||
lb = bounds.lb
|
||||
ub = bounds.ub
|
||||
else:
|
||||
lb, ub = prepare_bounds(bounds, n)
|
||||
|
||||
if lb.shape != (n,) and ub.shape != (n,):
|
||||
raise ValueError("Bounds have wrong shape.")
|
||||
|
||||
if np.any(lb >= ub):
|
||||
raise ValueError("Each lower bound must be strictly less than each "
|
||||
"upper bound.")
|
||||
|
||||
if lsmr_maxiter is not None and lsmr_maxiter < 1:
|
||||
raise ValueError("`lsmr_maxiter` must be None or positive integer.")
|
||||
|
||||
if not ((isinstance(lsmr_tol, float) and lsmr_tol > 0) or
|
||||
lsmr_tol in ('auto', None)):
|
||||
raise ValueError("`lsmr_tol` must be None, 'auto', or positive float.")
|
||||
|
||||
if lsq_solver == 'exact':
|
||||
unbd_lsq = np.linalg.lstsq(A, b, rcond=-1)
|
||||
elif lsq_solver == 'lsmr':
|
||||
first_lsmr_tol = lsmr_tol # tol of first call to lsmr
|
||||
if lsmr_tol is None or lsmr_tol == 'auto':
|
||||
first_lsmr_tol = 1e-2 * tol # default if lsmr_tol not defined
|
||||
unbd_lsq = lsmr(A, b, maxiter=lsmr_maxiter,
|
||||
atol=first_lsmr_tol, btol=first_lsmr_tol)
|
||||
x_lsq = unbd_lsq[0] # extract the solution from the least squares solver
|
||||
|
||||
if in_bounds(x_lsq, lb, ub):
|
||||
r = A @ x_lsq - b
|
||||
cost = 0.5 * np.dot(r, r)
|
||||
termination_status = 3
|
||||
termination_message = TERMINATION_MESSAGES[termination_status]
|
||||
g = compute_grad(A, r)
|
||||
g_norm = norm(g, ord=np.inf)
|
||||
|
||||
if verbose > 0:
|
||||
print(termination_message)
|
||||
print(f"Final cost {cost:.4e}, first-order optimality {g_norm:.2e}")
|
||||
|
||||
return OptimizeResult(
|
||||
x=x_lsq, fun=r, cost=cost, optimality=g_norm,
|
||||
active_mask=np.zeros(n), unbounded_sol=unbd_lsq,
|
||||
nit=0, status=termination_status,
|
||||
message=termination_message, success=True)
|
||||
|
||||
if method == 'trf':
|
||||
res = trf_linear(A, b, x_lsq, lb, ub, tol, lsq_solver, lsmr_tol,
|
||||
max_iter, verbose, lsmr_maxiter=lsmr_maxiter)
|
||||
elif method == 'bvls':
|
||||
res = bvls(A, b, x_lsq, lb, ub, tol, max_iter, verbose)
|
||||
|
||||
res.unbounded_sol = unbd_lsq
|
||||
res.message = TERMINATION_MESSAGES[res.status]
|
||||
res.success = res.status > 0
|
||||
|
||||
if verbose > 0:
|
||||
print(res.message)
|
||||
print(
|
||||
f"Number of iterations {res.nit}, initial cost {res.initial_cost:.4e}, "
|
||||
f"final cost {res.cost:.4e}, first-order optimality {res.optimality:.2e}."
|
||||
)
|
||||
|
||||
del res.initial_cost
|
||||
|
||||
return res
|
||||
560
venv/lib/python3.12/site-packages/scipy/optimize/_lsq/trf.py
Normal file
560
venv/lib/python3.12/site-packages/scipy/optimize/_lsq/trf.py
Normal file
@ -0,0 +1,560 @@
|
||||
"""Trust Region Reflective algorithm for least-squares optimization.
|
||||
|
||||
The algorithm is based on ideas from paper [STIR]_. The main idea is to
|
||||
account for the presence of the bounds by appropriate scaling of the variables (or,
|
||||
equivalently, changing a trust-region shape). Let's introduce a vector v:
|
||||
|
||||
| ub[i] - x[i], if g[i] < 0 and ub[i] < np.inf
|
||||
v[i] = | x[i] - lb[i], if g[i] > 0 and lb[i] > -np.inf
|
||||
| 1, otherwise
|
||||
|
||||
where g is the gradient of a cost function and lb, ub are the bounds. Its
|
||||
components are distances to the bounds at which the anti-gradient points (if
|
||||
this distance is finite). Define a scaling matrix D = diag(v**0.5).
|
||||
First-order optimality conditions can be stated as
|
||||
|
||||
D^2 g(x) = 0.
|
||||
|
||||
Meaning that components of the gradient should be zero for strictly interior
|
||||
variables, and components must point inside the feasible region for variables
|
||||
on the bound.
|
||||
|
||||
Now consider this system of equations as a new optimization problem. If the
|
||||
point x is strictly interior (not on the bound), then the left-hand side is
|
||||
differentiable and the Newton step for it satisfies
|
||||
|
||||
(D^2 H + diag(g) Jv) p = -D^2 g
|
||||
|
||||
where H is the Hessian matrix (or its J^T J approximation in least squares),
|
||||
Jv is the Jacobian matrix of v with components -1, 1 or 0, such that all
|
||||
elements of matrix C = diag(g) Jv are non-negative. Introduce the change
|
||||
of the variables x = D x_h (_h would be "hat" in LaTeX). In the new variables,
|
||||
we have a Newton step satisfying
|
||||
|
||||
B_h p_h = -g_h,
|
||||
|
||||
where B_h = D H D + C, g_h = D g. In least squares B_h = J_h^T J_h, where
|
||||
J_h = J D. Note that J_h and g_h are proper Jacobian and gradient with respect
|
||||
to "hat" variables. To guarantee global convergence we formulate a
|
||||
trust-region problem based on the Newton step in the new variables:
|
||||
|
||||
0.5 * p_h^T B_h p + g_h^T p_h -> min, ||p_h|| <= Delta
|
||||
|
||||
In the original space B = H + D^{-1} C D^{-1}, and the equivalent trust-region
|
||||
problem is
|
||||
|
||||
0.5 * p^T B p + g^T p -> min, ||D^{-1} p|| <= Delta
|
||||
|
||||
Here, the meaning of the matrix D becomes more clear: it alters the shape
|
||||
of a trust-region, such that large steps towards the bounds are not allowed.
|
||||
In the implementation, the trust-region problem is solved in "hat" space,
|
||||
but handling of the bounds is done in the original space (see below and read
|
||||
the code).
|
||||
|
||||
The introduction of the matrix D doesn't allow to ignore bounds, the algorithm
|
||||
must keep iterates strictly feasible (to satisfy aforementioned
|
||||
differentiability), the parameter theta controls step back from the boundary
|
||||
(see the code for details).
|
||||
|
||||
The algorithm does another important trick. If the trust-region solution
|
||||
doesn't fit into the bounds, then a reflected (from a firstly encountered
|
||||
bound) search direction is considered. For motivation and analysis refer to
|
||||
[STIR]_ paper (and other papers of the authors). In practice, it doesn't need
|
||||
a lot of justifications, the algorithm simply chooses the best step among
|
||||
three: a constrained trust-region step, a reflected step and a constrained
|
||||
Cauchy step (a minimizer along -g_h in "hat" space, or -D^2 g in the original
|
||||
space).
|
||||
|
||||
Another feature is that a trust-region radius control strategy is modified to
|
||||
account for appearance of the diagonal C matrix (called diag_h in the code).
|
||||
|
||||
Note that all described peculiarities are completely gone as we consider
|
||||
problems without bounds (the algorithm becomes a standard trust-region type
|
||||
algorithm very similar to ones implemented in MINPACK).
|
||||
|
||||
The implementation supports two methods of solving the trust-region problem.
|
||||
The first, called 'exact', applies SVD on Jacobian and then solves the problem
|
||||
very accurately using the algorithm described in [JJMore]_. It is not
|
||||
applicable to large problem. The second, called 'lsmr', uses the 2-D subspace
|
||||
approach (sometimes called "indefinite dogleg"), where the problem is solved
|
||||
in a subspace spanned by the gradient and the approximate Gauss-Newton step
|
||||
found by ``scipy.sparse.linalg.lsmr``. A 2-D trust-region problem is
|
||||
reformulated as a 4th order algebraic equation and solved very accurately by
|
||||
``numpy.roots``. The subspace approach allows to solve very large problems
|
||||
(up to couple of millions of residuals on a regular PC), provided the Jacobian
|
||||
matrix is sufficiently sparse.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [STIR] Branch, M.A., T.F. Coleman, and Y. Li, "A Subspace, Interior,
|
||||
and Conjugate Gradient Method for Large-Scale Bound-Constrained
|
||||
Minimization Problems," SIAM Journal on Scientific Computing,
|
||||
Vol. 21, Number 1, pp 1-23, 1999.
|
||||
.. [JJMore] More, J. J., "The Levenberg-Marquardt Algorithm: Implementation
|
||||
and Theory," Numerical Analysis, ed. G. A. Watson, Lecture
|
||||
"""
|
||||
import numpy as np
|
||||
from numpy.linalg import norm
|
||||
from scipy.linalg import svd, qr
|
||||
from scipy.sparse.linalg import lsmr
|
||||
from scipy.optimize import OptimizeResult
|
||||
|
||||
from .common import (
|
||||
step_size_to_bound, find_active_constraints, in_bounds,
|
||||
make_strictly_feasible, intersect_trust_region, solve_lsq_trust_region,
|
||||
solve_trust_region_2d, minimize_quadratic_1d, build_quadratic_1d,
|
||||
evaluate_quadratic, right_multiplied_operator, regularized_lsq_operator,
|
||||
CL_scaling_vector, compute_grad, compute_jac_scale, check_termination,
|
||||
update_tr_radius, scale_for_robust_loss_function, print_header_nonlinear,
|
||||
print_iteration_nonlinear)
|
||||
|
||||
|
||||
def trf(fun, jac, x0, f0, J0, lb, ub, ftol, xtol, gtol, max_nfev, x_scale,
|
||||
loss_function, tr_solver, tr_options, verbose):
|
||||
# For efficiency, it makes sense to run the simplified version of the
|
||||
# algorithm when no bounds are imposed. We decided to write the two
|
||||
# separate functions. It violates the DRY principle, but the individual
|
||||
# functions are kept the most readable.
|
||||
if np.all(lb == -np.inf) and np.all(ub == np.inf):
|
||||
return trf_no_bounds(
|
||||
fun, jac, x0, f0, J0, ftol, xtol, gtol, max_nfev, x_scale,
|
||||
loss_function, tr_solver, tr_options, verbose)
|
||||
else:
|
||||
return trf_bounds(
|
||||
fun, jac, x0, f0, J0, lb, ub, ftol, xtol, gtol, max_nfev, x_scale,
|
||||
loss_function, tr_solver, tr_options, verbose)
|
||||
|
||||
|
||||
def select_step(x, J_h, diag_h, g_h, p, p_h, d, Delta, lb, ub, theta):
|
||||
"""Select the best step according to Trust Region Reflective algorithm."""
|
||||
if in_bounds(x + p, lb, ub):
|
||||
p_value = evaluate_quadratic(J_h, g_h, p_h, diag=diag_h)
|
||||
return p, p_h, -p_value
|
||||
|
||||
p_stride, hits = step_size_to_bound(x, p, lb, ub)
|
||||
|
||||
# Compute the reflected direction.
|
||||
r_h = np.copy(p_h)
|
||||
r_h[hits.astype(bool)] *= -1
|
||||
r = d * r_h
|
||||
|
||||
# Restrict trust-region step, such that it hits the bound.
|
||||
p *= p_stride
|
||||
p_h *= p_stride
|
||||
x_on_bound = x + p
|
||||
|
||||
# Reflected direction will cross first either feasible region or trust
|
||||
# region boundary.
|
||||
_, to_tr = intersect_trust_region(p_h, r_h, Delta)
|
||||
to_bound, _ = step_size_to_bound(x_on_bound, r, lb, ub)
|
||||
|
||||
# Find lower and upper bounds on a step size along the reflected
|
||||
# direction, considering the strict feasibility requirement. There is no
|
||||
# single correct way to do that, the chosen approach seems to work best
|
||||
# on test problems.
|
||||
r_stride = min(to_bound, to_tr)
|
||||
if r_stride > 0:
|
||||
r_stride_l = (1 - theta) * p_stride / r_stride
|
||||
if r_stride == to_bound:
|
||||
r_stride_u = theta * to_bound
|
||||
else:
|
||||
r_stride_u = to_tr
|
||||
else:
|
||||
r_stride_l = 0
|
||||
r_stride_u = -1
|
||||
|
||||
# Check if reflection step is available.
|
||||
if r_stride_l <= r_stride_u:
|
||||
a, b, c = build_quadratic_1d(J_h, g_h, r_h, s0=p_h, diag=diag_h)
|
||||
r_stride, r_value = minimize_quadratic_1d(
|
||||
a, b, r_stride_l, r_stride_u, c=c)
|
||||
r_h *= r_stride
|
||||
r_h += p_h
|
||||
r = r_h * d
|
||||
else:
|
||||
r_value = np.inf
|
||||
|
||||
# Now correct p_h to make it strictly interior.
|
||||
p *= theta
|
||||
p_h *= theta
|
||||
p_value = evaluate_quadratic(J_h, g_h, p_h, diag=diag_h)
|
||||
|
||||
ag_h = -g_h
|
||||
ag = d * ag_h
|
||||
|
||||
to_tr = Delta / norm(ag_h)
|
||||
to_bound, _ = step_size_to_bound(x, ag, lb, ub)
|
||||
if to_bound < to_tr:
|
||||
ag_stride = theta * to_bound
|
||||
else:
|
||||
ag_stride = to_tr
|
||||
|
||||
a, b = build_quadratic_1d(J_h, g_h, ag_h, diag=diag_h)
|
||||
ag_stride, ag_value = minimize_quadratic_1d(a, b, 0, ag_stride)
|
||||
ag_h *= ag_stride
|
||||
ag *= ag_stride
|
||||
|
||||
if p_value < r_value and p_value < ag_value:
|
||||
return p, p_h, -p_value
|
||||
elif r_value < p_value and r_value < ag_value:
|
||||
return r, r_h, -r_value
|
||||
else:
|
||||
return ag, ag_h, -ag_value
|
||||
|
||||
|
||||
def trf_bounds(fun, jac, x0, f0, J0, lb, ub, ftol, xtol, gtol, max_nfev,
|
||||
x_scale, loss_function, tr_solver, tr_options, verbose):
|
||||
x = x0.copy()
|
||||
|
||||
f = f0
|
||||
f_true = f.copy()
|
||||
nfev = 1
|
||||
|
||||
J = J0
|
||||
njev = 1
|
||||
m, n = J.shape
|
||||
|
||||
if loss_function is not None:
|
||||
rho = loss_function(f)
|
||||
cost = 0.5 * np.sum(rho[0])
|
||||
J, f = scale_for_robust_loss_function(J, f, rho)
|
||||
else:
|
||||
cost = 0.5 * np.dot(f, f)
|
||||
|
||||
g = compute_grad(J, f)
|
||||
|
||||
jac_scale = isinstance(x_scale, str) and x_scale == 'jac'
|
||||
if jac_scale:
|
||||
scale, scale_inv = compute_jac_scale(J)
|
||||
else:
|
||||
scale, scale_inv = x_scale, 1 / x_scale
|
||||
|
||||
v, dv = CL_scaling_vector(x, g, lb, ub)
|
||||
v[dv != 0] *= scale_inv[dv != 0]
|
||||
Delta = norm(x0 * scale_inv / v**0.5)
|
||||
if Delta == 0:
|
||||
Delta = 1.0
|
||||
|
||||
g_norm = norm(g * v, ord=np.inf)
|
||||
|
||||
f_augmented = np.zeros(m + n)
|
||||
if tr_solver == 'exact':
|
||||
J_augmented = np.empty((m + n, n))
|
||||
elif tr_solver == 'lsmr':
|
||||
reg_term = 0.0
|
||||
regularize = tr_options.pop('regularize', True)
|
||||
|
||||
if max_nfev is None:
|
||||
max_nfev = x0.size * 100
|
||||
|
||||
alpha = 0.0 # "Levenberg-Marquardt" parameter
|
||||
|
||||
termination_status = None
|
||||
iteration = 0
|
||||
step_norm = None
|
||||
actual_reduction = None
|
||||
|
||||
if verbose == 2:
|
||||
print_header_nonlinear()
|
||||
|
||||
while True:
|
||||
v, dv = CL_scaling_vector(x, g, lb, ub)
|
||||
|
||||
g_norm = norm(g * v, ord=np.inf)
|
||||
if g_norm < gtol:
|
||||
termination_status = 1
|
||||
|
||||
if verbose == 2:
|
||||
print_iteration_nonlinear(iteration, nfev, cost, actual_reduction,
|
||||
step_norm, g_norm)
|
||||
|
||||
if termination_status is not None or nfev == max_nfev:
|
||||
break
|
||||
|
||||
# Now compute variables in "hat" space. Here, we also account for
|
||||
# scaling introduced by `x_scale` parameter. This part is a bit tricky,
|
||||
# you have to write down the formulas and see how the trust-region
|
||||
# problem is formulated when the two types of scaling are applied.
|
||||
# The idea is that first we apply `x_scale` and then apply Coleman-Li
|
||||
# approach in the new variables.
|
||||
|
||||
# v is recomputed in the variables after applying `x_scale`, note that
|
||||
# components which were identically 1 not affected.
|
||||
v[dv != 0] *= scale_inv[dv != 0]
|
||||
|
||||
# Here, we apply two types of scaling.
|
||||
d = v**0.5 * scale
|
||||
|
||||
# C = diag(g * scale) Jv
|
||||
diag_h = g * dv * scale
|
||||
|
||||
# After all this has been done, we continue normally.
|
||||
|
||||
# "hat" gradient.
|
||||
g_h = d * g
|
||||
|
||||
f_augmented[:m] = f
|
||||
if tr_solver == 'exact':
|
||||
J_augmented[:m] = J * d
|
||||
J_h = J_augmented[:m] # Memory view.
|
||||
J_augmented[m:] = np.diag(diag_h**0.5)
|
||||
U, s, V = svd(J_augmented, full_matrices=False)
|
||||
V = V.T
|
||||
uf = U.T.dot(f_augmented)
|
||||
elif tr_solver == 'lsmr':
|
||||
J_h = right_multiplied_operator(J, d)
|
||||
|
||||
if regularize:
|
||||
a, b = build_quadratic_1d(J_h, g_h, -g_h, diag=diag_h)
|
||||
to_tr = Delta / norm(g_h)
|
||||
ag_value = minimize_quadratic_1d(a, b, 0, to_tr)[1]
|
||||
reg_term = -ag_value / Delta**2
|
||||
|
||||
lsmr_op = regularized_lsq_operator(J_h, (diag_h + reg_term)**0.5)
|
||||
gn_h = lsmr(lsmr_op, f_augmented, **tr_options)[0]
|
||||
S = np.vstack((g_h, gn_h)).T
|
||||
S, _ = qr(S, mode='economic')
|
||||
JS = J_h.dot(S) # LinearOperator does dot too.
|
||||
B_S = np.dot(JS.T, JS) + np.dot(S.T * diag_h, S)
|
||||
g_S = S.T.dot(g_h)
|
||||
|
||||
# theta controls step back step ratio from the bounds.
|
||||
theta = max(0.995, 1 - g_norm)
|
||||
|
||||
actual_reduction = -1
|
||||
while actual_reduction <= 0 and nfev < max_nfev:
|
||||
if tr_solver == 'exact':
|
||||
p_h, alpha, n_iter = solve_lsq_trust_region(
|
||||
n, m, uf, s, V, Delta, initial_alpha=alpha)
|
||||
elif tr_solver == 'lsmr':
|
||||
p_S, _ = solve_trust_region_2d(B_S, g_S, Delta)
|
||||
p_h = S.dot(p_S)
|
||||
|
||||
p = d * p_h # Trust-region solution in the original space.
|
||||
step, step_h, predicted_reduction = select_step(
|
||||
x, J_h, diag_h, g_h, p, p_h, d, Delta, lb, ub, theta)
|
||||
|
||||
x_new = make_strictly_feasible(x + step, lb, ub, rstep=0)
|
||||
f_new = fun(x_new)
|
||||
nfev += 1
|
||||
|
||||
step_h_norm = norm(step_h)
|
||||
|
||||
if not np.all(np.isfinite(f_new)):
|
||||
Delta = 0.25 * step_h_norm
|
||||
continue
|
||||
|
||||
# Usual trust-region step quality estimation.
|
||||
if loss_function is not None:
|
||||
cost_new = loss_function(f_new, cost_only=True)
|
||||
else:
|
||||
cost_new = 0.5 * np.dot(f_new, f_new)
|
||||
actual_reduction = cost - cost_new
|
||||
Delta_new, ratio = update_tr_radius(
|
||||
Delta, actual_reduction, predicted_reduction,
|
||||
step_h_norm, step_h_norm > 0.95 * Delta)
|
||||
|
||||
step_norm = norm(step)
|
||||
termination_status = check_termination(
|
||||
actual_reduction, cost, step_norm, norm(x), ratio, ftol, xtol)
|
||||
if termination_status is not None:
|
||||
break
|
||||
|
||||
alpha *= Delta / Delta_new
|
||||
Delta = Delta_new
|
||||
|
||||
if actual_reduction > 0:
|
||||
x = x_new
|
||||
|
||||
f = f_new
|
||||
f_true = f.copy()
|
||||
|
||||
cost = cost_new
|
||||
|
||||
J = jac(x, f)
|
||||
njev += 1
|
||||
|
||||
if loss_function is not None:
|
||||
rho = loss_function(f)
|
||||
J, f = scale_for_robust_loss_function(J, f, rho)
|
||||
|
||||
g = compute_grad(J, f)
|
||||
|
||||
if jac_scale:
|
||||
scale, scale_inv = compute_jac_scale(J, scale_inv)
|
||||
else:
|
||||
step_norm = 0
|
||||
actual_reduction = 0
|
||||
|
||||
iteration += 1
|
||||
|
||||
if termination_status is None:
|
||||
termination_status = 0
|
||||
|
||||
active_mask = find_active_constraints(x, lb, ub, rtol=xtol)
|
||||
return OptimizeResult(
|
||||
x=x, cost=cost, fun=f_true, jac=J, grad=g, optimality=g_norm,
|
||||
active_mask=active_mask, nfev=nfev, njev=njev,
|
||||
status=termination_status)
|
||||
|
||||
|
||||
def trf_no_bounds(fun, jac, x0, f0, J0, ftol, xtol, gtol, max_nfev,
|
||||
x_scale, loss_function, tr_solver, tr_options, verbose):
|
||||
x = x0.copy()
|
||||
|
||||
f = f0
|
||||
f_true = f.copy()
|
||||
nfev = 1
|
||||
|
||||
J = J0
|
||||
njev = 1
|
||||
m, n = J.shape
|
||||
|
||||
if loss_function is not None:
|
||||
rho = loss_function(f)
|
||||
cost = 0.5 * np.sum(rho[0])
|
||||
J, f = scale_for_robust_loss_function(J, f, rho)
|
||||
else:
|
||||
cost = 0.5 * np.dot(f, f)
|
||||
|
||||
g = compute_grad(J, f)
|
||||
|
||||
jac_scale = isinstance(x_scale, str) and x_scale == 'jac'
|
||||
if jac_scale:
|
||||
scale, scale_inv = compute_jac_scale(J)
|
||||
else:
|
||||
scale, scale_inv = x_scale, 1 / x_scale
|
||||
|
||||
Delta = norm(x0 * scale_inv)
|
||||
if Delta == 0:
|
||||
Delta = 1.0
|
||||
|
||||
if tr_solver == 'lsmr':
|
||||
reg_term = 0
|
||||
damp = tr_options.pop('damp', 0.0)
|
||||
regularize = tr_options.pop('regularize', True)
|
||||
|
||||
if max_nfev is None:
|
||||
max_nfev = x0.size * 100
|
||||
|
||||
alpha = 0.0 # "Levenberg-Marquardt" parameter
|
||||
|
||||
termination_status = None
|
||||
iteration = 0
|
||||
step_norm = None
|
||||
actual_reduction = None
|
||||
|
||||
if verbose == 2:
|
||||
print_header_nonlinear()
|
||||
|
||||
while True:
|
||||
g_norm = norm(g, ord=np.inf)
|
||||
if g_norm < gtol:
|
||||
termination_status = 1
|
||||
|
||||
if verbose == 2:
|
||||
print_iteration_nonlinear(iteration, nfev, cost, actual_reduction,
|
||||
step_norm, g_norm)
|
||||
|
||||
if termination_status is not None or nfev == max_nfev:
|
||||
break
|
||||
|
||||
d = scale
|
||||
g_h = d * g
|
||||
|
||||
if tr_solver == 'exact':
|
||||
J_h = J * d
|
||||
U, s, V = svd(J_h, full_matrices=False)
|
||||
V = V.T
|
||||
uf = U.T.dot(f)
|
||||
elif tr_solver == 'lsmr':
|
||||
J_h = right_multiplied_operator(J, d)
|
||||
|
||||
if regularize:
|
||||
a, b = build_quadratic_1d(J_h, g_h, -g_h)
|
||||
to_tr = Delta / norm(g_h)
|
||||
ag_value = minimize_quadratic_1d(a, b, 0, to_tr)[1]
|
||||
reg_term = -ag_value / Delta**2
|
||||
|
||||
damp_full = (damp**2 + reg_term)**0.5
|
||||
gn_h = lsmr(J_h, f, damp=damp_full, **tr_options)[0]
|
||||
S = np.vstack((g_h, gn_h)).T
|
||||
S, _ = qr(S, mode='economic')
|
||||
JS = J_h.dot(S)
|
||||
B_S = np.dot(JS.T, JS)
|
||||
g_S = S.T.dot(g_h)
|
||||
|
||||
actual_reduction = -1
|
||||
while actual_reduction <= 0 and nfev < max_nfev:
|
||||
if tr_solver == 'exact':
|
||||
step_h, alpha, n_iter = solve_lsq_trust_region(
|
||||
n, m, uf, s, V, Delta, initial_alpha=alpha)
|
||||
elif tr_solver == 'lsmr':
|
||||
p_S, _ = solve_trust_region_2d(B_S, g_S, Delta)
|
||||
step_h = S.dot(p_S)
|
||||
|
||||
predicted_reduction = -evaluate_quadratic(J_h, g_h, step_h)
|
||||
step = d * step_h
|
||||
x_new = x + step
|
||||
f_new = fun(x_new)
|
||||
nfev += 1
|
||||
|
||||
step_h_norm = norm(step_h)
|
||||
|
||||
if not np.all(np.isfinite(f_new)):
|
||||
Delta = 0.25 * step_h_norm
|
||||
continue
|
||||
|
||||
# Usual trust-region step quality estimation.
|
||||
if loss_function is not None:
|
||||
cost_new = loss_function(f_new, cost_only=True)
|
||||
else:
|
||||
cost_new = 0.5 * np.dot(f_new, f_new)
|
||||
actual_reduction = cost - cost_new
|
||||
|
||||
Delta_new, ratio = update_tr_radius(
|
||||
Delta, actual_reduction, predicted_reduction,
|
||||
step_h_norm, step_h_norm > 0.95 * Delta)
|
||||
|
||||
step_norm = norm(step)
|
||||
termination_status = check_termination(
|
||||
actual_reduction, cost, step_norm, norm(x), ratio, ftol, xtol)
|
||||
if termination_status is not None:
|
||||
break
|
||||
|
||||
alpha *= Delta / Delta_new
|
||||
Delta = Delta_new
|
||||
|
||||
if actual_reduction > 0:
|
||||
x = x_new
|
||||
|
||||
f = f_new
|
||||
f_true = f.copy()
|
||||
|
||||
cost = cost_new
|
||||
|
||||
J = jac(x, f)
|
||||
njev += 1
|
||||
|
||||
if loss_function is not None:
|
||||
rho = loss_function(f)
|
||||
J, f = scale_for_robust_loss_function(J, f, rho)
|
||||
|
||||
g = compute_grad(J, f)
|
||||
|
||||
if jac_scale:
|
||||
scale, scale_inv = compute_jac_scale(J, scale_inv)
|
||||
else:
|
||||
step_norm = 0
|
||||
actual_reduction = 0
|
||||
|
||||
iteration += 1
|
||||
|
||||
if termination_status is None:
|
||||
termination_status = 0
|
||||
|
||||
active_mask = np.zeros_like(x)
|
||||
return OptimizeResult(
|
||||
x=x, cost=cost, fun=f_true, jac=J, grad=g, optimality=g_norm,
|
||||
active_mask=active_mask, nfev=nfev, njev=njev,
|
||||
status=termination_status)
|
||||
@ -0,0 +1,249 @@
|
||||
"""The adaptation of Trust Region Reflective algorithm for a linear
|
||||
least-squares problem."""
|
||||
import numpy as np
|
||||
from numpy.linalg import norm
|
||||
from scipy.linalg import qr, solve_triangular
|
||||
from scipy.sparse.linalg import lsmr
|
||||
from scipy.optimize import OptimizeResult
|
||||
|
||||
from .givens_elimination import givens_elimination
|
||||
from .common import (
|
||||
EPS, step_size_to_bound, find_active_constraints, in_bounds,
|
||||
make_strictly_feasible, build_quadratic_1d, evaluate_quadratic,
|
||||
minimize_quadratic_1d, CL_scaling_vector, reflective_transformation,
|
||||
print_header_linear, print_iteration_linear, compute_grad,
|
||||
regularized_lsq_operator, right_multiplied_operator)
|
||||
|
||||
|
||||
def regularized_lsq_with_qr(m, n, R, QTb, perm, diag, copy_R=True):
|
||||
"""Solve regularized least squares using information from QR-decomposition.
|
||||
|
||||
The initial problem is to solve the following system in a least-squares
|
||||
sense::
|
||||
|
||||
A x = b
|
||||
D x = 0
|
||||
|
||||
where D is diagonal matrix. The method is based on QR decomposition
|
||||
of the form A P = Q R, where P is a column permutation matrix, Q is an
|
||||
orthogonal matrix and R is an upper triangular matrix.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
m, n : int
|
||||
Initial shape of A.
|
||||
R : ndarray, shape (n, n)
|
||||
Upper triangular matrix from QR decomposition of A.
|
||||
QTb : ndarray, shape (n,)
|
||||
First n components of Q^T b.
|
||||
perm : ndarray, shape (n,)
|
||||
Array defining column permutation of A, such that ith column of
|
||||
P is perm[i]-th column of identity matrix.
|
||||
diag : ndarray, shape (n,)
|
||||
Array containing diagonal elements of D.
|
||||
|
||||
Returns
|
||||
-------
|
||||
x : ndarray, shape (n,)
|
||||
Found least-squares solution.
|
||||
"""
|
||||
if copy_R:
|
||||
R = R.copy()
|
||||
v = QTb.copy()
|
||||
|
||||
givens_elimination(R, v, diag[perm])
|
||||
|
||||
abs_diag_R = np.abs(np.diag(R))
|
||||
threshold = EPS * max(m, n) * np.max(abs_diag_R)
|
||||
nns, = np.nonzero(abs_diag_R > threshold)
|
||||
|
||||
R = R[np.ix_(nns, nns)]
|
||||
v = v[nns]
|
||||
|
||||
x = np.zeros(n)
|
||||
x[perm[nns]] = solve_triangular(R, v)
|
||||
|
||||
return x
|
||||
|
||||
|
||||
def backtracking(A, g, x, p, theta, p_dot_g, lb, ub):
|
||||
"""Find an appropriate step size using backtracking line search."""
|
||||
alpha = 1
|
||||
while True:
|
||||
x_new, _ = reflective_transformation(x + alpha * p, lb, ub)
|
||||
step = x_new - x
|
||||
cost_change = -evaluate_quadratic(A, g, step)
|
||||
if cost_change > -0.1 * alpha * p_dot_g:
|
||||
break
|
||||
alpha *= 0.5
|
||||
|
||||
active = find_active_constraints(x_new, lb, ub)
|
||||
if np.any(active != 0):
|
||||
x_new, _ = reflective_transformation(x + theta * alpha * p, lb, ub)
|
||||
x_new = make_strictly_feasible(x_new, lb, ub, rstep=0)
|
||||
step = x_new - x
|
||||
cost_change = -evaluate_quadratic(A, g, step)
|
||||
|
||||
return x, step, cost_change
|
||||
|
||||
|
||||
def select_step(x, A_h, g_h, c_h, p, p_h, d, lb, ub, theta):
|
||||
"""Select the best step according to Trust Region Reflective algorithm."""
|
||||
if in_bounds(x + p, lb, ub):
|
||||
return p
|
||||
|
||||
p_stride, hits = step_size_to_bound(x, p, lb, ub)
|
||||
r_h = np.copy(p_h)
|
||||
r_h[hits.astype(bool)] *= -1
|
||||
r = d * r_h
|
||||
|
||||
# Restrict step, such that it hits the bound.
|
||||
p *= p_stride
|
||||
p_h *= p_stride
|
||||
x_on_bound = x + p
|
||||
|
||||
# Find the step size along reflected direction.
|
||||
r_stride_u, _ = step_size_to_bound(x_on_bound, r, lb, ub)
|
||||
|
||||
# Stay interior.
|
||||
r_stride_l = (1 - theta) * r_stride_u
|
||||
r_stride_u *= theta
|
||||
|
||||
if r_stride_u > 0:
|
||||
a, b, c = build_quadratic_1d(A_h, g_h, r_h, s0=p_h, diag=c_h)
|
||||
r_stride, r_value = minimize_quadratic_1d(
|
||||
a, b, r_stride_l, r_stride_u, c=c)
|
||||
r_h = p_h + r_h * r_stride
|
||||
r = d * r_h
|
||||
else:
|
||||
r_value = np.inf
|
||||
|
||||
# Now correct p_h to make it strictly interior.
|
||||
p_h *= theta
|
||||
p *= theta
|
||||
p_value = evaluate_quadratic(A_h, g_h, p_h, diag=c_h)
|
||||
|
||||
ag_h = -g_h
|
||||
ag = d * ag_h
|
||||
ag_stride_u, _ = step_size_to_bound(x, ag, lb, ub)
|
||||
ag_stride_u *= theta
|
||||
a, b = build_quadratic_1d(A_h, g_h, ag_h, diag=c_h)
|
||||
ag_stride, ag_value = minimize_quadratic_1d(a, b, 0, ag_stride_u)
|
||||
ag *= ag_stride
|
||||
|
||||
if p_value < r_value and p_value < ag_value:
|
||||
return p
|
||||
elif r_value < p_value and r_value < ag_value:
|
||||
return r
|
||||
else:
|
||||
return ag
|
||||
|
||||
|
||||
def trf_linear(A, b, x_lsq, lb, ub, tol, lsq_solver, lsmr_tol,
|
||||
max_iter, verbose, *, lsmr_maxiter=None):
|
||||
m, n = A.shape
|
||||
x, _ = reflective_transformation(x_lsq, lb, ub)
|
||||
x = make_strictly_feasible(x, lb, ub, rstep=0.1)
|
||||
|
||||
if lsq_solver == 'exact':
|
||||
QT, R, perm = qr(A, mode='economic', pivoting=True)
|
||||
QT = QT.T
|
||||
|
||||
if m < n:
|
||||
R = np.vstack((R, np.zeros((n - m, n))))
|
||||
|
||||
QTr = np.zeros(n)
|
||||
k = min(m, n)
|
||||
elif lsq_solver == 'lsmr':
|
||||
r_aug = np.zeros(m + n)
|
||||
auto_lsmr_tol = False
|
||||
if lsmr_tol is None:
|
||||
lsmr_tol = 1e-2 * tol
|
||||
elif lsmr_tol == 'auto':
|
||||
auto_lsmr_tol = True
|
||||
|
||||
r = A.dot(x) - b
|
||||
g = compute_grad(A, r)
|
||||
cost = 0.5 * np.dot(r, r)
|
||||
initial_cost = cost
|
||||
|
||||
termination_status = None
|
||||
step_norm = None
|
||||
cost_change = None
|
||||
|
||||
if max_iter is None:
|
||||
max_iter = 100
|
||||
|
||||
if verbose == 2:
|
||||
print_header_linear()
|
||||
|
||||
for iteration in range(max_iter):
|
||||
v, dv = CL_scaling_vector(x, g, lb, ub)
|
||||
g_scaled = g * v
|
||||
g_norm = norm(g_scaled, ord=np.inf)
|
||||
if g_norm < tol:
|
||||
termination_status = 1
|
||||
|
||||
if verbose == 2:
|
||||
print_iteration_linear(iteration, cost, cost_change,
|
||||
step_norm, g_norm)
|
||||
|
||||
if termination_status is not None:
|
||||
break
|
||||
|
||||
diag_h = g * dv
|
||||
diag_root_h = diag_h ** 0.5
|
||||
d = v ** 0.5
|
||||
g_h = d * g
|
||||
|
||||
A_h = right_multiplied_operator(A, d)
|
||||
if lsq_solver == 'exact':
|
||||
QTr[:k] = QT.dot(r)
|
||||
p_h = -regularized_lsq_with_qr(m, n, R * d[perm], QTr, perm,
|
||||
diag_root_h, copy_R=False)
|
||||
elif lsq_solver == 'lsmr':
|
||||
lsmr_op = regularized_lsq_operator(A_h, diag_root_h)
|
||||
r_aug[:m] = r
|
||||
if auto_lsmr_tol:
|
||||
eta = 1e-2 * min(0.5, g_norm)
|
||||
lsmr_tol = max(EPS, min(0.1, eta * g_norm))
|
||||
p_h = -lsmr(lsmr_op, r_aug, maxiter=lsmr_maxiter,
|
||||
atol=lsmr_tol, btol=lsmr_tol)[0]
|
||||
|
||||
p = d * p_h
|
||||
|
||||
p_dot_g = np.dot(p, g)
|
||||
if p_dot_g > 0:
|
||||
termination_status = -1
|
||||
|
||||
theta = 1 - min(0.005, g_norm)
|
||||
step = select_step(x, A_h, g_h, diag_h, p, p_h, d, lb, ub, theta)
|
||||
cost_change = -evaluate_quadratic(A, g, step)
|
||||
|
||||
# Perhaps almost never executed, the idea is that `p` is descent
|
||||
# direction thus we must find acceptable cost decrease using simple
|
||||
# "backtracking", otherwise the algorithm's logic would break.
|
||||
if cost_change < 0:
|
||||
x, step, cost_change = backtracking(
|
||||
A, g, x, p, theta, p_dot_g, lb, ub)
|
||||
else:
|
||||
x = make_strictly_feasible(x + step, lb, ub, rstep=0)
|
||||
|
||||
step_norm = norm(step)
|
||||
r = A.dot(x) - b
|
||||
g = compute_grad(A, r)
|
||||
|
||||
if cost_change < tol * cost:
|
||||
termination_status = 2
|
||||
|
||||
cost = 0.5 * np.dot(r, r)
|
||||
|
||||
if termination_status is None:
|
||||
termination_status = 0
|
||||
|
||||
active_mask = find_active_constraints(x, lb, ub, rtol=tol)
|
||||
|
||||
return OptimizeResult(
|
||||
x=x, fun=r, cost=cost, optimality=g_norm, active_mask=active_mask,
|
||||
nit=iteration + 1, status=termination_status,
|
||||
initial_cost=initial_cost)
|
||||
392
venv/lib/python3.12/site-packages/scipy/optimize/_milp.py
Normal file
392
venv/lib/python3.12/site-packages/scipy/optimize/_milp.py
Normal file
@ -0,0 +1,392 @@
|
||||
import warnings
|
||||
import numpy as np
|
||||
from scipy.sparse import csc_array, vstack, issparse
|
||||
from scipy._lib._util import VisibleDeprecationWarning
|
||||
from ._highs._highs_wrapper import _highs_wrapper # type: ignore[import-not-found,import-untyped]
|
||||
from ._constraints import LinearConstraint, Bounds
|
||||
from ._optimize import OptimizeResult
|
||||
from ._linprog_highs import _highs_to_scipy_status_message
|
||||
|
||||
|
||||
def _constraints_to_components(constraints):
|
||||
"""
|
||||
Convert sequence of constraints to a single set of components A, b_l, b_u.
|
||||
|
||||
`constraints` could be
|
||||
|
||||
1. A LinearConstraint
|
||||
2. A tuple representing a LinearConstraint
|
||||
3. An invalid object
|
||||
4. A sequence of composed entirely of objects of type 1/2
|
||||
5. A sequence containing at least one object of type 3
|
||||
|
||||
We want to accept 1, 2, and 4 and reject 3 and 5.
|
||||
"""
|
||||
message = ("`constraints` (or each element within `constraints`) must be "
|
||||
"convertible into an instance of "
|
||||
"`scipy.optimize.LinearConstraint`.")
|
||||
As = []
|
||||
b_ls = []
|
||||
b_us = []
|
||||
|
||||
# Accept case 1 by standardizing as case 4
|
||||
if isinstance(constraints, LinearConstraint):
|
||||
constraints = [constraints]
|
||||
else:
|
||||
# Reject case 3
|
||||
try:
|
||||
iter(constraints)
|
||||
except TypeError as exc:
|
||||
raise ValueError(message) from exc
|
||||
|
||||
# Accept case 2 by standardizing as case 4
|
||||
if len(constraints) == 3:
|
||||
# argument could be a single tuple representing a LinearConstraint
|
||||
try:
|
||||
constraints = [LinearConstraint(*constraints)]
|
||||
except (TypeError, ValueError, VisibleDeprecationWarning):
|
||||
# argument was not a tuple representing a LinearConstraint
|
||||
pass
|
||||
|
||||
# Address cases 4/5
|
||||
for constraint in constraints:
|
||||
# if it's not a LinearConstraint or something that represents a
|
||||
# LinearConstraint at this point, it's invalid
|
||||
if not isinstance(constraint, LinearConstraint):
|
||||
try:
|
||||
constraint = LinearConstraint(*constraint)
|
||||
except TypeError as exc:
|
||||
raise ValueError(message) from exc
|
||||
As.append(csc_array(constraint.A))
|
||||
b_ls.append(np.atleast_1d(constraint.lb).astype(np.float64))
|
||||
b_us.append(np.atleast_1d(constraint.ub).astype(np.float64))
|
||||
|
||||
if len(As) > 1:
|
||||
A = vstack(As, format="csc")
|
||||
b_l = np.concatenate(b_ls)
|
||||
b_u = np.concatenate(b_us)
|
||||
else: # avoid unnecessary copying
|
||||
A = As[0]
|
||||
b_l = b_ls[0]
|
||||
b_u = b_us[0]
|
||||
|
||||
return A, b_l, b_u
|
||||
|
||||
|
||||
def _milp_iv(c, integrality, bounds, constraints, options):
|
||||
# objective IV
|
||||
if issparse(c):
|
||||
raise ValueError("`c` must be a dense array.")
|
||||
c = np.atleast_1d(c).astype(np.float64)
|
||||
if c.ndim != 1 or c.size == 0 or not np.all(np.isfinite(c)):
|
||||
message = ("`c` must be a one-dimensional array of finite numbers "
|
||||
"with at least one element.")
|
||||
raise ValueError(message)
|
||||
|
||||
# integrality IV
|
||||
if issparse(integrality):
|
||||
raise ValueError("`integrality` must be a dense array.")
|
||||
message = ("`integrality` must contain integers 0-3 and be broadcastable "
|
||||
"to `c.shape`.")
|
||||
if integrality is None:
|
||||
integrality = 0
|
||||
try:
|
||||
integrality = np.broadcast_to(integrality, c.shape).astype(np.uint8)
|
||||
except ValueError:
|
||||
raise ValueError(message)
|
||||
if integrality.min() < 0 or integrality.max() > 3:
|
||||
raise ValueError(message)
|
||||
|
||||
# bounds IV
|
||||
if bounds is None:
|
||||
bounds = Bounds(0, np.inf)
|
||||
elif not isinstance(bounds, Bounds):
|
||||
message = ("`bounds` must be convertible into an instance of "
|
||||
"`scipy.optimize.Bounds`.")
|
||||
try:
|
||||
bounds = Bounds(*bounds)
|
||||
except TypeError as exc:
|
||||
raise ValueError(message) from exc
|
||||
|
||||
try:
|
||||
lb = np.broadcast_to(bounds.lb, c.shape).astype(np.float64)
|
||||
ub = np.broadcast_to(bounds.ub, c.shape).astype(np.float64)
|
||||
except (ValueError, TypeError) as exc:
|
||||
message = ("`bounds.lb` and `bounds.ub` must contain reals and "
|
||||
"be broadcastable to `c.shape`.")
|
||||
raise ValueError(message) from exc
|
||||
|
||||
# constraints IV
|
||||
if not constraints:
|
||||
constraints = [LinearConstraint(np.empty((0, c.size)),
|
||||
np.empty((0,)), np.empty((0,)))]
|
||||
try:
|
||||
A, b_l, b_u = _constraints_to_components(constraints)
|
||||
except ValueError as exc:
|
||||
message = ("`constraints` (or each element within `constraints`) must "
|
||||
"be convertible into an instance of "
|
||||
"`scipy.optimize.LinearConstraint`.")
|
||||
raise ValueError(message) from exc
|
||||
|
||||
if A.shape != (b_l.size, c.size):
|
||||
message = "The shape of `A` must be (len(b_l), len(c))."
|
||||
raise ValueError(message)
|
||||
indptr, indices, data = A.indptr, A.indices, A.data.astype(np.float64)
|
||||
|
||||
# options IV
|
||||
options = options or {}
|
||||
supported_options = {'disp', 'presolve', 'time_limit', 'node_limit',
|
||||
'mip_rel_gap'}
|
||||
unsupported_options = set(options).difference(supported_options)
|
||||
if unsupported_options:
|
||||
message = (f"Unrecognized options detected: {unsupported_options}. "
|
||||
"These will be passed to HiGHS verbatim.")
|
||||
warnings.warn(message, RuntimeWarning, stacklevel=3)
|
||||
options_iv = {'log_to_console': options.pop("disp", False),
|
||||
'mip_max_nodes': options.pop("node_limit", None)}
|
||||
options_iv.update(options)
|
||||
|
||||
return c, integrality, lb, ub, indptr, indices, data, b_l, b_u, options_iv
|
||||
|
||||
|
||||
def milp(c, *, integrality=None, bounds=None, constraints=None, options=None):
|
||||
r"""
|
||||
Mixed-integer linear programming
|
||||
|
||||
Solves problems of the following form:
|
||||
|
||||
.. math::
|
||||
|
||||
\min_x \ & c^T x \\
|
||||
\mbox{such that} \ & b_l \leq A x \leq b_u,\\
|
||||
& l \leq x \leq u, \\
|
||||
& x_i \in \mathbb{Z}, i \in X_i
|
||||
|
||||
where :math:`x` is a vector of decision variables;
|
||||
:math:`c`, :math:`b_l`, :math:`b_u`, :math:`l`, and :math:`u` are vectors;
|
||||
:math:`A` is a matrix, and :math:`X_i` is the set of indices of
|
||||
decision variables that must be integral. (In this context, a
|
||||
variable that can assume only integer values is said to be "integral";
|
||||
it has an "integrality" constraint.)
|
||||
|
||||
Alternatively, that's:
|
||||
|
||||
minimize::
|
||||
|
||||
c @ x
|
||||
|
||||
such that::
|
||||
|
||||
b_l <= A @ x <= b_u
|
||||
l <= x <= u
|
||||
Specified elements of x must be integers
|
||||
|
||||
By default, ``l = 0`` and ``u = np.inf`` unless specified with
|
||||
``bounds``.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
c : 1D dense array_like
|
||||
The coefficients of the linear objective function to be minimized.
|
||||
`c` is converted to a double precision array before the problem is
|
||||
solved.
|
||||
integrality : 1D dense array_like, optional
|
||||
Indicates the type of integrality constraint on each decision variable.
|
||||
|
||||
``0`` : Continuous variable; no integrality constraint.
|
||||
|
||||
``1`` : Integer variable; decision variable must be an integer
|
||||
within `bounds`.
|
||||
|
||||
``2`` : Semi-continuous variable; decision variable must be within
|
||||
`bounds` or take value ``0``.
|
||||
|
||||
``3`` : Semi-integer variable; decision variable must be an integer
|
||||
within `bounds` or take value ``0``.
|
||||
|
||||
By default, all variables are continuous. `integrality` is converted
|
||||
to an array of integers before the problem is solved.
|
||||
|
||||
bounds : scipy.optimize.Bounds, optional
|
||||
Bounds on the decision variables. Lower and upper bounds are converted
|
||||
to double precision arrays before the problem is solved. The
|
||||
``keep_feasible`` parameter of the `Bounds` object is ignored. If
|
||||
not specified, all decision variables are constrained to be
|
||||
non-negative.
|
||||
constraints : sequence of scipy.optimize.LinearConstraint, optional
|
||||
Linear constraints of the optimization problem. Arguments may be
|
||||
one of the following:
|
||||
|
||||
1. A single `LinearConstraint` object
|
||||
2. A single tuple that can be converted to a `LinearConstraint` object
|
||||
as ``LinearConstraint(*constraints)``
|
||||
3. A sequence composed entirely of objects of type 1. and 2.
|
||||
|
||||
Before the problem is solved, all values are converted to double
|
||||
precision, and the matrices of constraint coefficients are converted to
|
||||
instances of `scipy.sparse.csc_array`. The ``keep_feasible`` parameter
|
||||
of `LinearConstraint` objects is ignored.
|
||||
options : dict, optional
|
||||
A dictionary of solver options. The following keys are recognized.
|
||||
|
||||
disp : bool (default: ``False``)
|
||||
Set to ``True`` if indicators of optimization status are to be
|
||||
printed to the console during optimization.
|
||||
node_limit : int, optional
|
||||
The maximum number of nodes (linear program relaxations) to solve
|
||||
before stopping. Default is no maximum number of nodes.
|
||||
presolve : bool (default: ``True``)
|
||||
Presolve attempts to identify trivial infeasibilities,
|
||||
identify trivial unboundedness, and simplify the problem before
|
||||
sending it to the main solver.
|
||||
time_limit : float, optional
|
||||
The maximum number of seconds allotted to solve the problem.
|
||||
Default is no time limit.
|
||||
mip_rel_gap : float, optional
|
||||
Termination criterion for MIP solver: solver will terminate when
|
||||
the gap between the primal objective value and the dual objective
|
||||
bound, scaled by the primal objective value, is <= mip_rel_gap.
|
||||
|
||||
Returns
|
||||
-------
|
||||
res : OptimizeResult
|
||||
An instance of :class:`scipy.optimize.OptimizeResult`. The object
|
||||
is guaranteed to have the following attributes.
|
||||
|
||||
status : int
|
||||
An integer representing the exit status of the algorithm.
|
||||
|
||||
``0`` : Optimal solution found.
|
||||
|
||||
``1`` : Iteration or time limit reached.
|
||||
|
||||
``2`` : Problem is infeasible.
|
||||
|
||||
``3`` : Problem is unbounded.
|
||||
|
||||
``4`` : Other; see message for details.
|
||||
|
||||
success : bool
|
||||
``True`` when an optimal solution is found and ``False`` otherwise.
|
||||
|
||||
message : str
|
||||
A string descriptor of the exit status of the algorithm.
|
||||
|
||||
The following attributes will also be present, but the values may be
|
||||
``None``, depending on the solution status.
|
||||
|
||||
x : ndarray
|
||||
The values of the decision variables that minimize the
|
||||
objective function while satisfying the constraints.
|
||||
fun : float
|
||||
The optimal value of the objective function ``c @ x``.
|
||||
mip_node_count : int
|
||||
The number of subproblems or "nodes" solved by the MILP solver.
|
||||
mip_dual_bound : float
|
||||
The MILP solver's final estimate of the lower bound on the optimal
|
||||
solution.
|
||||
mip_gap : float
|
||||
The difference between the primal objective value and the dual
|
||||
objective bound, scaled by the primal objective value.
|
||||
|
||||
Notes
|
||||
-----
|
||||
`milp` is a wrapper of the HiGHS linear optimization software [1]_. The
|
||||
algorithm is deterministic, and it typically finds the global optimum of
|
||||
moderately challenging mixed-integer linear programs (when it exists).
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Huangfu, Q., Galabova, I., Feldmeier, M., and Hall, J. A. J.
|
||||
"HiGHS - high performance software for linear optimization."
|
||||
https://highs.dev/
|
||||
.. [2] Huangfu, Q. and Hall, J. A. J. "Parallelizing the dual revised
|
||||
simplex method." Mathematical Programming Computation, 10 (1),
|
||||
119-142, 2018. DOI: 10.1007/s12532-017-0130-5
|
||||
|
||||
Examples
|
||||
--------
|
||||
Consider the problem at
|
||||
https://en.wikipedia.org/wiki/Integer_programming#Example, which is
|
||||
expressed as a maximization problem of two variables. Since `milp` requires
|
||||
that the problem be expressed as a minimization problem, the objective
|
||||
function coefficients on the decision variables are:
|
||||
|
||||
>>> import numpy as np
|
||||
>>> c = -np.array([0, 1])
|
||||
|
||||
Note the negative sign: we maximize the original objective function
|
||||
by minimizing the negative of the objective function.
|
||||
|
||||
We collect the coefficients of the constraints into arrays like:
|
||||
|
||||
>>> A = np.array([[-1, 1], [3, 2], [2, 3]])
|
||||
>>> b_u = np.array([1, 12, 12])
|
||||
>>> b_l = np.full_like(b_u, -np.inf, dtype=float)
|
||||
|
||||
Because there is no lower limit on these constraints, we have defined a
|
||||
variable ``b_l`` full of values representing negative infinity. This may
|
||||
be unfamiliar to users of `scipy.optimize.linprog`, which only accepts
|
||||
"less than" (or "upper bound") inequality constraints of the form
|
||||
``A_ub @ x <= b_u``. By accepting both ``b_l`` and ``b_u`` of constraints
|
||||
``b_l <= A_ub @ x <= b_u``, `milp` makes it easy to specify "greater than"
|
||||
inequality constraints, "less than" inequality constraints, and equality
|
||||
constraints concisely.
|
||||
|
||||
These arrays are collected into a single `LinearConstraint` object like:
|
||||
|
||||
>>> from scipy.optimize import LinearConstraint
|
||||
>>> constraints = LinearConstraint(A, b_l, b_u)
|
||||
|
||||
The non-negativity bounds on the decision variables are enforced by
|
||||
default, so we do not need to provide an argument for `bounds`.
|
||||
|
||||
Finally, the problem states that both decision variables must be integers:
|
||||
|
||||
>>> integrality = np.ones_like(c)
|
||||
|
||||
We solve the problem like:
|
||||
|
||||
>>> from scipy.optimize import milp
|
||||
>>> res = milp(c=c, constraints=constraints, integrality=integrality)
|
||||
>>> res.x
|
||||
[2.0, 2.0]
|
||||
|
||||
Note that had we solved the relaxed problem (without integrality
|
||||
constraints):
|
||||
|
||||
>>> res = milp(c=c, constraints=constraints) # OR:
|
||||
>>> # from scipy.optimize import linprog; res = linprog(c, A, b_u)
|
||||
>>> res.x
|
||||
[1.8, 2.8]
|
||||
|
||||
we would not have obtained the correct solution by rounding to the nearest
|
||||
integers.
|
||||
|
||||
Other examples are given :ref:`in the tutorial <tutorial-optimize_milp>`.
|
||||
|
||||
"""
|
||||
args_iv = _milp_iv(c, integrality, bounds, constraints, options)
|
||||
c, integrality, lb, ub, indptr, indices, data, b_l, b_u, options = args_iv
|
||||
|
||||
highs_res = _highs_wrapper(c, indptr, indices, data, b_l, b_u,
|
||||
lb, ub, integrality, options)
|
||||
|
||||
res = {}
|
||||
|
||||
# Convert to scipy-style status and message
|
||||
highs_status = highs_res.get('status', None)
|
||||
highs_message = highs_res.get('message', None)
|
||||
status, message = _highs_to_scipy_status_message(highs_status,
|
||||
highs_message)
|
||||
res['status'] = status
|
||||
res['message'] = message
|
||||
res['success'] = (status == 0)
|
||||
x = highs_res.get('x', None)
|
||||
res['x'] = np.array(x) if x is not None else None
|
||||
res['fun'] = highs_res.get('fun', None)
|
||||
res['mip_node_count'] = highs_res.get('mip_node_count', None)
|
||||
res['mip_dual_bound'] = highs_res.get('mip_dual_bound', None)
|
||||
res['mip_gap'] = highs_res.get('mip_gap', None)
|
||||
|
||||
return OptimizeResult(res)
|
||||
1116
venv/lib/python3.12/site-packages/scipy/optimize/_minimize.py
Normal file
1116
venv/lib/python3.12/site-packages/scipy/optimize/_minimize.py
Normal file
File diff suppressed because it is too large
Load Diff
Binary file not shown.
Binary file not shown.
1164
venv/lib/python3.12/site-packages/scipy/optimize/_minpack_py.py
Normal file
1164
venv/lib/python3.12/site-packages/scipy/optimize/_minpack_py.py
Normal file
File diff suppressed because it is too large
Load Diff
Binary file not shown.
164
venv/lib/python3.12/site-packages/scipy/optimize/_nnls.py
Normal file
164
venv/lib/python3.12/site-packages/scipy/optimize/_nnls.py
Normal file
@ -0,0 +1,164 @@
|
||||
import numpy as np
|
||||
from scipy.linalg import solve, LinAlgWarning
|
||||
import warnings
|
||||
|
||||
__all__ = ['nnls']
|
||||
|
||||
|
||||
def nnls(A, b, maxiter=None, *, atol=None):
|
||||
"""
|
||||
Solve ``argmin_x || Ax - b ||_2`` for ``x>=0``.
|
||||
|
||||
This problem, often called as NonNegative Least Squares, is a convex
|
||||
optimization problem with convex constraints. It typically arises when
|
||||
the ``x`` models quantities for which only nonnegative values are
|
||||
attainable; weight of ingredients, component costs and so on.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : (m, n) ndarray
|
||||
Coefficient array
|
||||
b : (m,) ndarray, float
|
||||
Right-hand side vector.
|
||||
maxiter: int, optional
|
||||
Maximum number of iterations, optional. Default value is ``3 * n``.
|
||||
atol: float
|
||||
Tolerance value used in the algorithm to assess closeness to zero in
|
||||
the projected residual ``(A.T @ (A x - b)`` entries. Increasing this
|
||||
value relaxes the solution constraints. A typical relaxation value can
|
||||
be selected as ``max(m, n) * np.linalg.norm(a, 1) * np.spacing(1.)``.
|
||||
This value is not set as default since the norm operation becomes
|
||||
expensive for large problems hence can be used only when necessary.
|
||||
|
||||
Returns
|
||||
-------
|
||||
x : ndarray
|
||||
Solution vector.
|
||||
rnorm : float
|
||||
The 2-norm of the residual, ``|| Ax-b ||_2``.
|
||||
|
||||
See Also
|
||||
--------
|
||||
lsq_linear : Linear least squares with bounds on the variables
|
||||
|
||||
Notes
|
||||
-----
|
||||
The code is based on [2]_ which is an improved version of the classical
|
||||
algorithm of [1]_. It utilizes an active set method and solves the KKT
|
||||
(Karush-Kuhn-Tucker) conditions for the non-negative least squares problem.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] : Lawson C., Hanson R.J., "Solving Least Squares Problems", SIAM,
|
||||
1995, :doi:`10.1137/1.9781611971217`
|
||||
.. [2] : Bro, Rasmus and de Jong, Sijmen, "A Fast Non-Negativity-
|
||||
Constrained Least Squares Algorithm", Journal Of Chemometrics, 1997,
|
||||
:doi:`10.1002/(SICI)1099-128X(199709/10)11:5<393::AID-CEM483>3.0.CO;2-L`
|
||||
|
||||
Examples
|
||||
--------
|
||||
>>> import numpy as np
|
||||
>>> from scipy.optimize import nnls
|
||||
...
|
||||
>>> A = np.array([[1, 0], [1, 0], [0, 1]])
|
||||
>>> b = np.array([2, 1, 1])
|
||||
>>> nnls(A, b)
|
||||
(array([1.5, 1. ]), 0.7071067811865475)
|
||||
|
||||
>>> b = np.array([-1, -1, -1])
|
||||
>>> nnls(A, b)
|
||||
(array([0., 0.]), 1.7320508075688772)
|
||||
|
||||
"""
|
||||
|
||||
A = np.asarray_chkfinite(A)
|
||||
b = np.asarray_chkfinite(b)
|
||||
|
||||
if len(A.shape) != 2:
|
||||
raise ValueError("Expected a two-dimensional array (matrix)" +
|
||||
f", but the shape of A is {A.shape}")
|
||||
if len(b.shape) != 1:
|
||||
raise ValueError("Expected a one-dimensional array (vector)" +
|
||||
f", but the shape of b is {b.shape}")
|
||||
|
||||
m, n = A.shape
|
||||
|
||||
if m != b.shape[0]:
|
||||
raise ValueError(
|
||||
"Incompatible dimensions. The first dimension of " +
|
||||
f"A is {m}, while the shape of b is {(b.shape[0], )}")
|
||||
|
||||
x, rnorm, mode = _nnls(A, b, maxiter, tol=atol)
|
||||
if mode != 1:
|
||||
raise RuntimeError("Maximum number of iterations reached.")
|
||||
|
||||
return x, rnorm
|
||||
|
||||
|
||||
def _nnls(A, b, maxiter=None, tol=None):
|
||||
"""
|
||||
This is a single RHS algorithm from ref [2] above. For multiple RHS
|
||||
support, the algorithm is given in :doi:`10.1002/cem.889`
|
||||
"""
|
||||
m, n = A.shape
|
||||
|
||||
AtA = A.T @ A
|
||||
Atb = b @ A # Result is 1D - let NumPy figure it out
|
||||
|
||||
if not maxiter:
|
||||
maxiter = 3*n
|
||||
if tol is None:
|
||||
tol = 10 * max(m, n) * np.spacing(1.)
|
||||
|
||||
# Initialize vars
|
||||
x = np.zeros(n, dtype=np.float64)
|
||||
s = np.zeros(n, dtype=np.float64)
|
||||
# Inactive constraint switches
|
||||
P = np.zeros(n, dtype=bool)
|
||||
|
||||
# Projected residual
|
||||
w = Atb.copy().astype(np.float64) # x=0. Skip (-AtA @ x) term
|
||||
|
||||
# Overall iteration counter
|
||||
# Outer loop is not counted, inner iter is counted across outer spins
|
||||
iter = 0
|
||||
|
||||
while (not P.all()) and (w[~P] > tol).any(): # B
|
||||
# Get the "most" active coeff index and move to inactive set
|
||||
k = np.argmax(w * (~P)) # B.2
|
||||
P[k] = True # B.3
|
||||
|
||||
# Iteration solution
|
||||
s[:] = 0.
|
||||
# B.4
|
||||
with warnings.catch_warnings():
|
||||
warnings.filterwarnings('ignore', message='Ill-conditioned matrix',
|
||||
category=LinAlgWarning)
|
||||
s[P] = solve(AtA[np.ix_(P, P)], Atb[P], assume_a='sym', check_finite=False)
|
||||
|
||||
# Inner loop
|
||||
while (iter < maxiter) and (s[P].min() < 0): # C.1
|
||||
iter += 1
|
||||
inds = P * (s < 0)
|
||||
alpha = (x[inds] / (x[inds] - s[inds])).min() # C.2
|
||||
x *= (1 - alpha)
|
||||
x += alpha*s
|
||||
P[x <= tol] = False
|
||||
with warnings.catch_warnings():
|
||||
warnings.filterwarnings('ignore', message='Ill-conditioned matrix',
|
||||
category=LinAlgWarning)
|
||||
s[P] = solve(AtA[np.ix_(P, P)], Atb[P], assume_a='sym',
|
||||
check_finite=False)
|
||||
s[~P] = 0 # C.6
|
||||
|
||||
x[:] = s[:]
|
||||
w[:] = Atb - AtA @ x
|
||||
|
||||
if iter == maxiter:
|
||||
# Typically following line should return
|
||||
# return x, np.linalg.norm(A@x - b), -1
|
||||
# however at the top level, -1 raises an exception wasting norm
|
||||
# Instead return dummy number 0.
|
||||
return x, 0., -1
|
||||
|
||||
return x, np.linalg.norm(A@x - b), 1
|
||||
1585
venv/lib/python3.12/site-packages/scipy/optimize/_nonlin.py
Normal file
1585
venv/lib/python3.12/site-packages/scipy/optimize/_nonlin.py
Normal file
File diff suppressed because it is too large
Load Diff
779
venv/lib/python3.12/site-packages/scipy/optimize/_numdiff.py
Normal file
779
venv/lib/python3.12/site-packages/scipy/optimize/_numdiff.py
Normal file
@ -0,0 +1,779 @@
|
||||
"""Routines for numerical differentiation."""
|
||||
import functools
|
||||
import numpy as np
|
||||
from numpy.linalg import norm
|
||||
|
||||
from scipy.sparse.linalg import LinearOperator
|
||||
from ..sparse import issparse, csc_matrix, csr_matrix, coo_matrix, find
|
||||
from ._group_columns import group_dense, group_sparse
|
||||
from scipy._lib._array_api import atleast_nd, array_namespace
|
||||
|
||||
|
||||
def _adjust_scheme_to_bounds(x0, h, num_steps, scheme, lb, ub):
|
||||
"""Adjust final difference scheme to the presence of bounds.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
x0 : ndarray, shape (n,)
|
||||
Point at which we wish to estimate derivative.
|
||||
h : ndarray, shape (n,)
|
||||
Desired absolute finite difference steps.
|
||||
num_steps : int
|
||||
Number of `h` steps in one direction required to implement finite
|
||||
difference scheme. For example, 2 means that we need to evaluate
|
||||
f(x0 + 2 * h) or f(x0 - 2 * h)
|
||||
scheme : {'1-sided', '2-sided'}
|
||||
Whether steps in one or both directions are required. In other
|
||||
words '1-sided' applies to forward and backward schemes, '2-sided'
|
||||
applies to center schemes.
|
||||
lb : ndarray, shape (n,)
|
||||
Lower bounds on independent variables.
|
||||
ub : ndarray, shape (n,)
|
||||
Upper bounds on independent variables.
|
||||
|
||||
Returns
|
||||
-------
|
||||
h_adjusted : ndarray, shape (n,)
|
||||
Adjusted absolute step sizes. Step size decreases only if a sign flip
|
||||
or switching to one-sided scheme doesn't allow to take a full step.
|
||||
use_one_sided : ndarray of bool, shape (n,)
|
||||
Whether to switch to one-sided scheme. Informative only for
|
||||
``scheme='2-sided'``.
|
||||
"""
|
||||
if scheme == '1-sided':
|
||||
use_one_sided = np.ones_like(h, dtype=bool)
|
||||
elif scheme == '2-sided':
|
||||
h = np.abs(h)
|
||||
use_one_sided = np.zeros_like(h, dtype=bool)
|
||||
else:
|
||||
raise ValueError("`scheme` must be '1-sided' or '2-sided'.")
|
||||
|
||||
if np.all((lb == -np.inf) & (ub == np.inf)):
|
||||
return h, use_one_sided
|
||||
|
||||
h_total = h * num_steps
|
||||
h_adjusted = h.copy()
|
||||
|
||||
lower_dist = x0 - lb
|
||||
upper_dist = ub - x0
|
||||
|
||||
if scheme == '1-sided':
|
||||
x = x0 + h_total
|
||||
violated = (x < lb) | (x > ub)
|
||||
fitting = np.abs(h_total) <= np.maximum(lower_dist, upper_dist)
|
||||
h_adjusted[violated & fitting] *= -1
|
||||
|
||||
forward = (upper_dist >= lower_dist) & ~fitting
|
||||
h_adjusted[forward] = upper_dist[forward] / num_steps
|
||||
backward = (upper_dist < lower_dist) & ~fitting
|
||||
h_adjusted[backward] = -lower_dist[backward] / num_steps
|
||||
elif scheme == '2-sided':
|
||||
central = (lower_dist >= h_total) & (upper_dist >= h_total)
|
||||
|
||||
forward = (upper_dist >= lower_dist) & ~central
|
||||
h_adjusted[forward] = np.minimum(
|
||||
h[forward], 0.5 * upper_dist[forward] / num_steps)
|
||||
use_one_sided[forward] = True
|
||||
|
||||
backward = (upper_dist < lower_dist) & ~central
|
||||
h_adjusted[backward] = -np.minimum(
|
||||
h[backward], 0.5 * lower_dist[backward] / num_steps)
|
||||
use_one_sided[backward] = True
|
||||
|
||||
min_dist = np.minimum(upper_dist, lower_dist) / num_steps
|
||||
adjusted_central = (~central & (np.abs(h_adjusted) <= min_dist))
|
||||
h_adjusted[adjusted_central] = min_dist[adjusted_central]
|
||||
use_one_sided[adjusted_central] = False
|
||||
|
||||
return h_adjusted, use_one_sided
|
||||
|
||||
|
||||
@functools.lru_cache
|
||||
def _eps_for_method(x0_dtype, f0_dtype, method):
|
||||
"""
|
||||
Calculates relative EPS step to use for a given data type
|
||||
and numdiff step method.
|
||||
|
||||
Progressively smaller steps are used for larger floating point types.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
f0_dtype: np.dtype
|
||||
dtype of function evaluation
|
||||
|
||||
x0_dtype: np.dtype
|
||||
dtype of parameter vector
|
||||
|
||||
method: {'2-point', '3-point', 'cs'}
|
||||
|
||||
Returns
|
||||
-------
|
||||
EPS: float
|
||||
relative step size. May be np.float16, np.float32, np.float64
|
||||
|
||||
Notes
|
||||
-----
|
||||
The default relative step will be np.float64. However, if x0 or f0 are
|
||||
smaller floating point types (np.float16, np.float32), then the smallest
|
||||
floating point type is chosen.
|
||||
"""
|
||||
# the default EPS value
|
||||
EPS = np.finfo(np.float64).eps
|
||||
|
||||
x0_is_fp = False
|
||||
if np.issubdtype(x0_dtype, np.inexact):
|
||||
# if you're a floating point type then over-ride the default EPS
|
||||
EPS = np.finfo(x0_dtype).eps
|
||||
x0_itemsize = np.dtype(x0_dtype).itemsize
|
||||
x0_is_fp = True
|
||||
|
||||
if np.issubdtype(f0_dtype, np.inexact):
|
||||
f0_itemsize = np.dtype(f0_dtype).itemsize
|
||||
# choose the smallest itemsize between x0 and f0
|
||||
if x0_is_fp and f0_itemsize < x0_itemsize:
|
||||
EPS = np.finfo(f0_dtype).eps
|
||||
|
||||
if method in ["2-point", "cs"]:
|
||||
return EPS**0.5
|
||||
elif method in ["3-point"]:
|
||||
return EPS**(1/3)
|
||||
else:
|
||||
raise RuntimeError("Unknown step method, should be one of "
|
||||
"{'2-point', '3-point', 'cs'}")
|
||||
|
||||
|
||||
def _compute_absolute_step(rel_step, x0, f0, method):
|
||||
"""
|
||||
Computes an absolute step from a relative step for finite difference
|
||||
calculation.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
rel_step: None or array-like
|
||||
Relative step for the finite difference calculation
|
||||
x0 : np.ndarray
|
||||
Parameter vector
|
||||
f0 : np.ndarray or scalar
|
||||
method : {'2-point', '3-point', 'cs'}
|
||||
|
||||
Returns
|
||||
-------
|
||||
h : float
|
||||
The absolute step size
|
||||
|
||||
Notes
|
||||
-----
|
||||
`h` will always be np.float64. However, if `x0` or `f0` are
|
||||
smaller floating point dtypes (e.g. np.float32), then the absolute
|
||||
step size will be calculated from the smallest floating point size.
|
||||
"""
|
||||
# this is used instead of np.sign(x0) because we need
|
||||
# sign_x0 to be 1 when x0 == 0.
|
||||
sign_x0 = (x0 >= 0).astype(float) * 2 - 1
|
||||
|
||||
rstep = _eps_for_method(x0.dtype, f0.dtype, method)
|
||||
|
||||
if rel_step is None:
|
||||
abs_step = rstep * sign_x0 * np.maximum(1.0, np.abs(x0))
|
||||
else:
|
||||
# User has requested specific relative steps.
|
||||
# Don't multiply by max(1, abs(x0) because if x0 < 1 then their
|
||||
# requested step is not used.
|
||||
abs_step = rel_step * sign_x0 * np.abs(x0)
|
||||
|
||||
# however we don't want an abs_step of 0, which can happen if
|
||||
# rel_step is 0, or x0 is 0. Instead, substitute a realistic step
|
||||
dx = ((x0 + abs_step) - x0)
|
||||
abs_step = np.where(dx == 0,
|
||||
rstep * sign_x0 * np.maximum(1.0, np.abs(x0)),
|
||||
abs_step)
|
||||
|
||||
return abs_step
|
||||
|
||||
|
||||
def _prepare_bounds(bounds, x0):
|
||||
"""
|
||||
Prepares new-style bounds from a two-tuple specifying the lower and upper
|
||||
limits for values in x0. If a value is not bound then the lower/upper bound
|
||||
will be expected to be -np.inf/np.inf.
|
||||
|
||||
Examples
|
||||
--------
|
||||
>>> _prepare_bounds([(0, 1, 2), (1, 2, np.inf)], [0.5, 1.5, 2.5])
|
||||
(array([0., 1., 2.]), array([ 1., 2., inf]))
|
||||
"""
|
||||
lb, ub = (np.asarray(b, dtype=float) for b in bounds)
|
||||
if lb.ndim == 0:
|
||||
lb = np.resize(lb, x0.shape)
|
||||
|
||||
if ub.ndim == 0:
|
||||
ub = np.resize(ub, x0.shape)
|
||||
|
||||
return lb, ub
|
||||
|
||||
|
||||
def group_columns(A, order=0):
|
||||
"""Group columns of a 2-D matrix for sparse finite differencing [1]_.
|
||||
|
||||
Two columns are in the same group if in each row at least one of them
|
||||
has zero. A greedy sequential algorithm is used to construct groups.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : array_like or sparse matrix, shape (m, n)
|
||||
Matrix of which to group columns.
|
||||
order : int, iterable of int with shape (n,) or None
|
||||
Permutation array which defines the order of columns enumeration.
|
||||
If int or None, a random permutation is used with `order` used as
|
||||
a random seed. Default is 0, that is use a random permutation but
|
||||
guarantee repeatability.
|
||||
|
||||
Returns
|
||||
-------
|
||||
groups : ndarray of int, shape (n,)
|
||||
Contains values from 0 to n_groups-1, where n_groups is the number
|
||||
of found groups. Each value ``groups[i]`` is an index of a group to
|
||||
which ith column assigned. The procedure was helpful only if
|
||||
n_groups is significantly less than n.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] A. Curtis, M. J. D. Powell, and J. Reid, "On the estimation of
|
||||
sparse Jacobian matrices", Journal of the Institute of Mathematics
|
||||
and its Applications, 13 (1974), pp. 117-120.
|
||||
"""
|
||||
if issparse(A):
|
||||
A = csc_matrix(A)
|
||||
else:
|
||||
A = np.atleast_2d(A)
|
||||
A = (A != 0).astype(np.int32)
|
||||
|
||||
if A.ndim != 2:
|
||||
raise ValueError("`A` must be 2-dimensional.")
|
||||
|
||||
m, n = A.shape
|
||||
|
||||
if order is None or np.isscalar(order):
|
||||
rng = np.random.RandomState(order)
|
||||
order = rng.permutation(n)
|
||||
else:
|
||||
order = np.asarray(order)
|
||||
if order.shape != (n,):
|
||||
raise ValueError("`order` has incorrect shape.")
|
||||
|
||||
A = A[:, order]
|
||||
|
||||
if issparse(A):
|
||||
groups = group_sparse(m, n, A.indices, A.indptr)
|
||||
else:
|
||||
groups = group_dense(m, n, A)
|
||||
|
||||
groups[order] = groups.copy()
|
||||
|
||||
return groups
|
||||
|
||||
|
||||
def approx_derivative(fun, x0, method='3-point', rel_step=None, abs_step=None,
|
||||
f0=None, bounds=(-np.inf, np.inf), sparsity=None,
|
||||
as_linear_operator=False, args=(), kwargs={}):
|
||||
"""Compute finite difference approximation of the derivatives of a
|
||||
vector-valued function.
|
||||
|
||||
If a function maps from R^n to R^m, its derivatives form m-by-n matrix
|
||||
called the Jacobian, where an element (i, j) is a partial derivative of
|
||||
f[i] with respect to x[j].
|
||||
|
||||
Parameters
|
||||
----------
|
||||
fun : callable
|
||||
Function of which to estimate the derivatives. The argument x
|
||||
passed to this function is ndarray of shape (n,) (never a scalar
|
||||
even if n=1). It must return 1-D array_like of shape (m,) or a scalar.
|
||||
x0 : array_like of shape (n,) or float
|
||||
Point at which to estimate the derivatives. Float will be converted
|
||||
to a 1-D array.
|
||||
method : {'3-point', '2-point', 'cs'}, optional
|
||||
Finite difference method to use:
|
||||
- '2-point' - use the first order accuracy forward or backward
|
||||
difference.
|
||||
- '3-point' - use central difference in interior points and the
|
||||
second order accuracy forward or backward difference
|
||||
near the boundary.
|
||||
- 'cs' - use a complex-step finite difference scheme. This assumes
|
||||
that the user function is real-valued and can be
|
||||
analytically continued to the complex plane. Otherwise,
|
||||
produces bogus results.
|
||||
rel_step : None or array_like, optional
|
||||
Relative step size to use. If None (default) the absolute step size is
|
||||
computed as ``h = rel_step * sign(x0) * max(1, abs(x0))``, with
|
||||
`rel_step` being selected automatically, see Notes. Otherwise
|
||||
``h = rel_step * sign(x0) * abs(x0)``. For ``method='3-point'`` the
|
||||
sign of `h` is ignored. The calculated step size is possibly adjusted
|
||||
to fit into the bounds.
|
||||
abs_step : array_like, optional
|
||||
Absolute step size to use, possibly adjusted to fit into the bounds.
|
||||
For ``method='3-point'`` the sign of `abs_step` is ignored. By default
|
||||
relative steps are used, only if ``abs_step is not None`` are absolute
|
||||
steps used.
|
||||
f0 : None or array_like, optional
|
||||
If not None it is assumed to be equal to ``fun(x0)``, in this case
|
||||
the ``fun(x0)`` is not called. Default is None.
|
||||
bounds : tuple of array_like, optional
|
||||
Lower and upper bounds on independent variables. Defaults to no bounds.
|
||||
Each bound must match the size of `x0` or be a scalar, in the latter
|
||||
case the bound will be the same for all variables. Use it to limit the
|
||||
range of function evaluation. Bounds checking is not implemented
|
||||
when `as_linear_operator` is True.
|
||||
sparsity : {None, array_like, sparse matrix, 2-tuple}, optional
|
||||
Defines a sparsity structure of the Jacobian matrix. If the Jacobian
|
||||
matrix is known to have only few non-zero elements in each row, then
|
||||
it's possible to estimate its several columns by a single function
|
||||
evaluation [3]_. To perform such economic computations two ingredients
|
||||
are required:
|
||||
|
||||
* structure : array_like or sparse matrix of shape (m, n). A zero
|
||||
element means that a corresponding element of the Jacobian
|
||||
identically equals to zero.
|
||||
* groups : array_like of shape (n,). A column grouping for a given
|
||||
sparsity structure, use `group_columns` to obtain it.
|
||||
|
||||
A single array or a sparse matrix is interpreted as a sparsity
|
||||
structure, and groups are computed inside the function. A tuple is
|
||||
interpreted as (structure, groups). If None (default), a standard
|
||||
dense differencing will be used.
|
||||
|
||||
Note, that sparse differencing makes sense only for large Jacobian
|
||||
matrices where each row contains few non-zero elements.
|
||||
as_linear_operator : bool, optional
|
||||
When True the function returns an `scipy.sparse.linalg.LinearOperator`.
|
||||
Otherwise it returns a dense array or a sparse matrix depending on
|
||||
`sparsity`. The linear operator provides an efficient way of computing
|
||||
``J.dot(p)`` for any vector ``p`` of shape (n,), but does not allow
|
||||
direct access to individual elements of the matrix. By default
|
||||
`as_linear_operator` is False.
|
||||
args, kwargs : tuple and dict, optional
|
||||
Additional arguments passed to `fun`. Both empty by default.
|
||||
The calling signature is ``fun(x, *args, **kwargs)``.
|
||||
|
||||
Returns
|
||||
-------
|
||||
J : {ndarray, sparse matrix, LinearOperator}
|
||||
Finite difference approximation of the Jacobian matrix.
|
||||
If `as_linear_operator` is True returns a LinearOperator
|
||||
with shape (m, n). Otherwise it returns a dense array or sparse
|
||||
matrix depending on how `sparsity` is defined. If `sparsity`
|
||||
is None then a ndarray with shape (m, n) is returned. If
|
||||
`sparsity` is not None returns a csr_matrix with shape (m, n).
|
||||
For sparse matrices and linear operators it is always returned as
|
||||
a 2-D structure, for ndarrays, if m=1 it is returned
|
||||
as a 1-D gradient array with shape (n,).
|
||||
|
||||
See Also
|
||||
--------
|
||||
check_derivative : Check correctness of a function computing derivatives.
|
||||
|
||||
Notes
|
||||
-----
|
||||
If `rel_step` is not provided, it assigned as ``EPS**(1/s)``, where EPS is
|
||||
determined from the smallest floating point dtype of `x0` or `fun(x0)`,
|
||||
``np.finfo(x0.dtype).eps``, s=2 for '2-point' method and
|
||||
s=3 for '3-point' method. Such relative step approximately minimizes a sum
|
||||
of truncation and round-off errors, see [1]_. Relative steps are used by
|
||||
default. However, absolute steps are used when ``abs_step is not None``.
|
||||
If any of the absolute or relative steps produces an indistinguishable
|
||||
difference from the original `x0`, ``(x0 + dx) - x0 == 0``, then a
|
||||
automatic step size is substituted for that particular entry.
|
||||
|
||||
A finite difference scheme for '3-point' method is selected automatically.
|
||||
The well-known central difference scheme is used for points sufficiently
|
||||
far from the boundary, and 3-point forward or backward scheme is used for
|
||||
points near the boundary. Both schemes have the second-order accuracy in
|
||||
terms of Taylor expansion. Refer to [2]_ for the formulas of 3-point
|
||||
forward and backward difference schemes.
|
||||
|
||||
For dense differencing when m=1 Jacobian is returned with a shape (n,),
|
||||
on the other hand when n=1 Jacobian is returned with a shape (m, 1).
|
||||
Our motivation is the following: a) It handles a case of gradient
|
||||
computation (m=1) in a conventional way. b) It clearly separates these two
|
||||
different cases. b) In all cases np.atleast_2d can be called to get 2-D
|
||||
Jacobian with correct dimensions.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] W. H. Press et. al. "Numerical Recipes. The Art of Scientific
|
||||
Computing. 3rd edition", sec. 5.7.
|
||||
|
||||
.. [2] A. Curtis, M. J. D. Powell, and J. Reid, "On the estimation of
|
||||
sparse Jacobian matrices", Journal of the Institute of Mathematics
|
||||
and its Applications, 13 (1974), pp. 117-120.
|
||||
|
||||
.. [3] B. Fornberg, "Generation of Finite Difference Formulas on
|
||||
Arbitrarily Spaced Grids", Mathematics of Computation 51, 1988.
|
||||
|
||||
Examples
|
||||
--------
|
||||
>>> import numpy as np
|
||||
>>> from scipy.optimize._numdiff import approx_derivative
|
||||
>>>
|
||||
>>> def f(x, c1, c2):
|
||||
... return np.array([x[0] * np.sin(c1 * x[1]),
|
||||
... x[0] * np.cos(c2 * x[1])])
|
||||
...
|
||||
>>> x0 = np.array([1.0, 0.5 * np.pi])
|
||||
>>> approx_derivative(f, x0, args=(1, 2))
|
||||
array([[ 1., 0.],
|
||||
[-1., 0.]])
|
||||
|
||||
Bounds can be used to limit the region of function evaluation.
|
||||
In the example below we compute left and right derivative at point 1.0.
|
||||
|
||||
>>> def g(x):
|
||||
... return x**2 if x >= 1 else x
|
||||
...
|
||||
>>> x0 = 1.0
|
||||
>>> approx_derivative(g, x0, bounds=(-np.inf, 1.0))
|
||||
array([ 1.])
|
||||
>>> approx_derivative(g, x0, bounds=(1.0, np.inf))
|
||||
array([ 2.])
|
||||
"""
|
||||
if method not in ['2-point', '3-point', 'cs']:
|
||||
raise ValueError("Unknown method '%s'. " % method)
|
||||
|
||||
xp = array_namespace(x0)
|
||||
_x = atleast_nd(x0, ndim=1, xp=xp)
|
||||
_dtype = xp.float64
|
||||
if xp.isdtype(_x.dtype, "real floating"):
|
||||
_dtype = _x.dtype
|
||||
|
||||
# promotes to floating
|
||||
x0 = xp.astype(_x, _dtype)
|
||||
|
||||
if x0.ndim > 1:
|
||||
raise ValueError("`x0` must have at most 1 dimension.")
|
||||
|
||||
lb, ub = _prepare_bounds(bounds, x0)
|
||||
|
||||
if lb.shape != x0.shape or ub.shape != x0.shape:
|
||||
raise ValueError("Inconsistent shapes between bounds and `x0`.")
|
||||
|
||||
if as_linear_operator and not (np.all(np.isinf(lb))
|
||||
and np.all(np.isinf(ub))):
|
||||
raise ValueError("Bounds not supported when "
|
||||
"`as_linear_operator` is True.")
|
||||
|
||||
def fun_wrapped(x):
|
||||
# send user function same fp type as x0. (but only if cs is not being
|
||||
# used
|
||||
if xp.isdtype(x.dtype, "real floating"):
|
||||
x = xp.astype(x, x0.dtype)
|
||||
|
||||
f = np.atleast_1d(fun(x, *args, **kwargs))
|
||||
if f.ndim > 1:
|
||||
raise RuntimeError("`fun` return value has "
|
||||
"more than 1 dimension.")
|
||||
return f
|
||||
|
||||
if f0 is None:
|
||||
f0 = fun_wrapped(x0)
|
||||
else:
|
||||
f0 = np.atleast_1d(f0)
|
||||
if f0.ndim > 1:
|
||||
raise ValueError("`f0` passed has more than 1 dimension.")
|
||||
|
||||
if np.any((x0 < lb) | (x0 > ub)):
|
||||
raise ValueError("`x0` violates bound constraints.")
|
||||
|
||||
if as_linear_operator:
|
||||
if rel_step is None:
|
||||
rel_step = _eps_for_method(x0.dtype, f0.dtype, method)
|
||||
|
||||
return _linear_operator_difference(fun_wrapped, x0,
|
||||
f0, rel_step, method)
|
||||
else:
|
||||
# by default we use rel_step
|
||||
if abs_step is None:
|
||||
h = _compute_absolute_step(rel_step, x0, f0, method)
|
||||
else:
|
||||
# user specifies an absolute step
|
||||
sign_x0 = (x0 >= 0).astype(float) * 2 - 1
|
||||
h = abs_step
|
||||
|
||||
# cannot have a zero step. This might happen if x0 is very large
|
||||
# or small. In which case fall back to relative step.
|
||||
dx = ((x0 + h) - x0)
|
||||
h = np.where(dx == 0,
|
||||
_eps_for_method(x0.dtype, f0.dtype, method) *
|
||||
sign_x0 * np.maximum(1.0, np.abs(x0)),
|
||||
h)
|
||||
|
||||
if method == '2-point':
|
||||
h, use_one_sided = _adjust_scheme_to_bounds(
|
||||
x0, h, 1, '1-sided', lb, ub)
|
||||
elif method == '3-point':
|
||||
h, use_one_sided = _adjust_scheme_to_bounds(
|
||||
x0, h, 1, '2-sided', lb, ub)
|
||||
elif method == 'cs':
|
||||
use_one_sided = False
|
||||
|
||||
if sparsity is None:
|
||||
return _dense_difference(fun_wrapped, x0, f0, h,
|
||||
use_one_sided, method)
|
||||
else:
|
||||
if not issparse(sparsity) and len(sparsity) == 2:
|
||||
structure, groups = sparsity
|
||||
else:
|
||||
structure = sparsity
|
||||
groups = group_columns(sparsity)
|
||||
|
||||
if issparse(structure):
|
||||
structure = csc_matrix(structure)
|
||||
else:
|
||||
structure = np.atleast_2d(structure)
|
||||
|
||||
groups = np.atleast_1d(groups)
|
||||
return _sparse_difference(fun_wrapped, x0, f0, h,
|
||||
use_one_sided, structure,
|
||||
groups, method)
|
||||
|
||||
|
||||
def _linear_operator_difference(fun, x0, f0, h, method):
|
||||
m = f0.size
|
||||
n = x0.size
|
||||
|
||||
if method == '2-point':
|
||||
def matvec(p):
|
||||
if np.array_equal(p, np.zeros_like(p)):
|
||||
return np.zeros(m)
|
||||
dx = h / norm(p)
|
||||
x = x0 + dx*p
|
||||
df = fun(x) - f0
|
||||
return df / dx
|
||||
|
||||
elif method == '3-point':
|
||||
def matvec(p):
|
||||
if np.array_equal(p, np.zeros_like(p)):
|
||||
return np.zeros(m)
|
||||
dx = 2*h / norm(p)
|
||||
x1 = x0 - (dx/2)*p
|
||||
x2 = x0 + (dx/2)*p
|
||||
f1 = fun(x1)
|
||||
f2 = fun(x2)
|
||||
df = f2 - f1
|
||||
return df / dx
|
||||
|
||||
elif method == 'cs':
|
||||
def matvec(p):
|
||||
if np.array_equal(p, np.zeros_like(p)):
|
||||
return np.zeros(m)
|
||||
dx = h / norm(p)
|
||||
x = x0 + dx*p*1.j
|
||||
f1 = fun(x)
|
||||
df = f1.imag
|
||||
return df / dx
|
||||
|
||||
else:
|
||||
raise RuntimeError("Never be here.")
|
||||
|
||||
return LinearOperator((m, n), matvec)
|
||||
|
||||
|
||||
def _dense_difference(fun, x0, f0, h, use_one_sided, method):
|
||||
m = f0.size
|
||||
n = x0.size
|
||||
J_transposed = np.empty((n, m))
|
||||
x1 = x0.copy()
|
||||
x2 = x0.copy()
|
||||
xc = x0.astype(complex, copy=True)
|
||||
|
||||
for i in range(h.size):
|
||||
if method == '2-point':
|
||||
x1[i] += h[i]
|
||||
dx = x1[i] - x0[i] # Recompute dx as exactly representable number.
|
||||
df = fun(x1) - f0
|
||||
elif method == '3-point' and use_one_sided[i]:
|
||||
x1[i] += h[i]
|
||||
x2[i] += 2 * h[i]
|
||||
dx = x2[i] - x0[i]
|
||||
f1 = fun(x1)
|
||||
f2 = fun(x2)
|
||||
df = -3.0 * f0 + 4 * f1 - f2
|
||||
elif method == '3-point' and not use_one_sided[i]:
|
||||
x1[i] -= h[i]
|
||||
x2[i] += h[i]
|
||||
dx = x2[i] - x1[i]
|
||||
f1 = fun(x1)
|
||||
f2 = fun(x2)
|
||||
df = f2 - f1
|
||||
elif method == 'cs':
|
||||
xc[i] += h[i] * 1.j
|
||||
f1 = fun(xc)
|
||||
df = f1.imag
|
||||
dx = h[i]
|
||||
else:
|
||||
raise RuntimeError("Never be here.")
|
||||
|
||||
J_transposed[i] = df / dx
|
||||
x1[i] = x2[i] = xc[i] = x0[i]
|
||||
|
||||
if m == 1:
|
||||
J_transposed = np.ravel(J_transposed)
|
||||
|
||||
return J_transposed.T
|
||||
|
||||
|
||||
def _sparse_difference(fun, x0, f0, h, use_one_sided,
|
||||
structure, groups, method):
|
||||
m = f0.size
|
||||
n = x0.size
|
||||
row_indices = []
|
||||
col_indices = []
|
||||
fractions = []
|
||||
|
||||
n_groups = np.max(groups) + 1
|
||||
for group in range(n_groups):
|
||||
# Perturb variables which are in the same group simultaneously.
|
||||
e = np.equal(group, groups)
|
||||
h_vec = h * e
|
||||
if method == '2-point':
|
||||
x = x0 + h_vec
|
||||
dx = x - x0
|
||||
df = fun(x) - f0
|
||||
# The result is written to columns which correspond to perturbed
|
||||
# variables.
|
||||
cols, = np.nonzero(e)
|
||||
# Find all non-zero elements in selected columns of Jacobian.
|
||||
i, j, _ = find(structure[:, cols])
|
||||
# Restore column indices in the full array.
|
||||
j = cols[j]
|
||||
elif method == '3-point':
|
||||
# Here we do conceptually the same but separate one-sided
|
||||
# and two-sided schemes.
|
||||
x1 = x0.copy()
|
||||
x2 = x0.copy()
|
||||
|
||||
mask_1 = use_one_sided & e
|
||||
x1[mask_1] += h_vec[mask_1]
|
||||
x2[mask_1] += 2 * h_vec[mask_1]
|
||||
|
||||
mask_2 = ~use_one_sided & e
|
||||
x1[mask_2] -= h_vec[mask_2]
|
||||
x2[mask_2] += h_vec[mask_2]
|
||||
|
||||
dx = np.zeros(n)
|
||||
dx[mask_1] = x2[mask_1] - x0[mask_1]
|
||||
dx[mask_2] = x2[mask_2] - x1[mask_2]
|
||||
|
||||
f1 = fun(x1)
|
||||
f2 = fun(x2)
|
||||
|
||||
cols, = np.nonzero(e)
|
||||
i, j, _ = find(structure[:, cols])
|
||||
j = cols[j]
|
||||
|
||||
mask = use_one_sided[j]
|
||||
df = np.empty(m)
|
||||
|
||||
rows = i[mask]
|
||||
df[rows] = -3 * f0[rows] + 4 * f1[rows] - f2[rows]
|
||||
|
||||
rows = i[~mask]
|
||||
df[rows] = f2[rows] - f1[rows]
|
||||
elif method == 'cs':
|
||||
f1 = fun(x0 + h_vec*1.j)
|
||||
df = f1.imag
|
||||
dx = h_vec
|
||||
cols, = np.nonzero(e)
|
||||
i, j, _ = find(structure[:, cols])
|
||||
j = cols[j]
|
||||
else:
|
||||
raise ValueError("Never be here.")
|
||||
|
||||
# All that's left is to compute the fraction. We store i, j and
|
||||
# fractions as separate arrays and later construct coo_matrix.
|
||||
row_indices.append(i)
|
||||
col_indices.append(j)
|
||||
fractions.append(df[i] / dx[j])
|
||||
|
||||
row_indices = np.hstack(row_indices)
|
||||
col_indices = np.hstack(col_indices)
|
||||
fractions = np.hstack(fractions)
|
||||
J = coo_matrix((fractions, (row_indices, col_indices)), shape=(m, n))
|
||||
return csr_matrix(J)
|
||||
|
||||
|
||||
def check_derivative(fun, jac, x0, bounds=(-np.inf, np.inf), args=(),
|
||||
kwargs={}):
|
||||
"""Check correctness of a function computing derivatives (Jacobian or
|
||||
gradient) by comparison with a finite difference approximation.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
fun : callable
|
||||
Function of which to estimate the derivatives. The argument x
|
||||
passed to this function is ndarray of shape (n,) (never a scalar
|
||||
even if n=1). It must return 1-D array_like of shape (m,) or a scalar.
|
||||
jac : callable
|
||||
Function which computes Jacobian matrix of `fun`. It must work with
|
||||
argument x the same way as `fun`. The return value must be array_like
|
||||
or sparse matrix with an appropriate shape.
|
||||
x0 : array_like of shape (n,) or float
|
||||
Point at which to estimate the derivatives. Float will be converted
|
||||
to 1-D array.
|
||||
bounds : 2-tuple of array_like, optional
|
||||
Lower and upper bounds on independent variables. Defaults to no bounds.
|
||||
Each bound must match the size of `x0` or be a scalar, in the latter
|
||||
case the bound will be the same for all variables. Use it to limit the
|
||||
range of function evaluation.
|
||||
args, kwargs : tuple and dict, optional
|
||||
Additional arguments passed to `fun` and `jac`. Both empty by default.
|
||||
The calling signature is ``fun(x, *args, **kwargs)`` and the same
|
||||
for `jac`.
|
||||
|
||||
Returns
|
||||
-------
|
||||
accuracy : float
|
||||
The maximum among all relative errors for elements with absolute values
|
||||
higher than 1 and absolute errors for elements with absolute values
|
||||
less or equal than 1. If `accuracy` is on the order of 1e-6 or lower,
|
||||
then it is likely that your `jac` implementation is correct.
|
||||
|
||||
See Also
|
||||
--------
|
||||
approx_derivative : Compute finite difference approximation of derivative.
|
||||
|
||||
Examples
|
||||
--------
|
||||
>>> import numpy as np
|
||||
>>> from scipy.optimize._numdiff import check_derivative
|
||||
>>>
|
||||
>>>
|
||||
>>> def f(x, c1, c2):
|
||||
... return np.array([x[0] * np.sin(c1 * x[1]),
|
||||
... x[0] * np.cos(c2 * x[1])])
|
||||
...
|
||||
>>> def jac(x, c1, c2):
|
||||
... return np.array([
|
||||
... [np.sin(c1 * x[1]), c1 * x[0] * np.cos(c1 * x[1])],
|
||||
... [np.cos(c2 * x[1]), -c2 * x[0] * np.sin(c2 * x[1])]
|
||||
... ])
|
||||
...
|
||||
>>>
|
||||
>>> x0 = np.array([1.0, 0.5 * np.pi])
|
||||
>>> check_derivative(f, jac, x0, args=(1, 2))
|
||||
2.4492935982947064e-16
|
||||
"""
|
||||
J_to_test = jac(x0, *args, **kwargs)
|
||||
if issparse(J_to_test):
|
||||
J_diff = approx_derivative(fun, x0, bounds=bounds, sparsity=J_to_test,
|
||||
args=args, kwargs=kwargs)
|
||||
J_to_test = csr_matrix(J_to_test)
|
||||
abs_err = J_to_test - J_diff
|
||||
i, j, abs_err_data = find(abs_err)
|
||||
J_diff_data = np.asarray(J_diff[i, j]).ravel()
|
||||
return np.max(np.abs(abs_err_data) /
|
||||
np.maximum(1, np.abs(J_diff_data)))
|
||||
else:
|
||||
J_diff = approx_derivative(fun, x0, bounds=bounds,
|
||||
args=args, kwargs=kwargs)
|
||||
abs_err = np.abs(J_to_test - J_diff)
|
||||
return np.max(abs_err / np.maximum(1, np.abs(J_diff)))
|
||||
4093
venv/lib/python3.12/site-packages/scipy/optimize/_optimize.py
Normal file
4093
venv/lib/python3.12/site-packages/scipy/optimize/_optimize.py
Normal file
File diff suppressed because it is too large
Load Diff
Binary file not shown.
731
venv/lib/python3.12/site-packages/scipy/optimize/_qap.py
Normal file
731
venv/lib/python3.12/site-packages/scipy/optimize/_qap.py
Normal file
@ -0,0 +1,731 @@
|
||||
import numpy as np
|
||||
import operator
|
||||
from . import (linear_sum_assignment, OptimizeResult)
|
||||
from ._optimize import _check_unknown_options
|
||||
|
||||
from scipy._lib._util import check_random_state
|
||||
import itertools
|
||||
|
||||
QUADRATIC_ASSIGNMENT_METHODS = ['faq', '2opt']
|
||||
|
||||
def quadratic_assignment(A, B, method="faq", options=None):
|
||||
r"""
|
||||
Approximates solution to the quadratic assignment problem and
|
||||
the graph matching problem.
|
||||
|
||||
Quadratic assignment solves problems of the following form:
|
||||
|
||||
.. math::
|
||||
|
||||
\min_P & \ {\ \text{trace}(A^T P B P^T)}\\
|
||||
\mbox{s.t. } & {P \ \epsilon \ \mathcal{P}}\\
|
||||
|
||||
where :math:`\mathcal{P}` is the set of all permutation matrices,
|
||||
and :math:`A` and :math:`B` are square matrices.
|
||||
|
||||
Graph matching tries to *maximize* the same objective function.
|
||||
This algorithm can be thought of as finding the alignment of the
|
||||
nodes of two graphs that minimizes the number of induced edge
|
||||
disagreements, or, in the case of weighted graphs, the sum of squared
|
||||
edge weight differences.
|
||||
|
||||
Note that the quadratic assignment problem is NP-hard. The results given
|
||||
here are approximations and are not guaranteed to be optimal.
|
||||
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : 2-D array, square
|
||||
The square matrix :math:`A` in the objective function above.
|
||||
|
||||
B : 2-D array, square
|
||||
The square matrix :math:`B` in the objective function above.
|
||||
|
||||
method : str in {'faq', '2opt'} (default: 'faq')
|
||||
The algorithm used to solve the problem.
|
||||
:ref:`'faq' <optimize.qap-faq>` (default) and
|
||||
:ref:`'2opt' <optimize.qap-2opt>` are available.
|
||||
|
||||
options : dict, optional
|
||||
A dictionary of solver options. All solvers support the following:
|
||||
|
||||
maximize : bool (default: False)
|
||||
Maximizes the objective function if ``True``.
|
||||
|
||||
partial_match : 2-D array of integers, optional (default: None)
|
||||
Fixes part of the matching. Also known as a "seed" [2]_.
|
||||
|
||||
Each row of `partial_match` specifies a pair of matched nodes:
|
||||
node ``partial_match[i, 0]`` of `A` is matched to node
|
||||
``partial_match[i, 1]`` of `B`. The array has shape ``(m, 2)``,
|
||||
where ``m`` is not greater than the number of nodes, :math:`n`.
|
||||
|
||||
rng : {None, int, `numpy.random.Generator`,
|
||||
`numpy.random.RandomState`}, optional
|
||||
|
||||
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
|
||||
singleton is used.
|
||||
If `seed` is an int, a new ``RandomState`` instance is used,
|
||||
seeded with `seed`.
|
||||
If `seed` is already a ``Generator`` or ``RandomState`` instance then
|
||||
that instance is used.
|
||||
|
||||
For method-specific options, see
|
||||
:func:`show_options('quadratic_assignment') <show_options>`.
|
||||
|
||||
Returns
|
||||
-------
|
||||
res : OptimizeResult
|
||||
`OptimizeResult` containing the following fields.
|
||||
|
||||
col_ind : 1-D array
|
||||
Column indices corresponding to the best permutation found of the
|
||||
nodes of `B`.
|
||||
fun : float
|
||||
The objective value of the solution.
|
||||
nit : int
|
||||
The number of iterations performed during optimization.
|
||||
|
||||
Notes
|
||||
-----
|
||||
The default method :ref:`'faq' <optimize.qap-faq>` uses the Fast
|
||||
Approximate QAP algorithm [1]_; it typically offers the best combination of
|
||||
speed and accuracy.
|
||||
Method :ref:`'2opt' <optimize.qap-2opt>` can be computationally expensive,
|
||||
but may be a useful alternative, or it can be used to refine the solution
|
||||
returned by another method.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] J.T. Vogelstein, J.M. Conroy, V. Lyzinski, L.J. Podrazik,
|
||||
S.G. Kratzer, E.T. Harley, D.E. Fishkind, R.J. Vogelstein, and
|
||||
C.E. Priebe, "Fast approximate quadratic programming for graph
|
||||
matching," PLOS one, vol. 10, no. 4, p. e0121002, 2015,
|
||||
:doi:`10.1371/journal.pone.0121002`
|
||||
|
||||
.. [2] D. Fishkind, S. Adali, H. Patsolic, L. Meng, D. Singh, V. Lyzinski,
|
||||
C. Priebe, "Seeded graph matching", Pattern Recognit. 87 (2019):
|
||||
203-215, :doi:`10.1016/j.patcog.2018.09.014`
|
||||
|
||||
.. [3] "2-opt," Wikipedia.
|
||||
https://en.wikipedia.org/wiki/2-opt
|
||||
|
||||
Examples
|
||||
--------
|
||||
>>> import numpy as np
|
||||
>>> from scipy.optimize import quadratic_assignment
|
||||
>>> A = np.array([[0, 80, 150, 170], [80, 0, 130, 100],
|
||||
... [150, 130, 0, 120], [170, 100, 120, 0]])
|
||||
>>> B = np.array([[0, 5, 2, 7], [0, 0, 3, 8],
|
||||
... [0, 0, 0, 3], [0, 0, 0, 0]])
|
||||
>>> res = quadratic_assignment(A, B)
|
||||
>>> print(res)
|
||||
fun: 3260
|
||||
col_ind: [0 3 2 1]
|
||||
nit: 9
|
||||
|
||||
The see the relationship between the returned ``col_ind`` and ``fun``,
|
||||
use ``col_ind`` to form the best permutation matrix found, then evaluate
|
||||
the objective function :math:`f(P) = trace(A^T P B P^T )`.
|
||||
|
||||
>>> perm = res['col_ind']
|
||||
>>> P = np.eye(len(A), dtype=int)[perm]
|
||||
>>> fun = np.trace(A.T @ P @ B @ P.T)
|
||||
>>> print(fun)
|
||||
3260
|
||||
|
||||
Alternatively, to avoid constructing the permutation matrix explicitly,
|
||||
directly permute the rows and columns of the distance matrix.
|
||||
|
||||
>>> fun = np.trace(A.T @ B[perm][:, perm])
|
||||
>>> print(fun)
|
||||
3260
|
||||
|
||||
Although not guaranteed in general, ``quadratic_assignment`` happens to
|
||||
have found the globally optimal solution.
|
||||
|
||||
>>> from itertools import permutations
|
||||
>>> perm_opt, fun_opt = None, np.inf
|
||||
>>> for perm in permutations([0, 1, 2, 3]):
|
||||
... perm = np.array(perm)
|
||||
... fun = np.trace(A.T @ B[perm][:, perm])
|
||||
... if fun < fun_opt:
|
||||
... fun_opt, perm_opt = fun, perm
|
||||
>>> print(np.array_equal(perm_opt, res['col_ind']))
|
||||
True
|
||||
|
||||
Here is an example for which the default method,
|
||||
:ref:`'faq' <optimize.qap-faq>`, does not find the global optimum.
|
||||
|
||||
>>> A = np.array([[0, 5, 8, 6], [5, 0, 5, 1],
|
||||
... [8, 5, 0, 2], [6, 1, 2, 0]])
|
||||
>>> B = np.array([[0, 1, 8, 4], [1, 0, 5, 2],
|
||||
... [8, 5, 0, 5], [4, 2, 5, 0]])
|
||||
>>> res = quadratic_assignment(A, B)
|
||||
>>> print(res)
|
||||
fun: 178
|
||||
col_ind: [1 0 3 2]
|
||||
nit: 13
|
||||
|
||||
If accuracy is important, consider using :ref:`'2opt' <optimize.qap-2opt>`
|
||||
to refine the solution.
|
||||
|
||||
>>> guess = np.array([np.arange(len(A)), res.col_ind]).T
|
||||
>>> res = quadratic_assignment(A, B, method="2opt",
|
||||
... options = {'partial_guess': guess})
|
||||
>>> print(res)
|
||||
fun: 176
|
||||
col_ind: [1 2 3 0]
|
||||
nit: 17
|
||||
|
||||
"""
|
||||
|
||||
if options is None:
|
||||
options = {}
|
||||
|
||||
method = method.lower()
|
||||
methods = {"faq": _quadratic_assignment_faq,
|
||||
"2opt": _quadratic_assignment_2opt}
|
||||
if method not in methods:
|
||||
raise ValueError(f"method {method} must be in {methods}.")
|
||||
res = methods[method](A, B, **options)
|
||||
return res
|
||||
|
||||
|
||||
def _calc_score(A, B, perm):
|
||||
# equivalent to objective function but avoids matmul
|
||||
return np.sum(A * B[perm][:, perm])
|
||||
|
||||
|
||||
def _common_input_validation(A, B, partial_match):
|
||||
A = np.atleast_2d(A)
|
||||
B = np.atleast_2d(B)
|
||||
|
||||
if partial_match is None:
|
||||
partial_match = np.array([[], []]).T
|
||||
partial_match = np.atleast_2d(partial_match).astype(int)
|
||||
|
||||
msg = None
|
||||
if A.shape[0] != A.shape[1]:
|
||||
msg = "`A` must be square"
|
||||
elif B.shape[0] != B.shape[1]:
|
||||
msg = "`B` must be square"
|
||||
elif A.ndim != 2 or B.ndim != 2:
|
||||
msg = "`A` and `B` must have exactly two dimensions"
|
||||
elif A.shape != B.shape:
|
||||
msg = "`A` and `B` matrices must be of equal size"
|
||||
elif partial_match.shape[0] > A.shape[0]:
|
||||
msg = "`partial_match` can have only as many seeds as there are nodes"
|
||||
elif partial_match.shape[1] != 2:
|
||||
msg = "`partial_match` must have two columns"
|
||||
elif partial_match.ndim != 2:
|
||||
msg = "`partial_match` must have exactly two dimensions"
|
||||
elif (partial_match < 0).any():
|
||||
msg = "`partial_match` must contain only positive indices"
|
||||
elif (partial_match >= len(A)).any():
|
||||
msg = "`partial_match` entries must be less than number of nodes"
|
||||
elif (not len(set(partial_match[:, 0])) == len(partial_match[:, 0]) or
|
||||
not len(set(partial_match[:, 1])) == len(partial_match[:, 1])):
|
||||
msg = "`partial_match` column entries must be unique"
|
||||
|
||||
if msg is not None:
|
||||
raise ValueError(msg)
|
||||
|
||||
return A, B, partial_match
|
||||
|
||||
|
||||
def _quadratic_assignment_faq(A, B,
|
||||
maximize=False, partial_match=None, rng=None,
|
||||
P0="barycenter", shuffle_input=False, maxiter=30,
|
||||
tol=0.03, **unknown_options):
|
||||
r"""Solve the quadratic assignment problem (approximately).
|
||||
|
||||
This function solves the Quadratic Assignment Problem (QAP) and the
|
||||
Graph Matching Problem (GMP) using the Fast Approximate QAP Algorithm
|
||||
(FAQ) [1]_.
|
||||
|
||||
Quadratic assignment solves problems of the following form:
|
||||
|
||||
.. math::
|
||||
|
||||
\min_P & \ {\ \text{trace}(A^T P B P^T)}\\
|
||||
\mbox{s.t. } & {P \ \epsilon \ \mathcal{P}}\\
|
||||
|
||||
where :math:`\mathcal{P}` is the set of all permutation matrices,
|
||||
and :math:`A` and :math:`B` are square matrices.
|
||||
|
||||
Graph matching tries to *maximize* the same objective function.
|
||||
This algorithm can be thought of as finding the alignment of the
|
||||
nodes of two graphs that minimizes the number of induced edge
|
||||
disagreements, or, in the case of weighted graphs, the sum of squared
|
||||
edge weight differences.
|
||||
|
||||
Note that the quadratic assignment problem is NP-hard. The results given
|
||||
here are approximations and are not guaranteed to be optimal.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : 2-D array, square
|
||||
The square matrix :math:`A` in the objective function above.
|
||||
B : 2-D array, square
|
||||
The square matrix :math:`B` in the objective function above.
|
||||
method : str in {'faq', '2opt'} (default: 'faq')
|
||||
The algorithm used to solve the problem. This is the method-specific
|
||||
documentation for 'faq'.
|
||||
:ref:`'2opt' <optimize.qap-2opt>` is also available.
|
||||
|
||||
Options
|
||||
-------
|
||||
maximize : bool (default: False)
|
||||
Maximizes the objective function if ``True``.
|
||||
partial_match : 2-D array of integers, optional (default: None)
|
||||
Fixes part of the matching. Also known as a "seed" [2]_.
|
||||
|
||||
Each row of `partial_match` specifies a pair of matched nodes:
|
||||
node ``partial_match[i, 0]`` of `A` is matched to node
|
||||
``partial_match[i, 1]`` of `B`. The array has shape ``(m, 2)``, where
|
||||
``m`` is not greater than the number of nodes, :math:`n`.
|
||||
|
||||
rng : {None, int, `numpy.random.Generator`,
|
||||
`numpy.random.RandomState`}, optional
|
||||
|
||||
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
|
||||
singleton is used.
|
||||
If `seed` is an int, a new ``RandomState`` instance is used,
|
||||
seeded with `seed`.
|
||||
If `seed` is already a ``Generator`` or ``RandomState`` instance then
|
||||
that instance is used.
|
||||
P0 : 2-D array, "barycenter", or "randomized" (default: "barycenter")
|
||||
Initial position. Must be a doubly-stochastic matrix [3]_.
|
||||
|
||||
If the initial position is an array, it must be a doubly stochastic
|
||||
matrix of size :math:`m' \times m'` where :math:`m' = n - m`.
|
||||
|
||||
If ``"barycenter"`` (default), the initial position is the barycenter
|
||||
of the Birkhoff polytope (the space of doubly stochastic matrices).
|
||||
This is a :math:`m' \times m'` matrix with all entries equal to
|
||||
:math:`1 / m'`.
|
||||
|
||||
If ``"randomized"`` the initial search position is
|
||||
:math:`P_0 = (J + K) / 2`, where :math:`J` is the barycenter and
|
||||
:math:`K` is a random doubly stochastic matrix.
|
||||
shuffle_input : bool (default: False)
|
||||
Set to `True` to resolve degenerate gradients randomly. For
|
||||
non-degenerate gradients this option has no effect.
|
||||
maxiter : int, positive (default: 30)
|
||||
Integer specifying the max number of Frank-Wolfe iterations performed.
|
||||
tol : float (default: 0.03)
|
||||
Tolerance for termination. Frank-Wolfe iteration terminates when
|
||||
:math:`\frac{||P_{i}-P_{i+1}||_F}{\sqrt{m')}} \leq tol`,
|
||||
where :math:`i` is the iteration number.
|
||||
|
||||
Returns
|
||||
-------
|
||||
res : OptimizeResult
|
||||
`OptimizeResult` containing the following fields.
|
||||
|
||||
col_ind : 1-D array
|
||||
Column indices corresponding to the best permutation found of the
|
||||
nodes of `B`.
|
||||
fun : float
|
||||
The objective value of the solution.
|
||||
nit : int
|
||||
The number of Frank-Wolfe iterations performed.
|
||||
|
||||
Notes
|
||||
-----
|
||||
The algorithm may be sensitive to the initial permutation matrix (or
|
||||
search "position") due to the possibility of several local minima
|
||||
within the feasible region. A barycenter initialization is more likely to
|
||||
result in a better solution than a single random initialization. However,
|
||||
calling ``quadratic_assignment`` several times with different random
|
||||
initializations may result in a better optimum at the cost of longer
|
||||
total execution time.
|
||||
|
||||
Examples
|
||||
--------
|
||||
As mentioned above, a barycenter initialization often results in a better
|
||||
solution than a single random initialization.
|
||||
|
||||
>>> from numpy.random import default_rng
|
||||
>>> rng = default_rng()
|
||||
>>> n = 15
|
||||
>>> A = rng.random((n, n))
|
||||
>>> B = rng.random((n, n))
|
||||
>>> res = quadratic_assignment(A, B) # FAQ is default method
|
||||
>>> print(res.fun)
|
||||
46.871483385480545 # may vary
|
||||
|
||||
>>> options = {"P0": "randomized"} # use randomized initialization
|
||||
>>> res = quadratic_assignment(A, B, options=options)
|
||||
>>> print(res.fun)
|
||||
47.224831071310625 # may vary
|
||||
|
||||
However, consider running from several randomized initializations and
|
||||
keeping the best result.
|
||||
|
||||
>>> res = min([quadratic_assignment(A, B, options=options)
|
||||
... for i in range(30)], key=lambda x: x.fun)
|
||||
>>> print(res.fun)
|
||||
46.671852533681516 # may vary
|
||||
|
||||
The '2-opt' method can be used to further refine the results.
|
||||
|
||||
>>> options = {"partial_guess": np.array([np.arange(n), res.col_ind]).T}
|
||||
>>> res = quadratic_assignment(A, B, method="2opt", options=options)
|
||||
>>> print(res.fun)
|
||||
46.47160735721583 # may vary
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] J.T. Vogelstein, J.M. Conroy, V. Lyzinski, L.J. Podrazik,
|
||||
S.G. Kratzer, E.T. Harley, D.E. Fishkind, R.J. Vogelstein, and
|
||||
C.E. Priebe, "Fast approximate quadratic programming for graph
|
||||
matching," PLOS one, vol. 10, no. 4, p. e0121002, 2015,
|
||||
:doi:`10.1371/journal.pone.0121002`
|
||||
|
||||
.. [2] D. Fishkind, S. Adali, H. Patsolic, L. Meng, D. Singh, V. Lyzinski,
|
||||
C. Priebe, "Seeded graph matching", Pattern Recognit. 87 (2019):
|
||||
203-215, :doi:`10.1016/j.patcog.2018.09.014`
|
||||
|
||||
.. [3] "Doubly stochastic Matrix," Wikipedia.
|
||||
https://en.wikipedia.org/wiki/Doubly_stochastic_matrix
|
||||
|
||||
"""
|
||||
|
||||
_check_unknown_options(unknown_options)
|
||||
|
||||
maxiter = operator.index(maxiter)
|
||||
|
||||
# ValueError check
|
||||
A, B, partial_match = _common_input_validation(A, B, partial_match)
|
||||
|
||||
msg = None
|
||||
if isinstance(P0, str) and P0 not in {'barycenter', 'randomized'}:
|
||||
msg = "Invalid 'P0' parameter string"
|
||||
elif maxiter <= 0:
|
||||
msg = "'maxiter' must be a positive integer"
|
||||
elif tol <= 0:
|
||||
msg = "'tol' must be a positive float"
|
||||
if msg is not None:
|
||||
raise ValueError(msg)
|
||||
|
||||
rng = check_random_state(rng)
|
||||
n = len(A) # number of vertices in graphs
|
||||
n_seeds = len(partial_match) # number of seeds
|
||||
n_unseed = n - n_seeds
|
||||
|
||||
# [1] Algorithm 1 Line 1 - choose initialization
|
||||
if not isinstance(P0, str):
|
||||
P0 = np.atleast_2d(P0)
|
||||
if P0.shape != (n_unseed, n_unseed):
|
||||
msg = "`P0` matrix must have shape m' x m', where m'=n-m"
|
||||
elif ((P0 < 0).any() or not np.allclose(np.sum(P0, axis=0), 1)
|
||||
or not np.allclose(np.sum(P0, axis=1), 1)):
|
||||
msg = "`P0` matrix must be doubly stochastic"
|
||||
if msg is not None:
|
||||
raise ValueError(msg)
|
||||
elif P0 == 'barycenter':
|
||||
P0 = np.ones((n_unseed, n_unseed)) / n_unseed
|
||||
elif P0 == 'randomized':
|
||||
J = np.ones((n_unseed, n_unseed)) / n_unseed
|
||||
# generate a nxn matrix where each entry is a random number [0, 1]
|
||||
# would use rand, but Generators don't have it
|
||||
# would use random, but old mtrand.RandomStates don't have it
|
||||
K = _doubly_stochastic(rng.uniform(size=(n_unseed, n_unseed)))
|
||||
P0 = (J + K) / 2
|
||||
|
||||
# check trivial cases
|
||||
if n == 0 or n_seeds == n:
|
||||
score = _calc_score(A, B, partial_match[:, 1])
|
||||
res = {"col_ind": partial_match[:, 1], "fun": score, "nit": 0}
|
||||
return OptimizeResult(res)
|
||||
|
||||
obj_func_scalar = 1
|
||||
if maximize:
|
||||
obj_func_scalar = -1
|
||||
|
||||
nonseed_B = np.setdiff1d(range(n), partial_match[:, 1])
|
||||
if shuffle_input:
|
||||
nonseed_B = rng.permutation(nonseed_B)
|
||||
|
||||
nonseed_A = np.setdiff1d(range(n), partial_match[:, 0])
|
||||
perm_A = np.concatenate([partial_match[:, 0], nonseed_A])
|
||||
perm_B = np.concatenate([partial_match[:, 1], nonseed_B])
|
||||
|
||||
# definitions according to Seeded Graph Matching [2].
|
||||
A11, A12, A21, A22 = _split_matrix(A[perm_A][:, perm_A], n_seeds)
|
||||
B11, B12, B21, B22 = _split_matrix(B[perm_B][:, perm_B], n_seeds)
|
||||
const_sum = A21 @ B21.T + A12.T @ B12
|
||||
|
||||
P = P0
|
||||
# [1] Algorithm 1 Line 2 - loop while stopping criteria not met
|
||||
for n_iter in range(1, maxiter+1):
|
||||
# [1] Algorithm 1 Line 3 - compute the gradient of f(P) = -tr(APB^tP^t)
|
||||
grad_fp = (const_sum + A22 @ P @ B22.T + A22.T @ P @ B22)
|
||||
# [1] Algorithm 1 Line 4 - get direction Q by solving Eq. 8
|
||||
_, cols = linear_sum_assignment(grad_fp, maximize=maximize)
|
||||
Q = np.eye(n_unseed)[cols]
|
||||
|
||||
# [1] Algorithm 1 Line 5 - compute the step size
|
||||
# Noting that e.g. trace(Ax) = trace(A)*x, expand and re-collect
|
||||
# terms as ax**2 + bx + c. c does not affect location of minimum
|
||||
# and can be ignored. Also, note that trace(A@B) = (A.T*B).sum();
|
||||
# apply where possible for efficiency.
|
||||
R = P - Q
|
||||
b21 = ((R.T @ A21) * B21).sum()
|
||||
b12 = ((R.T @ A12.T) * B12.T).sum()
|
||||
AR22 = A22.T @ R
|
||||
BR22 = B22 @ R.T
|
||||
b22a = (AR22 * B22.T[cols]).sum()
|
||||
b22b = (A22 * BR22[cols]).sum()
|
||||
a = (AR22.T * BR22).sum()
|
||||
b = b21 + b12 + b22a + b22b
|
||||
# critical point of ax^2 + bx + c is at x = -d/(2*e)
|
||||
# if a * obj_func_scalar > 0, it is a minimum
|
||||
# if minimum is not in [0, 1], only endpoints need to be considered
|
||||
if a*obj_func_scalar > 0 and 0 <= -b/(2*a) <= 1:
|
||||
alpha = -b/(2*a)
|
||||
else:
|
||||
alpha = np.argmin([0, (b + a)*obj_func_scalar])
|
||||
|
||||
# [1] Algorithm 1 Line 6 - Update P
|
||||
P_i1 = alpha * P + (1 - alpha) * Q
|
||||
if np.linalg.norm(P - P_i1) / np.sqrt(n_unseed) < tol:
|
||||
P = P_i1
|
||||
break
|
||||
P = P_i1
|
||||
# [1] Algorithm 1 Line 7 - end main loop
|
||||
|
||||
# [1] Algorithm 1 Line 8 - project onto the set of permutation matrices
|
||||
_, col = linear_sum_assignment(P, maximize=True)
|
||||
perm = np.concatenate((np.arange(n_seeds), col + n_seeds))
|
||||
|
||||
unshuffled_perm = np.zeros(n, dtype=int)
|
||||
unshuffled_perm[perm_A] = perm_B[perm]
|
||||
|
||||
score = _calc_score(A, B, unshuffled_perm)
|
||||
res = {"col_ind": unshuffled_perm, "fun": score, "nit": n_iter}
|
||||
return OptimizeResult(res)
|
||||
|
||||
|
||||
def _split_matrix(X, n):
|
||||
# definitions according to Seeded Graph Matching [2].
|
||||
upper, lower = X[:n], X[n:]
|
||||
return upper[:, :n], upper[:, n:], lower[:, :n], lower[:, n:]
|
||||
|
||||
|
||||
def _doubly_stochastic(P, tol=1e-3):
|
||||
# Adapted from @btaba implementation
|
||||
# https://github.com/btaba/sinkhorn_knopp
|
||||
# of Sinkhorn-Knopp algorithm
|
||||
# https://projecteuclid.org/euclid.pjm/1102992505
|
||||
|
||||
max_iter = 1000
|
||||
c = 1 / P.sum(axis=0)
|
||||
r = 1 / (P @ c)
|
||||
P_eps = P
|
||||
|
||||
for it in range(max_iter):
|
||||
if ((np.abs(P_eps.sum(axis=1) - 1) < tol).all() and
|
||||
(np.abs(P_eps.sum(axis=0) - 1) < tol).all()):
|
||||
# All column/row sums ~= 1 within threshold
|
||||
break
|
||||
|
||||
c = 1 / (r @ P)
|
||||
r = 1 / (P @ c)
|
||||
P_eps = r[:, None] * P * c
|
||||
|
||||
return P_eps
|
||||
|
||||
|
||||
def _quadratic_assignment_2opt(A, B, maximize=False, rng=None,
|
||||
partial_match=None,
|
||||
partial_guess=None,
|
||||
**unknown_options):
|
||||
r"""Solve the quadratic assignment problem (approximately).
|
||||
|
||||
This function solves the Quadratic Assignment Problem (QAP) and the
|
||||
Graph Matching Problem (GMP) using the 2-opt algorithm [1]_.
|
||||
|
||||
Quadratic assignment solves problems of the following form:
|
||||
|
||||
.. math::
|
||||
|
||||
\min_P & \ {\ \text{trace}(A^T P B P^T)}\\
|
||||
\mbox{s.t. } & {P \ \epsilon \ \mathcal{P}}\\
|
||||
|
||||
where :math:`\mathcal{P}` is the set of all permutation matrices,
|
||||
and :math:`A` and :math:`B` are square matrices.
|
||||
|
||||
Graph matching tries to *maximize* the same objective function.
|
||||
This algorithm can be thought of as finding the alignment of the
|
||||
nodes of two graphs that minimizes the number of induced edge
|
||||
disagreements, or, in the case of weighted graphs, the sum of squared
|
||||
edge weight differences.
|
||||
|
||||
Note that the quadratic assignment problem is NP-hard. The results given
|
||||
here are approximations and are not guaranteed to be optimal.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : 2-D array, square
|
||||
The square matrix :math:`A` in the objective function above.
|
||||
B : 2-D array, square
|
||||
The square matrix :math:`B` in the objective function above.
|
||||
method : str in {'faq', '2opt'} (default: 'faq')
|
||||
The algorithm used to solve the problem. This is the method-specific
|
||||
documentation for '2opt'.
|
||||
:ref:`'faq' <optimize.qap-faq>` is also available.
|
||||
|
||||
Options
|
||||
-------
|
||||
maximize : bool (default: False)
|
||||
Maximizes the objective function if ``True``.
|
||||
rng : {None, int, `numpy.random.Generator`,
|
||||
`numpy.random.RandomState`}, optional
|
||||
|
||||
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
|
||||
singleton is used.
|
||||
If `seed` is an int, a new ``RandomState`` instance is used,
|
||||
seeded with `seed`.
|
||||
If `seed` is already a ``Generator`` or ``RandomState`` instance then
|
||||
that instance is used.
|
||||
partial_match : 2-D array of integers, optional (default: None)
|
||||
Fixes part of the matching. Also known as a "seed" [2]_.
|
||||
|
||||
Each row of `partial_match` specifies a pair of matched nodes: node
|
||||
``partial_match[i, 0]`` of `A` is matched to node
|
||||
``partial_match[i, 1]`` of `B`. The array has shape ``(m, 2)``,
|
||||
where ``m`` is not greater than the number of nodes, :math:`n`.
|
||||
|
||||
.. note::
|
||||
`partial_match` must be sorted by the first column.
|
||||
|
||||
partial_guess : 2-D array of integers, optional (default: None)
|
||||
A guess for the matching between the two matrices. Unlike
|
||||
`partial_match`, `partial_guess` does not fix the indices; they are
|
||||
still free to be optimized.
|
||||
|
||||
Each row of `partial_guess` specifies a pair of matched nodes: node
|
||||
``partial_guess[i, 0]`` of `A` is matched to node
|
||||
``partial_guess[i, 1]`` of `B`. The array has shape ``(m, 2)``,
|
||||
where ``m`` is not greater than the number of nodes, :math:`n`.
|
||||
|
||||
.. note::
|
||||
`partial_guess` must be sorted by the first column.
|
||||
|
||||
Returns
|
||||
-------
|
||||
res : OptimizeResult
|
||||
`OptimizeResult` containing the following fields.
|
||||
|
||||
col_ind : 1-D array
|
||||
Column indices corresponding to the best permutation found of the
|
||||
nodes of `B`.
|
||||
fun : float
|
||||
The objective value of the solution.
|
||||
nit : int
|
||||
The number of iterations performed during optimization.
|
||||
|
||||
Notes
|
||||
-----
|
||||
This is a greedy algorithm that works similarly to bubble sort: beginning
|
||||
with an initial permutation, it iteratively swaps pairs of indices to
|
||||
improve the objective function until no such improvements are possible.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] "2-opt," Wikipedia.
|
||||
https://en.wikipedia.org/wiki/2-opt
|
||||
|
||||
.. [2] D. Fishkind, S. Adali, H. Patsolic, L. Meng, D. Singh, V. Lyzinski,
|
||||
C. Priebe, "Seeded graph matching", Pattern Recognit. 87 (2019):
|
||||
203-215, https://doi.org/10.1016/j.patcog.2018.09.014
|
||||
|
||||
"""
|
||||
_check_unknown_options(unknown_options)
|
||||
rng = check_random_state(rng)
|
||||
A, B, partial_match = _common_input_validation(A, B, partial_match)
|
||||
|
||||
N = len(A)
|
||||
# check trivial cases
|
||||
if N == 0 or partial_match.shape[0] == N:
|
||||
score = _calc_score(A, B, partial_match[:, 1])
|
||||
res = {"col_ind": partial_match[:, 1], "fun": score, "nit": 0}
|
||||
return OptimizeResult(res)
|
||||
|
||||
if partial_guess is None:
|
||||
partial_guess = np.array([[], []]).T
|
||||
partial_guess = np.atleast_2d(partial_guess).astype(int)
|
||||
|
||||
msg = None
|
||||
if partial_guess.shape[0] > A.shape[0]:
|
||||
msg = ("`partial_guess` can have only as "
|
||||
"many entries as there are nodes")
|
||||
elif partial_guess.shape[1] != 2:
|
||||
msg = "`partial_guess` must have two columns"
|
||||
elif partial_guess.ndim != 2:
|
||||
msg = "`partial_guess` must have exactly two dimensions"
|
||||
elif (partial_guess < 0).any():
|
||||
msg = "`partial_guess` must contain only positive indices"
|
||||
elif (partial_guess >= len(A)).any():
|
||||
msg = "`partial_guess` entries must be less than number of nodes"
|
||||
elif (not len(set(partial_guess[:, 0])) == len(partial_guess[:, 0]) or
|
||||
not len(set(partial_guess[:, 1])) == len(partial_guess[:, 1])):
|
||||
msg = "`partial_guess` column entries must be unique"
|
||||
if msg is not None:
|
||||
raise ValueError(msg)
|
||||
|
||||
fixed_rows = None
|
||||
if partial_match.size or partial_guess.size:
|
||||
# use partial_match and partial_guess for initial permutation,
|
||||
# but randomly permute the rest.
|
||||
guess_rows = np.zeros(N, dtype=bool)
|
||||
guess_cols = np.zeros(N, dtype=bool)
|
||||
fixed_rows = np.zeros(N, dtype=bool)
|
||||
fixed_cols = np.zeros(N, dtype=bool)
|
||||
perm = np.zeros(N, dtype=int)
|
||||
|
||||
rg, cg = partial_guess.T
|
||||
guess_rows[rg] = True
|
||||
guess_cols[cg] = True
|
||||
perm[guess_rows] = cg
|
||||
|
||||
# match overrides guess
|
||||
rf, cf = partial_match.T
|
||||
fixed_rows[rf] = True
|
||||
fixed_cols[cf] = True
|
||||
perm[fixed_rows] = cf
|
||||
|
||||
random_rows = ~fixed_rows & ~guess_rows
|
||||
random_cols = ~fixed_cols & ~guess_cols
|
||||
perm[random_rows] = rng.permutation(np.arange(N)[random_cols])
|
||||
else:
|
||||
perm = rng.permutation(np.arange(N))
|
||||
|
||||
best_score = _calc_score(A, B, perm)
|
||||
|
||||
i_free = np.arange(N)
|
||||
if fixed_rows is not None:
|
||||
i_free = i_free[~fixed_rows]
|
||||
|
||||
better = operator.gt if maximize else operator.lt
|
||||
n_iter = 0
|
||||
done = False
|
||||
while not done:
|
||||
# equivalent to nested for loops i in range(N), j in range(i, N)
|
||||
for i, j in itertools.combinations_with_replacement(i_free, 2):
|
||||
n_iter += 1
|
||||
perm[i], perm[j] = perm[j], perm[i]
|
||||
score = _calc_score(A, B, perm)
|
||||
if better(score, best_score):
|
||||
best_score = score
|
||||
break
|
||||
# faster to swap back than to create a new list every time
|
||||
perm[i], perm[j] = perm[j], perm[i]
|
||||
else: # no swaps made
|
||||
done = True
|
||||
|
||||
res = {"col_ind": perm, "fun": best_score, "nit": n_iter}
|
||||
return OptimizeResult(res)
|
||||
@ -0,0 +1,522 @@
|
||||
"""
|
||||
Routines for removing redundant (linearly dependent) equations from linear
|
||||
programming equality constraints.
|
||||
"""
|
||||
# Author: Matt Haberland
|
||||
|
||||
import numpy as np
|
||||
from scipy.linalg import svd
|
||||
from scipy.linalg.interpolative import interp_decomp
|
||||
import scipy
|
||||
from scipy.linalg.blas import dtrsm
|
||||
|
||||
|
||||
def _row_count(A):
|
||||
"""
|
||||
Counts the number of nonzeros in each row of input array A.
|
||||
Nonzeros are defined as any element with absolute value greater than
|
||||
tol = 1e-13. This value should probably be an input to the function.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : 2-D array
|
||||
An array representing a matrix
|
||||
|
||||
Returns
|
||||
-------
|
||||
rowcount : 1-D array
|
||||
Number of nonzeros in each row of A
|
||||
|
||||
"""
|
||||
tol = 1e-13
|
||||
return np.array((abs(A) > tol).sum(axis=1)).flatten()
|
||||
|
||||
|
||||
def _get_densest(A, eligibleRows):
|
||||
"""
|
||||
Returns the index of the densest row of A. Ignores rows that are not
|
||||
eligible for consideration.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : 2-D array
|
||||
An array representing a matrix
|
||||
eligibleRows : 1-D logical array
|
||||
Values indicate whether the corresponding row of A is eligible
|
||||
to be considered
|
||||
|
||||
Returns
|
||||
-------
|
||||
i_densest : int
|
||||
Index of the densest row in A eligible for consideration
|
||||
|
||||
"""
|
||||
rowCounts = _row_count(A)
|
||||
return np.argmax(rowCounts * eligibleRows)
|
||||
|
||||
|
||||
def _remove_zero_rows(A, b):
|
||||
"""
|
||||
Eliminates trivial equations from system of equations defined by Ax = b
|
||||
and identifies trivial infeasibilities
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : 2-D array
|
||||
An array representing the left-hand side of a system of equations
|
||||
b : 1-D array
|
||||
An array representing the right-hand side of a system of equations
|
||||
|
||||
Returns
|
||||
-------
|
||||
A : 2-D array
|
||||
An array representing the left-hand side of a system of equations
|
||||
b : 1-D array
|
||||
An array representing the right-hand side of a system of equations
|
||||
status: int
|
||||
An integer indicating the status of the removal operation
|
||||
0: No infeasibility identified
|
||||
2: Trivially infeasible
|
||||
message : str
|
||||
A string descriptor of the exit status of the optimization.
|
||||
|
||||
"""
|
||||
status = 0
|
||||
message = ""
|
||||
i_zero = _row_count(A) == 0
|
||||
A = A[np.logical_not(i_zero), :]
|
||||
if not np.allclose(b[i_zero], 0):
|
||||
status = 2
|
||||
message = "There is a zero row in A_eq with a nonzero corresponding " \
|
||||
"entry in b_eq. The problem is infeasible."
|
||||
b = b[np.logical_not(i_zero)]
|
||||
return A, b, status, message
|
||||
|
||||
|
||||
def bg_update_dense(plu, perm_r, v, j):
|
||||
LU, p = plu
|
||||
|
||||
vperm = v[perm_r]
|
||||
u = dtrsm(1, LU, vperm, lower=1, diag=1)
|
||||
LU[:j+1, j] = u[:j+1]
|
||||
l = u[j+1:]
|
||||
piv = LU[j, j]
|
||||
LU[j+1:, j] += (l/piv)
|
||||
return LU, p
|
||||
|
||||
|
||||
def _remove_redundancy_pivot_dense(A, rhs, true_rank=None):
|
||||
"""
|
||||
Eliminates redundant equations from system of equations defined by Ax = b
|
||||
and identifies infeasibilities.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : 2-D sparse matrix
|
||||
An matrix representing the left-hand side of a system of equations
|
||||
rhs : 1-D array
|
||||
An array representing the right-hand side of a system of equations
|
||||
|
||||
Returns
|
||||
-------
|
||||
A : 2-D sparse matrix
|
||||
A matrix representing the left-hand side of a system of equations
|
||||
rhs : 1-D array
|
||||
An array representing the right-hand side of a system of equations
|
||||
status: int
|
||||
An integer indicating the status of the system
|
||||
0: No infeasibility identified
|
||||
2: Trivially infeasible
|
||||
message : str
|
||||
A string descriptor of the exit status of the optimization.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [2] Andersen, Erling D. "Finding all linearly dependent rows in
|
||||
large-scale linear programming." Optimization Methods and Software
|
||||
6.3 (1995): 219-227.
|
||||
|
||||
"""
|
||||
tolapiv = 1e-8
|
||||
tolprimal = 1e-8
|
||||
status = 0
|
||||
message = ""
|
||||
inconsistent = ("There is a linear combination of rows of A_eq that "
|
||||
"results in zero, suggesting a redundant constraint. "
|
||||
"However the same linear combination of b_eq is "
|
||||
"nonzero, suggesting that the constraints conflict "
|
||||
"and the problem is infeasible.")
|
||||
A, rhs, status, message = _remove_zero_rows(A, rhs)
|
||||
|
||||
if status != 0:
|
||||
return A, rhs, status, message
|
||||
|
||||
m, n = A.shape
|
||||
|
||||
v = list(range(m)) # Artificial column indices.
|
||||
b = list(v) # Basis column indices.
|
||||
# This is better as a list than a set because column order of basis matrix
|
||||
# needs to be consistent.
|
||||
d = [] # Indices of dependent rows
|
||||
perm_r = None
|
||||
|
||||
A_orig = A
|
||||
A = np.zeros((m, m + n), order='F')
|
||||
np.fill_diagonal(A, 1)
|
||||
A[:, m:] = A_orig
|
||||
e = np.zeros(m)
|
||||
|
||||
js_candidates = np.arange(m, m+n, dtype=int) # candidate columns for basis
|
||||
# manual masking was faster than masked array
|
||||
js_mask = np.ones(js_candidates.shape, dtype=bool)
|
||||
|
||||
# Implements basic algorithm from [2]
|
||||
# Uses some of the suggested improvements (removing zero rows and
|
||||
# Bartels-Golub update idea).
|
||||
# Removing column singletons would be easy, but it is not as important
|
||||
# because the procedure is performed only on the equality constraint
|
||||
# matrix from the original problem - not on the canonical form matrix,
|
||||
# which would have many more column singletons due to slack variables
|
||||
# from the inequality constraints.
|
||||
# The thoughts on "crashing" the initial basis are only really useful if
|
||||
# the matrix is sparse.
|
||||
|
||||
lu = np.eye(m, order='F'), np.arange(m) # initial LU is trivial
|
||||
perm_r = lu[1]
|
||||
for i in v:
|
||||
|
||||
e[i] = 1
|
||||
if i > 0:
|
||||
e[i-1] = 0
|
||||
|
||||
try: # fails for i==0 and any time it gets ill-conditioned
|
||||
j = b[i-1]
|
||||
lu = bg_update_dense(lu, perm_r, A[:, j], i-1)
|
||||
except Exception:
|
||||
lu = scipy.linalg.lu_factor(A[:, b])
|
||||
LU, p = lu
|
||||
perm_r = list(range(m))
|
||||
for i1, i2 in enumerate(p):
|
||||
perm_r[i1], perm_r[i2] = perm_r[i2], perm_r[i1]
|
||||
|
||||
pi = scipy.linalg.lu_solve(lu, e, trans=1)
|
||||
|
||||
js = js_candidates[js_mask]
|
||||
batch = 50
|
||||
|
||||
# This is a tiny bit faster than looping over columns individually,
|
||||
# like for j in js: if abs(A[:,j].transpose().dot(pi)) > tolapiv:
|
||||
for j_index in range(0, len(js), batch):
|
||||
j_indices = js[j_index: min(j_index+batch, len(js))]
|
||||
|
||||
c = abs(A[:, j_indices].transpose().dot(pi))
|
||||
if (c > tolapiv).any():
|
||||
j = js[j_index + np.argmax(c)] # very independent column
|
||||
b[i] = j
|
||||
js_mask[j-m] = False
|
||||
break
|
||||
else:
|
||||
bibar = pi.T.dot(rhs.reshape(-1, 1))
|
||||
bnorm = np.linalg.norm(rhs)
|
||||
if abs(bibar)/(1+bnorm) > tolprimal: # inconsistent
|
||||
status = 2
|
||||
message = inconsistent
|
||||
return A_orig, rhs, status, message
|
||||
else: # dependent
|
||||
d.append(i)
|
||||
if true_rank is not None and len(d) == m - true_rank:
|
||||
break # found all redundancies
|
||||
|
||||
keep = set(range(m))
|
||||
keep = list(keep - set(d))
|
||||
return A_orig[keep, :], rhs[keep], status, message
|
||||
|
||||
|
||||
def _remove_redundancy_pivot_sparse(A, rhs):
|
||||
"""
|
||||
Eliminates redundant equations from system of equations defined by Ax = b
|
||||
and identifies infeasibilities.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : 2-D sparse matrix
|
||||
An matrix representing the left-hand side of a system of equations
|
||||
rhs : 1-D array
|
||||
An array representing the right-hand side of a system of equations
|
||||
|
||||
Returns
|
||||
-------
|
||||
A : 2-D sparse matrix
|
||||
A matrix representing the left-hand side of a system of equations
|
||||
rhs : 1-D array
|
||||
An array representing the right-hand side of a system of equations
|
||||
status: int
|
||||
An integer indicating the status of the system
|
||||
0: No infeasibility identified
|
||||
2: Trivially infeasible
|
||||
message : str
|
||||
A string descriptor of the exit status of the optimization.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [2] Andersen, Erling D. "Finding all linearly dependent rows in
|
||||
large-scale linear programming." Optimization Methods and Software
|
||||
6.3 (1995): 219-227.
|
||||
|
||||
"""
|
||||
|
||||
tolapiv = 1e-8
|
||||
tolprimal = 1e-8
|
||||
status = 0
|
||||
message = ""
|
||||
inconsistent = ("There is a linear combination of rows of A_eq that "
|
||||
"results in zero, suggesting a redundant constraint. "
|
||||
"However the same linear combination of b_eq is "
|
||||
"nonzero, suggesting that the constraints conflict "
|
||||
"and the problem is infeasible.")
|
||||
A, rhs, status, message = _remove_zero_rows(A, rhs)
|
||||
|
||||
if status != 0:
|
||||
return A, rhs, status, message
|
||||
|
||||
m, n = A.shape
|
||||
|
||||
v = list(range(m)) # Artificial column indices.
|
||||
b = list(v) # Basis column indices.
|
||||
# This is better as a list than a set because column order of basis matrix
|
||||
# needs to be consistent.
|
||||
k = set(range(m, m+n)) # Structural column indices.
|
||||
d = [] # Indices of dependent rows
|
||||
|
||||
A_orig = A
|
||||
A = scipy.sparse.hstack((scipy.sparse.eye(m), A)).tocsc()
|
||||
e = np.zeros(m)
|
||||
|
||||
# Implements basic algorithm from [2]
|
||||
# Uses only one of the suggested improvements (removing zero rows).
|
||||
# Removing column singletons would be easy, but it is not as important
|
||||
# because the procedure is performed only on the equality constraint
|
||||
# matrix from the original problem - not on the canonical form matrix,
|
||||
# which would have many more column singletons due to slack variables
|
||||
# from the inequality constraints.
|
||||
# The thoughts on "crashing" the initial basis sound useful, but the
|
||||
# description of the procedure seems to assume a lot of familiarity with
|
||||
# the subject; it is not very explicit. I already went through enough
|
||||
# trouble getting the basic algorithm working, so I was not interested in
|
||||
# trying to decipher this, too. (Overall, the paper is fraught with
|
||||
# mistakes and ambiguities - which is strange, because the rest of
|
||||
# Andersen's papers are quite good.)
|
||||
# I tried and tried and tried to improve performance using the
|
||||
# Bartels-Golub update. It works, but it's only practical if the LU
|
||||
# factorization can be specialized as described, and that is not possible
|
||||
# until the SciPy SuperLU interface permits control over column
|
||||
# permutation - see issue #7700.
|
||||
|
||||
for i in v:
|
||||
B = A[:, b]
|
||||
|
||||
e[i] = 1
|
||||
if i > 0:
|
||||
e[i-1] = 0
|
||||
|
||||
pi = scipy.sparse.linalg.spsolve(B.transpose(), e).reshape(-1, 1)
|
||||
|
||||
js = list(k-set(b)) # not efficient, but this is not the time sink...
|
||||
|
||||
# Due to overhead, it tends to be faster (for problems tested) to
|
||||
# compute the full matrix-vector product rather than individual
|
||||
# vector-vector products (with the chance of terminating as soon
|
||||
# as any are nonzero). For very large matrices, it might be worth
|
||||
# it to compute, say, 100 or 1000 at a time and stop when a nonzero
|
||||
# is found.
|
||||
|
||||
c = (np.abs(A[:, js].transpose().dot(pi)) > tolapiv).nonzero()[0]
|
||||
if len(c) > 0: # independent
|
||||
j = js[c[0]]
|
||||
# in a previous commit, the previous line was changed to choose
|
||||
# index j corresponding with the maximum dot product.
|
||||
# While this avoided issues with almost
|
||||
# singular matrices, it slowed the routine in most NETLIB tests.
|
||||
# I think this is because these columns were denser than the
|
||||
# first column with nonzero dot product (c[0]).
|
||||
# It would be nice to have a heuristic that balances sparsity with
|
||||
# high dot product, but I don't think it's worth the time to
|
||||
# develop one right now. Bartels-Golub update is a much higher
|
||||
# priority.
|
||||
b[i] = j # replace artificial column
|
||||
else:
|
||||
bibar = pi.T.dot(rhs.reshape(-1, 1))
|
||||
bnorm = np.linalg.norm(rhs)
|
||||
if abs(bibar)/(1 + bnorm) > tolprimal:
|
||||
status = 2
|
||||
message = inconsistent
|
||||
return A_orig, rhs, status, message
|
||||
else: # dependent
|
||||
d.append(i)
|
||||
|
||||
keep = set(range(m))
|
||||
keep = list(keep - set(d))
|
||||
return A_orig[keep, :], rhs[keep], status, message
|
||||
|
||||
|
||||
def _remove_redundancy_svd(A, b):
|
||||
"""
|
||||
Eliminates redundant equations from system of equations defined by Ax = b
|
||||
and identifies infeasibilities.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : 2-D array
|
||||
An array representing the left-hand side of a system of equations
|
||||
b : 1-D array
|
||||
An array representing the right-hand side of a system of equations
|
||||
|
||||
Returns
|
||||
-------
|
||||
A : 2-D array
|
||||
An array representing the left-hand side of a system of equations
|
||||
b : 1-D array
|
||||
An array representing the right-hand side of a system of equations
|
||||
status: int
|
||||
An integer indicating the status of the system
|
||||
0: No infeasibility identified
|
||||
2: Trivially infeasible
|
||||
message : str
|
||||
A string descriptor of the exit status of the optimization.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [2] Andersen, Erling D. "Finding all linearly dependent rows in
|
||||
large-scale linear programming." Optimization Methods and Software
|
||||
6.3 (1995): 219-227.
|
||||
|
||||
"""
|
||||
|
||||
A, b, status, message = _remove_zero_rows(A, b)
|
||||
|
||||
if status != 0:
|
||||
return A, b, status, message
|
||||
|
||||
U, s, Vh = svd(A)
|
||||
eps = np.finfo(float).eps
|
||||
tol = s.max() * max(A.shape) * eps
|
||||
|
||||
m, n = A.shape
|
||||
s_min = s[-1] if m <= n else 0
|
||||
|
||||
# this algorithm is faster than that of [2] when the nullspace is small
|
||||
# but it could probably be improvement by randomized algorithms and with
|
||||
# a sparse implementation.
|
||||
# it relies on repeated singular value decomposition to find linearly
|
||||
# dependent rows (as identified by columns of U that correspond with zero
|
||||
# singular values). Unfortunately, only one row can be removed per
|
||||
# decomposition (I tried otherwise; doing so can cause problems.)
|
||||
# It would be nice if we could do truncated SVD like sp.sparse.linalg.svds
|
||||
# but that function is unreliable at finding singular values near zero.
|
||||
# Finding max eigenvalue L of A A^T, then largest eigenvalue (and
|
||||
# associated eigenvector) of -A A^T + L I (I is identity) via power
|
||||
# iteration would also work in theory, but is only efficient if the
|
||||
# smallest nonzero eigenvalue of A A^T is close to the largest nonzero
|
||||
# eigenvalue.
|
||||
|
||||
while abs(s_min) < tol:
|
||||
v = U[:, -1] # TODO: return these so user can eliminate from problem?
|
||||
# rows need to be represented in significant amount
|
||||
eligibleRows = np.abs(v) > tol * 10e6
|
||||
if not np.any(eligibleRows) or np.any(np.abs(v.dot(A)) > tol):
|
||||
status = 4
|
||||
message = ("Due to numerical issues, redundant equality "
|
||||
"constraints could not be removed automatically. "
|
||||
"Try providing your constraint matrices as sparse "
|
||||
"matrices to activate sparse presolve, try turning "
|
||||
"off redundancy removal, or try turning off presolve "
|
||||
"altogether.")
|
||||
break
|
||||
if np.any(np.abs(v.dot(b)) > tol * 100): # factor of 100 to fix 10038 and 10349
|
||||
status = 2
|
||||
message = ("There is a linear combination of rows of A_eq that "
|
||||
"results in zero, suggesting a redundant constraint. "
|
||||
"However the same linear combination of b_eq is "
|
||||
"nonzero, suggesting that the constraints conflict "
|
||||
"and the problem is infeasible.")
|
||||
break
|
||||
|
||||
i_remove = _get_densest(A, eligibleRows)
|
||||
A = np.delete(A, i_remove, axis=0)
|
||||
b = np.delete(b, i_remove)
|
||||
U, s, Vh = svd(A)
|
||||
m, n = A.shape
|
||||
s_min = s[-1] if m <= n else 0
|
||||
|
||||
return A, b, status, message
|
||||
|
||||
|
||||
def _remove_redundancy_id(A, rhs, rank=None, randomized=True):
|
||||
"""Eliminates redundant equations from a system of equations.
|
||||
|
||||
Eliminates redundant equations from system of equations defined by Ax = b
|
||||
and identifies infeasibilities.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : 2-D array
|
||||
An array representing the left-hand side of a system of equations
|
||||
rhs : 1-D array
|
||||
An array representing the right-hand side of a system of equations
|
||||
rank : int, optional
|
||||
The rank of A
|
||||
randomized: bool, optional
|
||||
True for randomized interpolative decomposition
|
||||
|
||||
Returns
|
||||
-------
|
||||
A : 2-D array
|
||||
An array representing the left-hand side of a system of equations
|
||||
rhs : 1-D array
|
||||
An array representing the right-hand side of a system of equations
|
||||
status: int
|
||||
An integer indicating the status of the system
|
||||
0: No infeasibility identified
|
||||
2: Trivially infeasible
|
||||
message : str
|
||||
A string descriptor of the exit status of the optimization.
|
||||
|
||||
"""
|
||||
|
||||
status = 0
|
||||
message = ""
|
||||
inconsistent = ("There is a linear combination of rows of A_eq that "
|
||||
"results in zero, suggesting a redundant constraint. "
|
||||
"However the same linear combination of b_eq is "
|
||||
"nonzero, suggesting that the constraints conflict "
|
||||
"and the problem is infeasible.")
|
||||
|
||||
A, rhs, status, message = _remove_zero_rows(A, rhs)
|
||||
|
||||
if status != 0:
|
||||
return A, rhs, status, message
|
||||
|
||||
m, n = A.shape
|
||||
|
||||
k = rank
|
||||
if rank is None:
|
||||
k = np.linalg.matrix_rank(A)
|
||||
|
||||
idx, proj = interp_decomp(A.T, k, rand=randomized)
|
||||
|
||||
# first k entries in idx are indices of the independent rows
|
||||
# remaining entries are the indices of the m-k dependent rows
|
||||
# proj provides a linear combinations of rows of A2 that form the
|
||||
# remaining m-k (dependent) rows. The same linear combination of entries
|
||||
# in rhs2 must give the remaining m-k entries. If not, the system is
|
||||
# inconsistent, and the problem is infeasible.
|
||||
if not np.allclose(rhs[idx[:k]] @ proj, rhs[idx[k:]]):
|
||||
status = 2
|
||||
message = inconsistent
|
||||
|
||||
# sort indices because the other redundancy removal routines leave rows
|
||||
# in original order and tests were written with that in mind
|
||||
idx = sorted(idx[:k])
|
||||
A2 = A[idx, :]
|
||||
rhs2 = rhs[idx]
|
||||
return A2, rhs2, status, message
|
||||
732
venv/lib/python3.12/site-packages/scipy/optimize/_root.py
Normal file
732
venv/lib/python3.12/site-packages/scipy/optimize/_root.py
Normal file
@ -0,0 +1,732 @@
|
||||
"""
|
||||
Unified interfaces to root finding algorithms.
|
||||
|
||||
Functions
|
||||
---------
|
||||
- root : find a root of a vector function.
|
||||
"""
|
||||
__all__ = ['root']
|
||||
|
||||
import numpy as np
|
||||
|
||||
from warnings import warn
|
||||
|
||||
from ._optimize import MemoizeJac, OptimizeResult, _check_unknown_options
|
||||
from ._minpack_py import _root_hybr, leastsq
|
||||
from ._spectral import _root_df_sane
|
||||
from . import _nonlin as nonlin
|
||||
|
||||
|
||||
ROOT_METHODS = ['hybr', 'lm', 'broyden1', 'broyden2', 'anderson',
|
||||
'linearmixing', 'diagbroyden', 'excitingmixing', 'krylov',
|
||||
'df-sane']
|
||||
|
||||
|
||||
def root(fun, x0, args=(), method='hybr', jac=None, tol=None, callback=None,
|
||||
options=None):
|
||||
r"""
|
||||
Find a root of a vector function.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
fun : callable
|
||||
A vector function to find a root of.
|
||||
x0 : ndarray
|
||||
Initial guess.
|
||||
args : tuple, optional
|
||||
Extra arguments passed to the objective function and its Jacobian.
|
||||
method : str, optional
|
||||
Type of solver. Should be one of
|
||||
|
||||
- 'hybr' :ref:`(see here) <optimize.root-hybr>`
|
||||
- 'lm' :ref:`(see here) <optimize.root-lm>`
|
||||
- 'broyden1' :ref:`(see here) <optimize.root-broyden1>`
|
||||
- 'broyden2' :ref:`(see here) <optimize.root-broyden2>`
|
||||
- 'anderson' :ref:`(see here) <optimize.root-anderson>`
|
||||
- 'linearmixing' :ref:`(see here) <optimize.root-linearmixing>`
|
||||
- 'diagbroyden' :ref:`(see here) <optimize.root-diagbroyden>`
|
||||
- 'excitingmixing' :ref:`(see here) <optimize.root-excitingmixing>`
|
||||
- 'krylov' :ref:`(see here) <optimize.root-krylov>`
|
||||
- 'df-sane' :ref:`(see here) <optimize.root-dfsane>`
|
||||
|
||||
jac : bool or callable, optional
|
||||
If `jac` is a Boolean and is True, `fun` is assumed to return the
|
||||
value of Jacobian along with the objective function. If False, the
|
||||
Jacobian will be estimated numerically.
|
||||
`jac` can also be a callable returning the Jacobian of `fun`. In
|
||||
this case, it must accept the same arguments as `fun`.
|
||||
tol : float, optional
|
||||
Tolerance for termination. For detailed control, use solver-specific
|
||||
options.
|
||||
callback : function, optional
|
||||
Optional callback function. It is called on every iteration as
|
||||
``callback(x, f)`` where `x` is the current solution and `f`
|
||||
the corresponding residual. For all methods but 'hybr' and 'lm'.
|
||||
options : dict, optional
|
||||
A dictionary of solver options. E.g., `xtol` or `maxiter`, see
|
||||
:obj:`show_options()` for details.
|
||||
|
||||
Returns
|
||||
-------
|
||||
sol : OptimizeResult
|
||||
The solution represented as a ``OptimizeResult`` object.
|
||||
Important attributes are: ``x`` the solution array, ``success`` a
|
||||
Boolean flag indicating if the algorithm exited successfully and
|
||||
``message`` which describes the cause of the termination. See
|
||||
`OptimizeResult` for a description of other attributes.
|
||||
|
||||
See also
|
||||
--------
|
||||
show_options : Additional options accepted by the solvers
|
||||
|
||||
Notes
|
||||
-----
|
||||
This section describes the available solvers that can be selected by the
|
||||
'method' parameter. The default method is *hybr*.
|
||||
|
||||
Method *hybr* uses a modification of the Powell hybrid method as
|
||||
implemented in MINPACK [1]_.
|
||||
|
||||
Method *lm* solves the system of nonlinear equations in a least squares
|
||||
sense using a modification of the Levenberg-Marquardt algorithm as
|
||||
implemented in MINPACK [1]_.
|
||||
|
||||
Method *df-sane* is a derivative-free spectral method. [3]_
|
||||
|
||||
Methods *broyden1*, *broyden2*, *anderson*, *linearmixing*,
|
||||
*diagbroyden*, *excitingmixing*, *krylov* are inexact Newton methods,
|
||||
with backtracking or full line searches [2]_. Each method corresponds
|
||||
to a particular Jacobian approximations.
|
||||
|
||||
- Method *broyden1* uses Broyden's first Jacobian approximation, it is
|
||||
known as Broyden's good method.
|
||||
- Method *broyden2* uses Broyden's second Jacobian approximation, it
|
||||
is known as Broyden's bad method.
|
||||
- Method *anderson* uses (extended) Anderson mixing.
|
||||
- Method *Krylov* uses Krylov approximation for inverse Jacobian. It
|
||||
is suitable for large-scale problem.
|
||||
- Method *diagbroyden* uses diagonal Broyden Jacobian approximation.
|
||||
- Method *linearmixing* uses a scalar Jacobian approximation.
|
||||
- Method *excitingmixing* uses a tuned diagonal Jacobian
|
||||
approximation.
|
||||
|
||||
.. warning::
|
||||
|
||||
The algorithms implemented for methods *diagbroyden*,
|
||||
*linearmixing* and *excitingmixing* may be useful for specific
|
||||
problems, but whether they will work may depend strongly on the
|
||||
problem.
|
||||
|
||||
.. versionadded:: 0.11.0
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] More, Jorge J., Burton S. Garbow, and Kenneth E. Hillstrom.
|
||||
1980. User Guide for MINPACK-1.
|
||||
.. [2] C. T. Kelley. 1995. Iterative Methods for Linear and Nonlinear
|
||||
Equations. Society for Industrial and Applied Mathematics.
|
||||
<https://archive.siam.org/books/kelley/fr16/>
|
||||
.. [3] W. La Cruz, J.M. Martinez, M. Raydan. Math. Comp. 75, 1429 (2006).
|
||||
|
||||
Examples
|
||||
--------
|
||||
The following functions define a system of nonlinear equations and its
|
||||
jacobian.
|
||||
|
||||
>>> import numpy as np
|
||||
>>> def fun(x):
|
||||
... return [x[0] + 0.5 * (x[0] - x[1])**3 - 1.0,
|
||||
... 0.5 * (x[1] - x[0])**3 + x[1]]
|
||||
|
||||
>>> def jac(x):
|
||||
... return np.array([[1 + 1.5 * (x[0] - x[1])**2,
|
||||
... -1.5 * (x[0] - x[1])**2],
|
||||
... [-1.5 * (x[1] - x[0])**2,
|
||||
... 1 + 1.5 * (x[1] - x[0])**2]])
|
||||
|
||||
A solution can be obtained as follows.
|
||||
|
||||
>>> from scipy import optimize
|
||||
>>> sol = optimize.root(fun, [0, 0], jac=jac, method='hybr')
|
||||
>>> sol.x
|
||||
array([ 0.8411639, 0.1588361])
|
||||
|
||||
**Large problem**
|
||||
|
||||
Suppose that we needed to solve the following integrodifferential
|
||||
equation on the square :math:`[0,1]\times[0,1]`:
|
||||
|
||||
.. math::
|
||||
|
||||
\nabla^2 P = 10 \left(\int_0^1\int_0^1\cosh(P)\,dx\,dy\right)^2
|
||||
|
||||
with :math:`P(x,1) = 1` and :math:`P=0` elsewhere on the boundary of
|
||||
the square.
|
||||
|
||||
The solution can be found using the ``method='krylov'`` solver:
|
||||
|
||||
>>> from scipy import optimize
|
||||
>>> # parameters
|
||||
>>> nx, ny = 75, 75
|
||||
>>> hx, hy = 1./(nx-1), 1./(ny-1)
|
||||
|
||||
>>> P_left, P_right = 0, 0
|
||||
>>> P_top, P_bottom = 1, 0
|
||||
|
||||
>>> def residual(P):
|
||||
... d2x = np.zeros_like(P)
|
||||
... d2y = np.zeros_like(P)
|
||||
...
|
||||
... d2x[1:-1] = (P[2:] - 2*P[1:-1] + P[:-2]) / hx/hx
|
||||
... d2x[0] = (P[1] - 2*P[0] + P_left)/hx/hx
|
||||
... d2x[-1] = (P_right - 2*P[-1] + P[-2])/hx/hx
|
||||
...
|
||||
... d2y[:,1:-1] = (P[:,2:] - 2*P[:,1:-1] + P[:,:-2])/hy/hy
|
||||
... d2y[:,0] = (P[:,1] - 2*P[:,0] + P_bottom)/hy/hy
|
||||
... d2y[:,-1] = (P_top - 2*P[:,-1] + P[:,-2])/hy/hy
|
||||
...
|
||||
... return d2x + d2y - 10*np.cosh(P).mean()**2
|
||||
|
||||
>>> guess = np.zeros((nx, ny), float)
|
||||
>>> sol = optimize.root(residual, guess, method='krylov')
|
||||
>>> print('Residual: %g' % abs(residual(sol.x)).max())
|
||||
Residual: 5.7972e-06 # may vary
|
||||
|
||||
>>> import matplotlib.pyplot as plt
|
||||
>>> x, y = np.mgrid[0:1:(nx*1j), 0:1:(ny*1j)]
|
||||
>>> plt.pcolormesh(x, y, sol.x, shading='gouraud')
|
||||
>>> plt.colorbar()
|
||||
>>> plt.show()
|
||||
|
||||
"""
|
||||
def _wrapped_fun(*fargs):
|
||||
"""
|
||||
Wrapped `func` to track the number of times
|
||||
the function has been called.
|
||||
"""
|
||||
_wrapped_fun.nfev += 1
|
||||
return fun(*fargs)
|
||||
|
||||
_wrapped_fun.nfev = 0
|
||||
|
||||
if not isinstance(args, tuple):
|
||||
args = (args,)
|
||||
|
||||
meth = method.lower()
|
||||
if options is None:
|
||||
options = {}
|
||||
|
||||
if callback is not None and meth in ('hybr', 'lm'):
|
||||
warn('Method %s does not accept callback.' % method,
|
||||
RuntimeWarning, stacklevel=2)
|
||||
|
||||
# fun also returns the Jacobian
|
||||
if not callable(jac) and meth in ('hybr', 'lm'):
|
||||
if bool(jac):
|
||||
fun = MemoizeJac(fun)
|
||||
jac = fun.derivative
|
||||
else:
|
||||
jac = None
|
||||
|
||||
# set default tolerances
|
||||
if tol is not None:
|
||||
options = dict(options)
|
||||
if meth in ('hybr', 'lm'):
|
||||
options.setdefault('xtol', tol)
|
||||
elif meth in ('df-sane',):
|
||||
options.setdefault('ftol', tol)
|
||||
elif meth in ('broyden1', 'broyden2', 'anderson', 'linearmixing',
|
||||
'diagbroyden', 'excitingmixing', 'krylov'):
|
||||
options.setdefault('xtol', tol)
|
||||
options.setdefault('xatol', np.inf)
|
||||
options.setdefault('ftol', np.inf)
|
||||
options.setdefault('fatol', np.inf)
|
||||
|
||||
if meth == 'hybr':
|
||||
sol = _root_hybr(_wrapped_fun, x0, args=args, jac=jac, **options)
|
||||
elif meth == 'lm':
|
||||
sol = _root_leastsq(_wrapped_fun, x0, args=args, jac=jac, **options)
|
||||
elif meth == 'df-sane':
|
||||
_warn_jac_unused(jac, method)
|
||||
sol = _root_df_sane(_wrapped_fun, x0, args=args, callback=callback,
|
||||
**options)
|
||||
elif meth in ('broyden1', 'broyden2', 'anderson', 'linearmixing',
|
||||
'diagbroyden', 'excitingmixing', 'krylov'):
|
||||
_warn_jac_unused(jac, method)
|
||||
sol = _root_nonlin_solve(_wrapped_fun, x0, args=args, jac=jac,
|
||||
_method=meth, _callback=callback,
|
||||
**options)
|
||||
else:
|
||||
raise ValueError('Unknown solver %s' % method)
|
||||
|
||||
sol.nfev = _wrapped_fun.nfev
|
||||
return sol
|
||||
|
||||
|
||||
def _warn_jac_unused(jac, method):
|
||||
if jac is not None:
|
||||
warn(f'Method {method} does not use the jacobian (jac).',
|
||||
RuntimeWarning, stacklevel=2)
|
||||
|
||||
|
||||
def _root_leastsq(fun, x0, args=(), jac=None,
|
||||
col_deriv=0, xtol=1.49012e-08, ftol=1.49012e-08,
|
||||
gtol=0.0, maxiter=0, eps=0.0, factor=100, diag=None,
|
||||
**unknown_options):
|
||||
"""
|
||||
Solve for least squares with Levenberg-Marquardt
|
||||
|
||||
Options
|
||||
-------
|
||||
col_deriv : bool
|
||||
non-zero to specify that the Jacobian function computes derivatives
|
||||
down the columns (faster, because there is no transpose operation).
|
||||
ftol : float
|
||||
Relative error desired in the sum of squares.
|
||||
xtol : float
|
||||
Relative error desired in the approximate solution.
|
||||
gtol : float
|
||||
Orthogonality desired between the function vector and the columns
|
||||
of the Jacobian.
|
||||
maxiter : int
|
||||
The maximum number of calls to the function. If zero, then
|
||||
100*(N+1) is the maximum where N is the number of elements in x0.
|
||||
eps : float
|
||||
A suitable step length for the forward-difference approximation of
|
||||
the Jacobian (for Dfun=None). If `eps` is less than the machine
|
||||
precision, it is assumed that the relative errors in the functions
|
||||
are of the order of the machine precision.
|
||||
factor : float
|
||||
A parameter determining the initial step bound
|
||||
(``factor * || diag * x||``). Should be in interval ``(0.1, 100)``.
|
||||
diag : sequence
|
||||
N positive entries that serve as a scale factors for the variables.
|
||||
"""
|
||||
nfev = 0
|
||||
def _wrapped_fun(*fargs):
|
||||
"""
|
||||
Wrapped `func` to track the number of times
|
||||
the function has been called.
|
||||
"""
|
||||
nonlocal nfev
|
||||
nfev += 1
|
||||
return fun(*fargs)
|
||||
|
||||
_check_unknown_options(unknown_options)
|
||||
x, cov_x, info, msg, ier = leastsq(_wrapped_fun, x0, args=args,
|
||||
Dfun=jac, full_output=True,
|
||||
col_deriv=col_deriv, xtol=xtol,
|
||||
ftol=ftol, gtol=gtol,
|
||||
maxfev=maxiter, epsfcn=eps,
|
||||
factor=factor, diag=diag)
|
||||
sol = OptimizeResult(x=x, message=msg, status=ier,
|
||||
success=ier in (1, 2, 3, 4), cov_x=cov_x,
|
||||
fun=info.pop('fvec'), method="lm")
|
||||
sol.update(info)
|
||||
sol.nfev = nfev
|
||||
return sol
|
||||
|
||||
|
||||
def _root_nonlin_solve(fun, x0, args=(), jac=None,
|
||||
_callback=None, _method=None,
|
||||
nit=None, disp=False, maxiter=None,
|
||||
ftol=None, fatol=None, xtol=None, xatol=None,
|
||||
tol_norm=None, line_search='armijo', jac_options=None,
|
||||
**unknown_options):
|
||||
_check_unknown_options(unknown_options)
|
||||
|
||||
f_tol = fatol
|
||||
f_rtol = ftol
|
||||
x_tol = xatol
|
||||
x_rtol = xtol
|
||||
verbose = disp
|
||||
if jac_options is None:
|
||||
jac_options = dict()
|
||||
|
||||
jacobian = {'broyden1': nonlin.BroydenFirst,
|
||||
'broyden2': nonlin.BroydenSecond,
|
||||
'anderson': nonlin.Anderson,
|
||||
'linearmixing': nonlin.LinearMixing,
|
||||
'diagbroyden': nonlin.DiagBroyden,
|
||||
'excitingmixing': nonlin.ExcitingMixing,
|
||||
'krylov': nonlin.KrylovJacobian
|
||||
}[_method]
|
||||
|
||||
if args:
|
||||
if jac is True:
|
||||
def f(x):
|
||||
return fun(x, *args)[0]
|
||||
else:
|
||||
def f(x):
|
||||
return fun(x, *args)
|
||||
else:
|
||||
f = fun
|
||||
|
||||
x, info = nonlin.nonlin_solve(f, x0, jacobian=jacobian(**jac_options),
|
||||
iter=nit, verbose=verbose,
|
||||
maxiter=maxiter, f_tol=f_tol,
|
||||
f_rtol=f_rtol, x_tol=x_tol,
|
||||
x_rtol=x_rtol, tol_norm=tol_norm,
|
||||
line_search=line_search,
|
||||
callback=_callback, full_output=True,
|
||||
raise_exception=False)
|
||||
sol = OptimizeResult(x=x, method=_method)
|
||||
sol.update(info)
|
||||
return sol
|
||||
|
||||
def _root_broyden1_doc():
|
||||
"""
|
||||
Options
|
||||
-------
|
||||
nit : int, optional
|
||||
Number of iterations to make. If omitted (default), make as many
|
||||
as required to meet tolerances.
|
||||
disp : bool, optional
|
||||
Print status to stdout on every iteration.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations to make.
|
||||
ftol : float, optional
|
||||
Relative tolerance for the residual. If omitted, not used.
|
||||
fatol : float, optional
|
||||
Absolute tolerance (in max-norm) for the residual.
|
||||
If omitted, default is 6e-6.
|
||||
xtol : float, optional
|
||||
Relative minimum step size. If omitted, not used.
|
||||
xatol : float, optional
|
||||
Absolute minimum step size, as determined from the Jacobian
|
||||
approximation. If the step size is smaller than this, optimization
|
||||
is terminated as successful. If omitted, not used.
|
||||
tol_norm : function(vector) -> scalar, optional
|
||||
Norm to use in convergence check. Default is the maximum norm.
|
||||
line_search : {None, 'armijo' (default), 'wolfe'}, optional
|
||||
Which type of a line search to use to determine the step size in
|
||||
the direction given by the Jacobian approximation. Defaults to
|
||||
'armijo'.
|
||||
jac_options : dict, optional
|
||||
Options for the respective Jacobian approximation.
|
||||
alpha : float, optional
|
||||
Initial guess for the Jacobian is (-1/alpha).
|
||||
reduction_method : str or tuple, optional
|
||||
Method used in ensuring that the rank of the Broyden
|
||||
matrix stays low. Can either be a string giving the
|
||||
name of the method, or a tuple of the form ``(method,
|
||||
param1, param2, ...)`` that gives the name of the
|
||||
method and values for additional parameters.
|
||||
|
||||
Methods available:
|
||||
|
||||
- ``restart``
|
||||
Drop all matrix columns. Has no
|
||||
extra parameters.
|
||||
- ``simple``
|
||||
Drop oldest matrix column. Has no
|
||||
extra parameters.
|
||||
- ``svd``
|
||||
Keep only the most significant SVD
|
||||
components.
|
||||
|
||||
Extra parameters:
|
||||
|
||||
- ``to_retain``
|
||||
Number of SVD components to
|
||||
retain when rank reduction is done.
|
||||
Default is ``max_rank - 2``.
|
||||
max_rank : int, optional
|
||||
Maximum rank for the Broyden matrix.
|
||||
Default is infinity (i.e., no rank reduction).
|
||||
|
||||
Examples
|
||||
--------
|
||||
>>> def func(x):
|
||||
... return np.cos(x) + x[::-1] - [1, 2, 3, 4]
|
||||
...
|
||||
>>> from scipy import optimize
|
||||
>>> res = optimize.root(func, [1, 1, 1, 1], method='broyden1', tol=1e-14)
|
||||
>>> x = res.x
|
||||
>>> x
|
||||
array([4.04674914, 3.91158389, 2.71791677, 1.61756251])
|
||||
>>> np.cos(x) + x[::-1]
|
||||
array([1., 2., 3., 4.])
|
||||
|
||||
"""
|
||||
pass
|
||||
|
||||
def _root_broyden2_doc():
|
||||
"""
|
||||
Options
|
||||
-------
|
||||
nit : int, optional
|
||||
Number of iterations to make. If omitted (default), make as many
|
||||
as required to meet tolerances.
|
||||
disp : bool, optional
|
||||
Print status to stdout on every iteration.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations to make.
|
||||
ftol : float, optional
|
||||
Relative tolerance for the residual. If omitted, not used.
|
||||
fatol : float, optional
|
||||
Absolute tolerance (in max-norm) for the residual.
|
||||
If omitted, default is 6e-6.
|
||||
xtol : float, optional
|
||||
Relative minimum step size. If omitted, not used.
|
||||
xatol : float, optional
|
||||
Absolute minimum step size, as determined from the Jacobian
|
||||
approximation. If the step size is smaller than this, optimization
|
||||
is terminated as successful. If omitted, not used.
|
||||
tol_norm : function(vector) -> scalar, optional
|
||||
Norm to use in convergence check. Default is the maximum norm.
|
||||
line_search : {None, 'armijo' (default), 'wolfe'}, optional
|
||||
Which type of a line search to use to determine the step size in
|
||||
the direction given by the Jacobian approximation. Defaults to
|
||||
'armijo'.
|
||||
jac_options : dict, optional
|
||||
Options for the respective Jacobian approximation.
|
||||
|
||||
alpha : float, optional
|
||||
Initial guess for the Jacobian is (-1/alpha).
|
||||
reduction_method : str or tuple, optional
|
||||
Method used in ensuring that the rank of the Broyden
|
||||
matrix stays low. Can either be a string giving the
|
||||
name of the method, or a tuple of the form ``(method,
|
||||
param1, param2, ...)`` that gives the name of the
|
||||
method and values for additional parameters.
|
||||
|
||||
Methods available:
|
||||
|
||||
- ``restart``
|
||||
Drop all matrix columns. Has no
|
||||
extra parameters.
|
||||
- ``simple``
|
||||
Drop oldest matrix column. Has no
|
||||
extra parameters.
|
||||
- ``svd``
|
||||
Keep only the most significant SVD
|
||||
components.
|
||||
|
||||
Extra parameters:
|
||||
|
||||
- ``to_retain``
|
||||
Number of SVD components to
|
||||
retain when rank reduction is done.
|
||||
Default is ``max_rank - 2``.
|
||||
max_rank : int, optional
|
||||
Maximum rank for the Broyden matrix.
|
||||
Default is infinity (i.e., no rank reduction).
|
||||
"""
|
||||
pass
|
||||
|
||||
def _root_anderson_doc():
|
||||
"""
|
||||
Options
|
||||
-------
|
||||
nit : int, optional
|
||||
Number of iterations to make. If omitted (default), make as many
|
||||
as required to meet tolerances.
|
||||
disp : bool, optional
|
||||
Print status to stdout on every iteration.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations to make.
|
||||
ftol : float, optional
|
||||
Relative tolerance for the residual. If omitted, not used.
|
||||
fatol : float, optional
|
||||
Absolute tolerance (in max-norm) for the residual.
|
||||
If omitted, default is 6e-6.
|
||||
xtol : float, optional
|
||||
Relative minimum step size. If omitted, not used.
|
||||
xatol : float, optional
|
||||
Absolute minimum step size, as determined from the Jacobian
|
||||
approximation. If the step size is smaller than this, optimization
|
||||
is terminated as successful. If omitted, not used.
|
||||
tol_norm : function(vector) -> scalar, optional
|
||||
Norm to use in convergence check. Default is the maximum norm.
|
||||
line_search : {None, 'armijo' (default), 'wolfe'}, optional
|
||||
Which type of a line search to use to determine the step size in
|
||||
the direction given by the Jacobian approximation. Defaults to
|
||||
'armijo'.
|
||||
jac_options : dict, optional
|
||||
Options for the respective Jacobian approximation.
|
||||
|
||||
alpha : float, optional
|
||||
Initial guess for the Jacobian is (-1/alpha).
|
||||
M : float, optional
|
||||
Number of previous vectors to retain. Defaults to 5.
|
||||
w0 : float, optional
|
||||
Regularization parameter for numerical stability.
|
||||
Compared to unity, good values of the order of 0.01.
|
||||
"""
|
||||
pass
|
||||
|
||||
def _root_linearmixing_doc():
|
||||
"""
|
||||
Options
|
||||
-------
|
||||
nit : int, optional
|
||||
Number of iterations to make. If omitted (default), make as many
|
||||
as required to meet tolerances.
|
||||
disp : bool, optional
|
||||
Print status to stdout on every iteration.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations to make.
|
||||
ftol : float, optional
|
||||
Relative tolerance for the residual. If omitted, not used.
|
||||
fatol : float, optional
|
||||
Absolute tolerance (in max-norm) for the residual.
|
||||
If omitted, default is 6e-6.
|
||||
xtol : float, optional
|
||||
Relative minimum step size. If omitted, not used.
|
||||
xatol : float, optional
|
||||
Absolute minimum step size, as determined from the Jacobian
|
||||
approximation. If the step size is smaller than this, optimization
|
||||
is terminated as successful. If omitted, not used.
|
||||
tol_norm : function(vector) -> scalar, optional
|
||||
Norm to use in convergence check. Default is the maximum norm.
|
||||
line_search : {None, 'armijo' (default), 'wolfe'}, optional
|
||||
Which type of a line search to use to determine the step size in
|
||||
the direction given by the Jacobian approximation. Defaults to
|
||||
'armijo'.
|
||||
jac_options : dict, optional
|
||||
Options for the respective Jacobian approximation.
|
||||
|
||||
alpha : float, optional
|
||||
initial guess for the jacobian is (-1/alpha).
|
||||
"""
|
||||
pass
|
||||
|
||||
def _root_diagbroyden_doc():
|
||||
"""
|
||||
Options
|
||||
-------
|
||||
nit : int, optional
|
||||
Number of iterations to make. If omitted (default), make as many
|
||||
as required to meet tolerances.
|
||||
disp : bool, optional
|
||||
Print status to stdout on every iteration.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations to make.
|
||||
ftol : float, optional
|
||||
Relative tolerance for the residual. If omitted, not used.
|
||||
fatol : float, optional
|
||||
Absolute tolerance (in max-norm) for the residual.
|
||||
If omitted, default is 6e-6.
|
||||
xtol : float, optional
|
||||
Relative minimum step size. If omitted, not used.
|
||||
xatol : float, optional
|
||||
Absolute minimum step size, as determined from the Jacobian
|
||||
approximation. If the step size is smaller than this, optimization
|
||||
is terminated as successful. If omitted, not used.
|
||||
tol_norm : function(vector) -> scalar, optional
|
||||
Norm to use in convergence check. Default is the maximum norm.
|
||||
line_search : {None, 'armijo' (default), 'wolfe'}, optional
|
||||
Which type of a line search to use to determine the step size in
|
||||
the direction given by the Jacobian approximation. Defaults to
|
||||
'armijo'.
|
||||
jac_options : dict, optional
|
||||
Options for the respective Jacobian approximation.
|
||||
|
||||
alpha : float, optional
|
||||
initial guess for the jacobian is (-1/alpha).
|
||||
"""
|
||||
pass
|
||||
|
||||
def _root_excitingmixing_doc():
|
||||
"""
|
||||
Options
|
||||
-------
|
||||
nit : int, optional
|
||||
Number of iterations to make. If omitted (default), make as many
|
||||
as required to meet tolerances.
|
||||
disp : bool, optional
|
||||
Print status to stdout on every iteration.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations to make.
|
||||
ftol : float, optional
|
||||
Relative tolerance for the residual. If omitted, not used.
|
||||
fatol : float, optional
|
||||
Absolute tolerance (in max-norm) for the residual.
|
||||
If omitted, default is 6e-6.
|
||||
xtol : float, optional
|
||||
Relative minimum step size. If omitted, not used.
|
||||
xatol : float, optional
|
||||
Absolute minimum step size, as determined from the Jacobian
|
||||
approximation. If the step size is smaller than this, optimization
|
||||
is terminated as successful. If omitted, not used.
|
||||
tol_norm : function(vector) -> scalar, optional
|
||||
Norm to use in convergence check. Default is the maximum norm.
|
||||
line_search : {None, 'armijo' (default), 'wolfe'}, optional
|
||||
Which type of a line search to use to determine the step size in
|
||||
the direction given by the Jacobian approximation. Defaults to
|
||||
'armijo'.
|
||||
jac_options : dict, optional
|
||||
Options for the respective Jacobian approximation.
|
||||
|
||||
alpha : float, optional
|
||||
Initial Jacobian approximation is (-1/alpha).
|
||||
alphamax : float, optional
|
||||
The entries of the diagonal Jacobian are kept in the range
|
||||
``[alpha, alphamax]``.
|
||||
"""
|
||||
pass
|
||||
|
||||
def _root_krylov_doc():
|
||||
"""
|
||||
Options
|
||||
-------
|
||||
nit : int, optional
|
||||
Number of iterations to make. If omitted (default), make as many
|
||||
as required to meet tolerances.
|
||||
disp : bool, optional
|
||||
Print status to stdout on every iteration.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations to make.
|
||||
ftol : float, optional
|
||||
Relative tolerance for the residual. If omitted, not used.
|
||||
fatol : float, optional
|
||||
Absolute tolerance (in max-norm) for the residual.
|
||||
If omitted, default is 6e-6.
|
||||
xtol : float, optional
|
||||
Relative minimum step size. If omitted, not used.
|
||||
xatol : float, optional
|
||||
Absolute minimum step size, as determined from the Jacobian
|
||||
approximation. If the step size is smaller than this, optimization
|
||||
is terminated as successful. If omitted, not used.
|
||||
tol_norm : function(vector) -> scalar, optional
|
||||
Norm to use in convergence check. Default is the maximum norm.
|
||||
line_search : {None, 'armijo' (default), 'wolfe'}, optional
|
||||
Which type of a line search to use to determine the step size in
|
||||
the direction given by the Jacobian approximation. Defaults to
|
||||
'armijo'.
|
||||
jac_options : dict, optional
|
||||
Options for the respective Jacobian approximation.
|
||||
|
||||
rdiff : float, optional
|
||||
Relative step size to use in numerical differentiation.
|
||||
method : str or callable, optional
|
||||
Krylov method to use to approximate the Jacobian. Can be a string,
|
||||
or a function implementing the same interface as the iterative
|
||||
solvers in `scipy.sparse.linalg`. If a string, needs to be one of:
|
||||
``'lgmres'``, ``'gmres'``, ``'bicgstab'``, ``'cgs'``, ``'minres'``,
|
||||
``'tfqmr'``.
|
||||
|
||||
The default is `scipy.sparse.linalg.lgmres`.
|
||||
inner_M : LinearOperator or InverseJacobian
|
||||
Preconditioner for the inner Krylov iteration.
|
||||
Note that you can use also inverse Jacobians as (adaptive)
|
||||
preconditioners. For example,
|
||||
|
||||
>>> jac = BroydenFirst()
|
||||
>>> kjac = KrylovJacobian(inner_M=jac.inverse).
|
||||
|
||||
If the preconditioner has a method named 'update', it will
|
||||
be called as ``update(x, f)`` after each nonlinear step,
|
||||
with ``x`` giving the current point, and ``f`` the current
|
||||
function value.
|
||||
inner_tol, inner_maxiter, ...
|
||||
Parameters to pass on to the "inner" Krylov solver.
|
||||
See `scipy.sparse.linalg.gmres` for details.
|
||||
outer_k : int, optional
|
||||
Size of the subspace kept across LGMRES nonlinear
|
||||
iterations.
|
||||
|
||||
See `scipy.sparse.linalg.lgmres` for details.
|
||||
"""
|
||||
pass
|
||||
525
venv/lib/python3.12/site-packages/scipy/optimize/_root_scalar.py
Normal file
525
venv/lib/python3.12/site-packages/scipy/optimize/_root_scalar.py
Normal file
@ -0,0 +1,525 @@
|
||||
"""
|
||||
Unified interfaces to root finding algorithms for real or complex
|
||||
scalar functions.
|
||||
|
||||
Functions
|
||||
---------
|
||||
- root : find a root of a scalar function.
|
||||
"""
|
||||
import numpy as np
|
||||
|
||||
from . import _zeros_py as optzeros
|
||||
from ._numdiff import approx_derivative
|
||||
|
||||
__all__ = ['root_scalar']
|
||||
|
||||
ROOT_SCALAR_METHODS = ['bisect', 'brentq', 'brenth', 'ridder', 'toms748',
|
||||
'newton', 'secant', 'halley']
|
||||
|
||||
|
||||
class MemoizeDer:
|
||||
"""Decorator that caches the value and derivative(s) of function each
|
||||
time it is called.
|
||||
|
||||
This is a simplistic memoizer that calls and caches a single value
|
||||
of `f(x, *args)`.
|
||||
It assumes that `args` does not change between invocations.
|
||||
It supports the use case of a root-finder where `args` is fixed,
|
||||
`x` changes, and only rarely, if at all, does x assume the same value
|
||||
more than once."""
|
||||
def __init__(self, fun):
|
||||
self.fun = fun
|
||||
self.vals = None
|
||||
self.x = None
|
||||
self.n_calls = 0
|
||||
|
||||
def __call__(self, x, *args):
|
||||
r"""Calculate f or use cached value if available"""
|
||||
# Derivative may be requested before the function itself, always check
|
||||
if self.vals is None or x != self.x:
|
||||
fg = self.fun(x, *args)
|
||||
self.x = x
|
||||
self.n_calls += 1
|
||||
self.vals = fg[:]
|
||||
return self.vals[0]
|
||||
|
||||
def fprime(self, x, *args):
|
||||
r"""Calculate f' or use a cached value if available"""
|
||||
if self.vals is None or x != self.x:
|
||||
self(x, *args)
|
||||
return self.vals[1]
|
||||
|
||||
def fprime2(self, x, *args):
|
||||
r"""Calculate f'' or use a cached value if available"""
|
||||
if self.vals is None or x != self.x:
|
||||
self(x, *args)
|
||||
return self.vals[2]
|
||||
|
||||
def ncalls(self):
|
||||
return self.n_calls
|
||||
|
||||
|
||||
def root_scalar(f, args=(), method=None, bracket=None,
|
||||
fprime=None, fprime2=None,
|
||||
x0=None, x1=None,
|
||||
xtol=None, rtol=None, maxiter=None,
|
||||
options=None):
|
||||
"""
|
||||
Find a root of a scalar function.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
f : callable
|
||||
A function to find a root of.
|
||||
args : tuple, optional
|
||||
Extra arguments passed to the objective function and its derivative(s).
|
||||
method : str, optional
|
||||
Type of solver. Should be one of
|
||||
|
||||
- 'bisect' :ref:`(see here) <optimize.root_scalar-bisect>`
|
||||
- 'brentq' :ref:`(see here) <optimize.root_scalar-brentq>`
|
||||
- 'brenth' :ref:`(see here) <optimize.root_scalar-brenth>`
|
||||
- 'ridder' :ref:`(see here) <optimize.root_scalar-ridder>`
|
||||
- 'toms748' :ref:`(see here) <optimize.root_scalar-toms748>`
|
||||
- 'newton' :ref:`(see here) <optimize.root_scalar-newton>`
|
||||
- 'secant' :ref:`(see here) <optimize.root_scalar-secant>`
|
||||
- 'halley' :ref:`(see here) <optimize.root_scalar-halley>`
|
||||
|
||||
bracket: A sequence of 2 floats, optional
|
||||
An interval bracketing a root. `f(x, *args)` must have different
|
||||
signs at the two endpoints.
|
||||
x0 : float, optional
|
||||
Initial guess.
|
||||
x1 : float, optional
|
||||
A second guess.
|
||||
fprime : bool or callable, optional
|
||||
If `fprime` is a boolean and is True, `f` is assumed to return the
|
||||
value of the objective function and of the derivative.
|
||||
`fprime` can also be a callable returning the derivative of `f`. In
|
||||
this case, it must accept the same arguments as `f`.
|
||||
fprime2 : bool or callable, optional
|
||||
If `fprime2` is a boolean and is True, `f` is assumed to return the
|
||||
value of the objective function and of the
|
||||
first and second derivatives.
|
||||
`fprime2` can also be a callable returning the second derivative of `f`.
|
||||
In this case, it must accept the same arguments as `f`.
|
||||
xtol : float, optional
|
||||
Tolerance (absolute) for termination.
|
||||
rtol : float, optional
|
||||
Tolerance (relative) for termination.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations.
|
||||
options : dict, optional
|
||||
A dictionary of solver options. E.g., ``k``, see
|
||||
:obj:`show_options()` for details.
|
||||
|
||||
Returns
|
||||
-------
|
||||
sol : RootResults
|
||||
The solution represented as a ``RootResults`` object.
|
||||
Important attributes are: ``root`` the solution , ``converged`` a
|
||||
boolean flag indicating if the algorithm exited successfully and
|
||||
``flag`` which describes the cause of the termination. See
|
||||
`RootResults` for a description of other attributes.
|
||||
|
||||
See also
|
||||
--------
|
||||
show_options : Additional options accepted by the solvers
|
||||
root : Find a root of a vector function.
|
||||
|
||||
Notes
|
||||
-----
|
||||
This section describes the available solvers that can be selected by the
|
||||
'method' parameter.
|
||||
|
||||
The default is to use the best method available for the situation
|
||||
presented.
|
||||
If a bracket is provided, it may use one of the bracketing methods.
|
||||
If a derivative and an initial value are specified, it may
|
||||
select one of the derivative-based methods.
|
||||
If no method is judged applicable, it will raise an Exception.
|
||||
|
||||
Arguments for each method are as follows (x=required, o=optional).
|
||||
|
||||
+-----------------------------------------------+---+------+---------+----+----+--------+---------+------+------+---------+---------+
|
||||
| method | f | args | bracket | x0 | x1 | fprime | fprime2 | xtol | rtol | maxiter | options |
|
||||
+===============================================+===+======+=========+====+====+========+=========+======+======+=========+=========+
|
||||
| :ref:`bisect <optimize.root_scalar-bisect>` | x | o | x | | | | | o | o | o | o |
|
||||
+-----------------------------------------------+---+------+---------+----+----+--------+---------+------+------+---------+---------+
|
||||
| :ref:`brentq <optimize.root_scalar-brentq>` | x | o | x | | | | | o | o | o | o |
|
||||
+-----------------------------------------------+---+------+---------+----+----+--------+---------+------+------+---------+---------+
|
||||
| :ref:`brenth <optimize.root_scalar-brenth>` | x | o | x | | | | | o | o | o | o |
|
||||
+-----------------------------------------------+---+------+---------+----+----+--------+---------+------+------+---------+---------+
|
||||
| :ref:`ridder <optimize.root_scalar-ridder>` | x | o | x | | | | | o | o | o | o |
|
||||
+-----------------------------------------------+---+------+---------+----+----+--------+---------+------+------+---------+---------+
|
||||
| :ref:`toms748 <optimize.root_scalar-toms748>` | x | o | x | | | | | o | o | o | o |
|
||||
+-----------------------------------------------+---+------+---------+----+----+--------+---------+------+------+---------+---------+
|
||||
| :ref:`secant <optimize.root_scalar-secant>` | x | o | | x | o | | | o | o | o | o |
|
||||
+-----------------------------------------------+---+------+---------+----+----+--------+---------+------+------+---------+---------+
|
||||
| :ref:`newton <optimize.root_scalar-newton>` | x | o | | x | | o | | o | o | o | o |
|
||||
+-----------------------------------------------+---+------+---------+----+----+--------+---------+------+------+---------+---------+
|
||||
| :ref:`halley <optimize.root_scalar-halley>` | x | o | | x | | x | x | o | o | o | o |
|
||||
+-----------------------------------------------+---+------+---------+----+----+--------+---------+------+------+---------+---------+
|
||||
|
||||
Examples
|
||||
--------
|
||||
|
||||
Find the root of a simple cubic
|
||||
|
||||
>>> from scipy import optimize
|
||||
>>> def f(x):
|
||||
... return (x**3 - 1) # only one real root at x = 1
|
||||
|
||||
>>> def fprime(x):
|
||||
... return 3*x**2
|
||||
|
||||
The `brentq` method takes as input a bracket
|
||||
|
||||
>>> sol = optimize.root_scalar(f, bracket=[0, 3], method='brentq')
|
||||
>>> sol.root, sol.iterations, sol.function_calls
|
||||
(1.0, 10, 11)
|
||||
|
||||
The `newton` method takes as input a single point and uses the
|
||||
derivative(s).
|
||||
|
||||
>>> sol = optimize.root_scalar(f, x0=0.2, fprime=fprime, method='newton')
|
||||
>>> sol.root, sol.iterations, sol.function_calls
|
||||
(1.0, 11, 22)
|
||||
|
||||
The function can provide the value and derivative(s) in a single call.
|
||||
|
||||
>>> def f_p_pp(x):
|
||||
... return (x**3 - 1), 3*x**2, 6*x
|
||||
|
||||
>>> sol = optimize.root_scalar(
|
||||
... f_p_pp, x0=0.2, fprime=True, method='newton'
|
||||
... )
|
||||
>>> sol.root, sol.iterations, sol.function_calls
|
||||
(1.0, 11, 11)
|
||||
|
||||
>>> sol = optimize.root_scalar(
|
||||
... f_p_pp, x0=0.2, fprime=True, fprime2=True, method='halley'
|
||||
... )
|
||||
>>> sol.root, sol.iterations, sol.function_calls
|
||||
(1.0, 7, 8)
|
||||
|
||||
|
||||
""" # noqa: E501
|
||||
if not isinstance(args, tuple):
|
||||
args = (args,)
|
||||
|
||||
if options is None:
|
||||
options = {}
|
||||
|
||||
# fun also returns the derivative(s)
|
||||
is_memoized = False
|
||||
if fprime2 is not None and not callable(fprime2):
|
||||
if bool(fprime2):
|
||||
f = MemoizeDer(f)
|
||||
is_memoized = True
|
||||
fprime2 = f.fprime2
|
||||
fprime = f.fprime
|
||||
else:
|
||||
fprime2 = None
|
||||
if fprime is not None and not callable(fprime):
|
||||
if bool(fprime):
|
||||
f = MemoizeDer(f)
|
||||
is_memoized = True
|
||||
fprime = f.fprime
|
||||
else:
|
||||
fprime = None
|
||||
|
||||
# respect solver-specific default tolerances - only pass in if actually set
|
||||
kwargs = {}
|
||||
for k in ['xtol', 'rtol', 'maxiter']:
|
||||
v = locals().get(k)
|
||||
if v is not None:
|
||||
kwargs[k] = v
|
||||
|
||||
# Set any solver-specific options
|
||||
if options:
|
||||
kwargs.update(options)
|
||||
# Always request full_output from the underlying method as _root_scalar
|
||||
# always returns a RootResults object
|
||||
kwargs.update(full_output=True, disp=False)
|
||||
|
||||
# Pick a method if not specified.
|
||||
# Use the "best" method available for the situation.
|
||||
if not method:
|
||||
if bracket:
|
||||
method = 'brentq'
|
||||
elif x0 is not None:
|
||||
if fprime:
|
||||
if fprime2:
|
||||
method = 'halley'
|
||||
else:
|
||||
method = 'newton'
|
||||
elif x1 is not None:
|
||||
method = 'secant'
|
||||
else:
|
||||
method = 'newton'
|
||||
if not method:
|
||||
raise ValueError('Unable to select a solver as neither bracket '
|
||||
'nor starting point provided.')
|
||||
|
||||
meth = method.lower()
|
||||
map2underlying = {'halley': 'newton', 'secant': 'newton'}
|
||||
|
||||
try:
|
||||
methodc = getattr(optzeros, map2underlying.get(meth, meth))
|
||||
except AttributeError as e:
|
||||
raise ValueError('Unknown solver %s' % meth) from e
|
||||
|
||||
if meth in ['bisect', 'ridder', 'brentq', 'brenth', 'toms748']:
|
||||
if not isinstance(bracket, (list, tuple, np.ndarray)):
|
||||
raise ValueError('Bracket needed for %s' % method)
|
||||
|
||||
a, b = bracket[:2]
|
||||
try:
|
||||
r, sol = methodc(f, a, b, args=args, **kwargs)
|
||||
except ValueError as e:
|
||||
# gh-17622 fixed some bugs in low-level solvers by raising an error
|
||||
# (rather than returning incorrect results) when the callable
|
||||
# returns a NaN. It did so by wrapping the callable rather than
|
||||
# modifying compiled code, so the iteration count is not available.
|
||||
if hasattr(e, "_x"):
|
||||
sol = optzeros.RootResults(root=e._x,
|
||||
iterations=np.nan,
|
||||
function_calls=e._function_calls,
|
||||
flag=str(e), method=method)
|
||||
else:
|
||||
raise
|
||||
|
||||
elif meth in ['secant']:
|
||||
if x0 is None:
|
||||
raise ValueError('x0 must not be None for %s' % method)
|
||||
if 'xtol' in kwargs:
|
||||
kwargs['tol'] = kwargs.pop('xtol')
|
||||
r, sol = methodc(f, x0, args=args, fprime=None, fprime2=None,
|
||||
x1=x1, **kwargs)
|
||||
elif meth in ['newton']:
|
||||
if x0 is None:
|
||||
raise ValueError('x0 must not be None for %s' % method)
|
||||
if not fprime:
|
||||
# approximate fprime with finite differences
|
||||
|
||||
def fprime(x, *args):
|
||||
# `root_scalar` doesn't actually seem to support vectorized
|
||||
# use of `newton`. In that case, `approx_derivative` will
|
||||
# always get scalar input. Nonetheless, it always returns an
|
||||
# array, so we extract the element to produce scalar output.
|
||||
return approx_derivative(f, x, method='2-point', args=args)[0]
|
||||
|
||||
if 'xtol' in kwargs:
|
||||
kwargs['tol'] = kwargs.pop('xtol')
|
||||
r, sol = methodc(f, x0, args=args, fprime=fprime, fprime2=None,
|
||||
**kwargs)
|
||||
elif meth in ['halley']:
|
||||
if x0 is None:
|
||||
raise ValueError('x0 must not be None for %s' % method)
|
||||
if not fprime:
|
||||
raise ValueError('fprime must be specified for %s' % method)
|
||||
if not fprime2:
|
||||
raise ValueError('fprime2 must be specified for %s' % method)
|
||||
if 'xtol' in kwargs:
|
||||
kwargs['tol'] = kwargs.pop('xtol')
|
||||
r, sol = methodc(f, x0, args=args, fprime=fprime, fprime2=fprime2, **kwargs)
|
||||
else:
|
||||
raise ValueError('Unknown solver %s' % method)
|
||||
|
||||
if is_memoized:
|
||||
# Replace the function_calls count with the memoized count.
|
||||
# Avoids double and triple-counting.
|
||||
n_calls = f.n_calls
|
||||
sol.function_calls = n_calls
|
||||
|
||||
return sol
|
||||
|
||||
|
||||
def _root_scalar_brentq_doc():
|
||||
r"""
|
||||
Options
|
||||
-------
|
||||
args : tuple, optional
|
||||
Extra arguments passed to the objective function.
|
||||
bracket: A sequence of 2 floats, optional
|
||||
An interval bracketing a root. `f(x, *args)` must have different
|
||||
signs at the two endpoints.
|
||||
xtol : float, optional
|
||||
Tolerance (absolute) for termination.
|
||||
rtol : float, optional
|
||||
Tolerance (relative) for termination.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations.
|
||||
options: dict, optional
|
||||
Specifies any method-specific options not covered above
|
||||
|
||||
"""
|
||||
pass
|
||||
|
||||
|
||||
def _root_scalar_brenth_doc():
|
||||
r"""
|
||||
Options
|
||||
-------
|
||||
args : tuple, optional
|
||||
Extra arguments passed to the objective function.
|
||||
bracket: A sequence of 2 floats, optional
|
||||
An interval bracketing a root. `f(x, *args)` must have different
|
||||
signs at the two endpoints.
|
||||
xtol : float, optional
|
||||
Tolerance (absolute) for termination.
|
||||
rtol : float, optional
|
||||
Tolerance (relative) for termination.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations.
|
||||
options: dict, optional
|
||||
Specifies any method-specific options not covered above.
|
||||
|
||||
"""
|
||||
pass
|
||||
|
||||
def _root_scalar_toms748_doc():
|
||||
r"""
|
||||
Options
|
||||
-------
|
||||
args : tuple, optional
|
||||
Extra arguments passed to the objective function.
|
||||
bracket: A sequence of 2 floats, optional
|
||||
An interval bracketing a root. `f(x, *args)` must have different
|
||||
signs at the two endpoints.
|
||||
xtol : float, optional
|
||||
Tolerance (absolute) for termination.
|
||||
rtol : float, optional
|
||||
Tolerance (relative) for termination.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations.
|
||||
options: dict, optional
|
||||
Specifies any method-specific options not covered above.
|
||||
|
||||
"""
|
||||
pass
|
||||
|
||||
|
||||
def _root_scalar_secant_doc():
|
||||
r"""
|
||||
Options
|
||||
-------
|
||||
args : tuple, optional
|
||||
Extra arguments passed to the objective function.
|
||||
xtol : float, optional
|
||||
Tolerance (absolute) for termination.
|
||||
rtol : float, optional
|
||||
Tolerance (relative) for termination.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations.
|
||||
x0 : float, required
|
||||
Initial guess.
|
||||
x1 : float, required
|
||||
A second guess.
|
||||
options: dict, optional
|
||||
Specifies any method-specific options not covered above.
|
||||
|
||||
"""
|
||||
pass
|
||||
|
||||
|
||||
def _root_scalar_newton_doc():
|
||||
r"""
|
||||
Options
|
||||
-------
|
||||
args : tuple, optional
|
||||
Extra arguments passed to the objective function and its derivative.
|
||||
xtol : float, optional
|
||||
Tolerance (absolute) for termination.
|
||||
rtol : float, optional
|
||||
Tolerance (relative) for termination.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations.
|
||||
x0 : float, required
|
||||
Initial guess.
|
||||
fprime : bool or callable, optional
|
||||
If `fprime` is a boolean and is True, `f` is assumed to return the
|
||||
value of derivative along with the objective function.
|
||||
`fprime` can also be a callable returning the derivative of `f`. In
|
||||
this case, it must accept the same arguments as `f`.
|
||||
options: dict, optional
|
||||
Specifies any method-specific options not covered above.
|
||||
|
||||
"""
|
||||
pass
|
||||
|
||||
|
||||
def _root_scalar_halley_doc():
|
||||
r"""
|
||||
Options
|
||||
-------
|
||||
args : tuple, optional
|
||||
Extra arguments passed to the objective function and its derivatives.
|
||||
xtol : float, optional
|
||||
Tolerance (absolute) for termination.
|
||||
rtol : float, optional
|
||||
Tolerance (relative) for termination.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations.
|
||||
x0 : float, required
|
||||
Initial guess.
|
||||
fprime : bool or callable, required
|
||||
If `fprime` is a boolean and is True, `f` is assumed to return the
|
||||
value of derivative along with the objective function.
|
||||
`fprime` can also be a callable returning the derivative of `f`. In
|
||||
this case, it must accept the same arguments as `f`.
|
||||
fprime2 : bool or callable, required
|
||||
If `fprime2` is a boolean and is True, `f` is assumed to return the
|
||||
value of 1st and 2nd derivatives along with the objective function.
|
||||
`fprime2` can also be a callable returning the 2nd derivative of `f`.
|
||||
In this case, it must accept the same arguments as `f`.
|
||||
options: dict, optional
|
||||
Specifies any method-specific options not covered above.
|
||||
|
||||
"""
|
||||
pass
|
||||
|
||||
|
||||
def _root_scalar_ridder_doc():
|
||||
r"""
|
||||
Options
|
||||
-------
|
||||
args : tuple, optional
|
||||
Extra arguments passed to the objective function.
|
||||
bracket: A sequence of 2 floats, optional
|
||||
An interval bracketing a root. `f(x, *args)` must have different
|
||||
signs at the two endpoints.
|
||||
xtol : float, optional
|
||||
Tolerance (absolute) for termination.
|
||||
rtol : float, optional
|
||||
Tolerance (relative) for termination.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations.
|
||||
options: dict, optional
|
||||
Specifies any method-specific options not covered above.
|
||||
|
||||
"""
|
||||
pass
|
||||
|
||||
|
||||
def _root_scalar_bisect_doc():
|
||||
r"""
|
||||
Options
|
||||
-------
|
||||
args : tuple, optional
|
||||
Extra arguments passed to the objective function.
|
||||
bracket: A sequence of 2 floats, optional
|
||||
An interval bracketing a root. `f(x, *args)` must have different
|
||||
signs at the two endpoints.
|
||||
xtol : float, optional
|
||||
Tolerance (absolute) for termination.
|
||||
rtol : float, optional
|
||||
Tolerance (relative) for termination.
|
||||
maxiter : int, optional
|
||||
Maximum number of iterations.
|
||||
options: dict, optional
|
||||
Specifies any method-specific options not covered above.
|
||||
|
||||
"""
|
||||
pass
|
||||
1598
venv/lib/python3.12/site-packages/scipy/optimize/_shgo.py
Normal file
1598
venv/lib/python3.12/site-packages/scipy/optimize/_shgo.py
Normal file
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,460 @@
|
||||
import collections
|
||||
from abc import ABC, abstractmethod
|
||||
|
||||
import numpy as np
|
||||
|
||||
from scipy._lib._util import MapWrapper
|
||||
|
||||
|
||||
class VertexBase(ABC):
|
||||
"""
|
||||
Base class for a vertex.
|
||||
"""
|
||||
def __init__(self, x, nn=None, index=None):
|
||||
"""
|
||||
Initiation of a vertex object.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
x : tuple or vector
|
||||
The geometric location (domain).
|
||||
nn : list, optional
|
||||
Nearest neighbour list.
|
||||
index : int, optional
|
||||
Index of vertex.
|
||||
"""
|
||||
self.x = x
|
||||
self.hash = hash(self.x) # Save precomputed hash
|
||||
|
||||
if nn is not None:
|
||||
self.nn = set(nn) # can use .indexupdate to add a new list
|
||||
else:
|
||||
self.nn = set()
|
||||
|
||||
self.index = index
|
||||
|
||||
def __hash__(self):
|
||||
return self.hash
|
||||
|
||||
def __getattr__(self, item):
|
||||
if item not in ['x_a']:
|
||||
raise AttributeError(f"{type(self)} object has no attribute "
|
||||
f"'{item}'")
|
||||
if item == 'x_a':
|
||||
self.x_a = np.array(self.x)
|
||||
return self.x_a
|
||||
|
||||
@abstractmethod
|
||||
def connect(self, v):
|
||||
raise NotImplementedError("This method is only implemented with an "
|
||||
"associated child of the base class.")
|
||||
|
||||
@abstractmethod
|
||||
def disconnect(self, v):
|
||||
raise NotImplementedError("This method is only implemented with an "
|
||||
"associated child of the base class.")
|
||||
|
||||
def star(self):
|
||||
"""Returns the star domain ``st(v)`` of the vertex.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
v :
|
||||
The vertex ``v`` in ``st(v)``
|
||||
|
||||
Returns
|
||||
-------
|
||||
st : set
|
||||
A set containing all the vertices in ``st(v)``
|
||||
"""
|
||||
self.st = self.nn
|
||||
self.st.add(self)
|
||||
return self.st
|
||||
|
||||
|
||||
class VertexScalarField(VertexBase):
|
||||
"""
|
||||
Add homology properties of a scalar field f: R^n --> R associated with
|
||||
the geometry built from the VertexBase class
|
||||
"""
|
||||
|
||||
def __init__(self, x, field=None, nn=None, index=None, field_args=(),
|
||||
g_cons=None, g_cons_args=()):
|
||||
"""
|
||||
Parameters
|
||||
----------
|
||||
x : tuple,
|
||||
vector of vertex coordinates
|
||||
field : callable, optional
|
||||
a scalar field f: R^n --> R associated with the geometry
|
||||
nn : list, optional
|
||||
list of nearest neighbours
|
||||
index : int, optional
|
||||
index of the vertex
|
||||
field_args : tuple, optional
|
||||
additional arguments to be passed to field
|
||||
g_cons : callable, optional
|
||||
constraints on the vertex
|
||||
g_cons_args : tuple, optional
|
||||
additional arguments to be passed to g_cons
|
||||
|
||||
"""
|
||||
super().__init__(x, nn=nn, index=index)
|
||||
|
||||
# Note Vertex is only initiated once for all x so only
|
||||
# evaluated once
|
||||
# self.feasible = None
|
||||
|
||||
# self.f is externally defined by the cache to allow parallel
|
||||
# processing
|
||||
# None type that will break arithmetic operations unless defined
|
||||
# self.f = None
|
||||
|
||||
self.check_min = True
|
||||
self.check_max = True
|
||||
|
||||
def connect(self, v):
|
||||
"""Connects self to another vertex object v.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
v : VertexBase or VertexScalarField object
|
||||
"""
|
||||
if v is not self and v not in self.nn:
|
||||
self.nn.add(v)
|
||||
v.nn.add(self)
|
||||
|
||||
# Flags for checking homology properties:
|
||||
self.check_min = True
|
||||
self.check_max = True
|
||||
v.check_min = True
|
||||
v.check_max = True
|
||||
|
||||
def disconnect(self, v):
|
||||
if v in self.nn:
|
||||
self.nn.remove(v)
|
||||
v.nn.remove(self)
|
||||
|
||||
# Flags for checking homology properties:
|
||||
self.check_min = True
|
||||
self.check_max = True
|
||||
v.check_min = True
|
||||
v.check_max = True
|
||||
|
||||
def minimiser(self):
|
||||
"""Check whether this vertex is strictly less than all its
|
||||
neighbours"""
|
||||
if self.check_min:
|
||||
self._min = all(self.f < v.f for v in self.nn)
|
||||
self.check_min = False
|
||||
|
||||
return self._min
|
||||
|
||||
def maximiser(self):
|
||||
"""
|
||||
Check whether this vertex is strictly greater than all its
|
||||
neighbours.
|
||||
"""
|
||||
if self.check_max:
|
||||
self._max = all(self.f > v.f for v in self.nn)
|
||||
self.check_max = False
|
||||
|
||||
return self._max
|
||||
|
||||
|
||||
class VertexVectorField(VertexBase):
|
||||
"""
|
||||
Add homology properties of a scalar field f: R^n --> R^m associated with
|
||||
the geometry built from the VertexBase class.
|
||||
"""
|
||||
|
||||
def __init__(self, x, sfield=None, vfield=None, field_args=(),
|
||||
vfield_args=(), g_cons=None,
|
||||
g_cons_args=(), nn=None, index=None):
|
||||
super().__init__(x, nn=nn, index=index)
|
||||
|
||||
raise NotImplementedError("This class is still a work in progress")
|
||||
|
||||
|
||||
class VertexCacheBase:
|
||||
"""Base class for a vertex cache for a simplicial complex."""
|
||||
def __init__(self):
|
||||
|
||||
self.cache = collections.OrderedDict()
|
||||
self.nfev = 0 # Feasible points
|
||||
self.index = -1
|
||||
|
||||
def __iter__(self):
|
||||
for v in self.cache:
|
||||
yield self.cache[v]
|
||||
return
|
||||
|
||||
def size(self):
|
||||
"""Returns the size of the vertex cache."""
|
||||
return self.index + 1
|
||||
|
||||
def print_out(self):
|
||||
headlen = len(f"Vertex cache of size: {len(self.cache)}:")
|
||||
print('=' * headlen)
|
||||
print(f"Vertex cache of size: {len(self.cache)}:")
|
||||
print('=' * headlen)
|
||||
for v in self.cache:
|
||||
self.cache[v].print_out()
|
||||
|
||||
|
||||
class VertexCube(VertexBase):
|
||||
"""Vertex class to be used for a pure simplicial complex with no associated
|
||||
differential geometry (single level domain that exists in R^n)"""
|
||||
def __init__(self, x, nn=None, index=None):
|
||||
super().__init__(x, nn=nn, index=index)
|
||||
|
||||
def connect(self, v):
|
||||
if v is not self and v not in self.nn:
|
||||
self.nn.add(v)
|
||||
v.nn.add(self)
|
||||
|
||||
def disconnect(self, v):
|
||||
if v in self.nn:
|
||||
self.nn.remove(v)
|
||||
v.nn.remove(self)
|
||||
|
||||
|
||||
class VertexCacheIndex(VertexCacheBase):
|
||||
def __init__(self):
|
||||
"""
|
||||
Class for a vertex cache for a simplicial complex without an associated
|
||||
field. Useful only for building and visualising a domain complex.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
"""
|
||||
super().__init__()
|
||||
self.Vertex = VertexCube
|
||||
|
||||
def __getitem__(self, x, nn=None):
|
||||
try:
|
||||
return self.cache[x]
|
||||
except KeyError:
|
||||
self.index += 1
|
||||
xval = self.Vertex(x, index=self.index)
|
||||
# logging.info("New generated vertex at x = {}".format(x))
|
||||
# NOTE: Surprisingly high performance increase if logging
|
||||
# is commented out
|
||||
self.cache[x] = xval
|
||||
return self.cache[x]
|
||||
|
||||
|
||||
class VertexCacheField(VertexCacheBase):
|
||||
def __init__(self, field=None, field_args=(), g_cons=None, g_cons_args=(),
|
||||
workers=1):
|
||||
"""
|
||||
Class for a vertex cache for a simplicial complex with an associated
|
||||
field.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
field : callable
|
||||
Scalar or vector field callable.
|
||||
field_args : tuple, optional
|
||||
Any additional fixed parameters needed to completely specify the
|
||||
field function
|
||||
g_cons : dict or sequence of dict, optional
|
||||
Constraints definition.
|
||||
Function(s) ``R**n`` in the form::
|
||||
g_cons_args : tuple, optional
|
||||
Any additional fixed parameters needed to completely specify the
|
||||
constraint functions
|
||||
workers : int optional
|
||||
Uses `multiprocessing.Pool <multiprocessing>`) to compute the field
|
||||
functions in parallel.
|
||||
|
||||
"""
|
||||
super().__init__()
|
||||
self.index = -1
|
||||
self.Vertex = VertexScalarField
|
||||
self.field = field
|
||||
self.field_args = field_args
|
||||
self.wfield = FieldWrapper(field, field_args) # if workers is not 1
|
||||
|
||||
self.g_cons = g_cons
|
||||
self.g_cons_args = g_cons_args
|
||||
self.wgcons = ConstraintWrapper(g_cons, g_cons_args)
|
||||
self.gpool = set() # A set of tuples to process for feasibility
|
||||
|
||||
# Field processing objects
|
||||
self.fpool = set() # A set of tuples to process for scalar function
|
||||
self.sfc_lock = False # True if self.fpool is non-Empty
|
||||
|
||||
self.workers = workers
|
||||
self._mapwrapper = MapWrapper(workers)
|
||||
|
||||
if workers == 1:
|
||||
self.process_gpool = self.proc_gpool
|
||||
if g_cons is None:
|
||||
self.process_fpool = self.proc_fpool_nog
|
||||
else:
|
||||
self.process_fpool = self.proc_fpool_g
|
||||
else:
|
||||
self.process_gpool = self.pproc_gpool
|
||||
if g_cons is None:
|
||||
self.process_fpool = self.pproc_fpool_nog
|
||||
else:
|
||||
self.process_fpool = self.pproc_fpool_g
|
||||
|
||||
def __getitem__(self, x, nn=None):
|
||||
try:
|
||||
return self.cache[x]
|
||||
except KeyError:
|
||||
self.index += 1
|
||||
xval = self.Vertex(x, field=self.field, nn=nn, index=self.index,
|
||||
field_args=self.field_args,
|
||||
g_cons=self.g_cons,
|
||||
g_cons_args=self.g_cons_args)
|
||||
|
||||
self.cache[x] = xval # Define in cache
|
||||
self.gpool.add(xval) # Add to pool for processing feasibility
|
||||
self.fpool.add(xval) # Add to pool for processing field values
|
||||
return self.cache[x]
|
||||
|
||||
def __getstate__(self):
|
||||
self_dict = self.__dict__.copy()
|
||||
del self_dict['pool']
|
||||
return self_dict
|
||||
|
||||
def process_pools(self):
|
||||
if self.g_cons is not None:
|
||||
self.process_gpool()
|
||||
self.process_fpool()
|
||||
self.proc_minimisers()
|
||||
|
||||
def feasibility_check(self, v):
|
||||
v.feasible = True
|
||||
for g, args in zip(self.g_cons, self.g_cons_args):
|
||||
# constraint may return more than 1 value.
|
||||
if np.any(g(v.x_a, *args) < 0.0):
|
||||
v.f = np.inf
|
||||
v.feasible = False
|
||||
break
|
||||
|
||||
def compute_sfield(self, v):
|
||||
"""Compute the scalar field values of a vertex object `v`.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
v : VertexBase or VertexScalarField object
|
||||
"""
|
||||
try:
|
||||
v.f = self.field(v.x_a, *self.field_args)
|
||||
self.nfev += 1
|
||||
except AttributeError:
|
||||
v.f = np.inf
|
||||
# logging.warning(f"Field function not found at x = {self.x_a}")
|
||||
if np.isnan(v.f):
|
||||
v.f = np.inf
|
||||
|
||||
def proc_gpool(self):
|
||||
"""Process all constraints."""
|
||||
if self.g_cons is not None:
|
||||
for v in self.gpool:
|
||||
self.feasibility_check(v)
|
||||
# Clean the pool
|
||||
self.gpool = set()
|
||||
|
||||
def pproc_gpool(self):
|
||||
"""Process all constraints in parallel."""
|
||||
gpool_l = []
|
||||
for v in self.gpool:
|
||||
gpool_l.append(v.x_a)
|
||||
|
||||
G = self._mapwrapper(self.wgcons.gcons, gpool_l)
|
||||
for v, g in zip(self.gpool, G):
|
||||
v.feasible = g # set vertex object attribute v.feasible = g (bool)
|
||||
|
||||
def proc_fpool_g(self):
|
||||
"""Process all field functions with constraints supplied."""
|
||||
for v in self.fpool:
|
||||
if v.feasible:
|
||||
self.compute_sfield(v)
|
||||
# Clean the pool
|
||||
self.fpool = set()
|
||||
|
||||
def proc_fpool_nog(self):
|
||||
"""Process all field functions with no constraints supplied."""
|
||||
for v in self.fpool:
|
||||
self.compute_sfield(v)
|
||||
# Clean the pool
|
||||
self.fpool = set()
|
||||
|
||||
def pproc_fpool_g(self):
|
||||
"""
|
||||
Process all field functions with constraints supplied in parallel.
|
||||
"""
|
||||
self.wfield.func
|
||||
fpool_l = []
|
||||
for v in self.fpool:
|
||||
if v.feasible:
|
||||
fpool_l.append(v.x_a)
|
||||
else:
|
||||
v.f = np.inf
|
||||
F = self._mapwrapper(self.wfield.func, fpool_l)
|
||||
for va, f in zip(fpool_l, F):
|
||||
vt = tuple(va)
|
||||
self[vt].f = f # set vertex object attribute v.f = f
|
||||
self.nfev += 1
|
||||
# Clean the pool
|
||||
self.fpool = set()
|
||||
|
||||
def pproc_fpool_nog(self):
|
||||
"""
|
||||
Process all field functions with no constraints supplied in parallel.
|
||||
"""
|
||||
self.wfield.func
|
||||
fpool_l = []
|
||||
for v in self.fpool:
|
||||
fpool_l.append(v.x_a)
|
||||
F = self._mapwrapper(self.wfield.func, fpool_l)
|
||||
for va, f in zip(fpool_l, F):
|
||||
vt = tuple(va)
|
||||
self[vt].f = f # set vertex object attribute v.f = f
|
||||
self.nfev += 1
|
||||
# Clean the pool
|
||||
self.fpool = set()
|
||||
|
||||
def proc_minimisers(self):
|
||||
"""Check for minimisers."""
|
||||
for v in self:
|
||||
v.minimiser()
|
||||
v.maximiser()
|
||||
|
||||
|
||||
class ConstraintWrapper:
|
||||
"""Object to wrap constraints to pass to `multiprocessing.Pool`."""
|
||||
def __init__(self, g_cons, g_cons_args):
|
||||
self.g_cons = g_cons
|
||||
self.g_cons_args = g_cons_args
|
||||
|
||||
def gcons(self, v_x_a):
|
||||
vfeasible = True
|
||||
for g, args in zip(self.g_cons, self.g_cons_args):
|
||||
# constraint may return more than 1 value.
|
||||
if np.any(g(v_x_a, *args) < 0.0):
|
||||
vfeasible = False
|
||||
break
|
||||
return vfeasible
|
||||
|
||||
|
||||
class FieldWrapper:
|
||||
"""Object to wrap field to pass to `multiprocessing.Pool`."""
|
||||
def __init__(self, field, field_args):
|
||||
self.field = field
|
||||
self.field_args = field_args
|
||||
|
||||
def func(self, v_x_a):
|
||||
try:
|
||||
v_f = self.field(v_x_a, *self.field_args)
|
||||
except Exception:
|
||||
v_f = np.inf
|
||||
if np.isnan(v_f):
|
||||
v_f = np.inf
|
||||
|
||||
return v_f
|
||||
Binary file not shown.
510
venv/lib/python3.12/site-packages/scipy/optimize/_slsqp_py.py
Normal file
510
venv/lib/python3.12/site-packages/scipy/optimize/_slsqp_py.py
Normal file
@ -0,0 +1,510 @@
|
||||
"""
|
||||
This module implements the Sequential Least Squares Programming optimization
|
||||
algorithm (SLSQP), originally developed by Dieter Kraft.
|
||||
See http://www.netlib.org/toms/733
|
||||
|
||||
Functions
|
||||
---------
|
||||
.. autosummary::
|
||||
:toctree: generated/
|
||||
|
||||
approx_jacobian
|
||||
fmin_slsqp
|
||||
|
||||
"""
|
||||
|
||||
__all__ = ['approx_jacobian', 'fmin_slsqp']
|
||||
|
||||
import numpy as np
|
||||
from scipy.optimize._slsqp import slsqp
|
||||
from numpy import (zeros, array, linalg, append, concatenate, finfo,
|
||||
sqrt, vstack, isfinite, atleast_1d)
|
||||
from ._optimize import (OptimizeResult, _check_unknown_options,
|
||||
_prepare_scalar_function, _clip_x_for_func,
|
||||
_check_clip_x)
|
||||
from ._numdiff import approx_derivative
|
||||
from ._constraints import old_bound_to_new, _arr_to_scalar
|
||||
from scipy._lib._array_api import atleast_nd, array_namespace
|
||||
|
||||
|
||||
__docformat__ = "restructuredtext en"
|
||||
|
||||
_epsilon = sqrt(finfo(float).eps)
|
||||
|
||||
|
||||
def approx_jacobian(x, func, epsilon, *args):
|
||||
"""
|
||||
Approximate the Jacobian matrix of a callable function.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
x : array_like
|
||||
The state vector at which to compute the Jacobian matrix.
|
||||
func : callable f(x,*args)
|
||||
The vector-valued function.
|
||||
epsilon : float
|
||||
The perturbation used to determine the partial derivatives.
|
||||
args : sequence
|
||||
Additional arguments passed to func.
|
||||
|
||||
Returns
|
||||
-------
|
||||
An array of dimensions ``(lenf, lenx)`` where ``lenf`` is the length
|
||||
of the outputs of `func`, and ``lenx`` is the number of elements in
|
||||
`x`.
|
||||
|
||||
Notes
|
||||
-----
|
||||
The approximation is done using forward differences.
|
||||
|
||||
"""
|
||||
# approx_derivative returns (m, n) == (lenf, lenx)
|
||||
jac = approx_derivative(func, x, method='2-point', abs_step=epsilon,
|
||||
args=args)
|
||||
# if func returns a scalar jac.shape will be (lenx,). Make sure
|
||||
# it's at least a 2D array.
|
||||
return np.atleast_2d(jac)
|
||||
|
||||
|
||||
def fmin_slsqp(func, x0, eqcons=(), f_eqcons=None, ieqcons=(), f_ieqcons=None,
|
||||
bounds=(), fprime=None, fprime_eqcons=None,
|
||||
fprime_ieqcons=None, args=(), iter=100, acc=1.0E-6,
|
||||
iprint=1, disp=None, full_output=0, epsilon=_epsilon,
|
||||
callback=None):
|
||||
"""
|
||||
Minimize a function using Sequential Least Squares Programming
|
||||
|
||||
Python interface function for the SLSQP Optimization subroutine
|
||||
originally implemented by Dieter Kraft.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
func : callable f(x,*args)
|
||||
Objective function. Must return a scalar.
|
||||
x0 : 1-D ndarray of float
|
||||
Initial guess for the independent variable(s).
|
||||
eqcons : list, optional
|
||||
A list of functions of length n such that
|
||||
eqcons[j](x,*args) == 0.0 in a successfully optimized
|
||||
problem.
|
||||
f_eqcons : callable f(x,*args), optional
|
||||
Returns a 1-D array in which each element must equal 0.0 in a
|
||||
successfully optimized problem. If f_eqcons is specified,
|
||||
eqcons is ignored.
|
||||
ieqcons : list, optional
|
||||
A list of functions of length n such that
|
||||
ieqcons[j](x,*args) >= 0.0 in a successfully optimized
|
||||
problem.
|
||||
f_ieqcons : callable f(x,*args), optional
|
||||
Returns a 1-D ndarray in which each element must be greater or
|
||||
equal to 0.0 in a successfully optimized problem. If
|
||||
f_ieqcons is specified, ieqcons is ignored.
|
||||
bounds : list, optional
|
||||
A list of tuples specifying the lower and upper bound
|
||||
for each independent variable [(xl0, xu0),(xl1, xu1),...]
|
||||
Infinite values will be interpreted as large floating values.
|
||||
fprime : callable `f(x,*args)`, optional
|
||||
A function that evaluates the partial derivatives of func.
|
||||
fprime_eqcons : callable `f(x,*args)`, optional
|
||||
A function of the form `f(x, *args)` that returns the m by n
|
||||
array of equality constraint normals. If not provided,
|
||||
the normals will be approximated. The array returned by
|
||||
fprime_eqcons should be sized as ( len(eqcons), len(x0) ).
|
||||
fprime_ieqcons : callable `f(x,*args)`, optional
|
||||
A function of the form `f(x, *args)` that returns the m by n
|
||||
array of inequality constraint normals. If not provided,
|
||||
the normals will be approximated. The array returned by
|
||||
fprime_ieqcons should be sized as ( len(ieqcons), len(x0) ).
|
||||
args : sequence, optional
|
||||
Additional arguments passed to func and fprime.
|
||||
iter : int, optional
|
||||
The maximum number of iterations.
|
||||
acc : float, optional
|
||||
Requested accuracy.
|
||||
iprint : int, optional
|
||||
The verbosity of fmin_slsqp :
|
||||
|
||||
* iprint <= 0 : Silent operation
|
||||
* iprint == 1 : Print summary upon completion (default)
|
||||
* iprint >= 2 : Print status of each iterate and summary
|
||||
disp : int, optional
|
||||
Overrides the iprint interface (preferred).
|
||||
full_output : bool, optional
|
||||
If False, return only the minimizer of func (default).
|
||||
Otherwise, output final objective function and summary
|
||||
information.
|
||||
epsilon : float, optional
|
||||
The step size for finite-difference derivative estimates.
|
||||
callback : callable, optional
|
||||
Called after each iteration, as ``callback(x)``, where ``x`` is the
|
||||
current parameter vector.
|
||||
|
||||
Returns
|
||||
-------
|
||||
out : ndarray of float
|
||||
The final minimizer of func.
|
||||
fx : ndarray of float, if full_output is true
|
||||
The final value of the objective function.
|
||||
its : int, if full_output is true
|
||||
The number of iterations.
|
||||
imode : int, if full_output is true
|
||||
The exit mode from the optimizer (see below).
|
||||
smode : string, if full_output is true
|
||||
Message describing the exit mode from the optimizer.
|
||||
|
||||
See also
|
||||
--------
|
||||
minimize: Interface to minimization algorithms for multivariate
|
||||
functions. See the 'SLSQP' `method` in particular.
|
||||
|
||||
Notes
|
||||
-----
|
||||
Exit modes are defined as follows ::
|
||||
|
||||
-1 : Gradient evaluation required (g & a)
|
||||
0 : Optimization terminated successfully
|
||||
1 : Function evaluation required (f & c)
|
||||
2 : More equality constraints than independent variables
|
||||
3 : More than 3*n iterations in LSQ subproblem
|
||||
4 : Inequality constraints incompatible
|
||||
5 : Singular matrix E in LSQ subproblem
|
||||
6 : Singular matrix C in LSQ subproblem
|
||||
7 : Rank-deficient equality constraint subproblem HFTI
|
||||
8 : Positive directional derivative for linesearch
|
||||
9 : Iteration limit reached
|
||||
|
||||
Examples
|
||||
--------
|
||||
Examples are given :ref:`in the tutorial <tutorial-sqlsp>`.
|
||||
|
||||
"""
|
||||
if disp is not None:
|
||||
iprint = disp
|
||||
|
||||
opts = {'maxiter': iter,
|
||||
'ftol': acc,
|
||||
'iprint': iprint,
|
||||
'disp': iprint != 0,
|
||||
'eps': epsilon,
|
||||
'callback': callback}
|
||||
|
||||
# Build the constraints as a tuple of dictionaries
|
||||
cons = ()
|
||||
# 1. constraints of the 1st kind (eqcons, ieqcons); no Jacobian; take
|
||||
# the same extra arguments as the objective function.
|
||||
cons += tuple({'type': 'eq', 'fun': c, 'args': args} for c in eqcons)
|
||||
cons += tuple({'type': 'ineq', 'fun': c, 'args': args} for c in ieqcons)
|
||||
# 2. constraints of the 2nd kind (f_eqcons, f_ieqcons) and their Jacobian
|
||||
# (fprime_eqcons, fprime_ieqcons); also take the same extra arguments
|
||||
# as the objective function.
|
||||
if f_eqcons:
|
||||
cons += ({'type': 'eq', 'fun': f_eqcons, 'jac': fprime_eqcons,
|
||||
'args': args}, )
|
||||
if f_ieqcons:
|
||||
cons += ({'type': 'ineq', 'fun': f_ieqcons, 'jac': fprime_ieqcons,
|
||||
'args': args}, )
|
||||
|
||||
res = _minimize_slsqp(func, x0, args, jac=fprime, bounds=bounds,
|
||||
constraints=cons, **opts)
|
||||
if full_output:
|
||||
return res['x'], res['fun'], res['nit'], res['status'], res['message']
|
||||
else:
|
||||
return res['x']
|
||||
|
||||
|
||||
def _minimize_slsqp(func, x0, args=(), jac=None, bounds=None,
|
||||
constraints=(),
|
||||
maxiter=100, ftol=1.0E-6, iprint=1, disp=False,
|
||||
eps=_epsilon, callback=None, finite_diff_rel_step=None,
|
||||
**unknown_options):
|
||||
"""
|
||||
Minimize a scalar function of one or more variables using Sequential
|
||||
Least Squares Programming (SLSQP).
|
||||
|
||||
Options
|
||||
-------
|
||||
ftol : float
|
||||
Precision goal for the value of f in the stopping criterion.
|
||||
eps : float
|
||||
Step size used for numerical approximation of the Jacobian.
|
||||
disp : bool
|
||||
Set to True to print convergence messages. If False,
|
||||
`verbosity` is ignored and set to 0.
|
||||
maxiter : int
|
||||
Maximum number of iterations.
|
||||
finite_diff_rel_step : None or array_like, optional
|
||||
If `jac in ['2-point', '3-point', 'cs']` the relative step size to
|
||||
use for numerical approximation of `jac`. The absolute step
|
||||
size is computed as ``h = rel_step * sign(x) * max(1, abs(x))``,
|
||||
possibly adjusted to fit into the bounds. For ``method='3-point'``
|
||||
the sign of `h` is ignored. If None (default) then step is selected
|
||||
automatically.
|
||||
"""
|
||||
_check_unknown_options(unknown_options)
|
||||
iter = maxiter - 1
|
||||
acc = ftol
|
||||
epsilon = eps
|
||||
|
||||
if not disp:
|
||||
iprint = 0
|
||||
|
||||
# Transform x0 into an array.
|
||||
xp = array_namespace(x0)
|
||||
x0 = atleast_nd(x0, ndim=1, xp=xp)
|
||||
dtype = xp.float64
|
||||
if xp.isdtype(x0.dtype, "real floating"):
|
||||
dtype = x0.dtype
|
||||
x = xp.reshape(xp.astype(x0, dtype), -1)
|
||||
|
||||
# SLSQP is sent 'old-style' bounds, 'new-style' bounds are required by
|
||||
# ScalarFunction
|
||||
if bounds is None or len(bounds) == 0:
|
||||
new_bounds = (-np.inf, np.inf)
|
||||
else:
|
||||
new_bounds = old_bound_to_new(bounds)
|
||||
|
||||
# clip the initial guess to bounds, otherwise ScalarFunction doesn't work
|
||||
x = np.clip(x, new_bounds[0], new_bounds[1])
|
||||
|
||||
# Constraints are triaged per type into a dictionary of tuples
|
||||
if isinstance(constraints, dict):
|
||||
constraints = (constraints, )
|
||||
|
||||
cons = {'eq': (), 'ineq': ()}
|
||||
for ic, con in enumerate(constraints):
|
||||
# check type
|
||||
try:
|
||||
ctype = con['type'].lower()
|
||||
except KeyError as e:
|
||||
raise KeyError('Constraint %d has no type defined.' % ic) from e
|
||||
except TypeError as e:
|
||||
raise TypeError('Constraints must be defined using a '
|
||||
'dictionary.') from e
|
||||
except AttributeError as e:
|
||||
raise TypeError("Constraint's type must be a string.") from e
|
||||
else:
|
||||
if ctype not in ['eq', 'ineq']:
|
||||
raise ValueError("Unknown constraint type '%s'." % con['type'])
|
||||
|
||||
# check function
|
||||
if 'fun' not in con:
|
||||
raise ValueError('Constraint %d has no function defined.' % ic)
|
||||
|
||||
# check Jacobian
|
||||
cjac = con.get('jac')
|
||||
if cjac is None:
|
||||
# approximate Jacobian function. The factory function is needed
|
||||
# to keep a reference to `fun`, see gh-4240.
|
||||
def cjac_factory(fun):
|
||||
def cjac(x, *args):
|
||||
x = _check_clip_x(x, new_bounds)
|
||||
|
||||
if jac in ['2-point', '3-point', 'cs']:
|
||||
return approx_derivative(fun, x, method=jac, args=args,
|
||||
rel_step=finite_diff_rel_step,
|
||||
bounds=new_bounds)
|
||||
else:
|
||||
return approx_derivative(fun, x, method='2-point',
|
||||
abs_step=epsilon, args=args,
|
||||
bounds=new_bounds)
|
||||
|
||||
return cjac
|
||||
cjac = cjac_factory(con['fun'])
|
||||
|
||||
# update constraints' dictionary
|
||||
cons[ctype] += ({'fun': con['fun'],
|
||||
'jac': cjac,
|
||||
'args': con.get('args', ())}, )
|
||||
|
||||
exit_modes = {-1: "Gradient evaluation required (g & a)",
|
||||
0: "Optimization terminated successfully",
|
||||
1: "Function evaluation required (f & c)",
|
||||
2: "More equality constraints than independent variables",
|
||||
3: "More than 3*n iterations in LSQ subproblem",
|
||||
4: "Inequality constraints incompatible",
|
||||
5: "Singular matrix E in LSQ subproblem",
|
||||
6: "Singular matrix C in LSQ subproblem",
|
||||
7: "Rank-deficient equality constraint subproblem HFTI",
|
||||
8: "Positive directional derivative for linesearch",
|
||||
9: "Iteration limit reached"}
|
||||
|
||||
# Set the parameters that SLSQP will need
|
||||
# meq, mieq: number of equality and inequality constraints
|
||||
meq = sum(map(len, [atleast_1d(c['fun'](x, *c['args']))
|
||||
for c in cons['eq']]))
|
||||
mieq = sum(map(len, [atleast_1d(c['fun'](x, *c['args']))
|
||||
for c in cons['ineq']]))
|
||||
# m = The total number of constraints
|
||||
m = meq + mieq
|
||||
# la = The number of constraints, or 1 if there are no constraints
|
||||
la = array([1, m]).max()
|
||||
# n = The number of independent variables
|
||||
n = len(x)
|
||||
|
||||
# Define the workspaces for SLSQP
|
||||
n1 = n + 1
|
||||
mineq = m - meq + n1 + n1
|
||||
len_w = (3*n1+m)*(n1+1)+(n1-meq+1)*(mineq+2) + 2*mineq+(n1+mineq)*(n1-meq) \
|
||||
+ 2*meq + n1 + ((n+1)*n)//2 + 2*m + 3*n + 3*n1 + 1
|
||||
len_jw = mineq
|
||||
w = zeros(len_w)
|
||||
jw = zeros(len_jw)
|
||||
|
||||
# Decompose bounds into xl and xu
|
||||
if bounds is None or len(bounds) == 0:
|
||||
xl = np.empty(n, dtype=float)
|
||||
xu = np.empty(n, dtype=float)
|
||||
xl.fill(np.nan)
|
||||
xu.fill(np.nan)
|
||||
else:
|
||||
bnds = array([(_arr_to_scalar(l), _arr_to_scalar(u))
|
||||
for (l, u) in bounds], float)
|
||||
if bnds.shape[0] != n:
|
||||
raise IndexError('SLSQP Error: the length of bounds is not '
|
||||
'compatible with that of x0.')
|
||||
|
||||
with np.errstate(invalid='ignore'):
|
||||
bnderr = bnds[:, 0] > bnds[:, 1]
|
||||
|
||||
if bnderr.any():
|
||||
raise ValueError('SLSQP Error: lb > ub in bounds %s.' %
|
||||
', '.join(str(b) for b in bnderr))
|
||||
xl, xu = bnds[:, 0], bnds[:, 1]
|
||||
|
||||
# Mark infinite bounds with nans; the Fortran code understands this
|
||||
infbnd = ~isfinite(bnds)
|
||||
xl[infbnd[:, 0]] = np.nan
|
||||
xu[infbnd[:, 1]] = np.nan
|
||||
|
||||
# ScalarFunction provides function and gradient evaluation
|
||||
sf = _prepare_scalar_function(func, x, jac=jac, args=args, epsilon=eps,
|
||||
finite_diff_rel_step=finite_diff_rel_step,
|
||||
bounds=new_bounds)
|
||||
# gh11403 SLSQP sometimes exceeds bounds by 1 or 2 ULP, make sure this
|
||||
# doesn't get sent to the func/grad evaluator.
|
||||
wrapped_fun = _clip_x_for_func(sf.fun, new_bounds)
|
||||
wrapped_grad = _clip_x_for_func(sf.grad, new_bounds)
|
||||
|
||||
# Initialize the iteration counter and the mode value
|
||||
mode = array(0, int)
|
||||
acc = array(acc, float)
|
||||
majiter = array(iter, int)
|
||||
majiter_prev = 0
|
||||
|
||||
# Initialize internal SLSQP state variables
|
||||
alpha = array(0, float)
|
||||
f0 = array(0, float)
|
||||
gs = array(0, float)
|
||||
h1 = array(0, float)
|
||||
h2 = array(0, float)
|
||||
h3 = array(0, float)
|
||||
h4 = array(0, float)
|
||||
t = array(0, float)
|
||||
t0 = array(0, float)
|
||||
tol = array(0, float)
|
||||
iexact = array(0, int)
|
||||
incons = array(0, int)
|
||||
ireset = array(0, int)
|
||||
itermx = array(0, int)
|
||||
line = array(0, int)
|
||||
n1 = array(0, int)
|
||||
n2 = array(0, int)
|
||||
n3 = array(0, int)
|
||||
|
||||
# Print the header if iprint >= 2
|
||||
if iprint >= 2:
|
||||
print("%5s %5s %16s %16s" % ("NIT", "FC", "OBJFUN", "GNORM"))
|
||||
|
||||
# mode is zero on entry, so call objective, constraints and gradients
|
||||
# there should be no func evaluations here because it's cached from
|
||||
# ScalarFunction
|
||||
fx = wrapped_fun(x)
|
||||
g = append(wrapped_grad(x), 0.0)
|
||||
c = _eval_constraint(x, cons)
|
||||
a = _eval_con_normals(x, cons, la, n, m, meq, mieq)
|
||||
|
||||
while 1:
|
||||
# Call SLSQP
|
||||
slsqp(m, meq, x, xl, xu, fx, c, g, a, acc, majiter, mode, w, jw,
|
||||
alpha, f0, gs, h1, h2, h3, h4, t, t0, tol,
|
||||
iexact, incons, ireset, itermx, line,
|
||||
n1, n2, n3)
|
||||
|
||||
if mode == 1: # objective and constraint evaluation required
|
||||
fx = wrapped_fun(x)
|
||||
c = _eval_constraint(x, cons)
|
||||
|
||||
if mode == -1: # gradient evaluation required
|
||||
g = append(wrapped_grad(x), 0.0)
|
||||
a = _eval_con_normals(x, cons, la, n, m, meq, mieq)
|
||||
|
||||
if majiter > majiter_prev:
|
||||
# call callback if major iteration has incremented
|
||||
if callback is not None:
|
||||
callback(np.copy(x))
|
||||
|
||||
# Print the status of the current iterate if iprint > 2
|
||||
if iprint >= 2:
|
||||
print("%5i %5i % 16.6E % 16.6E" % (majiter, sf.nfev,
|
||||
fx, linalg.norm(g)))
|
||||
|
||||
# If exit mode is not -1 or 1, slsqp has completed
|
||||
if abs(mode) != 1:
|
||||
break
|
||||
|
||||
majiter_prev = int(majiter)
|
||||
|
||||
# Optimization loop complete. Print status if requested
|
||||
if iprint >= 1:
|
||||
print(exit_modes[int(mode)] + " (Exit mode " + str(mode) + ')')
|
||||
print(" Current function value:", fx)
|
||||
print(" Iterations:", majiter)
|
||||
print(" Function evaluations:", sf.nfev)
|
||||
print(" Gradient evaluations:", sf.ngev)
|
||||
|
||||
return OptimizeResult(x=x, fun=fx, jac=g[:-1], nit=int(majiter),
|
||||
nfev=sf.nfev, njev=sf.ngev, status=int(mode),
|
||||
message=exit_modes[int(mode)], success=(mode == 0))
|
||||
|
||||
|
||||
def _eval_constraint(x, cons):
|
||||
# Compute constraints
|
||||
if cons['eq']:
|
||||
c_eq = concatenate([atleast_1d(con['fun'](x, *con['args']))
|
||||
for con in cons['eq']])
|
||||
else:
|
||||
c_eq = zeros(0)
|
||||
|
||||
if cons['ineq']:
|
||||
c_ieq = concatenate([atleast_1d(con['fun'](x, *con['args']))
|
||||
for con in cons['ineq']])
|
||||
else:
|
||||
c_ieq = zeros(0)
|
||||
|
||||
# Now combine c_eq and c_ieq into a single matrix
|
||||
c = concatenate((c_eq, c_ieq))
|
||||
return c
|
||||
|
||||
|
||||
def _eval_con_normals(x, cons, la, n, m, meq, mieq):
|
||||
# Compute the normals of the constraints
|
||||
if cons['eq']:
|
||||
a_eq = vstack([con['jac'](x, *con['args'])
|
||||
for con in cons['eq']])
|
||||
else: # no equality constraint
|
||||
a_eq = zeros((meq, n))
|
||||
|
||||
if cons['ineq']:
|
||||
a_ieq = vstack([con['jac'](x, *con['args'])
|
||||
for con in cons['ineq']])
|
||||
else: # no inequality constraint
|
||||
a_ieq = zeros((mieq, n))
|
||||
|
||||
# Now combine a_eq and a_ieq into a single a matrix
|
||||
if m == 0: # no constraints
|
||||
a = zeros((la, n))
|
||||
else:
|
||||
a = vstack((a_eq, a_ieq))
|
||||
a = concatenate((a, zeros([la, 1])), 1)
|
||||
|
||||
return a
|
||||
260
venv/lib/python3.12/site-packages/scipy/optimize/_spectral.py
Normal file
260
venv/lib/python3.12/site-packages/scipy/optimize/_spectral.py
Normal file
@ -0,0 +1,260 @@
|
||||
"""
|
||||
Spectral Algorithm for Nonlinear Equations
|
||||
"""
|
||||
import collections
|
||||
|
||||
import numpy as np
|
||||
from scipy.optimize import OptimizeResult
|
||||
from scipy.optimize._optimize import _check_unknown_options
|
||||
from ._linesearch import _nonmonotone_line_search_cruz, _nonmonotone_line_search_cheng
|
||||
|
||||
class _NoConvergence(Exception):
|
||||
pass
|
||||
|
||||
|
||||
def _root_df_sane(func, x0, args=(), ftol=1e-8, fatol=1e-300, maxfev=1000,
|
||||
fnorm=None, callback=None, disp=False, M=10, eta_strategy=None,
|
||||
sigma_eps=1e-10, sigma_0=1.0, line_search='cruz', **unknown_options):
|
||||
r"""
|
||||
Solve nonlinear equation with the DF-SANE method
|
||||
|
||||
Options
|
||||
-------
|
||||
ftol : float, optional
|
||||
Relative norm tolerance.
|
||||
fatol : float, optional
|
||||
Absolute norm tolerance.
|
||||
Algorithm terminates when ``||func(x)|| < fatol + ftol ||func(x_0)||``.
|
||||
fnorm : callable, optional
|
||||
Norm to use in the convergence check. If None, 2-norm is used.
|
||||
maxfev : int, optional
|
||||
Maximum number of function evaluations.
|
||||
disp : bool, optional
|
||||
Whether to print convergence process to stdout.
|
||||
eta_strategy : callable, optional
|
||||
Choice of the ``eta_k`` parameter, which gives slack for growth
|
||||
of ``||F||**2``. Called as ``eta_k = eta_strategy(k, x, F)`` with
|
||||
`k` the iteration number, `x` the current iterate and `F` the current
|
||||
residual. Should satisfy ``eta_k > 0`` and ``sum(eta, k=0..inf) < inf``.
|
||||
Default: ``||F||**2 / (1 + k)**2``.
|
||||
sigma_eps : float, optional
|
||||
The spectral coefficient is constrained to ``sigma_eps < sigma < 1/sigma_eps``.
|
||||
Default: 1e-10
|
||||
sigma_0 : float, optional
|
||||
Initial spectral coefficient.
|
||||
Default: 1.0
|
||||
M : int, optional
|
||||
Number of iterates to include in the nonmonotonic line search.
|
||||
Default: 10
|
||||
line_search : {'cruz', 'cheng'}
|
||||
Type of line search to employ. 'cruz' is the original one defined in
|
||||
[Martinez & Raydan. Math. Comp. 75, 1429 (2006)], 'cheng' is
|
||||
a modified search defined in [Cheng & Li. IMA J. Numer. Anal. 29, 814 (2009)].
|
||||
Default: 'cruz'
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] "Spectral residual method without gradient information for solving
|
||||
large-scale nonlinear systems of equations." W. La Cruz,
|
||||
J.M. Martinez, M. Raydan. Math. Comp. **75**, 1429 (2006).
|
||||
.. [2] W. La Cruz, Opt. Meth. Software, 29, 24 (2014).
|
||||
.. [3] W. Cheng, D.-H. Li. IMA J. Numer. Anal. **29**, 814 (2009).
|
||||
|
||||
"""
|
||||
_check_unknown_options(unknown_options)
|
||||
|
||||
if line_search not in ('cheng', 'cruz'):
|
||||
raise ValueError(f"Invalid value {line_search!r} for 'line_search'")
|
||||
|
||||
nexp = 2
|
||||
|
||||
if eta_strategy is None:
|
||||
# Different choice from [1], as their eta is not invariant
|
||||
# vs. scaling of F.
|
||||
def eta_strategy(k, x, F):
|
||||
# Obtain squared 2-norm of the initial residual from the outer scope
|
||||
return f_0 / (1 + k)**2
|
||||
|
||||
if fnorm is None:
|
||||
def fnorm(F):
|
||||
# Obtain squared 2-norm of the current residual from the outer scope
|
||||
return f_k**(1.0/nexp)
|
||||
|
||||
def fmerit(F):
|
||||
return np.linalg.norm(F)**nexp
|
||||
|
||||
nfev = [0]
|
||||
f, x_k, x_shape, f_k, F_k, is_complex = _wrap_func(func, x0, fmerit,
|
||||
nfev, maxfev, args)
|
||||
|
||||
k = 0
|
||||
f_0 = f_k
|
||||
sigma_k = sigma_0
|
||||
|
||||
F_0_norm = fnorm(F_k)
|
||||
|
||||
# For the 'cruz' line search
|
||||
prev_fs = collections.deque([f_k], M)
|
||||
|
||||
# For the 'cheng' line search
|
||||
Q = 1.0
|
||||
C = f_0
|
||||
|
||||
converged = False
|
||||
message = "too many function evaluations required"
|
||||
|
||||
while True:
|
||||
F_k_norm = fnorm(F_k)
|
||||
|
||||
if disp:
|
||||
print("iter %d: ||F|| = %g, sigma = %g" % (k, F_k_norm, sigma_k))
|
||||
|
||||
if callback is not None:
|
||||
callback(x_k, F_k)
|
||||
|
||||
if F_k_norm < ftol * F_0_norm + fatol:
|
||||
# Converged!
|
||||
message = "successful convergence"
|
||||
converged = True
|
||||
break
|
||||
|
||||
# Control spectral parameter, from [2]
|
||||
if abs(sigma_k) > 1/sigma_eps:
|
||||
sigma_k = 1/sigma_eps * np.sign(sigma_k)
|
||||
elif abs(sigma_k) < sigma_eps:
|
||||
sigma_k = sigma_eps
|
||||
|
||||
# Line search direction
|
||||
d = -sigma_k * F_k
|
||||
|
||||
# Nonmonotone line search
|
||||
eta = eta_strategy(k, x_k, F_k)
|
||||
try:
|
||||
if line_search == 'cruz':
|
||||
alpha, xp, fp, Fp = _nonmonotone_line_search_cruz(f, x_k, d, prev_fs,
|
||||
eta=eta)
|
||||
elif line_search == 'cheng':
|
||||
alpha, xp, fp, Fp, C, Q = _nonmonotone_line_search_cheng(f, x_k, d, f_k,
|
||||
C, Q, eta=eta)
|
||||
except _NoConvergence:
|
||||
break
|
||||
|
||||
# Update spectral parameter
|
||||
s_k = xp - x_k
|
||||
y_k = Fp - F_k
|
||||
sigma_k = np.vdot(s_k, s_k) / np.vdot(s_k, y_k)
|
||||
|
||||
# Take step
|
||||
x_k = xp
|
||||
F_k = Fp
|
||||
f_k = fp
|
||||
|
||||
# Store function value
|
||||
if line_search == 'cruz':
|
||||
prev_fs.append(fp)
|
||||
|
||||
k += 1
|
||||
|
||||
x = _wrap_result(x_k, is_complex, shape=x_shape)
|
||||
F = _wrap_result(F_k, is_complex)
|
||||
|
||||
result = OptimizeResult(x=x, success=converged,
|
||||
message=message,
|
||||
fun=F, nfev=nfev[0], nit=k, method="df-sane")
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def _wrap_func(func, x0, fmerit, nfev_list, maxfev, args=()):
|
||||
"""
|
||||
Wrap a function and an initial value so that (i) complex values
|
||||
are wrapped to reals, and (ii) value for a merit function
|
||||
fmerit(x, f) is computed at the same time, (iii) iteration count
|
||||
is maintained and an exception is raised if it is exceeded.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
func : callable
|
||||
Function to wrap
|
||||
x0 : ndarray
|
||||
Initial value
|
||||
fmerit : callable
|
||||
Merit function fmerit(f) for computing merit value from residual.
|
||||
nfev_list : list
|
||||
List to store number of evaluations in. Should be [0] in the beginning.
|
||||
maxfev : int
|
||||
Maximum number of evaluations before _NoConvergence is raised.
|
||||
args : tuple
|
||||
Extra arguments to func
|
||||
|
||||
Returns
|
||||
-------
|
||||
wrap_func : callable
|
||||
Wrapped function, to be called as
|
||||
``F, fp = wrap_func(x0)``
|
||||
x0_wrap : ndarray of float
|
||||
Wrapped initial value; raveled to 1-D and complex
|
||||
values mapped to reals.
|
||||
x0_shape : tuple
|
||||
Shape of the initial value array
|
||||
f : float
|
||||
Merit function at F
|
||||
F : ndarray of float
|
||||
Residual at x0_wrap
|
||||
is_complex : bool
|
||||
Whether complex values were mapped to reals
|
||||
|
||||
"""
|
||||
x0 = np.asarray(x0)
|
||||
x0_shape = x0.shape
|
||||
F = np.asarray(func(x0, *args)).ravel()
|
||||
is_complex = np.iscomplexobj(x0) or np.iscomplexobj(F)
|
||||
x0 = x0.ravel()
|
||||
|
||||
nfev_list[0] = 1
|
||||
|
||||
if is_complex:
|
||||
def wrap_func(x):
|
||||
if nfev_list[0] >= maxfev:
|
||||
raise _NoConvergence()
|
||||
nfev_list[0] += 1
|
||||
z = _real2complex(x).reshape(x0_shape)
|
||||
v = np.asarray(func(z, *args)).ravel()
|
||||
F = _complex2real(v)
|
||||
f = fmerit(F)
|
||||
return f, F
|
||||
|
||||
x0 = _complex2real(x0)
|
||||
F = _complex2real(F)
|
||||
else:
|
||||
def wrap_func(x):
|
||||
if nfev_list[0] >= maxfev:
|
||||
raise _NoConvergence()
|
||||
nfev_list[0] += 1
|
||||
x = x.reshape(x0_shape)
|
||||
F = np.asarray(func(x, *args)).ravel()
|
||||
f = fmerit(F)
|
||||
return f, F
|
||||
|
||||
return wrap_func, x0, x0_shape, fmerit(F), F, is_complex
|
||||
|
||||
|
||||
def _wrap_result(result, is_complex, shape=None):
|
||||
"""
|
||||
Convert from real to complex and reshape result arrays.
|
||||
"""
|
||||
if is_complex:
|
||||
z = _real2complex(result)
|
||||
else:
|
||||
z = result
|
||||
if shape is not None:
|
||||
z = z.reshape(shape)
|
||||
return z
|
||||
|
||||
|
||||
def _real2complex(x):
|
||||
return np.ascontiguousarray(x, dtype=float).view(np.complex128)
|
||||
|
||||
|
||||
def _complex2real(z):
|
||||
return np.ascontiguousarray(z, dtype=complex).view(np.float64)
|
||||
430
venv/lib/python3.12/site-packages/scipy/optimize/_tnc.py
Normal file
430
venv/lib/python3.12/site-packages/scipy/optimize/_tnc.py
Normal file
@ -0,0 +1,430 @@
|
||||
# TNC Python interface
|
||||
# @(#) $Jeannot: tnc.py,v 1.11 2005/01/28 18:27:31 js Exp $
|
||||
|
||||
# Copyright (c) 2004-2005, Jean-Sebastien Roy (js@jeannot.org)
|
||||
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a
|
||||
# copy of this software and associated documentation files (the
|
||||
# "Software"), to deal in the Software without restriction, including
|
||||
# without limitation the rights to use, copy, modify, merge, publish,
|
||||
# distribute, sublicense, and/or sell copies of the Software, and to
|
||||
# permit persons to whom the Software is furnished to do so, subject to
|
||||
# the following conditions:
|
||||
|
||||
# The above copyright notice and this permission notice shall be included
|
||||
# in all copies or substantial portions of the Software.
|
||||
|
||||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
|
||||
# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
||||
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
|
||||
# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
|
||||
# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
|
||||
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
|
||||
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
"""
|
||||
TNC: A Python interface to the TNC non-linear optimizer
|
||||
|
||||
TNC is a non-linear optimizer. To use it, you must provide a function to
|
||||
minimize. The function must take one argument: the list of coordinates where to
|
||||
evaluate the function; and it must return either a tuple, whose first element is the
|
||||
value of the function, and whose second argument is the gradient of the function
|
||||
(as a list of values); or None, to abort the minimization.
|
||||
"""
|
||||
|
||||
from scipy.optimize import _moduleTNC as moduleTNC
|
||||
from ._optimize import (MemoizeJac, OptimizeResult, _check_unknown_options,
|
||||
_prepare_scalar_function)
|
||||
from ._constraints import old_bound_to_new
|
||||
from scipy._lib._array_api import atleast_nd, array_namespace
|
||||
|
||||
from numpy import inf, array, zeros
|
||||
|
||||
__all__ = ['fmin_tnc']
|
||||
|
||||
|
||||
MSG_NONE = 0 # No messages
|
||||
MSG_ITER = 1 # One line per iteration
|
||||
MSG_INFO = 2 # Informational messages
|
||||
MSG_VERS = 4 # Version info
|
||||
MSG_EXIT = 8 # Exit reasons
|
||||
MSG_ALL = MSG_ITER + MSG_INFO + MSG_VERS + MSG_EXIT
|
||||
|
||||
MSGS = {
|
||||
MSG_NONE: "No messages",
|
||||
MSG_ITER: "One line per iteration",
|
||||
MSG_INFO: "Informational messages",
|
||||
MSG_VERS: "Version info",
|
||||
MSG_EXIT: "Exit reasons",
|
||||
MSG_ALL: "All messages"
|
||||
}
|
||||
|
||||
INFEASIBLE = -1 # Infeasible (lower bound > upper bound)
|
||||
LOCALMINIMUM = 0 # Local minimum reached (|pg| ~= 0)
|
||||
FCONVERGED = 1 # Converged (|f_n-f_(n-1)| ~= 0)
|
||||
XCONVERGED = 2 # Converged (|x_n-x_(n-1)| ~= 0)
|
||||
MAXFUN = 3 # Max. number of function evaluations reached
|
||||
LSFAIL = 4 # Linear search failed
|
||||
CONSTANT = 5 # All lower bounds are equal to the upper bounds
|
||||
NOPROGRESS = 6 # Unable to progress
|
||||
USERABORT = 7 # User requested end of minimization
|
||||
|
||||
RCSTRINGS = {
|
||||
INFEASIBLE: "Infeasible (lower bound > upper bound)",
|
||||
LOCALMINIMUM: "Local minimum reached (|pg| ~= 0)",
|
||||
FCONVERGED: "Converged (|f_n-f_(n-1)| ~= 0)",
|
||||
XCONVERGED: "Converged (|x_n-x_(n-1)| ~= 0)",
|
||||
MAXFUN: "Max. number of function evaluations reached",
|
||||
LSFAIL: "Linear search failed",
|
||||
CONSTANT: "All lower bounds are equal to the upper bounds",
|
||||
NOPROGRESS: "Unable to progress",
|
||||
USERABORT: "User requested end of minimization"
|
||||
}
|
||||
|
||||
# Changes to interface made by Travis Oliphant, Apr. 2004 for inclusion in
|
||||
# SciPy
|
||||
|
||||
|
||||
def fmin_tnc(func, x0, fprime=None, args=(), approx_grad=0,
|
||||
bounds=None, epsilon=1e-8, scale=None, offset=None,
|
||||
messages=MSG_ALL, maxCGit=-1, maxfun=None, eta=-1,
|
||||
stepmx=0, accuracy=0, fmin=0, ftol=-1, xtol=-1, pgtol=-1,
|
||||
rescale=-1, disp=None, callback=None):
|
||||
"""
|
||||
Minimize a function with variables subject to bounds, using
|
||||
gradient information in a truncated Newton algorithm. This
|
||||
method wraps a C implementation of the algorithm.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
func : callable ``func(x, *args)``
|
||||
Function to minimize. Must do one of:
|
||||
|
||||
1. Return f and g, where f is the value of the function and g its
|
||||
gradient (a list of floats).
|
||||
|
||||
2. Return the function value but supply gradient function
|
||||
separately as `fprime`.
|
||||
|
||||
3. Return the function value and set ``approx_grad=True``.
|
||||
|
||||
If the function returns None, the minimization
|
||||
is aborted.
|
||||
x0 : array_like
|
||||
Initial estimate of minimum.
|
||||
fprime : callable ``fprime(x, *args)``, optional
|
||||
Gradient of `func`. If None, then either `func` must return the
|
||||
function value and the gradient (``f,g = func(x, *args)``)
|
||||
or `approx_grad` must be True.
|
||||
args : tuple, optional
|
||||
Arguments to pass to function.
|
||||
approx_grad : bool, optional
|
||||
If true, approximate the gradient numerically.
|
||||
bounds : list, optional
|
||||
(min, max) pairs for each element in x0, defining the
|
||||
bounds on that parameter. Use None or +/-inf for one of
|
||||
min or max when there is no bound in that direction.
|
||||
epsilon : float, optional
|
||||
Used if approx_grad is True. The stepsize in a finite
|
||||
difference approximation for fprime.
|
||||
scale : array_like, optional
|
||||
Scaling factors to apply to each variable. If None, the
|
||||
factors are up-low for interval bounded variables and
|
||||
1+|x| for the others. Defaults to None.
|
||||
offset : array_like, optional
|
||||
Value to subtract from each variable. If None, the
|
||||
offsets are (up+low)/2 for interval bounded variables
|
||||
and x for the others.
|
||||
messages : int, optional
|
||||
Bit mask used to select messages display during
|
||||
minimization values defined in the MSGS dict. Defaults to
|
||||
MGS_ALL.
|
||||
disp : int, optional
|
||||
Integer interface to messages. 0 = no message, 5 = all messages
|
||||
maxCGit : int, optional
|
||||
Maximum number of hessian*vector evaluations per main
|
||||
iteration. If maxCGit == 0, the direction chosen is
|
||||
-gradient if maxCGit < 0, maxCGit is set to
|
||||
max(1,min(50,n/2)). Defaults to -1.
|
||||
maxfun : int, optional
|
||||
Maximum number of function evaluation. If None, maxfun is
|
||||
set to max(100, 10*len(x0)). Defaults to None. Note that this function
|
||||
may violate the limit because of evaluating gradients by numerical
|
||||
differentiation.
|
||||
eta : float, optional
|
||||
Severity of the line search. If < 0 or > 1, set to 0.25.
|
||||
Defaults to -1.
|
||||
stepmx : float, optional
|
||||
Maximum step for the line search. May be increased during
|
||||
call. If too small, it will be set to 10.0. Defaults to 0.
|
||||
accuracy : float, optional
|
||||
Relative precision for finite difference calculations. If
|
||||
<= machine_precision, set to sqrt(machine_precision).
|
||||
Defaults to 0.
|
||||
fmin : float, optional
|
||||
Minimum function value estimate. Defaults to 0.
|
||||
ftol : float, optional
|
||||
Precision goal for the value of f in the stopping criterion.
|
||||
If ftol < 0.0, ftol is set to 0.0 defaults to -1.
|
||||
xtol : float, optional
|
||||
Precision goal for the value of x in the stopping
|
||||
criterion (after applying x scaling factors). If xtol <
|
||||
0.0, xtol is set to sqrt(machine_precision). Defaults to
|
||||
-1.
|
||||
pgtol : float, optional
|
||||
Precision goal for the value of the projected gradient in
|
||||
the stopping criterion (after applying x scaling factors).
|
||||
If pgtol < 0.0, pgtol is set to 1e-2 * sqrt(accuracy).
|
||||
Setting it to 0.0 is not recommended. Defaults to -1.
|
||||
rescale : float, optional
|
||||
Scaling factor (in log10) used to trigger f value
|
||||
rescaling. If 0, rescale at each iteration. If a large
|
||||
value, never rescale. If < 0, rescale is set to 1.3.
|
||||
callback : callable, optional
|
||||
Called after each iteration, as callback(xk), where xk is the
|
||||
current parameter vector.
|
||||
|
||||
Returns
|
||||
-------
|
||||
x : ndarray
|
||||
The solution.
|
||||
nfeval : int
|
||||
The number of function evaluations.
|
||||
rc : int
|
||||
Return code, see below
|
||||
|
||||
See also
|
||||
--------
|
||||
minimize: Interface to minimization algorithms for multivariate
|
||||
functions. See the 'TNC' `method` in particular.
|
||||
|
||||
Notes
|
||||
-----
|
||||
The underlying algorithm is truncated Newton, also called
|
||||
Newton Conjugate-Gradient. This method differs from
|
||||
scipy.optimize.fmin_ncg in that
|
||||
|
||||
1. it wraps a C implementation of the algorithm
|
||||
2. it allows each variable to be given an upper and lower bound.
|
||||
|
||||
The algorithm incorporates the bound constraints by determining
|
||||
the descent direction as in an unconstrained truncated Newton,
|
||||
but never taking a step-size large enough to leave the space
|
||||
of feasible x's. The algorithm keeps track of a set of
|
||||
currently active constraints, and ignores them when computing
|
||||
the minimum allowable step size. (The x's associated with the
|
||||
active constraint are kept fixed.) If the maximum allowable
|
||||
step size is zero then a new constraint is added. At the end
|
||||
of each iteration one of the constraints may be deemed no
|
||||
longer active and removed. A constraint is considered
|
||||
no longer active is if it is currently active
|
||||
but the gradient for that variable points inward from the
|
||||
constraint. The specific constraint removed is the one
|
||||
associated with the variable of largest index whose
|
||||
constraint is no longer active.
|
||||
|
||||
Return codes are defined as follows::
|
||||
|
||||
-1 : Infeasible (lower bound > upper bound)
|
||||
0 : Local minimum reached (|pg| ~= 0)
|
||||
1 : Converged (|f_n-f_(n-1)| ~= 0)
|
||||
2 : Converged (|x_n-x_(n-1)| ~= 0)
|
||||
3 : Max. number of function evaluations reached
|
||||
4 : Linear search failed
|
||||
5 : All lower bounds are equal to the upper bounds
|
||||
6 : Unable to progress
|
||||
7 : User requested end of minimization
|
||||
|
||||
References
|
||||
----------
|
||||
Wright S., Nocedal J. (2006), 'Numerical Optimization'
|
||||
|
||||
Nash S.G. (1984), "Newton-Type Minimization Via the Lanczos Method",
|
||||
SIAM Journal of Numerical Analysis 21, pp. 770-778
|
||||
|
||||
"""
|
||||
# handle fprime/approx_grad
|
||||
if approx_grad:
|
||||
fun = func
|
||||
jac = None
|
||||
elif fprime is None:
|
||||
fun = MemoizeJac(func)
|
||||
jac = fun.derivative
|
||||
else:
|
||||
fun = func
|
||||
jac = fprime
|
||||
|
||||
if disp is not None: # disp takes precedence over messages
|
||||
mesg_num = disp
|
||||
else:
|
||||
mesg_num = {0:MSG_NONE, 1:MSG_ITER, 2:MSG_INFO, 3:MSG_VERS,
|
||||
4:MSG_EXIT, 5:MSG_ALL}.get(messages, MSG_ALL)
|
||||
# build options
|
||||
opts = {'eps': epsilon,
|
||||
'scale': scale,
|
||||
'offset': offset,
|
||||
'mesg_num': mesg_num,
|
||||
'maxCGit': maxCGit,
|
||||
'maxfun': maxfun,
|
||||
'eta': eta,
|
||||
'stepmx': stepmx,
|
||||
'accuracy': accuracy,
|
||||
'minfev': fmin,
|
||||
'ftol': ftol,
|
||||
'xtol': xtol,
|
||||
'gtol': pgtol,
|
||||
'rescale': rescale,
|
||||
'disp': False}
|
||||
|
||||
res = _minimize_tnc(fun, x0, args, jac, bounds, callback=callback, **opts)
|
||||
|
||||
return res['x'], res['nfev'], res['status']
|
||||
|
||||
|
||||
def _minimize_tnc(fun, x0, args=(), jac=None, bounds=None,
|
||||
eps=1e-8, scale=None, offset=None, mesg_num=None,
|
||||
maxCGit=-1, eta=-1, stepmx=0, accuracy=0,
|
||||
minfev=0, ftol=-1, xtol=-1, gtol=-1, rescale=-1, disp=False,
|
||||
callback=None, finite_diff_rel_step=None, maxfun=None,
|
||||
**unknown_options):
|
||||
"""
|
||||
Minimize a scalar function of one or more variables using a truncated
|
||||
Newton (TNC) algorithm.
|
||||
|
||||
Options
|
||||
-------
|
||||
eps : float or ndarray
|
||||
If `jac is None` the absolute step size used for numerical
|
||||
approximation of the jacobian via forward differences.
|
||||
scale : list of floats
|
||||
Scaling factors to apply to each variable. If None, the
|
||||
factors are up-low for interval bounded variables and
|
||||
1+|x] for the others. Defaults to None.
|
||||
offset : float
|
||||
Value to subtract from each variable. If None, the
|
||||
offsets are (up+low)/2 for interval bounded variables
|
||||
and x for the others.
|
||||
disp : bool
|
||||
Set to True to print convergence messages.
|
||||
maxCGit : int
|
||||
Maximum number of hessian*vector evaluations per main
|
||||
iteration. If maxCGit == 0, the direction chosen is
|
||||
-gradient if maxCGit < 0, maxCGit is set to
|
||||
max(1,min(50,n/2)). Defaults to -1.
|
||||
eta : float
|
||||
Severity of the line search. If < 0 or > 1, set to 0.25.
|
||||
Defaults to -1.
|
||||
stepmx : float
|
||||
Maximum step for the line search. May be increased during
|
||||
call. If too small, it will be set to 10.0. Defaults to 0.
|
||||
accuracy : float
|
||||
Relative precision for finite difference calculations. If
|
||||
<= machine_precision, set to sqrt(machine_precision).
|
||||
Defaults to 0.
|
||||
minfev : float
|
||||
Minimum function value estimate. Defaults to 0.
|
||||
ftol : float
|
||||
Precision goal for the value of f in the stopping criterion.
|
||||
If ftol < 0.0, ftol is set to 0.0 defaults to -1.
|
||||
xtol : float
|
||||
Precision goal for the value of x in the stopping
|
||||
criterion (after applying x scaling factors). If xtol <
|
||||
0.0, xtol is set to sqrt(machine_precision). Defaults to
|
||||
-1.
|
||||
gtol : float
|
||||
Precision goal for the value of the projected gradient in
|
||||
the stopping criterion (after applying x scaling factors).
|
||||
If gtol < 0.0, gtol is set to 1e-2 * sqrt(accuracy).
|
||||
Setting it to 0.0 is not recommended. Defaults to -1.
|
||||
rescale : float
|
||||
Scaling factor (in log10) used to trigger f value
|
||||
rescaling. If 0, rescale at each iteration. If a large
|
||||
value, never rescale. If < 0, rescale is set to 1.3.
|
||||
finite_diff_rel_step : None or array_like, optional
|
||||
If `jac in ['2-point', '3-point', 'cs']` the relative step size to
|
||||
use for numerical approximation of the jacobian. The absolute step
|
||||
size is computed as ``h = rel_step * sign(x) * max(1, abs(x))``,
|
||||
possibly adjusted to fit into the bounds. For ``method='3-point'``
|
||||
the sign of `h` is ignored. If None (default) then step is selected
|
||||
automatically.
|
||||
maxfun : int
|
||||
Maximum number of function evaluations. If None, `maxfun` is
|
||||
set to max(100, 10*len(x0)). Defaults to None.
|
||||
"""
|
||||
_check_unknown_options(unknown_options)
|
||||
fmin = minfev
|
||||
pgtol = gtol
|
||||
|
||||
xp = array_namespace(x0)
|
||||
x0 = atleast_nd(x0, ndim=1, xp=xp)
|
||||
dtype = xp.float64
|
||||
if xp.isdtype(x0.dtype, "real floating"):
|
||||
dtype = x0.dtype
|
||||
x0 = xp.reshape(xp.astype(x0, dtype), -1)
|
||||
|
||||
n = len(x0)
|
||||
|
||||
if bounds is None:
|
||||
bounds = [(None,None)] * n
|
||||
if len(bounds) != n:
|
||||
raise ValueError('length of x0 != length of bounds')
|
||||
new_bounds = old_bound_to_new(bounds)
|
||||
|
||||
if mesg_num is not None:
|
||||
messages = {0:MSG_NONE, 1:MSG_ITER, 2:MSG_INFO, 3:MSG_VERS,
|
||||
4:MSG_EXIT, 5:MSG_ALL}.get(mesg_num, MSG_ALL)
|
||||
elif disp:
|
||||
messages = MSG_ALL
|
||||
else:
|
||||
messages = MSG_NONE
|
||||
|
||||
sf = _prepare_scalar_function(fun, x0, jac=jac, args=args, epsilon=eps,
|
||||
finite_diff_rel_step=finite_diff_rel_step,
|
||||
bounds=new_bounds)
|
||||
func_and_grad = sf.fun_and_grad
|
||||
|
||||
"""
|
||||
low, up : the bounds (lists of floats)
|
||||
if low is None, the lower bounds are removed.
|
||||
if up is None, the upper bounds are removed.
|
||||
low and up defaults to None
|
||||
"""
|
||||
low = zeros(n)
|
||||
up = zeros(n)
|
||||
for i in range(n):
|
||||
if bounds[i] is None:
|
||||
l, u = -inf, inf
|
||||
else:
|
||||
l,u = bounds[i]
|
||||
if l is None:
|
||||
low[i] = -inf
|
||||
else:
|
||||
low[i] = l
|
||||
if u is None:
|
||||
up[i] = inf
|
||||
else:
|
||||
up[i] = u
|
||||
|
||||
if scale is None:
|
||||
scale = array([])
|
||||
|
||||
if offset is None:
|
||||
offset = array([])
|
||||
|
||||
if maxfun is None:
|
||||
maxfun = max(100, 10*len(x0))
|
||||
|
||||
rc, nf, nit, x, funv, jacv = moduleTNC.tnc_minimize(
|
||||
func_and_grad, x0, low, up, scale,
|
||||
offset, messages, maxCGit, maxfun,
|
||||
eta, stepmx, accuracy, fmin, ftol,
|
||||
xtol, pgtol, rescale, callback
|
||||
)
|
||||
# the TNC documentation states: "On output, x, f and g may be very
|
||||
# slightly out of sync because of scaling". Therefore re-evaluate
|
||||
# func_and_grad so they are synced.
|
||||
funv, jacv = func_and_grad(x)
|
||||
|
||||
return OptimizeResult(x=x, fun=funv, jac=jacv, nfev=sf.nfev,
|
||||
nit=nit, status=rc, message=RCSTRINGS[rc],
|
||||
success=(-1 < rc < 3))
|
||||
@ -0,0 +1,12 @@
|
||||
from ._trlib import TRLIBQuadraticSubproblem
|
||||
|
||||
__all__ = ['TRLIBQuadraticSubproblem', 'get_trlib_quadratic_subproblem']
|
||||
|
||||
|
||||
def get_trlib_quadratic_subproblem(tol_rel_i=-2.0, tol_rel_b=-3.0, disp=False):
|
||||
def subproblem_factory(x, fun, jac, hess, hessp):
|
||||
return TRLIBQuadraticSubproblem(x, fun, jac, hess, hessp,
|
||||
tol_rel_i=tol_rel_i,
|
||||
tol_rel_b=tol_rel_b,
|
||||
disp=disp)
|
||||
return subproblem_factory
|
||||
Binary file not shown.
304
venv/lib/python3.12/site-packages/scipy/optimize/_trustregion.py
Normal file
304
venv/lib/python3.12/site-packages/scipy/optimize/_trustregion.py
Normal file
@ -0,0 +1,304 @@
|
||||
"""Trust-region optimization."""
|
||||
import math
|
||||
import warnings
|
||||
|
||||
import numpy as np
|
||||
import scipy.linalg
|
||||
from ._optimize import (_check_unknown_options, _status_message,
|
||||
OptimizeResult, _prepare_scalar_function,
|
||||
_call_callback_maybe_halt)
|
||||
from scipy.optimize._hessian_update_strategy import HessianUpdateStrategy
|
||||
from scipy.optimize._differentiable_functions import FD_METHODS
|
||||
__all__ = []
|
||||
|
||||
|
||||
def _wrap_function(function, args):
|
||||
# wraps a minimizer function to count number of evaluations
|
||||
# and to easily provide an args kwd.
|
||||
ncalls = [0]
|
||||
if function is None:
|
||||
return ncalls, None
|
||||
|
||||
def function_wrapper(x, *wrapper_args):
|
||||
ncalls[0] += 1
|
||||
# A copy of x is sent to the user function (gh13740)
|
||||
return function(np.copy(x), *(wrapper_args + args))
|
||||
|
||||
return ncalls, function_wrapper
|
||||
|
||||
|
||||
class BaseQuadraticSubproblem:
|
||||
"""
|
||||
Base/abstract class defining the quadratic model for trust-region
|
||||
minimization. Child classes must implement the ``solve`` method.
|
||||
|
||||
Values of the objective function, Jacobian and Hessian (if provided) at
|
||||
the current iterate ``x`` are evaluated on demand and then stored as
|
||||
attributes ``fun``, ``jac``, ``hess``.
|
||||
"""
|
||||
|
||||
def __init__(self, x, fun, jac, hess=None, hessp=None):
|
||||
self._x = x
|
||||
self._f = None
|
||||
self._g = None
|
||||
self._h = None
|
||||
self._g_mag = None
|
||||
self._cauchy_point = None
|
||||
self._newton_point = None
|
||||
self._fun = fun
|
||||
self._jac = jac
|
||||
self._hess = hess
|
||||
self._hessp = hessp
|
||||
|
||||
def __call__(self, p):
|
||||
return self.fun + np.dot(self.jac, p) + 0.5 * np.dot(p, self.hessp(p))
|
||||
|
||||
@property
|
||||
def fun(self):
|
||||
"""Value of objective function at current iteration."""
|
||||
if self._f is None:
|
||||
self._f = self._fun(self._x)
|
||||
return self._f
|
||||
|
||||
@property
|
||||
def jac(self):
|
||||
"""Value of Jacobian of objective function at current iteration."""
|
||||
if self._g is None:
|
||||
self._g = self._jac(self._x)
|
||||
return self._g
|
||||
|
||||
@property
|
||||
def hess(self):
|
||||
"""Value of Hessian of objective function at current iteration."""
|
||||
if self._h is None:
|
||||
self._h = self._hess(self._x)
|
||||
return self._h
|
||||
|
||||
def hessp(self, p):
|
||||
if self._hessp is not None:
|
||||
return self._hessp(self._x, p)
|
||||
else:
|
||||
return np.dot(self.hess, p)
|
||||
|
||||
@property
|
||||
def jac_mag(self):
|
||||
"""Magnitude of jacobian of objective function at current iteration."""
|
||||
if self._g_mag is None:
|
||||
self._g_mag = scipy.linalg.norm(self.jac)
|
||||
return self._g_mag
|
||||
|
||||
def get_boundaries_intersections(self, z, d, trust_radius):
|
||||
"""
|
||||
Solve the scalar quadratic equation ``||z + t d|| == trust_radius``.
|
||||
This is like a line-sphere intersection.
|
||||
Return the two values of t, sorted from low to high.
|
||||
"""
|
||||
a = np.dot(d, d)
|
||||
b = 2 * np.dot(z, d)
|
||||
c = np.dot(z, z) - trust_radius**2
|
||||
sqrt_discriminant = math.sqrt(b*b - 4*a*c)
|
||||
|
||||
# The following calculation is mathematically
|
||||
# equivalent to:
|
||||
# ta = (-b - sqrt_discriminant) / (2*a)
|
||||
# tb = (-b + sqrt_discriminant) / (2*a)
|
||||
# but produce smaller round off errors.
|
||||
# Look at Matrix Computation p.97
|
||||
# for a better justification.
|
||||
aux = b + math.copysign(sqrt_discriminant, b)
|
||||
ta = -aux / (2*a)
|
||||
tb = -2*c / aux
|
||||
return sorted([ta, tb])
|
||||
|
||||
def solve(self, trust_radius):
|
||||
raise NotImplementedError('The solve method should be implemented by '
|
||||
'the child class')
|
||||
|
||||
|
||||
def _minimize_trust_region(fun, x0, args=(), jac=None, hess=None, hessp=None,
|
||||
subproblem=None, initial_trust_radius=1.0,
|
||||
max_trust_radius=1000.0, eta=0.15, gtol=1e-4,
|
||||
maxiter=None, disp=False, return_all=False,
|
||||
callback=None, inexact=True, **unknown_options):
|
||||
"""
|
||||
Minimization of scalar function of one or more variables using a
|
||||
trust-region algorithm.
|
||||
|
||||
Options for the trust-region algorithm are:
|
||||
initial_trust_radius : float
|
||||
Initial trust radius.
|
||||
max_trust_radius : float
|
||||
Never propose steps that are longer than this value.
|
||||
eta : float
|
||||
Trust region related acceptance stringency for proposed steps.
|
||||
gtol : float
|
||||
Gradient norm must be less than `gtol`
|
||||
before successful termination.
|
||||
maxiter : int
|
||||
Maximum number of iterations to perform.
|
||||
disp : bool
|
||||
If True, print convergence message.
|
||||
inexact : bool
|
||||
Accuracy to solve subproblems. If True requires less nonlinear
|
||||
iterations, but more vector products. Only effective for method
|
||||
trust-krylov.
|
||||
|
||||
This function is called by the `minimize` function.
|
||||
It is not supposed to be called directly.
|
||||
"""
|
||||
_check_unknown_options(unknown_options)
|
||||
|
||||
if jac is None:
|
||||
raise ValueError('Jacobian is currently required for trust-region '
|
||||
'methods')
|
||||
if hess is None and hessp is None:
|
||||
raise ValueError('Either the Hessian or the Hessian-vector product '
|
||||
'is currently required for trust-region methods')
|
||||
if subproblem is None:
|
||||
raise ValueError('A subproblem solving strategy is required for '
|
||||
'trust-region methods')
|
||||
if not (0 <= eta < 0.25):
|
||||
raise Exception('invalid acceptance stringency')
|
||||
if max_trust_radius <= 0:
|
||||
raise Exception('the max trust radius must be positive')
|
||||
if initial_trust_radius <= 0:
|
||||
raise ValueError('the initial trust radius must be positive')
|
||||
if initial_trust_radius >= max_trust_radius:
|
||||
raise ValueError('the initial trust radius must be less than the '
|
||||
'max trust radius')
|
||||
|
||||
# force the initial guess into a nice format
|
||||
x0 = np.asarray(x0).flatten()
|
||||
|
||||
# A ScalarFunction representing the problem. This caches calls to fun, jac,
|
||||
# hess.
|
||||
sf = _prepare_scalar_function(fun, x0, jac=jac, hess=hess, args=args)
|
||||
fun = sf.fun
|
||||
jac = sf.grad
|
||||
if callable(hess):
|
||||
hess = sf.hess
|
||||
elif callable(hessp):
|
||||
# this elif statement must come before examining whether hess
|
||||
# is estimated by FD methods or a HessianUpdateStrategy
|
||||
pass
|
||||
elif (hess in FD_METHODS or isinstance(hess, HessianUpdateStrategy)):
|
||||
# If the Hessian is being estimated by finite differences or a
|
||||
# Hessian update strategy then ScalarFunction.hess returns a
|
||||
# LinearOperator or a HessianUpdateStrategy. This enables the
|
||||
# calculation/creation of a hessp. BUT you only want to do this
|
||||
# if the user *hasn't* provided a callable(hessp) function.
|
||||
hess = None
|
||||
|
||||
def hessp(x, p, *args):
|
||||
return sf.hess(x).dot(p)
|
||||
else:
|
||||
raise ValueError('Either the Hessian or the Hessian-vector product '
|
||||
'is currently required for trust-region methods')
|
||||
|
||||
# ScalarFunction doesn't represent hessp
|
||||
nhessp, hessp = _wrap_function(hessp, args)
|
||||
|
||||
# limit the number of iterations
|
||||
if maxiter is None:
|
||||
maxiter = len(x0)*200
|
||||
|
||||
# init the search status
|
||||
warnflag = 0
|
||||
|
||||
# initialize the search
|
||||
trust_radius = initial_trust_radius
|
||||
x = x0
|
||||
if return_all:
|
||||
allvecs = [x]
|
||||
m = subproblem(x, fun, jac, hess, hessp)
|
||||
k = 0
|
||||
|
||||
# search for the function min
|
||||
# do not even start if the gradient is small enough
|
||||
while m.jac_mag >= gtol:
|
||||
|
||||
# Solve the sub-problem.
|
||||
# This gives us the proposed step relative to the current position
|
||||
# and it tells us whether the proposed step
|
||||
# has reached the trust region boundary or not.
|
||||
try:
|
||||
p, hits_boundary = m.solve(trust_radius)
|
||||
except np.linalg.LinAlgError:
|
||||
warnflag = 3
|
||||
break
|
||||
|
||||
# calculate the predicted value at the proposed point
|
||||
predicted_value = m(p)
|
||||
|
||||
# define the local approximation at the proposed point
|
||||
x_proposed = x + p
|
||||
m_proposed = subproblem(x_proposed, fun, jac, hess, hessp)
|
||||
|
||||
# evaluate the ratio defined in equation (4.4)
|
||||
actual_reduction = m.fun - m_proposed.fun
|
||||
predicted_reduction = m.fun - predicted_value
|
||||
if predicted_reduction <= 0:
|
||||
warnflag = 2
|
||||
break
|
||||
rho = actual_reduction / predicted_reduction
|
||||
|
||||
# update the trust radius according to the actual/predicted ratio
|
||||
if rho < 0.25:
|
||||
trust_radius *= 0.25
|
||||
elif rho > 0.75 and hits_boundary:
|
||||
trust_radius = min(2*trust_radius, max_trust_radius)
|
||||
|
||||
# if the ratio is high enough then accept the proposed step
|
||||
if rho > eta:
|
||||
x = x_proposed
|
||||
m = m_proposed
|
||||
|
||||
# append the best guess, call back, increment the iteration count
|
||||
if return_all:
|
||||
allvecs.append(np.copy(x))
|
||||
k += 1
|
||||
|
||||
intermediate_result = OptimizeResult(x=x, fun=m.fun)
|
||||
if _call_callback_maybe_halt(callback, intermediate_result):
|
||||
break
|
||||
|
||||
# check if the gradient is small enough to stop
|
||||
if m.jac_mag < gtol:
|
||||
warnflag = 0
|
||||
break
|
||||
|
||||
# check if we have looked at enough iterations
|
||||
if k >= maxiter:
|
||||
warnflag = 1
|
||||
break
|
||||
|
||||
# print some stuff if requested
|
||||
status_messages = (
|
||||
_status_message['success'],
|
||||
_status_message['maxiter'],
|
||||
'A bad approximation caused failure to predict improvement.',
|
||||
'A linalg error occurred, such as a non-psd Hessian.',
|
||||
)
|
||||
if disp:
|
||||
if warnflag == 0:
|
||||
print(status_messages[warnflag])
|
||||
else:
|
||||
warnings.warn(status_messages[warnflag], RuntimeWarning, stacklevel=3)
|
||||
print(" Current function value: %f" % m.fun)
|
||||
print(" Iterations: %d" % k)
|
||||
print(" Function evaluations: %d" % sf.nfev)
|
||||
print(" Gradient evaluations: %d" % sf.ngev)
|
||||
print(" Hessian evaluations: %d" % (sf.nhev + nhessp[0]))
|
||||
|
||||
result = OptimizeResult(x=x, success=(warnflag == 0), status=warnflag,
|
||||
fun=m.fun, jac=m.jac, nfev=sf.nfev, njev=sf.ngev,
|
||||
nhev=sf.nhev + nhessp[0], nit=k,
|
||||
message=status_messages[warnflag])
|
||||
|
||||
if hess is not None:
|
||||
result['hess'] = m.hess
|
||||
|
||||
if return_all:
|
||||
result['allvecs'] = allvecs
|
||||
|
||||
return result
|
||||
@ -0,0 +1,6 @@
|
||||
"""This module contains the equality constrained SQP solver."""
|
||||
|
||||
|
||||
from .minimize_trustregion_constr import _minimize_trustregion_constr
|
||||
|
||||
__all__ = ['_minimize_trustregion_constr']
|
||||
@ -0,0 +1,390 @@
|
||||
import numpy as np
|
||||
import scipy.sparse as sps
|
||||
|
||||
|
||||
class CanonicalConstraint:
|
||||
"""Canonical constraint to use with trust-constr algorithm.
|
||||
|
||||
It represents the set of constraints of the form::
|
||||
|
||||
f_eq(x) = 0
|
||||
f_ineq(x) <= 0
|
||||
|
||||
where ``f_eq`` and ``f_ineq`` are evaluated by a single function, see
|
||||
below.
|
||||
|
||||
The class is supposed to be instantiated by factory methods, which
|
||||
should prepare the parameters listed below.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
n_eq, n_ineq : int
|
||||
Number of equality and inequality constraints respectively.
|
||||
fun : callable
|
||||
Function defining the constraints. The signature is
|
||||
``fun(x) -> c_eq, c_ineq``, where ``c_eq`` is ndarray with `n_eq`
|
||||
components and ``c_ineq`` is ndarray with `n_ineq` components.
|
||||
jac : callable
|
||||
Function to evaluate the Jacobian of the constraint. The signature
|
||||
is ``jac(x) -> J_eq, J_ineq``, where ``J_eq`` and ``J_ineq`` are
|
||||
either ndarray of csr_matrix of shapes (n_eq, n) and (n_ineq, n),
|
||||
respectively.
|
||||
hess : callable
|
||||
Function to evaluate the Hessian of the constraints multiplied
|
||||
by Lagrange multipliers, that is
|
||||
``dot(f_eq, v_eq) + dot(f_ineq, v_ineq)``. The signature is
|
||||
``hess(x, v_eq, v_ineq) -> H``, where ``H`` has an implied
|
||||
shape (n, n) and provide a matrix-vector product operation
|
||||
``H.dot(p)``.
|
||||
keep_feasible : ndarray, shape (n_ineq,)
|
||||
Mask indicating which inequality constraints should be kept feasible.
|
||||
"""
|
||||
def __init__(self, n_eq, n_ineq, fun, jac, hess, keep_feasible):
|
||||
self.n_eq = n_eq
|
||||
self.n_ineq = n_ineq
|
||||
self.fun = fun
|
||||
self.jac = jac
|
||||
self.hess = hess
|
||||
self.keep_feasible = keep_feasible
|
||||
|
||||
@classmethod
|
||||
def from_PreparedConstraint(cls, constraint):
|
||||
"""Create an instance from `PreparedConstrained` object."""
|
||||
lb, ub = constraint.bounds
|
||||
cfun = constraint.fun
|
||||
keep_feasible = constraint.keep_feasible
|
||||
|
||||
if np.all(lb == -np.inf) and np.all(ub == np.inf):
|
||||
return cls.empty(cfun.n)
|
||||
|
||||
if np.all(lb == -np.inf) and np.all(ub == np.inf):
|
||||
return cls.empty(cfun.n)
|
||||
elif np.all(lb == ub):
|
||||
return cls._equal_to_canonical(cfun, lb)
|
||||
elif np.all(lb == -np.inf):
|
||||
return cls._less_to_canonical(cfun, ub, keep_feasible)
|
||||
elif np.all(ub == np.inf):
|
||||
return cls._greater_to_canonical(cfun, lb, keep_feasible)
|
||||
else:
|
||||
return cls._interval_to_canonical(cfun, lb, ub, keep_feasible)
|
||||
|
||||
@classmethod
|
||||
def empty(cls, n):
|
||||
"""Create an "empty" instance.
|
||||
|
||||
This "empty" instance is required to allow working with unconstrained
|
||||
problems as if they have some constraints.
|
||||
"""
|
||||
empty_fun = np.empty(0)
|
||||
empty_jac = np.empty((0, n))
|
||||
empty_hess = sps.csr_matrix((n, n))
|
||||
|
||||
def fun(x):
|
||||
return empty_fun, empty_fun
|
||||
|
||||
def jac(x):
|
||||
return empty_jac, empty_jac
|
||||
|
||||
def hess(x, v_eq, v_ineq):
|
||||
return empty_hess
|
||||
|
||||
return cls(0, 0, fun, jac, hess, np.empty(0, dtype=np.bool_))
|
||||
|
||||
@classmethod
|
||||
def concatenate(cls, canonical_constraints, sparse_jacobian):
|
||||
"""Concatenate multiple `CanonicalConstraint` into one.
|
||||
|
||||
`sparse_jacobian` (bool) determines the Jacobian format of the
|
||||
concatenated constraint. Note that items in `canonical_constraints`
|
||||
must have their Jacobians in the same format.
|
||||
"""
|
||||
def fun(x):
|
||||
if canonical_constraints:
|
||||
eq_all, ineq_all = zip(
|
||||
*[c.fun(x) for c in canonical_constraints])
|
||||
else:
|
||||
eq_all, ineq_all = [], []
|
||||
|
||||
return np.hstack(eq_all), np.hstack(ineq_all)
|
||||
|
||||
if sparse_jacobian:
|
||||
vstack = sps.vstack
|
||||
else:
|
||||
vstack = np.vstack
|
||||
|
||||
def jac(x):
|
||||
if canonical_constraints:
|
||||
eq_all, ineq_all = zip(
|
||||
*[c.jac(x) for c in canonical_constraints])
|
||||
else:
|
||||
eq_all, ineq_all = [], []
|
||||
|
||||
return vstack(eq_all), vstack(ineq_all)
|
||||
|
||||
def hess(x, v_eq, v_ineq):
|
||||
hess_all = []
|
||||
index_eq = 0
|
||||
index_ineq = 0
|
||||
for c in canonical_constraints:
|
||||
vc_eq = v_eq[index_eq:index_eq + c.n_eq]
|
||||
vc_ineq = v_ineq[index_ineq:index_ineq + c.n_ineq]
|
||||
hess_all.append(c.hess(x, vc_eq, vc_ineq))
|
||||
index_eq += c.n_eq
|
||||
index_ineq += c.n_ineq
|
||||
|
||||
def matvec(p):
|
||||
result = np.zeros_like(p)
|
||||
for h in hess_all:
|
||||
result += h.dot(p)
|
||||
return result
|
||||
|
||||
n = x.shape[0]
|
||||
return sps.linalg.LinearOperator((n, n), matvec, dtype=float)
|
||||
|
||||
n_eq = sum(c.n_eq for c in canonical_constraints)
|
||||
n_ineq = sum(c.n_ineq for c in canonical_constraints)
|
||||
keep_feasible = np.hstack([c.keep_feasible for c in
|
||||
canonical_constraints])
|
||||
|
||||
return cls(n_eq, n_ineq, fun, jac, hess, keep_feasible)
|
||||
|
||||
@classmethod
|
||||
def _equal_to_canonical(cls, cfun, value):
|
||||
empty_fun = np.empty(0)
|
||||
n = cfun.n
|
||||
|
||||
n_eq = value.shape[0]
|
||||
n_ineq = 0
|
||||
keep_feasible = np.empty(0, dtype=bool)
|
||||
|
||||
if cfun.sparse_jacobian:
|
||||
empty_jac = sps.csr_matrix((0, n))
|
||||
else:
|
||||
empty_jac = np.empty((0, n))
|
||||
|
||||
def fun(x):
|
||||
return cfun.fun(x) - value, empty_fun
|
||||
|
||||
def jac(x):
|
||||
return cfun.jac(x), empty_jac
|
||||
|
||||
def hess(x, v_eq, v_ineq):
|
||||
return cfun.hess(x, v_eq)
|
||||
|
||||
empty_fun = np.empty(0)
|
||||
n = cfun.n
|
||||
if cfun.sparse_jacobian:
|
||||
empty_jac = sps.csr_matrix((0, n))
|
||||
else:
|
||||
empty_jac = np.empty((0, n))
|
||||
|
||||
return cls(n_eq, n_ineq, fun, jac, hess, keep_feasible)
|
||||
|
||||
@classmethod
|
||||
def _less_to_canonical(cls, cfun, ub, keep_feasible):
|
||||
empty_fun = np.empty(0)
|
||||
n = cfun.n
|
||||
if cfun.sparse_jacobian:
|
||||
empty_jac = sps.csr_matrix((0, n))
|
||||
else:
|
||||
empty_jac = np.empty((0, n))
|
||||
|
||||
finite_ub = ub < np.inf
|
||||
n_eq = 0
|
||||
n_ineq = np.sum(finite_ub)
|
||||
|
||||
if np.all(finite_ub):
|
||||
def fun(x):
|
||||
return empty_fun, cfun.fun(x) - ub
|
||||
|
||||
def jac(x):
|
||||
return empty_jac, cfun.jac(x)
|
||||
|
||||
def hess(x, v_eq, v_ineq):
|
||||
return cfun.hess(x, v_ineq)
|
||||
else:
|
||||
finite_ub = np.nonzero(finite_ub)[0]
|
||||
keep_feasible = keep_feasible[finite_ub]
|
||||
ub = ub[finite_ub]
|
||||
|
||||
def fun(x):
|
||||
return empty_fun, cfun.fun(x)[finite_ub] - ub
|
||||
|
||||
def jac(x):
|
||||
return empty_jac, cfun.jac(x)[finite_ub]
|
||||
|
||||
def hess(x, v_eq, v_ineq):
|
||||
v = np.zeros(cfun.m)
|
||||
v[finite_ub] = v_ineq
|
||||
return cfun.hess(x, v)
|
||||
|
||||
return cls(n_eq, n_ineq, fun, jac, hess, keep_feasible)
|
||||
|
||||
@classmethod
|
||||
def _greater_to_canonical(cls, cfun, lb, keep_feasible):
|
||||
empty_fun = np.empty(0)
|
||||
n = cfun.n
|
||||
if cfun.sparse_jacobian:
|
||||
empty_jac = sps.csr_matrix((0, n))
|
||||
else:
|
||||
empty_jac = np.empty((0, n))
|
||||
|
||||
finite_lb = lb > -np.inf
|
||||
n_eq = 0
|
||||
n_ineq = np.sum(finite_lb)
|
||||
|
||||
if np.all(finite_lb):
|
||||
def fun(x):
|
||||
return empty_fun, lb - cfun.fun(x)
|
||||
|
||||
def jac(x):
|
||||
return empty_jac, -cfun.jac(x)
|
||||
|
||||
def hess(x, v_eq, v_ineq):
|
||||
return cfun.hess(x, -v_ineq)
|
||||
else:
|
||||
finite_lb = np.nonzero(finite_lb)[0]
|
||||
keep_feasible = keep_feasible[finite_lb]
|
||||
lb = lb[finite_lb]
|
||||
|
||||
def fun(x):
|
||||
return empty_fun, lb - cfun.fun(x)[finite_lb]
|
||||
|
||||
def jac(x):
|
||||
return empty_jac, -cfun.jac(x)[finite_lb]
|
||||
|
||||
def hess(x, v_eq, v_ineq):
|
||||
v = np.zeros(cfun.m)
|
||||
v[finite_lb] = -v_ineq
|
||||
return cfun.hess(x, v)
|
||||
|
||||
return cls(n_eq, n_ineq, fun, jac, hess, keep_feasible)
|
||||
|
||||
@classmethod
|
||||
def _interval_to_canonical(cls, cfun, lb, ub, keep_feasible):
|
||||
lb_inf = lb == -np.inf
|
||||
ub_inf = ub == np.inf
|
||||
equal = lb == ub
|
||||
less = lb_inf & ~ub_inf
|
||||
greater = ub_inf & ~lb_inf
|
||||
interval = ~equal & ~lb_inf & ~ub_inf
|
||||
|
||||
equal = np.nonzero(equal)[0]
|
||||
less = np.nonzero(less)[0]
|
||||
greater = np.nonzero(greater)[0]
|
||||
interval = np.nonzero(interval)[0]
|
||||
n_less = less.shape[0]
|
||||
n_greater = greater.shape[0]
|
||||
n_interval = interval.shape[0]
|
||||
n_ineq = n_less + n_greater + 2 * n_interval
|
||||
n_eq = equal.shape[0]
|
||||
|
||||
keep_feasible = np.hstack((keep_feasible[less],
|
||||
keep_feasible[greater],
|
||||
keep_feasible[interval],
|
||||
keep_feasible[interval]))
|
||||
|
||||
def fun(x):
|
||||
f = cfun.fun(x)
|
||||
eq = f[equal] - lb[equal]
|
||||
le = f[less] - ub[less]
|
||||
ge = lb[greater] - f[greater]
|
||||
il = f[interval] - ub[interval]
|
||||
ig = lb[interval] - f[interval]
|
||||
return eq, np.hstack((le, ge, il, ig))
|
||||
|
||||
def jac(x):
|
||||
J = cfun.jac(x)
|
||||
eq = J[equal]
|
||||
le = J[less]
|
||||
ge = -J[greater]
|
||||
il = J[interval]
|
||||
ig = -il
|
||||
if sps.issparse(J):
|
||||
ineq = sps.vstack((le, ge, il, ig))
|
||||
else:
|
||||
ineq = np.vstack((le, ge, il, ig))
|
||||
return eq, ineq
|
||||
|
||||
def hess(x, v_eq, v_ineq):
|
||||
n_start = 0
|
||||
v_l = v_ineq[n_start:n_start + n_less]
|
||||
n_start += n_less
|
||||
v_g = v_ineq[n_start:n_start + n_greater]
|
||||
n_start += n_greater
|
||||
v_il = v_ineq[n_start:n_start + n_interval]
|
||||
n_start += n_interval
|
||||
v_ig = v_ineq[n_start:n_start + n_interval]
|
||||
|
||||
v = np.zeros_like(lb)
|
||||
v[equal] = v_eq
|
||||
v[less] = v_l
|
||||
v[greater] = -v_g
|
||||
v[interval] = v_il - v_ig
|
||||
|
||||
return cfun.hess(x, v)
|
||||
|
||||
return cls(n_eq, n_ineq, fun, jac, hess, keep_feasible)
|
||||
|
||||
|
||||
def initial_constraints_as_canonical(n, prepared_constraints, sparse_jacobian):
|
||||
"""Convert initial values of the constraints to the canonical format.
|
||||
|
||||
The purpose to avoid one additional call to the constraints at the initial
|
||||
point. It takes saved values in `PreparedConstraint`, modififies and
|
||||
concatenates them to the canonical constraint format.
|
||||
"""
|
||||
c_eq = []
|
||||
c_ineq = []
|
||||
J_eq = []
|
||||
J_ineq = []
|
||||
|
||||
for c in prepared_constraints:
|
||||
f = c.fun.f
|
||||
J = c.fun.J
|
||||
lb, ub = c.bounds
|
||||
if np.all(lb == ub):
|
||||
c_eq.append(f - lb)
|
||||
J_eq.append(J)
|
||||
elif np.all(lb == -np.inf):
|
||||
finite_ub = ub < np.inf
|
||||
c_ineq.append(f[finite_ub] - ub[finite_ub])
|
||||
J_ineq.append(J[finite_ub])
|
||||
elif np.all(ub == np.inf):
|
||||
finite_lb = lb > -np.inf
|
||||
c_ineq.append(lb[finite_lb] - f[finite_lb])
|
||||
J_ineq.append(-J[finite_lb])
|
||||
else:
|
||||
lb_inf = lb == -np.inf
|
||||
ub_inf = ub == np.inf
|
||||
equal = lb == ub
|
||||
less = lb_inf & ~ub_inf
|
||||
greater = ub_inf & ~lb_inf
|
||||
interval = ~equal & ~lb_inf & ~ub_inf
|
||||
|
||||
c_eq.append(f[equal] - lb[equal])
|
||||
c_ineq.append(f[less] - ub[less])
|
||||
c_ineq.append(lb[greater] - f[greater])
|
||||
c_ineq.append(f[interval] - ub[interval])
|
||||
c_ineq.append(lb[interval] - f[interval])
|
||||
|
||||
J_eq.append(J[equal])
|
||||
J_ineq.append(J[less])
|
||||
J_ineq.append(-J[greater])
|
||||
J_ineq.append(J[interval])
|
||||
J_ineq.append(-J[interval])
|
||||
|
||||
c_eq = np.hstack(c_eq) if c_eq else np.empty(0)
|
||||
c_ineq = np.hstack(c_ineq) if c_ineq else np.empty(0)
|
||||
|
||||
if sparse_jacobian:
|
||||
vstack = sps.vstack
|
||||
empty = sps.csr_matrix((0, n))
|
||||
else:
|
||||
vstack = np.vstack
|
||||
empty = np.empty((0, n))
|
||||
|
||||
J_eq = vstack(J_eq) if J_eq else empty
|
||||
J_ineq = vstack(J_ineq) if J_ineq else empty
|
||||
|
||||
return c_eq, c_ineq, J_eq, J_ineq
|
||||
@ -0,0 +1,231 @@
|
||||
"""Byrd-Omojokun Trust-Region SQP method."""
|
||||
|
||||
from scipy.sparse import eye as speye
|
||||
from .projections import projections
|
||||
from .qp_subproblem import modified_dogleg, projected_cg, box_intersections
|
||||
import numpy as np
|
||||
from numpy.linalg import norm
|
||||
|
||||
__all__ = ['equality_constrained_sqp']
|
||||
|
||||
|
||||
def default_scaling(x):
|
||||
n, = np.shape(x)
|
||||
return speye(n)
|
||||
|
||||
|
||||
def equality_constrained_sqp(fun_and_constr, grad_and_jac, lagr_hess,
|
||||
x0, fun0, grad0, constr0,
|
||||
jac0, stop_criteria,
|
||||
state,
|
||||
initial_penalty,
|
||||
initial_trust_radius,
|
||||
factorization_method,
|
||||
trust_lb=None,
|
||||
trust_ub=None,
|
||||
scaling=default_scaling):
|
||||
"""Solve nonlinear equality-constrained problem using trust-region SQP.
|
||||
|
||||
Solve optimization problem:
|
||||
|
||||
minimize fun(x)
|
||||
subject to: constr(x) = 0
|
||||
|
||||
using Byrd-Omojokun Trust-Region SQP method described in [1]_. Several
|
||||
implementation details are based on [2]_ and [3]_, p. 549.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Lalee, Marucha, Jorge Nocedal, and Todd Plantenga. "On the
|
||||
implementation of an algorithm for large-scale equality
|
||||
constrained optimization." SIAM Journal on
|
||||
Optimization 8.3 (1998): 682-706.
|
||||
.. [2] Byrd, Richard H., Mary E. Hribar, and Jorge Nocedal.
|
||||
"An interior point algorithm for large-scale nonlinear
|
||||
programming." SIAM Journal on Optimization 9.4 (1999): 877-900.
|
||||
.. [3] Nocedal, Jorge, and Stephen J. Wright. "Numerical optimization"
|
||||
Second Edition (2006).
|
||||
"""
|
||||
PENALTY_FACTOR = 0.3 # Rho from formula (3.51), reference [2]_, p.891.
|
||||
LARGE_REDUCTION_RATIO = 0.9
|
||||
INTERMEDIARY_REDUCTION_RATIO = 0.3
|
||||
SUFFICIENT_REDUCTION_RATIO = 1e-8 # Eta from reference [2]_, p.892.
|
||||
TRUST_ENLARGEMENT_FACTOR_L = 7.0
|
||||
TRUST_ENLARGEMENT_FACTOR_S = 2.0
|
||||
MAX_TRUST_REDUCTION = 0.5
|
||||
MIN_TRUST_REDUCTION = 0.1
|
||||
SOC_THRESHOLD = 0.1
|
||||
TR_FACTOR = 0.8 # Zeta from formula (3.21), reference [2]_, p.885.
|
||||
BOX_FACTOR = 0.5
|
||||
|
||||
n, = np.shape(x0) # Number of parameters
|
||||
|
||||
# Set default lower and upper bounds.
|
||||
if trust_lb is None:
|
||||
trust_lb = np.full(n, -np.inf)
|
||||
if trust_ub is None:
|
||||
trust_ub = np.full(n, np.inf)
|
||||
|
||||
# Initial values
|
||||
x = np.copy(x0)
|
||||
trust_radius = initial_trust_radius
|
||||
penalty = initial_penalty
|
||||
# Compute Values
|
||||
f = fun0
|
||||
c = grad0
|
||||
b = constr0
|
||||
A = jac0
|
||||
S = scaling(x)
|
||||
# Get projections
|
||||
try:
|
||||
Z, LS, Y = projections(A, factorization_method)
|
||||
except ValueError as e:
|
||||
if str(e) == "expected square matrix":
|
||||
# can be the case if there are more equality
|
||||
# constraints than independent variables
|
||||
raise ValueError(
|
||||
"The 'expected square matrix' error can occur if there are"
|
||||
" more equality constraints than independent variables."
|
||||
" Consider how your constraints are set up, or use"
|
||||
" factorization_method='SVDFactorization'."
|
||||
) from e
|
||||
else:
|
||||
raise e
|
||||
|
||||
# Compute least-square lagrange multipliers
|
||||
v = -LS.dot(c)
|
||||
# Compute Hessian
|
||||
H = lagr_hess(x, v)
|
||||
|
||||
# Update state parameters
|
||||
optimality = norm(c + A.T.dot(v), np.inf)
|
||||
constr_violation = norm(b, np.inf) if len(b) > 0 else 0
|
||||
cg_info = {'niter': 0, 'stop_cond': 0,
|
||||
'hits_boundary': False}
|
||||
|
||||
last_iteration_failed = False
|
||||
while not stop_criteria(state, x, last_iteration_failed,
|
||||
optimality, constr_violation,
|
||||
trust_radius, penalty, cg_info):
|
||||
# Normal Step - `dn`
|
||||
# minimize 1/2*||A dn + b||^2
|
||||
# subject to:
|
||||
# ||dn|| <= TR_FACTOR * trust_radius
|
||||
# BOX_FACTOR * lb <= dn <= BOX_FACTOR * ub.
|
||||
dn = modified_dogleg(A, Y, b,
|
||||
TR_FACTOR*trust_radius,
|
||||
BOX_FACTOR*trust_lb,
|
||||
BOX_FACTOR*trust_ub)
|
||||
|
||||
# Tangential Step - `dt`
|
||||
# Solve the QP problem:
|
||||
# minimize 1/2 dt.T H dt + dt.T (H dn + c)
|
||||
# subject to:
|
||||
# A dt = 0
|
||||
# ||dt|| <= sqrt(trust_radius**2 - ||dn||**2)
|
||||
# lb - dn <= dt <= ub - dn
|
||||
c_t = H.dot(dn) + c
|
||||
b_t = np.zeros_like(b)
|
||||
trust_radius_t = np.sqrt(trust_radius**2 - np.linalg.norm(dn)**2)
|
||||
lb_t = trust_lb - dn
|
||||
ub_t = trust_ub - dn
|
||||
dt, cg_info = projected_cg(H, c_t, Z, Y, b_t,
|
||||
trust_radius_t,
|
||||
lb_t, ub_t)
|
||||
|
||||
# Compute update (normal + tangential steps).
|
||||
d = dn + dt
|
||||
|
||||
# Compute second order model: 1/2 d H d + c.T d + f.
|
||||
quadratic_model = 1/2*(H.dot(d)).dot(d) + c.T.dot(d)
|
||||
# Compute linearized constraint: l = A d + b.
|
||||
linearized_constr = A.dot(d)+b
|
||||
# Compute new penalty parameter according to formula (3.52),
|
||||
# reference [2]_, p.891.
|
||||
vpred = norm(b) - norm(linearized_constr)
|
||||
# Guarantee `vpred` always positive,
|
||||
# regardless of roundoff errors.
|
||||
vpred = max(1e-16, vpred)
|
||||
previous_penalty = penalty
|
||||
if quadratic_model > 0:
|
||||
new_penalty = quadratic_model / ((1-PENALTY_FACTOR)*vpred)
|
||||
penalty = max(penalty, new_penalty)
|
||||
# Compute predicted reduction according to formula (3.52),
|
||||
# reference [2]_, p.891.
|
||||
predicted_reduction = -quadratic_model + penalty*vpred
|
||||
|
||||
# Compute merit function at current point
|
||||
merit_function = f + penalty*norm(b)
|
||||
# Evaluate function and constraints at trial point
|
||||
x_next = x + S.dot(d)
|
||||
f_next, b_next = fun_and_constr(x_next)
|
||||
# Compute merit function at trial point
|
||||
merit_function_next = f_next + penalty*norm(b_next)
|
||||
# Compute actual reduction according to formula (3.54),
|
||||
# reference [2]_, p.892.
|
||||
actual_reduction = merit_function - merit_function_next
|
||||
# Compute reduction ratio
|
||||
reduction_ratio = actual_reduction / predicted_reduction
|
||||
|
||||
# Second order correction (SOC), reference [2]_, p.892.
|
||||
if reduction_ratio < SUFFICIENT_REDUCTION_RATIO and \
|
||||
norm(dn) <= SOC_THRESHOLD * norm(dt):
|
||||
# Compute second order correction
|
||||
y = -Y.dot(b_next)
|
||||
# Make sure increment is inside box constraints
|
||||
_, t, intersect = box_intersections(d, y, trust_lb, trust_ub)
|
||||
# Compute tentative point
|
||||
x_soc = x + S.dot(d + t*y)
|
||||
f_soc, b_soc = fun_and_constr(x_soc)
|
||||
# Recompute actual reduction
|
||||
merit_function_soc = f_soc + penalty*norm(b_soc)
|
||||
actual_reduction_soc = merit_function - merit_function_soc
|
||||
# Recompute reduction ratio
|
||||
reduction_ratio_soc = actual_reduction_soc / predicted_reduction
|
||||
if intersect and reduction_ratio_soc >= SUFFICIENT_REDUCTION_RATIO:
|
||||
x_next = x_soc
|
||||
f_next = f_soc
|
||||
b_next = b_soc
|
||||
reduction_ratio = reduction_ratio_soc
|
||||
|
||||
# Readjust trust region step, formula (3.55), reference [2]_, p.892.
|
||||
if reduction_ratio >= LARGE_REDUCTION_RATIO:
|
||||
trust_radius = max(TRUST_ENLARGEMENT_FACTOR_L * norm(d),
|
||||
trust_radius)
|
||||
elif reduction_ratio >= INTERMEDIARY_REDUCTION_RATIO:
|
||||
trust_radius = max(TRUST_ENLARGEMENT_FACTOR_S * norm(d),
|
||||
trust_radius)
|
||||
# Reduce trust region step, according to reference [3]_, p.696.
|
||||
elif reduction_ratio < SUFFICIENT_REDUCTION_RATIO:
|
||||
trust_reduction = ((1-SUFFICIENT_REDUCTION_RATIO) /
|
||||
(1-reduction_ratio))
|
||||
new_trust_radius = trust_reduction * norm(d)
|
||||
if new_trust_radius >= MAX_TRUST_REDUCTION * trust_radius:
|
||||
trust_radius *= MAX_TRUST_REDUCTION
|
||||
elif new_trust_radius >= MIN_TRUST_REDUCTION * trust_radius:
|
||||
trust_radius = new_trust_radius
|
||||
else:
|
||||
trust_radius *= MIN_TRUST_REDUCTION
|
||||
|
||||
# Update iteration
|
||||
if reduction_ratio >= SUFFICIENT_REDUCTION_RATIO:
|
||||
x = x_next
|
||||
f, b = f_next, b_next
|
||||
c, A = grad_and_jac(x)
|
||||
S = scaling(x)
|
||||
# Get projections
|
||||
Z, LS, Y = projections(A, factorization_method)
|
||||
# Compute least-square lagrange multipliers
|
||||
v = -LS.dot(c)
|
||||
# Compute Hessian
|
||||
H = lagr_hess(x, v)
|
||||
# Set Flag
|
||||
last_iteration_failed = False
|
||||
# Otimality values
|
||||
optimality = norm(c + A.T.dot(v), np.inf)
|
||||
constr_violation = norm(b, np.inf) if len(b) > 0 else 0
|
||||
else:
|
||||
penalty = previous_penalty
|
||||
last_iteration_failed = True
|
||||
|
||||
return x, state
|
||||
@ -0,0 +1,564 @@
|
||||
import time
|
||||
import numpy as np
|
||||
from scipy.sparse.linalg import LinearOperator
|
||||
from .._differentiable_functions import VectorFunction
|
||||
from .._constraints import (
|
||||
NonlinearConstraint, LinearConstraint, PreparedConstraint, Bounds, strict_bounds)
|
||||
from .._hessian_update_strategy import BFGS
|
||||
from .._optimize import OptimizeResult
|
||||
from .._differentiable_functions import ScalarFunction
|
||||
from .equality_constrained_sqp import equality_constrained_sqp
|
||||
from .canonical_constraint import (CanonicalConstraint,
|
||||
initial_constraints_as_canonical)
|
||||
from .tr_interior_point import tr_interior_point
|
||||
from .report import BasicReport, SQPReport, IPReport
|
||||
|
||||
|
||||
TERMINATION_MESSAGES = {
|
||||
0: "The maximum number of function evaluations is exceeded.",
|
||||
1: "`gtol` termination condition is satisfied.",
|
||||
2: "`xtol` termination condition is satisfied.",
|
||||
3: "`callback` function requested termination."
|
||||
}
|
||||
|
||||
|
||||
class HessianLinearOperator:
|
||||
"""Build LinearOperator from hessp"""
|
||||
def __init__(self, hessp, n):
|
||||
self.hessp = hessp
|
||||
self.n = n
|
||||
|
||||
def __call__(self, x, *args):
|
||||
def matvec(p):
|
||||
return self.hessp(x, p, *args)
|
||||
|
||||
return LinearOperator((self.n, self.n), matvec=matvec)
|
||||
|
||||
|
||||
class LagrangianHessian:
|
||||
"""The Hessian of the Lagrangian as LinearOperator.
|
||||
|
||||
The Lagrangian is computed as the objective function plus all the
|
||||
constraints multiplied with some numbers (Lagrange multipliers).
|
||||
"""
|
||||
def __init__(self, n, objective_hess, constraints_hess):
|
||||
self.n = n
|
||||
self.objective_hess = objective_hess
|
||||
self.constraints_hess = constraints_hess
|
||||
|
||||
def __call__(self, x, v_eq=np.empty(0), v_ineq=np.empty(0)):
|
||||
H_objective = self.objective_hess(x)
|
||||
H_constraints = self.constraints_hess(x, v_eq, v_ineq)
|
||||
|
||||
def matvec(p):
|
||||
return H_objective.dot(p) + H_constraints.dot(p)
|
||||
|
||||
return LinearOperator((self.n, self.n), matvec)
|
||||
|
||||
|
||||
def update_state_sqp(state, x, last_iteration_failed, objective, prepared_constraints,
|
||||
start_time, tr_radius, constr_penalty, cg_info):
|
||||
state.nit += 1
|
||||
state.nfev = objective.nfev
|
||||
state.njev = objective.ngev
|
||||
state.nhev = objective.nhev
|
||||
state.constr_nfev = [c.fun.nfev if isinstance(c.fun, VectorFunction) else 0
|
||||
for c in prepared_constraints]
|
||||
state.constr_njev = [c.fun.njev if isinstance(c.fun, VectorFunction) else 0
|
||||
for c in prepared_constraints]
|
||||
state.constr_nhev = [c.fun.nhev if isinstance(c.fun, VectorFunction) else 0
|
||||
for c in prepared_constraints]
|
||||
|
||||
if not last_iteration_failed:
|
||||
state.x = x
|
||||
state.fun = objective.f
|
||||
state.grad = objective.g
|
||||
state.v = [c.fun.v for c in prepared_constraints]
|
||||
state.constr = [c.fun.f for c in prepared_constraints]
|
||||
state.jac = [c.fun.J for c in prepared_constraints]
|
||||
# Compute Lagrangian Gradient
|
||||
state.lagrangian_grad = np.copy(state.grad)
|
||||
for c in prepared_constraints:
|
||||
state.lagrangian_grad += c.fun.J.T.dot(c.fun.v)
|
||||
state.optimality = np.linalg.norm(state.lagrangian_grad, np.inf)
|
||||
# Compute maximum constraint violation
|
||||
state.constr_violation = 0
|
||||
for i in range(len(prepared_constraints)):
|
||||
lb, ub = prepared_constraints[i].bounds
|
||||
c = state.constr[i]
|
||||
state.constr_violation = np.max([state.constr_violation,
|
||||
np.max(lb - c),
|
||||
np.max(c - ub)])
|
||||
|
||||
state.execution_time = time.time() - start_time
|
||||
state.tr_radius = tr_radius
|
||||
state.constr_penalty = constr_penalty
|
||||
state.cg_niter += cg_info["niter"]
|
||||
state.cg_stop_cond = cg_info["stop_cond"]
|
||||
|
||||
return state
|
||||
|
||||
|
||||
def update_state_ip(state, x, last_iteration_failed, objective,
|
||||
prepared_constraints, start_time,
|
||||
tr_radius, constr_penalty, cg_info,
|
||||
barrier_parameter, barrier_tolerance):
|
||||
state = update_state_sqp(state, x, last_iteration_failed, objective,
|
||||
prepared_constraints, start_time, tr_radius,
|
||||
constr_penalty, cg_info)
|
||||
state.barrier_parameter = barrier_parameter
|
||||
state.barrier_tolerance = barrier_tolerance
|
||||
return state
|
||||
|
||||
|
||||
def _minimize_trustregion_constr(fun, x0, args, grad,
|
||||
hess, hessp, bounds, constraints,
|
||||
xtol=1e-8, gtol=1e-8,
|
||||
barrier_tol=1e-8,
|
||||
sparse_jacobian=None,
|
||||
callback=None, maxiter=1000,
|
||||
verbose=0, finite_diff_rel_step=None,
|
||||
initial_constr_penalty=1.0, initial_tr_radius=1.0,
|
||||
initial_barrier_parameter=0.1,
|
||||
initial_barrier_tolerance=0.1,
|
||||
factorization_method=None,
|
||||
disp=False):
|
||||
"""Minimize a scalar function subject to constraints.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
gtol : float, optional
|
||||
Tolerance for termination by the norm of the Lagrangian gradient.
|
||||
The algorithm will terminate when both the infinity norm (i.e., max
|
||||
abs value) of the Lagrangian gradient and the constraint violation
|
||||
are smaller than ``gtol``. Default is 1e-8.
|
||||
xtol : float, optional
|
||||
Tolerance for termination by the change of the independent variable.
|
||||
The algorithm will terminate when ``tr_radius < xtol``, where
|
||||
``tr_radius`` is the radius of the trust region used in the algorithm.
|
||||
Default is 1e-8.
|
||||
barrier_tol : float, optional
|
||||
Threshold on the barrier parameter for the algorithm termination.
|
||||
When inequality constraints are present, the algorithm will terminate
|
||||
only when the barrier parameter is less than `barrier_tol`.
|
||||
Default is 1e-8.
|
||||
sparse_jacobian : {bool, None}, optional
|
||||
Determines how to represent Jacobians of the constraints. If bool,
|
||||
then Jacobians of all the constraints will be converted to the
|
||||
corresponding format. If None (default), then Jacobians won't be
|
||||
converted, but the algorithm can proceed only if they all have the
|
||||
same format.
|
||||
initial_tr_radius: float, optional
|
||||
Initial trust radius. The trust radius gives the maximum distance
|
||||
between solution points in consecutive iterations. It reflects the
|
||||
trust the algorithm puts in the local approximation of the optimization
|
||||
problem. For an accurate local approximation the trust-region should be
|
||||
large and for an approximation valid only close to the current point it
|
||||
should be a small one. The trust radius is automatically updated throughout
|
||||
the optimization process, with ``initial_tr_radius`` being its initial value.
|
||||
Default is 1 (recommended in [1]_, p. 19).
|
||||
initial_constr_penalty : float, optional
|
||||
Initial constraints penalty parameter. The penalty parameter is used for
|
||||
balancing the requirements of decreasing the objective function
|
||||
and satisfying the constraints. It is used for defining the merit function:
|
||||
``merit_function(x) = fun(x) + constr_penalty * constr_norm_l2(x)``,
|
||||
where ``constr_norm_l2(x)`` is the l2 norm of a vector containing all
|
||||
the constraints. The merit function is used for accepting or rejecting
|
||||
trial points and ``constr_penalty`` weights the two conflicting goals
|
||||
of reducing objective function and constraints. The penalty is automatically
|
||||
updated throughout the optimization process, with
|
||||
``initial_constr_penalty`` being its initial value. Default is 1
|
||||
(recommended in [1]_, p 19).
|
||||
initial_barrier_parameter, initial_barrier_tolerance: float, optional
|
||||
Initial barrier parameter and initial tolerance for the barrier subproblem.
|
||||
Both are used only when inequality constraints are present. For dealing with
|
||||
optimization problems ``min_x f(x)`` subject to inequality constraints
|
||||
``c(x) <= 0`` the algorithm introduces slack variables, solving the problem
|
||||
``min_(x,s) f(x) + barrier_parameter*sum(ln(s))`` subject to the equality
|
||||
constraints ``c(x) + s = 0`` instead of the original problem. This subproblem
|
||||
is solved for decreasing values of ``barrier_parameter`` and with decreasing
|
||||
tolerances for the termination, starting with ``initial_barrier_parameter``
|
||||
for the barrier parameter and ``initial_barrier_tolerance`` for the
|
||||
barrier tolerance. Default is 0.1 for both values (recommended in [1]_ p. 19).
|
||||
Also note that ``barrier_parameter`` and ``barrier_tolerance`` are updated
|
||||
with the same prefactor.
|
||||
factorization_method : string or None, optional
|
||||
Method to factorize the Jacobian of the constraints. Use None (default)
|
||||
for the auto selection or one of:
|
||||
|
||||
- 'NormalEquation' (requires scikit-sparse)
|
||||
- 'AugmentedSystem'
|
||||
- 'QRFactorization'
|
||||
- 'SVDFactorization'
|
||||
|
||||
The methods 'NormalEquation' and 'AugmentedSystem' can be used only
|
||||
with sparse constraints. The projections required by the algorithm
|
||||
will be computed using, respectively, the normal equation and the
|
||||
augmented system approaches explained in [1]_. 'NormalEquation'
|
||||
computes the Cholesky factorization of ``A A.T`` and 'AugmentedSystem'
|
||||
performs the LU factorization of an augmented system. They usually
|
||||
provide similar results. 'AugmentedSystem' is used by default for
|
||||
sparse matrices.
|
||||
|
||||
The methods 'QRFactorization' and 'SVDFactorization' can be used
|
||||
only with dense constraints. They compute the required projections
|
||||
using, respectively, QR and SVD factorizations. The 'SVDFactorization'
|
||||
method can cope with Jacobian matrices with deficient row rank and will
|
||||
be used whenever other factorization methods fail (which may imply the
|
||||
conversion of sparse matrices to a dense format when required).
|
||||
By default, 'QRFactorization' is used for dense matrices.
|
||||
finite_diff_rel_step : None or array_like, optional
|
||||
Relative step size for the finite difference approximation.
|
||||
maxiter : int, optional
|
||||
Maximum number of algorithm iterations. Default is 1000.
|
||||
verbose : {0, 1, 2}, optional
|
||||
Level of algorithm's verbosity:
|
||||
|
||||
* 0 (default) : work silently.
|
||||
* 1 : display a termination report.
|
||||
* 2 : display progress during iterations.
|
||||
* 3 : display progress during iterations (more complete report).
|
||||
|
||||
disp : bool, optional
|
||||
If True (default), then `verbose` will be set to 1 if it was 0.
|
||||
|
||||
Returns
|
||||
-------
|
||||
`OptimizeResult` with the fields documented below. Note the following:
|
||||
|
||||
1. All values corresponding to the constraints are ordered as they
|
||||
were passed to the solver. And values corresponding to `bounds`
|
||||
constraints are put *after* other constraints.
|
||||
2. All numbers of function, Jacobian or Hessian evaluations correspond
|
||||
to numbers of actual Python function calls. It means, for example,
|
||||
that if a Jacobian is estimated by finite differences, then the
|
||||
number of Jacobian evaluations will be zero and the number of
|
||||
function evaluations will be incremented by all calls during the
|
||||
finite difference estimation.
|
||||
|
||||
x : ndarray, shape (n,)
|
||||
Solution found.
|
||||
optimality : float
|
||||
Infinity norm of the Lagrangian gradient at the solution.
|
||||
constr_violation : float
|
||||
Maximum constraint violation at the solution.
|
||||
fun : float
|
||||
Objective function at the solution.
|
||||
grad : ndarray, shape (n,)
|
||||
Gradient of the objective function at the solution.
|
||||
lagrangian_grad : ndarray, shape (n,)
|
||||
Gradient of the Lagrangian function at the solution.
|
||||
nit : int
|
||||
Total number of iterations.
|
||||
nfev : integer
|
||||
Number of the objective function evaluations.
|
||||
njev : integer
|
||||
Number of the objective function gradient evaluations.
|
||||
nhev : integer
|
||||
Number of the objective function Hessian evaluations.
|
||||
cg_niter : int
|
||||
Total number of the conjugate gradient method iterations.
|
||||
method : {'equality_constrained_sqp', 'tr_interior_point'}
|
||||
Optimization method used.
|
||||
constr : list of ndarray
|
||||
List of constraint values at the solution.
|
||||
jac : list of {ndarray, sparse matrix}
|
||||
List of the Jacobian matrices of the constraints at the solution.
|
||||
v : list of ndarray
|
||||
List of the Lagrange multipliers for the constraints at the solution.
|
||||
For an inequality constraint a positive multiplier means that the upper
|
||||
bound is active, a negative multiplier means that the lower bound is
|
||||
active and if a multiplier is zero it means the constraint is not
|
||||
active.
|
||||
constr_nfev : list of int
|
||||
Number of constraint evaluations for each of the constraints.
|
||||
constr_njev : list of int
|
||||
Number of Jacobian matrix evaluations for each of the constraints.
|
||||
constr_nhev : list of int
|
||||
Number of Hessian evaluations for each of the constraints.
|
||||
tr_radius : float
|
||||
Radius of the trust region at the last iteration.
|
||||
constr_penalty : float
|
||||
Penalty parameter at the last iteration, see `initial_constr_penalty`.
|
||||
barrier_tolerance : float
|
||||
Tolerance for the barrier subproblem at the last iteration.
|
||||
Only for problems with inequality constraints.
|
||||
barrier_parameter : float
|
||||
Barrier parameter at the last iteration. Only for problems
|
||||
with inequality constraints.
|
||||
execution_time : float
|
||||
Total execution time.
|
||||
message : str
|
||||
Termination message.
|
||||
status : {0, 1, 2, 3}
|
||||
Termination status:
|
||||
|
||||
* 0 : The maximum number of function evaluations is exceeded.
|
||||
* 1 : `gtol` termination condition is satisfied.
|
||||
* 2 : `xtol` termination condition is satisfied.
|
||||
* 3 : `callback` function requested termination.
|
||||
|
||||
cg_stop_cond : int
|
||||
Reason for CG subproblem termination at the last iteration:
|
||||
|
||||
* 0 : CG subproblem not evaluated.
|
||||
* 1 : Iteration limit was reached.
|
||||
* 2 : Reached the trust-region boundary.
|
||||
* 3 : Negative curvature detected.
|
||||
* 4 : Tolerance was satisfied.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Conn, A. R., Gould, N. I., & Toint, P. L.
|
||||
Trust region methods. 2000. Siam. pp. 19.
|
||||
"""
|
||||
x0 = np.atleast_1d(x0).astype(float)
|
||||
n_vars = np.size(x0)
|
||||
if hess is None:
|
||||
if callable(hessp):
|
||||
hess = HessianLinearOperator(hessp, n_vars)
|
||||
else:
|
||||
hess = BFGS()
|
||||
if disp and verbose == 0:
|
||||
verbose = 1
|
||||
|
||||
if bounds is not None:
|
||||
modified_lb = np.nextafter(bounds.lb, -np.inf, where=bounds.lb > -np.inf)
|
||||
modified_ub = np.nextafter(bounds.ub, np.inf, where=bounds.ub < np.inf)
|
||||
modified_lb = np.where(np.isfinite(bounds.lb), modified_lb, bounds.lb)
|
||||
modified_ub = np.where(np.isfinite(bounds.ub), modified_ub, bounds.ub)
|
||||
bounds = Bounds(modified_lb, modified_ub, keep_feasible=bounds.keep_feasible)
|
||||
finite_diff_bounds = strict_bounds(bounds.lb, bounds.ub,
|
||||
bounds.keep_feasible, n_vars)
|
||||
else:
|
||||
finite_diff_bounds = (-np.inf, np.inf)
|
||||
|
||||
# Define Objective Function
|
||||
objective = ScalarFunction(fun, x0, args, grad, hess,
|
||||
finite_diff_rel_step, finite_diff_bounds)
|
||||
|
||||
# Put constraints in list format when needed.
|
||||
if isinstance(constraints, (NonlinearConstraint, LinearConstraint)):
|
||||
constraints = [constraints]
|
||||
|
||||
# Prepare constraints.
|
||||
prepared_constraints = [
|
||||
PreparedConstraint(c, x0, sparse_jacobian, finite_diff_bounds)
|
||||
for c in constraints]
|
||||
|
||||
# Check that all constraints are either sparse or dense.
|
||||
n_sparse = sum(c.fun.sparse_jacobian for c in prepared_constraints)
|
||||
if 0 < n_sparse < len(prepared_constraints):
|
||||
raise ValueError("All constraints must have the same kind of the "
|
||||
"Jacobian --- either all sparse or all dense. "
|
||||
"You can set the sparsity globally by setting "
|
||||
"`sparse_jacobian` to either True of False.")
|
||||
if prepared_constraints:
|
||||
sparse_jacobian = n_sparse > 0
|
||||
|
||||
if bounds is not None:
|
||||
if sparse_jacobian is None:
|
||||
sparse_jacobian = True
|
||||
prepared_constraints.append(PreparedConstraint(bounds, x0,
|
||||
sparse_jacobian))
|
||||
|
||||
# Concatenate initial constraints to the canonical form.
|
||||
c_eq0, c_ineq0, J_eq0, J_ineq0 = initial_constraints_as_canonical(
|
||||
n_vars, prepared_constraints, sparse_jacobian)
|
||||
|
||||
# Prepare all canonical constraints and concatenate it into one.
|
||||
canonical_all = [CanonicalConstraint.from_PreparedConstraint(c)
|
||||
for c in prepared_constraints]
|
||||
|
||||
if len(canonical_all) == 0:
|
||||
canonical = CanonicalConstraint.empty(n_vars)
|
||||
elif len(canonical_all) == 1:
|
||||
canonical = canonical_all[0]
|
||||
else:
|
||||
canonical = CanonicalConstraint.concatenate(canonical_all,
|
||||
sparse_jacobian)
|
||||
|
||||
# Generate the Hessian of the Lagrangian.
|
||||
lagrangian_hess = LagrangianHessian(n_vars, objective.hess, canonical.hess)
|
||||
|
||||
# Choose appropriate method
|
||||
if canonical.n_ineq == 0:
|
||||
method = 'equality_constrained_sqp'
|
||||
else:
|
||||
method = 'tr_interior_point'
|
||||
|
||||
# Construct OptimizeResult
|
||||
state = OptimizeResult(
|
||||
nit=0, nfev=0, njev=0, nhev=0,
|
||||
cg_niter=0, cg_stop_cond=0,
|
||||
fun=objective.f, grad=objective.g,
|
||||
lagrangian_grad=np.copy(objective.g),
|
||||
constr=[c.fun.f for c in prepared_constraints],
|
||||
jac=[c.fun.J for c in prepared_constraints],
|
||||
constr_nfev=[0 for c in prepared_constraints],
|
||||
constr_njev=[0 for c in prepared_constraints],
|
||||
constr_nhev=[0 for c in prepared_constraints],
|
||||
v=[c.fun.v for c in prepared_constraints],
|
||||
method=method)
|
||||
|
||||
# Start counting
|
||||
start_time = time.time()
|
||||
|
||||
# Define stop criteria
|
||||
if method == 'equality_constrained_sqp':
|
||||
def stop_criteria(state, x, last_iteration_failed,
|
||||
optimality, constr_violation,
|
||||
tr_radius, constr_penalty, cg_info):
|
||||
state = update_state_sqp(state, x, last_iteration_failed,
|
||||
objective, prepared_constraints,
|
||||
start_time, tr_radius, constr_penalty,
|
||||
cg_info)
|
||||
if verbose == 2:
|
||||
BasicReport.print_iteration(state.nit,
|
||||
state.nfev,
|
||||
state.cg_niter,
|
||||
state.fun,
|
||||
state.tr_radius,
|
||||
state.optimality,
|
||||
state.constr_violation)
|
||||
elif verbose > 2:
|
||||
SQPReport.print_iteration(state.nit,
|
||||
state.nfev,
|
||||
state.cg_niter,
|
||||
state.fun,
|
||||
state.tr_radius,
|
||||
state.optimality,
|
||||
state.constr_violation,
|
||||
state.constr_penalty,
|
||||
state.cg_stop_cond)
|
||||
state.status = None
|
||||
state.niter = state.nit # Alias for callback (backward-compatibility)
|
||||
if callback is not None:
|
||||
callback_stop = False
|
||||
try:
|
||||
callback_stop = callback(state)
|
||||
except StopIteration:
|
||||
callback_stop = True
|
||||
if callback_stop:
|
||||
state.status = 3
|
||||
return True
|
||||
if state.optimality < gtol and state.constr_violation < gtol:
|
||||
state.status = 1
|
||||
elif state.tr_radius < xtol:
|
||||
state.status = 2
|
||||
elif state.nit >= maxiter:
|
||||
state.status = 0
|
||||
return state.status in (0, 1, 2, 3)
|
||||
elif method == 'tr_interior_point':
|
||||
def stop_criteria(state, x, last_iteration_failed, tr_radius,
|
||||
constr_penalty, cg_info, barrier_parameter,
|
||||
barrier_tolerance):
|
||||
state = update_state_ip(state, x, last_iteration_failed,
|
||||
objective, prepared_constraints,
|
||||
start_time, tr_radius, constr_penalty,
|
||||
cg_info, barrier_parameter, barrier_tolerance)
|
||||
if verbose == 2:
|
||||
BasicReport.print_iteration(state.nit,
|
||||
state.nfev,
|
||||
state.cg_niter,
|
||||
state.fun,
|
||||
state.tr_radius,
|
||||
state.optimality,
|
||||
state.constr_violation)
|
||||
elif verbose > 2:
|
||||
IPReport.print_iteration(state.nit,
|
||||
state.nfev,
|
||||
state.cg_niter,
|
||||
state.fun,
|
||||
state.tr_radius,
|
||||
state.optimality,
|
||||
state.constr_violation,
|
||||
state.constr_penalty,
|
||||
state.barrier_parameter,
|
||||
state.cg_stop_cond)
|
||||
state.status = None
|
||||
state.niter = state.nit # Alias for callback (backward compatibility)
|
||||
if callback is not None:
|
||||
callback_stop = False
|
||||
try:
|
||||
callback_stop = callback(state)
|
||||
except StopIteration:
|
||||
callback_stop = True
|
||||
if callback_stop:
|
||||
state.status = 3
|
||||
return True
|
||||
if state.optimality < gtol and state.constr_violation < gtol:
|
||||
state.status = 1
|
||||
elif (state.tr_radius < xtol
|
||||
and state.barrier_parameter < barrier_tol):
|
||||
state.status = 2
|
||||
elif state.nit >= maxiter:
|
||||
state.status = 0
|
||||
return state.status in (0, 1, 2, 3)
|
||||
|
||||
if verbose == 2:
|
||||
BasicReport.print_header()
|
||||
elif verbose > 2:
|
||||
if method == 'equality_constrained_sqp':
|
||||
SQPReport.print_header()
|
||||
elif method == 'tr_interior_point':
|
||||
IPReport.print_header()
|
||||
|
||||
# Call inferior function to do the optimization
|
||||
if method == 'equality_constrained_sqp':
|
||||
def fun_and_constr(x):
|
||||
f = objective.fun(x)
|
||||
c_eq, _ = canonical.fun(x)
|
||||
return f, c_eq
|
||||
|
||||
def grad_and_jac(x):
|
||||
g = objective.grad(x)
|
||||
J_eq, _ = canonical.jac(x)
|
||||
return g, J_eq
|
||||
|
||||
_, result = equality_constrained_sqp(
|
||||
fun_and_constr, grad_and_jac, lagrangian_hess,
|
||||
x0, objective.f, objective.g,
|
||||
c_eq0, J_eq0,
|
||||
stop_criteria, state,
|
||||
initial_constr_penalty, initial_tr_radius,
|
||||
factorization_method)
|
||||
|
||||
elif method == 'tr_interior_point':
|
||||
_, result = tr_interior_point(
|
||||
objective.fun, objective.grad, lagrangian_hess,
|
||||
n_vars, canonical.n_ineq, canonical.n_eq,
|
||||
canonical.fun, canonical.jac,
|
||||
x0, objective.f, objective.g,
|
||||
c_ineq0, J_ineq0, c_eq0, J_eq0,
|
||||
stop_criteria,
|
||||
canonical.keep_feasible,
|
||||
xtol, state, initial_barrier_parameter,
|
||||
initial_barrier_tolerance,
|
||||
initial_constr_penalty, initial_tr_radius,
|
||||
factorization_method)
|
||||
|
||||
# Status 3 occurs when the callback function requests termination,
|
||||
# this is assumed to not be a success.
|
||||
result.success = True if result.status in (1, 2) else False
|
||||
result.message = TERMINATION_MESSAGES[result.status]
|
||||
|
||||
# Alias (for backward compatibility with 1.1.0)
|
||||
result.niter = result.nit
|
||||
|
||||
if verbose == 2:
|
||||
BasicReport.print_footer()
|
||||
elif verbose > 2:
|
||||
if method == 'equality_constrained_sqp':
|
||||
SQPReport.print_footer()
|
||||
elif method == 'tr_interior_point':
|
||||
IPReport.print_footer()
|
||||
if verbose >= 1:
|
||||
print(result.message)
|
||||
print("Number of iterations: {}, function evaluations: {}, "
|
||||
"CG iterations: {}, optimality: {:.2e}, "
|
||||
"constraint violation: {:.2e}, execution time: {:4.2} s."
|
||||
.format(result.nit, result.nfev, result.cg_niter,
|
||||
result.optimality, result.constr_violation,
|
||||
result.execution_time))
|
||||
return result
|
||||
@ -0,0 +1,407 @@
|
||||
"""Basic linear factorizations needed by the solver."""
|
||||
|
||||
from scipy.sparse import (bmat, csc_matrix, eye, issparse)
|
||||
from scipy.sparse.linalg import LinearOperator
|
||||
import scipy.linalg
|
||||
import scipy.sparse.linalg
|
||||
try:
|
||||
from sksparse.cholmod import cholesky_AAt
|
||||
sksparse_available = True
|
||||
except ImportError:
|
||||
import warnings
|
||||
sksparse_available = False
|
||||
import numpy as np
|
||||
from warnings import warn
|
||||
|
||||
__all__ = [
|
||||
'orthogonality',
|
||||
'projections',
|
||||
]
|
||||
|
||||
|
||||
def orthogonality(A, g):
|
||||
"""Measure orthogonality between a vector and the null space of a matrix.
|
||||
|
||||
Compute a measure of orthogonality between the null space
|
||||
of the (possibly sparse) matrix ``A`` and a given vector ``g``.
|
||||
|
||||
The formula is a simplified (and cheaper) version of formula (3.13)
|
||||
from [1]_.
|
||||
``orth = norm(A g, ord=2)/(norm(A, ord='fro')*norm(g, ord=2))``.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Gould, Nicholas IM, Mary E. Hribar, and Jorge Nocedal.
|
||||
"On the solution of equality constrained quadratic
|
||||
programming problems arising in optimization."
|
||||
SIAM Journal on Scientific Computing 23.4 (2001): 1376-1395.
|
||||
"""
|
||||
# Compute vector norms
|
||||
norm_g = np.linalg.norm(g)
|
||||
# Compute Froebnius norm of the matrix A
|
||||
if issparse(A):
|
||||
norm_A = scipy.sparse.linalg.norm(A, ord='fro')
|
||||
else:
|
||||
norm_A = np.linalg.norm(A, ord='fro')
|
||||
|
||||
# Check if norms are zero
|
||||
if norm_g == 0 or norm_A == 0:
|
||||
return 0
|
||||
|
||||
norm_A_g = np.linalg.norm(A.dot(g))
|
||||
# Orthogonality measure
|
||||
orth = norm_A_g / (norm_A*norm_g)
|
||||
return orth
|
||||
|
||||
|
||||
def normal_equation_projections(A, m, n, orth_tol, max_refin, tol):
|
||||
"""Return linear operators for matrix A using ``NormalEquation`` approach.
|
||||
"""
|
||||
# Cholesky factorization
|
||||
factor = cholesky_AAt(A)
|
||||
|
||||
# z = x - A.T inv(A A.T) A x
|
||||
def null_space(x):
|
||||
v = factor(A.dot(x))
|
||||
z = x - A.T.dot(v)
|
||||
|
||||
# Iterative refinement to improve roundoff
|
||||
# errors described in [2]_, algorithm 5.1.
|
||||
k = 0
|
||||
while orthogonality(A, z) > orth_tol:
|
||||
if k >= max_refin:
|
||||
break
|
||||
# z_next = z - A.T inv(A A.T) A z
|
||||
v = factor(A.dot(z))
|
||||
z = z - A.T.dot(v)
|
||||
k += 1
|
||||
|
||||
return z
|
||||
|
||||
# z = inv(A A.T) A x
|
||||
def least_squares(x):
|
||||
return factor(A.dot(x))
|
||||
|
||||
# z = A.T inv(A A.T) x
|
||||
def row_space(x):
|
||||
return A.T.dot(factor(x))
|
||||
|
||||
return null_space, least_squares, row_space
|
||||
|
||||
|
||||
def augmented_system_projections(A, m, n, orth_tol, max_refin, tol):
|
||||
"""Return linear operators for matrix A - ``AugmentedSystem``."""
|
||||
# Form augmented system
|
||||
K = csc_matrix(bmat([[eye(n), A.T], [A, None]]))
|
||||
# LU factorization
|
||||
# TODO: Use a symmetric indefinite factorization
|
||||
# to solve the system twice as fast (because
|
||||
# of the symmetry).
|
||||
try:
|
||||
solve = scipy.sparse.linalg.factorized(K)
|
||||
except RuntimeError:
|
||||
warn("Singular Jacobian matrix. Using dense SVD decomposition to "
|
||||
"perform the factorizations.",
|
||||
stacklevel=3)
|
||||
return svd_factorization_projections(A.toarray(),
|
||||
m, n, orth_tol,
|
||||
max_refin, tol)
|
||||
|
||||
# z = x - A.T inv(A A.T) A x
|
||||
# is computed solving the extended system:
|
||||
# [I A.T] * [ z ] = [x]
|
||||
# [A O ] [aux] [0]
|
||||
def null_space(x):
|
||||
# v = [x]
|
||||
# [0]
|
||||
v = np.hstack([x, np.zeros(m)])
|
||||
# lu_sol = [ z ]
|
||||
# [aux]
|
||||
lu_sol = solve(v)
|
||||
z = lu_sol[:n]
|
||||
|
||||
# Iterative refinement to improve roundoff
|
||||
# errors described in [2]_, algorithm 5.2.
|
||||
k = 0
|
||||
while orthogonality(A, z) > orth_tol:
|
||||
if k >= max_refin:
|
||||
break
|
||||
# new_v = [x] - [I A.T] * [ z ]
|
||||
# [0] [A O ] [aux]
|
||||
new_v = v - K.dot(lu_sol)
|
||||
# [I A.T] * [delta z ] = new_v
|
||||
# [A O ] [delta aux]
|
||||
lu_update = solve(new_v)
|
||||
# [ z ] += [delta z ]
|
||||
# [aux] [delta aux]
|
||||
lu_sol += lu_update
|
||||
z = lu_sol[:n]
|
||||
k += 1
|
||||
|
||||
# return z = x - A.T inv(A A.T) A x
|
||||
return z
|
||||
|
||||
# z = inv(A A.T) A x
|
||||
# is computed solving the extended system:
|
||||
# [I A.T] * [aux] = [x]
|
||||
# [A O ] [ z ] [0]
|
||||
def least_squares(x):
|
||||
# v = [x]
|
||||
# [0]
|
||||
v = np.hstack([x, np.zeros(m)])
|
||||
# lu_sol = [aux]
|
||||
# [ z ]
|
||||
lu_sol = solve(v)
|
||||
# return z = inv(A A.T) A x
|
||||
return lu_sol[n:m+n]
|
||||
|
||||
# z = A.T inv(A A.T) x
|
||||
# is computed solving the extended system:
|
||||
# [I A.T] * [ z ] = [0]
|
||||
# [A O ] [aux] [x]
|
||||
def row_space(x):
|
||||
# v = [0]
|
||||
# [x]
|
||||
v = np.hstack([np.zeros(n), x])
|
||||
# lu_sol = [ z ]
|
||||
# [aux]
|
||||
lu_sol = solve(v)
|
||||
# return z = A.T inv(A A.T) x
|
||||
return lu_sol[:n]
|
||||
|
||||
return null_space, least_squares, row_space
|
||||
|
||||
|
||||
def qr_factorization_projections(A, m, n, orth_tol, max_refin, tol):
|
||||
"""Return linear operators for matrix A using ``QRFactorization`` approach.
|
||||
"""
|
||||
# QRFactorization
|
||||
Q, R, P = scipy.linalg.qr(A.T, pivoting=True, mode='economic')
|
||||
|
||||
if np.linalg.norm(R[-1, :], np.inf) < tol:
|
||||
warn('Singular Jacobian matrix. Using SVD decomposition to ' +
|
||||
'perform the factorizations.',
|
||||
stacklevel=3)
|
||||
return svd_factorization_projections(A, m, n,
|
||||
orth_tol,
|
||||
max_refin,
|
||||
tol)
|
||||
|
||||
# z = x - A.T inv(A A.T) A x
|
||||
def null_space(x):
|
||||
# v = P inv(R) Q.T x
|
||||
aux1 = Q.T.dot(x)
|
||||
aux2 = scipy.linalg.solve_triangular(R, aux1, lower=False)
|
||||
v = np.zeros(m)
|
||||
v[P] = aux2
|
||||
z = x - A.T.dot(v)
|
||||
|
||||
# Iterative refinement to improve roundoff
|
||||
# errors described in [2]_, algorithm 5.1.
|
||||
k = 0
|
||||
while orthogonality(A, z) > orth_tol:
|
||||
if k >= max_refin:
|
||||
break
|
||||
# v = P inv(R) Q.T x
|
||||
aux1 = Q.T.dot(z)
|
||||
aux2 = scipy.linalg.solve_triangular(R, aux1, lower=False)
|
||||
v[P] = aux2
|
||||
# z_next = z - A.T v
|
||||
z = z - A.T.dot(v)
|
||||
k += 1
|
||||
|
||||
return z
|
||||
|
||||
# z = inv(A A.T) A x
|
||||
def least_squares(x):
|
||||
# z = P inv(R) Q.T x
|
||||
aux1 = Q.T.dot(x)
|
||||
aux2 = scipy.linalg.solve_triangular(R, aux1, lower=False)
|
||||
z = np.zeros(m)
|
||||
z[P] = aux2
|
||||
return z
|
||||
|
||||
# z = A.T inv(A A.T) x
|
||||
def row_space(x):
|
||||
# z = Q inv(R.T) P.T x
|
||||
aux1 = x[P]
|
||||
aux2 = scipy.linalg.solve_triangular(R, aux1,
|
||||
lower=False,
|
||||
trans='T')
|
||||
z = Q.dot(aux2)
|
||||
return z
|
||||
|
||||
return null_space, least_squares, row_space
|
||||
|
||||
|
||||
def svd_factorization_projections(A, m, n, orth_tol, max_refin, tol):
|
||||
"""Return linear operators for matrix A using ``SVDFactorization`` approach.
|
||||
"""
|
||||
# SVD Factorization
|
||||
U, s, Vt = scipy.linalg.svd(A, full_matrices=False)
|
||||
|
||||
# Remove dimensions related with very small singular values
|
||||
U = U[:, s > tol]
|
||||
Vt = Vt[s > tol, :]
|
||||
s = s[s > tol]
|
||||
|
||||
# z = x - A.T inv(A A.T) A x
|
||||
def null_space(x):
|
||||
# v = U 1/s V.T x = inv(A A.T) A x
|
||||
aux1 = Vt.dot(x)
|
||||
aux2 = 1/s*aux1
|
||||
v = U.dot(aux2)
|
||||
z = x - A.T.dot(v)
|
||||
|
||||
# Iterative refinement to improve roundoff
|
||||
# errors described in [2]_, algorithm 5.1.
|
||||
k = 0
|
||||
while orthogonality(A, z) > orth_tol:
|
||||
if k >= max_refin:
|
||||
break
|
||||
# v = U 1/s V.T x = inv(A A.T) A x
|
||||
aux1 = Vt.dot(z)
|
||||
aux2 = 1/s*aux1
|
||||
v = U.dot(aux2)
|
||||
# z_next = z - A.T v
|
||||
z = z - A.T.dot(v)
|
||||
k += 1
|
||||
|
||||
return z
|
||||
|
||||
# z = inv(A A.T) A x
|
||||
def least_squares(x):
|
||||
# z = U 1/s V.T x = inv(A A.T) A x
|
||||
aux1 = Vt.dot(x)
|
||||
aux2 = 1/s*aux1
|
||||
z = U.dot(aux2)
|
||||
return z
|
||||
|
||||
# z = A.T inv(A A.T) x
|
||||
def row_space(x):
|
||||
# z = V 1/s U.T x
|
||||
aux1 = U.T.dot(x)
|
||||
aux2 = 1/s*aux1
|
||||
z = Vt.T.dot(aux2)
|
||||
return z
|
||||
|
||||
return null_space, least_squares, row_space
|
||||
|
||||
|
||||
def projections(A, method=None, orth_tol=1e-12, max_refin=3, tol=1e-15):
|
||||
"""Return three linear operators related with a given matrix A.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : sparse matrix (or ndarray), shape (m, n)
|
||||
Matrix ``A`` used in the projection.
|
||||
method : string, optional
|
||||
Method used for compute the given linear
|
||||
operators. Should be one of:
|
||||
|
||||
- 'NormalEquation': The operators
|
||||
will be computed using the
|
||||
so-called normal equation approach
|
||||
explained in [1]_. In order to do
|
||||
so the Cholesky factorization of
|
||||
``(A A.T)`` is computed. Exclusive
|
||||
for sparse matrices.
|
||||
- 'AugmentedSystem': The operators
|
||||
will be computed using the
|
||||
so-called augmented system approach
|
||||
explained in [1]_. Exclusive
|
||||
for sparse matrices.
|
||||
- 'QRFactorization': Compute projections
|
||||
using QR factorization. Exclusive for
|
||||
dense matrices.
|
||||
- 'SVDFactorization': Compute projections
|
||||
using SVD factorization. Exclusive for
|
||||
dense matrices.
|
||||
|
||||
orth_tol : float, optional
|
||||
Tolerance for iterative refinements.
|
||||
max_refin : int, optional
|
||||
Maximum number of iterative refinements.
|
||||
tol : float, optional
|
||||
Tolerance for singular values.
|
||||
|
||||
Returns
|
||||
-------
|
||||
Z : LinearOperator, shape (n, n)
|
||||
Null-space operator. For a given vector ``x``,
|
||||
the null space operator is equivalent to apply
|
||||
a projection matrix ``P = I - A.T inv(A A.T) A``
|
||||
to the vector. It can be shown that this is
|
||||
equivalent to project ``x`` into the null space
|
||||
of A.
|
||||
LS : LinearOperator, shape (m, n)
|
||||
Least-squares operator. For a given vector ``x``,
|
||||
the least-squares operator is equivalent to apply a
|
||||
pseudoinverse matrix ``pinv(A.T) = inv(A A.T) A``
|
||||
to the vector. It can be shown that this vector
|
||||
``pinv(A.T) x`` is the least_square solution to
|
||||
``A.T y = x``.
|
||||
Y : LinearOperator, shape (n, m)
|
||||
Row-space operator. For a given vector ``x``,
|
||||
the row-space operator is equivalent to apply a
|
||||
projection matrix ``Q = A.T inv(A A.T)``
|
||||
to the vector. It can be shown that this
|
||||
vector ``y = Q x`` the minimum norm solution
|
||||
of ``A y = x``.
|
||||
|
||||
Notes
|
||||
-----
|
||||
Uses iterative refinements described in [1]
|
||||
during the computation of ``Z`` in order to
|
||||
cope with the possibility of large roundoff errors.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Gould, Nicholas IM, Mary E. Hribar, and Jorge Nocedal.
|
||||
"On the solution of equality constrained quadratic
|
||||
programming problems arising in optimization."
|
||||
SIAM Journal on Scientific Computing 23.4 (2001): 1376-1395.
|
||||
"""
|
||||
m, n = np.shape(A)
|
||||
|
||||
# The factorization of an empty matrix
|
||||
# only works for the sparse representation.
|
||||
if m*n == 0:
|
||||
A = csc_matrix(A)
|
||||
|
||||
# Check Argument
|
||||
if issparse(A):
|
||||
if method is None:
|
||||
method = "AugmentedSystem"
|
||||
if method not in ("NormalEquation", "AugmentedSystem"):
|
||||
raise ValueError("Method not allowed for sparse matrix.")
|
||||
if method == "NormalEquation" and not sksparse_available:
|
||||
warnings.warn("Only accepts 'NormalEquation' option when "
|
||||
"scikit-sparse is available. Using "
|
||||
"'AugmentedSystem' option instead.",
|
||||
ImportWarning, stacklevel=3)
|
||||
method = 'AugmentedSystem'
|
||||
else:
|
||||
if method is None:
|
||||
method = "QRFactorization"
|
||||
if method not in ("QRFactorization", "SVDFactorization"):
|
||||
raise ValueError("Method not allowed for dense array.")
|
||||
|
||||
if method == 'NormalEquation':
|
||||
null_space, least_squares, row_space \
|
||||
= normal_equation_projections(A, m, n, orth_tol, max_refin, tol)
|
||||
elif method == 'AugmentedSystem':
|
||||
null_space, least_squares, row_space \
|
||||
= augmented_system_projections(A, m, n, orth_tol, max_refin, tol)
|
||||
elif method == "QRFactorization":
|
||||
null_space, least_squares, row_space \
|
||||
= qr_factorization_projections(A, m, n, orth_tol, max_refin, tol)
|
||||
elif method == "SVDFactorization":
|
||||
null_space, least_squares, row_space \
|
||||
= svd_factorization_projections(A, m, n, orth_tol, max_refin, tol)
|
||||
|
||||
Z = LinearOperator((n, n), null_space)
|
||||
LS = LinearOperator((m, n), least_squares)
|
||||
Y = LinearOperator((n, m), row_space)
|
||||
|
||||
return Z, LS, Y
|
||||
@ -0,0 +1,637 @@
|
||||
"""Equality-constrained quadratic programming solvers."""
|
||||
|
||||
from scipy.sparse import (linalg, bmat, csc_matrix)
|
||||
from math import copysign
|
||||
import numpy as np
|
||||
from numpy.linalg import norm
|
||||
|
||||
__all__ = [
|
||||
'eqp_kktfact',
|
||||
'sphere_intersections',
|
||||
'box_intersections',
|
||||
'box_sphere_intersections',
|
||||
'inside_box_boundaries',
|
||||
'modified_dogleg',
|
||||
'projected_cg'
|
||||
]
|
||||
|
||||
|
||||
# For comparison with the projected CG
|
||||
def eqp_kktfact(H, c, A, b):
|
||||
"""Solve equality-constrained quadratic programming (EQP) problem.
|
||||
|
||||
Solve ``min 1/2 x.T H x + x.t c`` subject to ``A x + b = 0``
|
||||
using direct factorization of the KKT system.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
H : sparse matrix, shape (n, n)
|
||||
Hessian matrix of the EQP problem.
|
||||
c : array_like, shape (n,)
|
||||
Gradient of the quadratic objective function.
|
||||
A : sparse matrix
|
||||
Jacobian matrix of the EQP problem.
|
||||
b : array_like, shape (m,)
|
||||
Right-hand side of the constraint equation.
|
||||
|
||||
Returns
|
||||
-------
|
||||
x : array_like, shape (n,)
|
||||
Solution of the KKT problem.
|
||||
lagrange_multipliers : ndarray, shape (m,)
|
||||
Lagrange multipliers of the KKT problem.
|
||||
"""
|
||||
n, = np.shape(c) # Number of parameters
|
||||
m, = np.shape(b) # Number of constraints
|
||||
|
||||
# Karush-Kuhn-Tucker matrix of coefficients.
|
||||
# Defined as in Nocedal/Wright "Numerical
|
||||
# Optimization" p.452 in Eq. (16.4).
|
||||
kkt_matrix = csc_matrix(bmat([[H, A.T], [A, None]]))
|
||||
# Vector of coefficients.
|
||||
kkt_vec = np.hstack([-c, -b])
|
||||
|
||||
# TODO: Use a symmetric indefinite factorization
|
||||
# to solve the system twice as fast (because
|
||||
# of the symmetry).
|
||||
lu = linalg.splu(kkt_matrix)
|
||||
kkt_sol = lu.solve(kkt_vec)
|
||||
x = kkt_sol[:n]
|
||||
lagrange_multipliers = -kkt_sol[n:n+m]
|
||||
|
||||
return x, lagrange_multipliers
|
||||
|
||||
|
||||
def sphere_intersections(z, d, trust_radius,
|
||||
entire_line=False):
|
||||
"""Find the intersection between segment (or line) and spherical constraints.
|
||||
|
||||
Find the intersection between the segment (or line) defined by the
|
||||
parametric equation ``x(t) = z + t*d`` and the ball
|
||||
``||x|| <= trust_radius``.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
z : array_like, shape (n,)
|
||||
Initial point.
|
||||
d : array_like, shape (n,)
|
||||
Direction.
|
||||
trust_radius : float
|
||||
Ball radius.
|
||||
entire_line : bool, optional
|
||||
When ``True``, the function returns the intersection between the line
|
||||
``x(t) = z + t*d`` (``t`` can assume any value) and the ball
|
||||
``||x|| <= trust_radius``. When ``False``, the function returns the intersection
|
||||
between the segment ``x(t) = z + t*d``, ``0 <= t <= 1``, and the ball.
|
||||
|
||||
Returns
|
||||
-------
|
||||
ta, tb : float
|
||||
The line/segment ``x(t) = z + t*d`` is inside the ball for
|
||||
for ``ta <= t <= tb``.
|
||||
intersect : bool
|
||||
When ``True``, there is a intersection between the line/segment
|
||||
and the sphere. On the other hand, when ``False``, there is no
|
||||
intersection.
|
||||
"""
|
||||
# Special case when d=0
|
||||
if norm(d) == 0:
|
||||
return 0, 0, False
|
||||
# Check for inf trust_radius
|
||||
if np.isinf(trust_radius):
|
||||
if entire_line:
|
||||
ta = -np.inf
|
||||
tb = np.inf
|
||||
else:
|
||||
ta = 0
|
||||
tb = 1
|
||||
intersect = True
|
||||
return ta, tb, intersect
|
||||
|
||||
a = np.dot(d, d)
|
||||
b = 2 * np.dot(z, d)
|
||||
c = np.dot(z, z) - trust_radius**2
|
||||
discriminant = b*b - 4*a*c
|
||||
if discriminant < 0:
|
||||
intersect = False
|
||||
return 0, 0, intersect
|
||||
sqrt_discriminant = np.sqrt(discriminant)
|
||||
|
||||
# The following calculation is mathematically
|
||||
# equivalent to:
|
||||
# ta = (-b - sqrt_discriminant) / (2*a)
|
||||
# tb = (-b + sqrt_discriminant) / (2*a)
|
||||
# but produce smaller round off errors.
|
||||
# Look at Matrix Computation p.97
|
||||
# for a better justification.
|
||||
aux = b + copysign(sqrt_discriminant, b)
|
||||
ta = -aux / (2*a)
|
||||
tb = -2*c / aux
|
||||
ta, tb = sorted([ta, tb])
|
||||
|
||||
if entire_line:
|
||||
intersect = True
|
||||
else:
|
||||
# Checks to see if intersection happens
|
||||
# within vectors length.
|
||||
if tb < 0 or ta > 1:
|
||||
intersect = False
|
||||
ta = 0
|
||||
tb = 0
|
||||
else:
|
||||
intersect = True
|
||||
# Restrict intersection interval
|
||||
# between 0 and 1.
|
||||
ta = max(0, ta)
|
||||
tb = min(1, tb)
|
||||
|
||||
return ta, tb, intersect
|
||||
|
||||
|
||||
def box_intersections(z, d, lb, ub,
|
||||
entire_line=False):
|
||||
"""Find the intersection between segment (or line) and box constraints.
|
||||
|
||||
Find the intersection between the segment (or line) defined by the
|
||||
parametric equation ``x(t) = z + t*d`` and the rectangular box
|
||||
``lb <= x <= ub``.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
z : array_like, shape (n,)
|
||||
Initial point.
|
||||
d : array_like, shape (n,)
|
||||
Direction.
|
||||
lb : array_like, shape (n,)
|
||||
Lower bounds to each one of the components of ``x``. Used
|
||||
to delimit the rectangular box.
|
||||
ub : array_like, shape (n, )
|
||||
Upper bounds to each one of the components of ``x``. Used
|
||||
to delimit the rectangular box.
|
||||
entire_line : bool, optional
|
||||
When ``True``, the function returns the intersection between the line
|
||||
``x(t) = z + t*d`` (``t`` can assume any value) and the rectangular
|
||||
box. When ``False``, the function returns the intersection between the segment
|
||||
``x(t) = z + t*d``, ``0 <= t <= 1``, and the rectangular box.
|
||||
|
||||
Returns
|
||||
-------
|
||||
ta, tb : float
|
||||
The line/segment ``x(t) = z + t*d`` is inside the box for
|
||||
for ``ta <= t <= tb``.
|
||||
intersect : bool
|
||||
When ``True``, there is a intersection between the line (or segment)
|
||||
and the rectangular box. On the other hand, when ``False``, there is no
|
||||
intersection.
|
||||
"""
|
||||
# Make sure it is a numpy array
|
||||
z = np.asarray(z)
|
||||
d = np.asarray(d)
|
||||
lb = np.asarray(lb)
|
||||
ub = np.asarray(ub)
|
||||
# Special case when d=0
|
||||
if norm(d) == 0:
|
||||
return 0, 0, False
|
||||
|
||||
# Get values for which d==0
|
||||
zero_d = (d == 0)
|
||||
# If the boundaries are not satisfied for some coordinate
|
||||
# for which "d" is zero, there is no box-line intersection.
|
||||
if (z[zero_d] < lb[zero_d]).any() or (z[zero_d] > ub[zero_d]).any():
|
||||
intersect = False
|
||||
return 0, 0, intersect
|
||||
# Remove values for which d is zero
|
||||
not_zero_d = np.logical_not(zero_d)
|
||||
z = z[not_zero_d]
|
||||
d = d[not_zero_d]
|
||||
lb = lb[not_zero_d]
|
||||
ub = ub[not_zero_d]
|
||||
|
||||
# Find a series of intervals (t_lb[i], t_ub[i]).
|
||||
t_lb = (lb-z) / d
|
||||
t_ub = (ub-z) / d
|
||||
# Get the intersection of all those intervals.
|
||||
ta = max(np.minimum(t_lb, t_ub))
|
||||
tb = min(np.maximum(t_lb, t_ub))
|
||||
|
||||
# Check if intersection is feasible
|
||||
if ta <= tb:
|
||||
intersect = True
|
||||
else:
|
||||
intersect = False
|
||||
# Checks to see if intersection happens within vectors length.
|
||||
if not entire_line:
|
||||
if tb < 0 or ta > 1:
|
||||
intersect = False
|
||||
ta = 0
|
||||
tb = 0
|
||||
else:
|
||||
# Restrict intersection interval between 0 and 1.
|
||||
ta = max(0, ta)
|
||||
tb = min(1, tb)
|
||||
|
||||
return ta, tb, intersect
|
||||
|
||||
|
||||
def box_sphere_intersections(z, d, lb, ub, trust_radius,
|
||||
entire_line=False,
|
||||
extra_info=False):
|
||||
"""Find the intersection between segment (or line) and box/sphere constraints.
|
||||
|
||||
Find the intersection between the segment (or line) defined by the
|
||||
parametric equation ``x(t) = z + t*d``, the rectangular box
|
||||
``lb <= x <= ub`` and the ball ``||x|| <= trust_radius``.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
z : array_like, shape (n,)
|
||||
Initial point.
|
||||
d : array_like, shape (n,)
|
||||
Direction.
|
||||
lb : array_like, shape (n,)
|
||||
Lower bounds to each one of the components of ``x``. Used
|
||||
to delimit the rectangular box.
|
||||
ub : array_like, shape (n, )
|
||||
Upper bounds to each one of the components of ``x``. Used
|
||||
to delimit the rectangular box.
|
||||
trust_radius : float
|
||||
Ball radius.
|
||||
entire_line : bool, optional
|
||||
When ``True``, the function returns the intersection between the line
|
||||
``x(t) = z + t*d`` (``t`` can assume any value) and the constraints.
|
||||
When ``False``, the function returns the intersection between the segment
|
||||
``x(t) = z + t*d``, ``0 <= t <= 1`` and the constraints.
|
||||
extra_info : bool, optional
|
||||
When ``True``, the function returns ``intersect_sphere`` and ``intersect_box``.
|
||||
|
||||
Returns
|
||||
-------
|
||||
ta, tb : float
|
||||
The line/segment ``x(t) = z + t*d`` is inside the rectangular box and
|
||||
inside the ball for ``ta <= t <= tb``.
|
||||
intersect : bool
|
||||
When ``True``, there is a intersection between the line (or segment)
|
||||
and both constraints. On the other hand, when ``False``, there is no
|
||||
intersection.
|
||||
sphere_info : dict, optional
|
||||
Dictionary ``{ta, tb, intersect}`` containing the interval ``[ta, tb]``
|
||||
for which the line intercepts the ball. And a boolean value indicating
|
||||
whether the sphere is intersected by the line.
|
||||
box_info : dict, optional
|
||||
Dictionary ``{ta, tb, intersect}`` containing the interval ``[ta, tb]``
|
||||
for which the line intercepts the box. And a boolean value indicating
|
||||
whether the box is intersected by the line.
|
||||
"""
|
||||
ta_b, tb_b, intersect_b = box_intersections(z, d, lb, ub,
|
||||
entire_line)
|
||||
ta_s, tb_s, intersect_s = sphere_intersections(z, d,
|
||||
trust_radius,
|
||||
entire_line)
|
||||
ta = np.maximum(ta_b, ta_s)
|
||||
tb = np.minimum(tb_b, tb_s)
|
||||
if intersect_b and intersect_s and ta <= tb:
|
||||
intersect = True
|
||||
else:
|
||||
intersect = False
|
||||
|
||||
if extra_info:
|
||||
sphere_info = {'ta': ta_s, 'tb': tb_s, 'intersect': intersect_s}
|
||||
box_info = {'ta': ta_b, 'tb': tb_b, 'intersect': intersect_b}
|
||||
return ta, tb, intersect, sphere_info, box_info
|
||||
else:
|
||||
return ta, tb, intersect
|
||||
|
||||
|
||||
def inside_box_boundaries(x, lb, ub):
|
||||
"""Check if lb <= x <= ub."""
|
||||
return (lb <= x).all() and (x <= ub).all()
|
||||
|
||||
|
||||
def reinforce_box_boundaries(x, lb, ub):
|
||||
"""Return clipped value of x"""
|
||||
return np.minimum(np.maximum(x, lb), ub)
|
||||
|
||||
|
||||
def modified_dogleg(A, Y, b, trust_radius, lb, ub):
|
||||
"""Approximately minimize ``1/2*|| A x + b ||^2`` inside trust-region.
|
||||
|
||||
Approximately solve the problem of minimizing ``1/2*|| A x + b ||^2``
|
||||
subject to ``||x|| < Delta`` and ``lb <= x <= ub`` using a modification
|
||||
of the classical dogleg approach.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : LinearOperator (or sparse matrix or ndarray), shape (m, n)
|
||||
Matrix ``A`` in the minimization problem. It should have
|
||||
dimension ``(m, n)`` such that ``m < n``.
|
||||
Y : LinearOperator (or sparse matrix or ndarray), shape (n, m)
|
||||
LinearOperator that apply the projection matrix
|
||||
``Q = A.T inv(A A.T)`` to the vector. The obtained vector
|
||||
``y = Q x`` being the minimum norm solution of ``A y = x``.
|
||||
b : array_like, shape (m,)
|
||||
Vector ``b``in the minimization problem.
|
||||
trust_radius: float
|
||||
Trust radius to be considered. Delimits a sphere boundary
|
||||
to the problem.
|
||||
lb : array_like, shape (n,)
|
||||
Lower bounds to each one of the components of ``x``.
|
||||
It is expected that ``lb <= 0``, otherwise the algorithm
|
||||
may fail. If ``lb[i] = -Inf``, the lower
|
||||
bound for the ith component is just ignored.
|
||||
ub : array_like, shape (n, )
|
||||
Upper bounds to each one of the components of ``x``.
|
||||
It is expected that ``ub >= 0``, otherwise the algorithm
|
||||
may fail. If ``ub[i] = Inf``, the upper bound for the ith
|
||||
component is just ignored.
|
||||
|
||||
Returns
|
||||
-------
|
||||
x : array_like, shape (n,)
|
||||
Solution to the problem.
|
||||
|
||||
Notes
|
||||
-----
|
||||
Based on implementations described in pp. 885-886 from [1]_.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Byrd, Richard H., Mary E. Hribar, and Jorge Nocedal.
|
||||
"An interior point algorithm for large-scale nonlinear
|
||||
programming." SIAM Journal on Optimization 9.4 (1999): 877-900.
|
||||
"""
|
||||
# Compute minimum norm minimizer of 1/2*|| A x + b ||^2.
|
||||
newton_point = -Y.dot(b)
|
||||
# Check for interior point
|
||||
if inside_box_boundaries(newton_point, lb, ub) \
|
||||
and norm(newton_point) <= trust_radius:
|
||||
x = newton_point
|
||||
return x
|
||||
|
||||
# Compute gradient vector ``g = A.T b``
|
||||
g = A.T.dot(b)
|
||||
# Compute Cauchy point
|
||||
# `cauchy_point = g.T g / (g.T A.T A g)``.
|
||||
A_g = A.dot(g)
|
||||
cauchy_point = -np.dot(g, g) / np.dot(A_g, A_g) * g
|
||||
# Origin
|
||||
origin_point = np.zeros_like(cauchy_point)
|
||||
|
||||
# Check the segment between cauchy_point and newton_point
|
||||
# for a possible solution.
|
||||
z = cauchy_point
|
||||
p = newton_point - cauchy_point
|
||||
_, alpha, intersect = box_sphere_intersections(z, p, lb, ub,
|
||||
trust_radius)
|
||||
if intersect:
|
||||
x1 = z + alpha*p
|
||||
else:
|
||||
# Check the segment between the origin and cauchy_point
|
||||
# for a possible solution.
|
||||
z = origin_point
|
||||
p = cauchy_point
|
||||
_, alpha, _ = box_sphere_intersections(z, p, lb, ub,
|
||||
trust_radius)
|
||||
x1 = z + alpha*p
|
||||
|
||||
# Check the segment between origin and newton_point
|
||||
# for a possible solution.
|
||||
z = origin_point
|
||||
p = newton_point
|
||||
_, alpha, _ = box_sphere_intersections(z, p, lb, ub,
|
||||
trust_radius)
|
||||
x2 = z + alpha*p
|
||||
|
||||
# Return the best solution among x1 and x2.
|
||||
if norm(A.dot(x1) + b) < norm(A.dot(x2) + b):
|
||||
return x1
|
||||
else:
|
||||
return x2
|
||||
|
||||
|
||||
def projected_cg(H, c, Z, Y, b, trust_radius=np.inf,
|
||||
lb=None, ub=None, tol=None,
|
||||
max_iter=None, max_infeasible_iter=None,
|
||||
return_all=False):
|
||||
"""Solve EQP problem with projected CG method.
|
||||
|
||||
Solve equality-constrained quadratic programming problem
|
||||
``min 1/2 x.T H x + x.t c`` subject to ``A x + b = 0`` and,
|
||||
possibly, to trust region constraints ``||x|| < trust_radius``
|
||||
and box constraints ``lb <= x <= ub``.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
H : LinearOperator (or sparse matrix or ndarray), shape (n, n)
|
||||
Operator for computing ``H v``.
|
||||
c : array_like, shape (n,)
|
||||
Gradient of the quadratic objective function.
|
||||
Z : LinearOperator (or sparse matrix or ndarray), shape (n, n)
|
||||
Operator for projecting ``x`` into the null space of A.
|
||||
Y : LinearOperator, sparse matrix, ndarray, shape (n, m)
|
||||
Operator that, for a given a vector ``b``, compute smallest
|
||||
norm solution of ``A x + b = 0``.
|
||||
b : array_like, shape (m,)
|
||||
Right-hand side of the constraint equation.
|
||||
trust_radius : float, optional
|
||||
Trust radius to be considered. By default, uses ``trust_radius=inf``,
|
||||
which means no trust radius at all.
|
||||
lb : array_like, shape (n,), optional
|
||||
Lower bounds to each one of the components of ``x``.
|
||||
If ``lb[i] = -Inf`` the lower bound for the i-th
|
||||
component is just ignored (default).
|
||||
ub : array_like, shape (n, ), optional
|
||||
Upper bounds to each one of the components of ``x``.
|
||||
If ``ub[i] = Inf`` the upper bound for the i-th
|
||||
component is just ignored (default).
|
||||
tol : float, optional
|
||||
Tolerance used to interrupt the algorithm.
|
||||
max_iter : int, optional
|
||||
Maximum algorithm iterations. Where ``max_inter <= n-m``.
|
||||
By default, uses ``max_iter = n-m``.
|
||||
max_infeasible_iter : int, optional
|
||||
Maximum infeasible (regarding box constraints) iterations the
|
||||
algorithm is allowed to take.
|
||||
By default, uses ``max_infeasible_iter = n-m``.
|
||||
return_all : bool, optional
|
||||
When ``true``, return the list of all vectors through the iterations.
|
||||
|
||||
Returns
|
||||
-------
|
||||
x : array_like, shape (n,)
|
||||
Solution of the EQP problem.
|
||||
info : Dict
|
||||
Dictionary containing the following:
|
||||
|
||||
- niter : Number of iterations.
|
||||
- stop_cond : Reason for algorithm termination:
|
||||
1. Iteration limit was reached;
|
||||
2. Reached the trust-region boundary;
|
||||
3. Negative curvature detected;
|
||||
4. Tolerance was satisfied.
|
||||
- allvecs : List containing all intermediary vectors (optional).
|
||||
- hits_boundary : True if the proposed step is on the boundary
|
||||
of the trust region.
|
||||
|
||||
Notes
|
||||
-----
|
||||
Implementation of Algorithm 6.2 on [1]_.
|
||||
|
||||
In the absence of spherical and box constraints, for sufficient
|
||||
iterations, the method returns a truly optimal result.
|
||||
In the presence of those constraints, the value returned is only
|
||||
a inexpensive approximation of the optimal value.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Gould, Nicholas IM, Mary E. Hribar, and Jorge Nocedal.
|
||||
"On the solution of equality constrained quadratic
|
||||
programming problems arising in optimization."
|
||||
SIAM Journal on Scientific Computing 23.4 (2001): 1376-1395.
|
||||
"""
|
||||
CLOSE_TO_ZERO = 1e-25
|
||||
|
||||
n, = np.shape(c) # Number of parameters
|
||||
m, = np.shape(b) # Number of constraints
|
||||
|
||||
# Initial Values
|
||||
x = Y.dot(-b)
|
||||
r = Z.dot(H.dot(x) + c)
|
||||
g = Z.dot(r)
|
||||
p = -g
|
||||
|
||||
# Store ``x`` value
|
||||
if return_all:
|
||||
allvecs = [x]
|
||||
# Values for the first iteration
|
||||
H_p = H.dot(p)
|
||||
rt_g = norm(g)**2 # g.T g = r.T Z g = r.T g (ref [1]_ p.1389)
|
||||
|
||||
# If x > trust-region the problem does not have a solution.
|
||||
tr_distance = trust_radius - norm(x)
|
||||
if tr_distance < 0:
|
||||
raise ValueError("Trust region problem does not have a solution.")
|
||||
# If x == trust_radius, then x is the solution
|
||||
# to the optimization problem, since x is the
|
||||
# minimum norm solution to Ax=b.
|
||||
elif tr_distance < CLOSE_TO_ZERO:
|
||||
info = {'niter': 0, 'stop_cond': 2, 'hits_boundary': True}
|
||||
if return_all:
|
||||
allvecs.append(x)
|
||||
info['allvecs'] = allvecs
|
||||
return x, info
|
||||
|
||||
# Set default tolerance
|
||||
if tol is None:
|
||||
tol = max(min(0.01 * np.sqrt(rt_g), 0.1 * rt_g), CLOSE_TO_ZERO)
|
||||
# Set default lower and upper bounds
|
||||
if lb is None:
|
||||
lb = np.full(n, -np.inf)
|
||||
if ub is None:
|
||||
ub = np.full(n, np.inf)
|
||||
# Set maximum iterations
|
||||
if max_iter is None:
|
||||
max_iter = n-m
|
||||
max_iter = min(max_iter, n-m)
|
||||
# Set maximum infeasible iterations
|
||||
if max_infeasible_iter is None:
|
||||
max_infeasible_iter = n-m
|
||||
|
||||
hits_boundary = False
|
||||
stop_cond = 1
|
||||
counter = 0
|
||||
last_feasible_x = np.zeros_like(x)
|
||||
k = 0
|
||||
for i in range(max_iter):
|
||||
# Stop criteria - Tolerance : r.T g < tol
|
||||
if rt_g < tol:
|
||||
stop_cond = 4
|
||||
break
|
||||
k += 1
|
||||
# Compute curvature
|
||||
pt_H_p = H_p.dot(p)
|
||||
# Stop criteria - Negative curvature
|
||||
if pt_H_p <= 0:
|
||||
if np.isinf(trust_radius):
|
||||
raise ValueError("Negative curvature not allowed "
|
||||
"for unrestricted problems.")
|
||||
else:
|
||||
# Find intersection with constraints
|
||||
_, alpha, intersect = box_sphere_intersections(
|
||||
x, p, lb, ub, trust_radius, entire_line=True)
|
||||
# Update solution
|
||||
if intersect:
|
||||
x = x + alpha*p
|
||||
# Reinforce variables are inside box constraints.
|
||||
# This is only necessary because of roundoff errors.
|
||||
x = reinforce_box_boundaries(x, lb, ub)
|
||||
# Attribute information
|
||||
stop_cond = 3
|
||||
hits_boundary = True
|
||||
break
|
||||
|
||||
# Get next step
|
||||
alpha = rt_g / pt_H_p
|
||||
x_next = x + alpha*p
|
||||
|
||||
# Stop criteria - Hits boundary
|
||||
if np.linalg.norm(x_next) >= trust_radius:
|
||||
# Find intersection with box constraints
|
||||
_, theta, intersect = box_sphere_intersections(x, alpha*p, lb, ub,
|
||||
trust_radius)
|
||||
# Update solution
|
||||
if intersect:
|
||||
x = x + theta*alpha*p
|
||||
# Reinforce variables are inside box constraints.
|
||||
# This is only necessary because of roundoff errors.
|
||||
x = reinforce_box_boundaries(x, lb, ub)
|
||||
# Attribute information
|
||||
stop_cond = 2
|
||||
hits_boundary = True
|
||||
break
|
||||
|
||||
# Check if ``x`` is inside the box and start counter if it is not.
|
||||
if inside_box_boundaries(x_next, lb, ub):
|
||||
counter = 0
|
||||
else:
|
||||
counter += 1
|
||||
# Whenever outside box constraints keep looking for intersections.
|
||||
if counter > 0:
|
||||
_, theta, intersect = box_sphere_intersections(x, alpha*p, lb, ub,
|
||||
trust_radius)
|
||||
if intersect:
|
||||
last_feasible_x = x + theta*alpha*p
|
||||
# Reinforce variables are inside box constraints.
|
||||
# This is only necessary because of roundoff errors.
|
||||
last_feasible_x = reinforce_box_boundaries(last_feasible_x,
|
||||
lb, ub)
|
||||
counter = 0
|
||||
# Stop after too many infeasible (regarding box constraints) iteration.
|
||||
if counter > max_infeasible_iter:
|
||||
break
|
||||
# Store ``x_next`` value
|
||||
if return_all:
|
||||
allvecs.append(x_next)
|
||||
|
||||
# Update residual
|
||||
r_next = r + alpha*H_p
|
||||
# Project residual g+ = Z r+
|
||||
g_next = Z.dot(r_next)
|
||||
# Compute conjugate direction step d
|
||||
rt_g_next = norm(g_next)**2 # g.T g = r.T g (ref [1]_ p.1389)
|
||||
beta = rt_g_next / rt_g
|
||||
p = - g_next + beta*p
|
||||
# Prepare for next iteration
|
||||
x = x_next
|
||||
g = g_next
|
||||
r = g_next
|
||||
rt_g = norm(g)**2 # g.T g = r.T Z g = r.T g (ref [1]_ p.1389)
|
||||
H_p = H.dot(p)
|
||||
|
||||
if not inside_box_boundaries(x, lb, ub):
|
||||
x = last_feasible_x
|
||||
hits_boundary = True
|
||||
info = {'niter': k, 'stop_cond': stop_cond,
|
||||
'hits_boundary': hits_boundary}
|
||||
if return_all:
|
||||
info['allvecs'] = allvecs
|
||||
return x, info
|
||||
@ -0,0 +1,51 @@
|
||||
"""Progress report printers."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
class ReportBase:
|
||||
COLUMN_NAMES: list[str] = NotImplemented
|
||||
COLUMN_WIDTHS: list[int] = NotImplemented
|
||||
ITERATION_FORMATS: list[str] = NotImplemented
|
||||
|
||||
@classmethod
|
||||
def print_header(cls):
|
||||
fmt = ("|"
|
||||
+ "|".join([f"{{:^{x}}}" for x in cls.COLUMN_WIDTHS])
|
||||
+ "|")
|
||||
separators = ['-' * x for x in cls.COLUMN_WIDTHS]
|
||||
print(fmt.format(*cls.COLUMN_NAMES))
|
||||
print(fmt.format(*separators))
|
||||
|
||||
@classmethod
|
||||
def print_iteration(cls, *args):
|
||||
iteration_format = [f"{{:{x}}}" for x in cls.ITERATION_FORMATS]
|
||||
fmt = "|" + "|".join(iteration_format) + "|"
|
||||
print(fmt.format(*args))
|
||||
|
||||
@classmethod
|
||||
def print_footer(cls):
|
||||
print()
|
||||
|
||||
|
||||
class BasicReport(ReportBase):
|
||||
COLUMN_NAMES = ["niter", "f evals", "CG iter", "obj func", "tr radius",
|
||||
"opt", "c viol"]
|
||||
COLUMN_WIDTHS = [7, 7, 7, 13, 10, 10, 10]
|
||||
ITERATION_FORMATS = ["^7", "^7", "^7", "^+13.4e",
|
||||
"^10.2e", "^10.2e", "^10.2e"]
|
||||
|
||||
|
||||
class SQPReport(ReportBase):
|
||||
COLUMN_NAMES = ["niter", "f evals", "CG iter", "obj func", "tr radius",
|
||||
"opt", "c viol", "penalty", "CG stop"]
|
||||
COLUMN_WIDTHS = [7, 7, 7, 13, 10, 10, 10, 10, 7]
|
||||
ITERATION_FORMATS = ["^7", "^7", "^7", "^+13.4e", "^10.2e", "^10.2e",
|
||||
"^10.2e", "^10.2e", "^7"]
|
||||
|
||||
|
||||
class IPReport(ReportBase):
|
||||
COLUMN_NAMES = ["niter", "f evals", "CG iter", "obj func", "tr radius",
|
||||
"opt", "c viol", "penalty", "barrier param", "CG stop"]
|
||||
COLUMN_WIDTHS = [7, 7, 7, 13, 10, 10, 10, 10, 13, 7]
|
||||
ITERATION_FORMATS = ["^7", "^7", "^7", "^+13.4e", "^10.2e", "^10.2e",
|
||||
"^10.2e", "^10.2e", "^13.2e", "^7"]
|
||||
@ -0,0 +1,296 @@
|
||||
import numpy as np
|
||||
from numpy.testing import assert_array_equal, assert_equal
|
||||
from scipy.optimize._constraints import (NonlinearConstraint, Bounds,
|
||||
PreparedConstraint)
|
||||
from scipy.optimize._trustregion_constr.canonical_constraint \
|
||||
import CanonicalConstraint, initial_constraints_as_canonical
|
||||
|
||||
|
||||
def create_quadratic_function(n, m, rng):
|
||||
a = rng.rand(m)
|
||||
A = rng.rand(m, n)
|
||||
H = rng.rand(m, n, n)
|
||||
HT = np.transpose(H, (1, 2, 0))
|
||||
|
||||
def fun(x):
|
||||
return a + A.dot(x) + 0.5 * H.dot(x).dot(x)
|
||||
|
||||
def jac(x):
|
||||
return A + H.dot(x)
|
||||
|
||||
def hess(x, v):
|
||||
return HT.dot(v)
|
||||
|
||||
return fun, jac, hess
|
||||
|
||||
|
||||
def test_bounds_cases():
|
||||
# Test 1: no constraints.
|
||||
user_constraint = Bounds(-np.inf, np.inf)
|
||||
x0 = np.array([-1, 2])
|
||||
prepared_constraint = PreparedConstraint(user_constraint, x0, False)
|
||||
c = CanonicalConstraint.from_PreparedConstraint(prepared_constraint)
|
||||
|
||||
assert_equal(c.n_eq, 0)
|
||||
assert_equal(c.n_ineq, 0)
|
||||
|
||||
c_eq, c_ineq = c.fun(x0)
|
||||
assert_array_equal(c_eq, [])
|
||||
assert_array_equal(c_ineq, [])
|
||||
|
||||
J_eq, J_ineq = c.jac(x0)
|
||||
assert_array_equal(J_eq, np.empty((0, 2)))
|
||||
assert_array_equal(J_ineq, np.empty((0, 2)))
|
||||
|
||||
assert_array_equal(c.keep_feasible, [])
|
||||
|
||||
# Test 2: infinite lower bound.
|
||||
user_constraint = Bounds(-np.inf, [0, np.inf, 1], [False, True, True])
|
||||
x0 = np.array([-1, -2, -3], dtype=float)
|
||||
prepared_constraint = PreparedConstraint(user_constraint, x0, False)
|
||||
c = CanonicalConstraint.from_PreparedConstraint(prepared_constraint)
|
||||
|
||||
assert_equal(c.n_eq, 0)
|
||||
assert_equal(c.n_ineq, 2)
|
||||
|
||||
c_eq, c_ineq = c.fun(x0)
|
||||
assert_array_equal(c_eq, [])
|
||||
assert_array_equal(c_ineq, [-1, -4])
|
||||
|
||||
J_eq, J_ineq = c.jac(x0)
|
||||
assert_array_equal(J_eq, np.empty((0, 3)))
|
||||
assert_array_equal(J_ineq, np.array([[1, 0, 0], [0, 0, 1]]))
|
||||
|
||||
assert_array_equal(c.keep_feasible, [False, True])
|
||||
|
||||
# Test 3: infinite upper bound.
|
||||
user_constraint = Bounds([0, 1, -np.inf], np.inf, [True, False, True])
|
||||
x0 = np.array([1, 2, 3], dtype=float)
|
||||
prepared_constraint = PreparedConstraint(user_constraint, x0, False)
|
||||
c = CanonicalConstraint.from_PreparedConstraint(prepared_constraint)
|
||||
|
||||
assert_equal(c.n_eq, 0)
|
||||
assert_equal(c.n_ineq, 2)
|
||||
|
||||
c_eq, c_ineq = c.fun(x0)
|
||||
assert_array_equal(c_eq, [])
|
||||
assert_array_equal(c_ineq, [-1, -1])
|
||||
|
||||
J_eq, J_ineq = c.jac(x0)
|
||||
assert_array_equal(J_eq, np.empty((0, 3)))
|
||||
assert_array_equal(J_ineq, np.array([[-1, 0, 0], [0, -1, 0]]))
|
||||
|
||||
assert_array_equal(c.keep_feasible, [True, False])
|
||||
|
||||
# Test 4: interval constraint.
|
||||
user_constraint = Bounds([-1, -np.inf, 2, 3], [1, np.inf, 10, 3],
|
||||
[False, True, True, True])
|
||||
x0 = np.array([0, 10, 8, 5])
|
||||
prepared_constraint = PreparedConstraint(user_constraint, x0, False)
|
||||
c = CanonicalConstraint.from_PreparedConstraint(prepared_constraint)
|
||||
|
||||
assert_equal(c.n_eq, 1)
|
||||
assert_equal(c.n_ineq, 4)
|
||||
|
||||
c_eq, c_ineq = c.fun(x0)
|
||||
assert_array_equal(c_eq, [2])
|
||||
assert_array_equal(c_ineq, [-1, -2, -1, -6])
|
||||
|
||||
J_eq, J_ineq = c.jac(x0)
|
||||
assert_array_equal(J_eq, [[0, 0, 0, 1]])
|
||||
assert_array_equal(J_ineq, [[1, 0, 0, 0],
|
||||
[0, 0, 1, 0],
|
||||
[-1, 0, 0, 0],
|
||||
[0, 0, -1, 0]])
|
||||
|
||||
assert_array_equal(c.keep_feasible, [False, True, False, True])
|
||||
|
||||
|
||||
def test_nonlinear_constraint():
|
||||
n = 3
|
||||
m = 5
|
||||
rng = np.random.RandomState(0)
|
||||
x0 = rng.rand(n)
|
||||
|
||||
fun, jac, hess = create_quadratic_function(n, m, rng)
|
||||
f = fun(x0)
|
||||
J = jac(x0)
|
||||
|
||||
lb = [-10, 3, -np.inf, -np.inf, -5]
|
||||
ub = [10, 3, np.inf, 3, np.inf]
|
||||
user_constraint = NonlinearConstraint(
|
||||
fun, lb, ub, jac, hess, [True, False, False, True, False])
|
||||
|
||||
for sparse_jacobian in [False, True]:
|
||||
prepared_constraint = PreparedConstraint(user_constraint, x0,
|
||||
sparse_jacobian)
|
||||
c = CanonicalConstraint.from_PreparedConstraint(prepared_constraint)
|
||||
|
||||
assert_array_equal(c.n_eq, 1)
|
||||
assert_array_equal(c.n_ineq, 4)
|
||||
|
||||
c_eq, c_ineq = c.fun(x0)
|
||||
assert_array_equal(c_eq, [f[1] - lb[1]])
|
||||
assert_array_equal(c_ineq, [f[3] - ub[3], lb[4] - f[4],
|
||||
f[0] - ub[0], lb[0] - f[0]])
|
||||
|
||||
J_eq, J_ineq = c.jac(x0)
|
||||
if sparse_jacobian:
|
||||
J_eq = J_eq.toarray()
|
||||
J_ineq = J_ineq.toarray()
|
||||
|
||||
assert_array_equal(J_eq, J[1, None])
|
||||
assert_array_equal(J_ineq, np.vstack((J[3], -J[4], J[0], -J[0])))
|
||||
|
||||
v_eq = rng.rand(c.n_eq)
|
||||
v_ineq = rng.rand(c.n_ineq)
|
||||
v = np.zeros(m)
|
||||
v[1] = v_eq[0]
|
||||
v[3] = v_ineq[0]
|
||||
v[4] = -v_ineq[1]
|
||||
v[0] = v_ineq[2] - v_ineq[3]
|
||||
assert_array_equal(c.hess(x0, v_eq, v_ineq), hess(x0, v))
|
||||
|
||||
assert_array_equal(c.keep_feasible, [True, False, True, True])
|
||||
|
||||
|
||||
def test_concatenation():
|
||||
rng = np.random.RandomState(0)
|
||||
n = 4
|
||||
x0 = rng.rand(n)
|
||||
|
||||
f1 = x0
|
||||
J1 = np.eye(n)
|
||||
lb1 = [-1, -np.inf, -2, 3]
|
||||
ub1 = [1, np.inf, np.inf, 3]
|
||||
bounds = Bounds(lb1, ub1, [False, False, True, False])
|
||||
|
||||
fun, jac, hess = create_quadratic_function(n, 5, rng)
|
||||
f2 = fun(x0)
|
||||
J2 = jac(x0)
|
||||
lb2 = [-10, 3, -np.inf, -np.inf, -5]
|
||||
ub2 = [10, 3, np.inf, 5, np.inf]
|
||||
nonlinear = NonlinearConstraint(
|
||||
fun, lb2, ub2, jac, hess, [True, False, False, True, False])
|
||||
|
||||
for sparse_jacobian in [False, True]:
|
||||
bounds_prepared = PreparedConstraint(bounds, x0, sparse_jacobian)
|
||||
nonlinear_prepared = PreparedConstraint(nonlinear, x0, sparse_jacobian)
|
||||
|
||||
c1 = CanonicalConstraint.from_PreparedConstraint(bounds_prepared)
|
||||
c2 = CanonicalConstraint.from_PreparedConstraint(nonlinear_prepared)
|
||||
c = CanonicalConstraint.concatenate([c1, c2], sparse_jacobian)
|
||||
|
||||
assert_equal(c.n_eq, 2)
|
||||
assert_equal(c.n_ineq, 7)
|
||||
|
||||
c_eq, c_ineq = c.fun(x0)
|
||||
assert_array_equal(c_eq, [f1[3] - lb1[3], f2[1] - lb2[1]])
|
||||
assert_array_equal(c_ineq, [lb1[2] - f1[2], f1[0] - ub1[0],
|
||||
lb1[0] - f1[0], f2[3] - ub2[3],
|
||||
lb2[4] - f2[4], f2[0] - ub2[0],
|
||||
lb2[0] - f2[0]])
|
||||
|
||||
J_eq, J_ineq = c.jac(x0)
|
||||
if sparse_jacobian:
|
||||
J_eq = J_eq.toarray()
|
||||
J_ineq = J_ineq.toarray()
|
||||
|
||||
assert_array_equal(J_eq, np.vstack((J1[3], J2[1])))
|
||||
assert_array_equal(J_ineq, np.vstack((-J1[2], J1[0], -J1[0], J2[3],
|
||||
-J2[4], J2[0], -J2[0])))
|
||||
|
||||
v_eq = rng.rand(c.n_eq)
|
||||
v_ineq = rng.rand(c.n_ineq)
|
||||
v = np.zeros(5)
|
||||
v[1] = v_eq[1]
|
||||
v[3] = v_ineq[3]
|
||||
v[4] = -v_ineq[4]
|
||||
v[0] = v_ineq[5] - v_ineq[6]
|
||||
H = c.hess(x0, v_eq, v_ineq).dot(np.eye(n))
|
||||
assert_array_equal(H, hess(x0, v))
|
||||
|
||||
assert_array_equal(c.keep_feasible,
|
||||
[True, False, False, True, False, True, True])
|
||||
|
||||
|
||||
def test_empty():
|
||||
x = np.array([1, 2, 3])
|
||||
c = CanonicalConstraint.empty(3)
|
||||
assert_equal(c.n_eq, 0)
|
||||
assert_equal(c.n_ineq, 0)
|
||||
|
||||
c_eq, c_ineq = c.fun(x)
|
||||
assert_array_equal(c_eq, [])
|
||||
assert_array_equal(c_ineq, [])
|
||||
|
||||
J_eq, J_ineq = c.jac(x)
|
||||
assert_array_equal(J_eq, np.empty((0, 3)))
|
||||
assert_array_equal(J_ineq, np.empty((0, 3)))
|
||||
|
||||
H = c.hess(x, None, None).toarray()
|
||||
assert_array_equal(H, np.zeros((3, 3)))
|
||||
|
||||
|
||||
def test_initial_constraints_as_canonical():
|
||||
# rng is only used to generate the coefficients of the quadratic
|
||||
# function that is used by the nonlinear constraint.
|
||||
rng = np.random.RandomState(0)
|
||||
|
||||
x0 = np.array([0.5, 0.4, 0.3, 0.2])
|
||||
n = len(x0)
|
||||
|
||||
lb1 = [-1, -np.inf, -2, 3]
|
||||
ub1 = [1, np.inf, np.inf, 3]
|
||||
bounds = Bounds(lb1, ub1, [False, False, True, False])
|
||||
|
||||
fun, jac, hess = create_quadratic_function(n, 5, rng)
|
||||
lb2 = [-10, 3, -np.inf, -np.inf, -5]
|
||||
ub2 = [10, 3, np.inf, 5, np.inf]
|
||||
nonlinear = NonlinearConstraint(
|
||||
fun, lb2, ub2, jac, hess, [True, False, False, True, False])
|
||||
|
||||
for sparse_jacobian in [False, True]:
|
||||
bounds_prepared = PreparedConstraint(bounds, x0, sparse_jacobian)
|
||||
nonlinear_prepared = PreparedConstraint(nonlinear, x0, sparse_jacobian)
|
||||
|
||||
f1 = bounds_prepared.fun.f
|
||||
J1 = bounds_prepared.fun.J
|
||||
f2 = nonlinear_prepared.fun.f
|
||||
J2 = nonlinear_prepared.fun.J
|
||||
|
||||
c_eq, c_ineq, J_eq, J_ineq = initial_constraints_as_canonical(
|
||||
n, [bounds_prepared, nonlinear_prepared], sparse_jacobian)
|
||||
|
||||
assert_array_equal(c_eq, [f1[3] - lb1[3], f2[1] - lb2[1]])
|
||||
assert_array_equal(c_ineq, [lb1[2] - f1[2], f1[0] - ub1[0],
|
||||
lb1[0] - f1[0], f2[3] - ub2[3],
|
||||
lb2[4] - f2[4], f2[0] - ub2[0],
|
||||
lb2[0] - f2[0]])
|
||||
|
||||
if sparse_jacobian:
|
||||
J1 = J1.toarray()
|
||||
J2 = J2.toarray()
|
||||
J_eq = J_eq.toarray()
|
||||
J_ineq = J_ineq.toarray()
|
||||
|
||||
assert_array_equal(J_eq, np.vstack((J1[3], J2[1])))
|
||||
assert_array_equal(J_ineq, np.vstack((-J1[2], J1[0], -J1[0], J2[3],
|
||||
-J2[4], J2[0], -J2[0])))
|
||||
|
||||
|
||||
def test_initial_constraints_as_canonical_empty():
|
||||
n = 3
|
||||
for sparse_jacobian in [False, True]:
|
||||
c_eq, c_ineq, J_eq, J_ineq = initial_constraints_as_canonical(
|
||||
n, [], sparse_jacobian)
|
||||
|
||||
assert_array_equal(c_eq, [])
|
||||
assert_array_equal(c_ineq, [])
|
||||
|
||||
if sparse_jacobian:
|
||||
J_eq = J_eq.toarray()
|
||||
J_ineq = J_ineq.toarray()
|
||||
|
||||
assert_array_equal(J_eq, np.empty((0, n)))
|
||||
assert_array_equal(J_ineq, np.empty((0, n)))
|
||||
@ -0,0 +1,214 @@
|
||||
import numpy as np
|
||||
import scipy.linalg
|
||||
from scipy.sparse import csc_matrix
|
||||
from scipy.optimize._trustregion_constr.projections \
|
||||
import projections, orthogonality
|
||||
from numpy.testing import (TestCase, assert_array_almost_equal,
|
||||
assert_equal, assert_allclose)
|
||||
|
||||
try:
|
||||
from sksparse.cholmod import cholesky_AAt # noqa: F401
|
||||
sksparse_available = True
|
||||
available_sparse_methods = ("NormalEquation", "AugmentedSystem")
|
||||
except ImportError:
|
||||
sksparse_available = False
|
||||
available_sparse_methods = ("AugmentedSystem",)
|
||||
available_dense_methods = ('QRFactorization', 'SVDFactorization')
|
||||
|
||||
|
||||
class TestProjections(TestCase):
|
||||
|
||||
def test_nullspace_and_least_squares_sparse(self):
|
||||
A_dense = np.array([[1, 2, 3, 4, 0, 5, 0, 7],
|
||||
[0, 8, 7, 0, 1, 5, 9, 0],
|
||||
[1, 0, 0, 0, 0, 1, 2, 3]])
|
||||
At_dense = A_dense.T
|
||||
A = csc_matrix(A_dense)
|
||||
test_points = ([1, 2, 3, 4, 5, 6, 7, 8],
|
||||
[1, 10, 3, 0, 1, 6, 7, 8],
|
||||
[1.12, 10, 0, 0, 100000, 6, 0.7, 8])
|
||||
|
||||
for method in available_sparse_methods:
|
||||
Z, LS, _ = projections(A, method)
|
||||
for z in test_points:
|
||||
# Test if x is in the null_space
|
||||
x = Z.matvec(z)
|
||||
assert_array_almost_equal(A.dot(x), 0)
|
||||
# Test orthogonality
|
||||
assert_array_almost_equal(orthogonality(A, x), 0)
|
||||
# Test if x is the least square solution
|
||||
x = LS.matvec(z)
|
||||
x2 = scipy.linalg.lstsq(At_dense, z)[0]
|
||||
assert_array_almost_equal(x, x2)
|
||||
|
||||
def test_iterative_refinements_sparse(self):
|
||||
A_dense = np.array([[1, 2, 3, 4, 0, 5, 0, 7],
|
||||
[0, 8, 7, 0, 1, 5, 9, 0],
|
||||
[1, 0, 0, 0, 0, 1, 2, 3]])
|
||||
A = csc_matrix(A_dense)
|
||||
test_points = ([1, 2, 3, 4, 5, 6, 7, 8],
|
||||
[1, 10, 3, 0, 1, 6, 7, 8],
|
||||
[1.12, 10, 0, 0, 100000, 6, 0.7, 8],
|
||||
[1, 0, 0, 0, 0, 1, 2, 3+1e-10])
|
||||
|
||||
for method in available_sparse_methods:
|
||||
Z, LS, _ = projections(A, method, orth_tol=1e-18, max_refin=100)
|
||||
for z in test_points:
|
||||
# Test if x is in the null_space
|
||||
x = Z.matvec(z)
|
||||
atol = 1e-13 * abs(x).max()
|
||||
assert_allclose(A.dot(x), 0, atol=atol)
|
||||
# Test orthogonality
|
||||
assert_allclose(orthogonality(A, x), 0, atol=1e-13)
|
||||
|
||||
def test_rowspace_sparse(self):
|
||||
A_dense = np.array([[1, 2, 3, 4, 0, 5, 0, 7],
|
||||
[0, 8, 7, 0, 1, 5, 9, 0],
|
||||
[1, 0, 0, 0, 0, 1, 2, 3]])
|
||||
A = csc_matrix(A_dense)
|
||||
test_points = ([1, 2, 3],
|
||||
[1, 10, 3],
|
||||
[1.12, 10, 0])
|
||||
|
||||
for method in available_sparse_methods:
|
||||
_, _, Y = projections(A, method)
|
||||
for z in test_points:
|
||||
# Test if x is solution of A x = z
|
||||
x = Y.matvec(z)
|
||||
assert_array_almost_equal(A.dot(x), z)
|
||||
# Test if x is in the return row space of A
|
||||
A_ext = np.vstack((A_dense, x))
|
||||
assert_equal(np.linalg.matrix_rank(A_dense),
|
||||
np.linalg.matrix_rank(A_ext))
|
||||
|
||||
def test_nullspace_and_least_squares_dense(self):
|
||||
A = np.array([[1, 2, 3, 4, 0, 5, 0, 7],
|
||||
[0, 8, 7, 0, 1, 5, 9, 0],
|
||||
[1, 0, 0, 0, 0, 1, 2, 3]])
|
||||
At = A.T
|
||||
test_points = ([1, 2, 3, 4, 5, 6, 7, 8],
|
||||
[1, 10, 3, 0, 1, 6, 7, 8],
|
||||
[1.12, 10, 0, 0, 100000, 6, 0.7, 8])
|
||||
|
||||
for method in available_dense_methods:
|
||||
Z, LS, _ = projections(A, method)
|
||||
for z in test_points:
|
||||
# Test if x is in the null_space
|
||||
x = Z.matvec(z)
|
||||
assert_array_almost_equal(A.dot(x), 0)
|
||||
# Test orthogonality
|
||||
assert_array_almost_equal(orthogonality(A, x), 0)
|
||||
# Test if x is the least square solution
|
||||
x = LS.matvec(z)
|
||||
x2 = scipy.linalg.lstsq(At, z)[0]
|
||||
assert_array_almost_equal(x, x2)
|
||||
|
||||
def test_compare_dense_and_sparse(self):
|
||||
D = np.diag(range(1, 101))
|
||||
A = np.hstack([D, D, D, D])
|
||||
A_sparse = csc_matrix(A)
|
||||
np.random.seed(0)
|
||||
|
||||
Z, LS, Y = projections(A)
|
||||
Z_sparse, LS_sparse, Y_sparse = projections(A_sparse)
|
||||
for k in range(20):
|
||||
z = np.random.normal(size=(400,))
|
||||
assert_array_almost_equal(Z.dot(z), Z_sparse.dot(z))
|
||||
assert_array_almost_equal(LS.dot(z), LS_sparse.dot(z))
|
||||
x = np.random.normal(size=(100,))
|
||||
assert_array_almost_equal(Y.dot(x), Y_sparse.dot(x))
|
||||
|
||||
def test_compare_dense_and_sparse2(self):
|
||||
D1 = np.diag([-1.7, 1, 0.5])
|
||||
D2 = np.diag([1, -0.6, -0.3])
|
||||
D3 = np.diag([-0.3, -1.5, 2])
|
||||
A = np.hstack([D1, D2, D3])
|
||||
A_sparse = csc_matrix(A)
|
||||
np.random.seed(0)
|
||||
|
||||
Z, LS, Y = projections(A)
|
||||
Z_sparse, LS_sparse, Y_sparse = projections(A_sparse)
|
||||
for k in range(1):
|
||||
z = np.random.normal(size=(9,))
|
||||
assert_array_almost_equal(Z.dot(z), Z_sparse.dot(z))
|
||||
assert_array_almost_equal(LS.dot(z), LS_sparse.dot(z))
|
||||
x = np.random.normal(size=(3,))
|
||||
assert_array_almost_equal(Y.dot(x), Y_sparse.dot(x))
|
||||
|
||||
def test_iterative_refinements_dense(self):
|
||||
A = np.array([[1, 2, 3, 4, 0, 5, 0, 7],
|
||||
[0, 8, 7, 0, 1, 5, 9, 0],
|
||||
[1, 0, 0, 0, 0, 1, 2, 3]])
|
||||
test_points = ([1, 2, 3, 4, 5, 6, 7, 8],
|
||||
[1, 10, 3, 0, 1, 6, 7, 8],
|
||||
[1, 0, 0, 0, 0, 1, 2, 3+1e-10])
|
||||
|
||||
for method in available_dense_methods:
|
||||
Z, LS, _ = projections(A, method, orth_tol=1e-18, max_refin=10)
|
||||
for z in test_points:
|
||||
# Test if x is in the null_space
|
||||
x = Z.matvec(z)
|
||||
assert_allclose(A.dot(x), 0, rtol=0, atol=2.5e-14)
|
||||
# Test orthogonality
|
||||
assert_allclose(orthogonality(A, x), 0, rtol=0, atol=5e-16)
|
||||
|
||||
def test_rowspace_dense(self):
|
||||
A = np.array([[1, 2, 3, 4, 0, 5, 0, 7],
|
||||
[0, 8, 7, 0, 1, 5, 9, 0],
|
||||
[1, 0, 0, 0, 0, 1, 2, 3]])
|
||||
test_points = ([1, 2, 3],
|
||||
[1, 10, 3],
|
||||
[1.12, 10, 0])
|
||||
|
||||
for method in available_dense_methods:
|
||||
_, _, Y = projections(A, method)
|
||||
for z in test_points:
|
||||
# Test if x is solution of A x = z
|
||||
x = Y.matvec(z)
|
||||
assert_array_almost_equal(A.dot(x), z)
|
||||
# Test if x is in the return row space of A
|
||||
A_ext = np.vstack((A, x))
|
||||
assert_equal(np.linalg.matrix_rank(A),
|
||||
np.linalg.matrix_rank(A_ext))
|
||||
|
||||
|
||||
class TestOrthogonality(TestCase):
|
||||
|
||||
def test_dense_matrix(self):
|
||||
A = np.array([[1, 2, 3, 4, 0, 5, 0, 7],
|
||||
[0, 8, 7, 0, 1, 5, 9, 0],
|
||||
[1, 0, 0, 0, 0, 1, 2, 3]])
|
||||
test_vectors = ([-1.98931144, -1.56363389,
|
||||
-0.84115584, 2.2864762,
|
||||
5.599141, 0.09286976,
|
||||
1.37040802, -0.28145812],
|
||||
[697.92794044, -4091.65114008,
|
||||
-3327.42316335, 836.86906951,
|
||||
99434.98929065, -1285.37653682,
|
||||
-4109.21503806, 2935.29289083])
|
||||
test_expected_orth = (0, 0)
|
||||
|
||||
for i in range(len(test_vectors)):
|
||||
x = test_vectors[i]
|
||||
orth = test_expected_orth[i]
|
||||
assert_array_almost_equal(orthogonality(A, x), orth)
|
||||
|
||||
def test_sparse_matrix(self):
|
||||
A = np.array([[1, 2, 3, 4, 0, 5, 0, 7],
|
||||
[0, 8, 7, 0, 1, 5, 9, 0],
|
||||
[1, 0, 0, 0, 0, 1, 2, 3]])
|
||||
A = csc_matrix(A)
|
||||
test_vectors = ([-1.98931144, -1.56363389,
|
||||
-0.84115584, 2.2864762,
|
||||
5.599141, 0.09286976,
|
||||
1.37040802, -0.28145812],
|
||||
[697.92794044, -4091.65114008,
|
||||
-3327.42316335, 836.86906951,
|
||||
99434.98929065, -1285.37653682,
|
||||
-4109.21503806, 2935.29289083])
|
||||
test_expected_orth = (0, 0)
|
||||
|
||||
for i in range(len(test_vectors)):
|
||||
x = test_vectors[i]
|
||||
orth = test_expected_orth[i]
|
||||
assert_array_almost_equal(orthogonality(A, x), orth)
|
||||
@ -0,0 +1,645 @@
|
||||
import numpy as np
|
||||
from scipy.sparse import csc_matrix
|
||||
from scipy.optimize._trustregion_constr.qp_subproblem \
|
||||
import (eqp_kktfact,
|
||||
projected_cg,
|
||||
box_intersections,
|
||||
sphere_intersections,
|
||||
box_sphere_intersections,
|
||||
modified_dogleg)
|
||||
from scipy.optimize._trustregion_constr.projections \
|
||||
import projections
|
||||
from numpy.testing import TestCase, assert_array_almost_equal, assert_equal
|
||||
import pytest
|
||||
|
||||
|
||||
class TestEQPDirectFactorization(TestCase):
|
||||
|
||||
# From Example 16.2 Nocedal/Wright "Numerical
|
||||
# Optimization" p.452.
|
||||
def test_nocedal_example(self):
|
||||
H = csc_matrix([[6, 2, 1],
|
||||
[2, 5, 2],
|
||||
[1, 2, 4]])
|
||||
A = csc_matrix([[1, 0, 1],
|
||||
[0, 1, 1]])
|
||||
c = np.array([-8, -3, -3])
|
||||
b = -np.array([3, 0])
|
||||
x, lagrange_multipliers = eqp_kktfact(H, c, A, b)
|
||||
assert_array_almost_equal(x, [2, -1, 1])
|
||||
assert_array_almost_equal(lagrange_multipliers, [3, -2])
|
||||
|
||||
|
||||
class TestSphericalBoundariesIntersections(TestCase):
|
||||
|
||||
def test_2d_sphere_constraints(self):
|
||||
# Interior inicial point
|
||||
ta, tb, intersect = sphere_intersections([0, 0],
|
||||
[1, 0], 0.5)
|
||||
assert_array_almost_equal([ta, tb], [0, 0.5])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# No intersection between line and circle
|
||||
ta, tb, intersect = sphere_intersections([2, 0],
|
||||
[0, 1], 1)
|
||||
assert_equal(intersect, False)
|
||||
|
||||
# Outside initial point pointing toward outside the circle
|
||||
ta, tb, intersect = sphere_intersections([2, 0],
|
||||
[1, 0], 1)
|
||||
assert_equal(intersect, False)
|
||||
|
||||
# Outside initial point pointing toward inside the circle
|
||||
ta, tb, intersect = sphere_intersections([2, 0],
|
||||
[-1, 0], 1.5)
|
||||
assert_array_almost_equal([ta, tb], [0.5, 1])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Initial point on the boundary
|
||||
ta, tb, intersect = sphere_intersections([2, 0],
|
||||
[1, 0], 2)
|
||||
assert_array_almost_equal([ta, tb], [0, 0])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
def test_2d_sphere_constraints_line_intersections(self):
|
||||
# Interior initial point
|
||||
ta, tb, intersect = sphere_intersections([0, 0],
|
||||
[1, 0], 0.5,
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [-0.5, 0.5])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# No intersection between line and circle
|
||||
ta, tb, intersect = sphere_intersections([2, 0],
|
||||
[0, 1], 1,
|
||||
entire_line=True)
|
||||
assert_equal(intersect, False)
|
||||
|
||||
# Outside initial point pointing toward outside the circle
|
||||
ta, tb, intersect = sphere_intersections([2, 0],
|
||||
[1, 0], 1,
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [-3, -1])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Outside initial point pointing toward inside the circle
|
||||
ta, tb, intersect = sphere_intersections([2, 0],
|
||||
[-1, 0], 1.5,
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [0.5, 3.5])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Initial point on the boundary
|
||||
ta, tb, intersect = sphere_intersections([2, 0],
|
||||
[1, 0], 2,
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [-4, 0])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
|
||||
class TestBoxBoundariesIntersections(TestCase):
|
||||
|
||||
def test_2d_box_constraints(self):
|
||||
# Box constraint in the direction of vector d
|
||||
ta, tb, intersect = box_intersections([2, 0], [0, 2],
|
||||
[1, 1], [3, 3])
|
||||
assert_array_almost_equal([ta, tb], [0.5, 1])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Negative direction
|
||||
ta, tb, intersect = box_intersections([2, 0], [0, 2],
|
||||
[1, -3], [3, -1])
|
||||
assert_equal(intersect, False)
|
||||
|
||||
# Some constraints are absent (set to +/- inf)
|
||||
ta, tb, intersect = box_intersections([2, 0], [0, 2],
|
||||
[-np.inf, 1],
|
||||
[np.inf, np.inf])
|
||||
assert_array_almost_equal([ta, tb], [0.5, 1])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Intersect on the face of the box
|
||||
ta, tb, intersect = box_intersections([1, 0], [0, 1],
|
||||
[1, 1], [3, 3])
|
||||
assert_array_almost_equal([ta, tb], [1, 1])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Interior initial point
|
||||
ta, tb, intersect = box_intersections([0, 0], [4, 4],
|
||||
[-2, -3], [3, 2])
|
||||
assert_array_almost_equal([ta, tb], [0, 0.5])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# No intersection between line and box constraints
|
||||
ta, tb, intersect = box_intersections([2, 0], [0, 2],
|
||||
[-3, -3], [-1, -1])
|
||||
assert_equal(intersect, False)
|
||||
ta, tb, intersect = box_intersections([2, 0], [0, 2],
|
||||
[-3, 3], [-1, 1])
|
||||
assert_equal(intersect, False)
|
||||
ta, tb, intersect = box_intersections([2, 0], [0, 2],
|
||||
[-3, -np.inf],
|
||||
[-1, np.inf])
|
||||
assert_equal(intersect, False)
|
||||
ta, tb, intersect = box_intersections([0, 0], [1, 100],
|
||||
[1, 1], [3, 3])
|
||||
assert_equal(intersect, False)
|
||||
ta, tb, intersect = box_intersections([0.99, 0], [0, 2],
|
||||
[1, 1], [3, 3])
|
||||
assert_equal(intersect, False)
|
||||
|
||||
# Initial point on the boundary
|
||||
ta, tb, intersect = box_intersections([2, 2], [0, 1],
|
||||
[-2, -2], [2, 2])
|
||||
assert_array_almost_equal([ta, tb], [0, 0])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
def test_2d_box_constraints_entire_line(self):
|
||||
# Box constraint in the direction of vector d
|
||||
ta, tb, intersect = box_intersections([2, 0], [0, 2],
|
||||
[1, 1], [3, 3],
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [0.5, 1.5])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Negative direction
|
||||
ta, tb, intersect = box_intersections([2, 0], [0, 2],
|
||||
[1, -3], [3, -1],
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [-1.5, -0.5])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Some constraints are absent (set to +/- inf)
|
||||
ta, tb, intersect = box_intersections([2, 0], [0, 2],
|
||||
[-np.inf, 1],
|
||||
[np.inf, np.inf],
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [0.5, np.inf])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Intersect on the face of the box
|
||||
ta, tb, intersect = box_intersections([1, 0], [0, 1],
|
||||
[1, 1], [3, 3],
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [1, 3])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Interior initial pointoint
|
||||
ta, tb, intersect = box_intersections([0, 0], [4, 4],
|
||||
[-2, -3], [3, 2],
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [-0.5, 0.5])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# No intersection between line and box constraints
|
||||
ta, tb, intersect = box_intersections([2, 0], [0, 2],
|
||||
[-3, -3], [-1, -1],
|
||||
entire_line=True)
|
||||
assert_equal(intersect, False)
|
||||
ta, tb, intersect = box_intersections([2, 0], [0, 2],
|
||||
[-3, 3], [-1, 1],
|
||||
entire_line=True)
|
||||
assert_equal(intersect, False)
|
||||
ta, tb, intersect = box_intersections([2, 0], [0, 2],
|
||||
[-3, -np.inf],
|
||||
[-1, np.inf],
|
||||
entire_line=True)
|
||||
assert_equal(intersect, False)
|
||||
ta, tb, intersect = box_intersections([0, 0], [1, 100],
|
||||
[1, 1], [3, 3],
|
||||
entire_line=True)
|
||||
assert_equal(intersect, False)
|
||||
ta, tb, intersect = box_intersections([0.99, 0], [0, 2],
|
||||
[1, 1], [3, 3],
|
||||
entire_line=True)
|
||||
assert_equal(intersect, False)
|
||||
|
||||
# Initial point on the boundary
|
||||
ta, tb, intersect = box_intersections([2, 2], [0, 1],
|
||||
[-2, -2], [2, 2],
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [-4, 0])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
def test_3d_box_constraints(self):
|
||||
# Simple case
|
||||
ta, tb, intersect = box_intersections([1, 1, 0], [0, 0, 1],
|
||||
[1, 1, 1], [3, 3, 3])
|
||||
assert_array_almost_equal([ta, tb], [1, 1])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Negative direction
|
||||
ta, tb, intersect = box_intersections([1, 1, 0], [0, 0, -1],
|
||||
[1, 1, 1], [3, 3, 3])
|
||||
assert_equal(intersect, False)
|
||||
|
||||
# Interior point
|
||||
ta, tb, intersect = box_intersections([2, 2, 2], [0, -1, 1],
|
||||
[1, 1, 1], [3, 3, 3])
|
||||
assert_array_almost_equal([ta, tb], [0, 1])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
def test_3d_box_constraints_entire_line(self):
|
||||
# Simple case
|
||||
ta, tb, intersect = box_intersections([1, 1, 0], [0, 0, 1],
|
||||
[1, 1, 1], [3, 3, 3],
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [1, 3])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Negative direction
|
||||
ta, tb, intersect = box_intersections([1, 1, 0], [0, 0, -1],
|
||||
[1, 1, 1], [3, 3, 3],
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [-3, -1])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Interior point
|
||||
ta, tb, intersect = box_intersections([2, 2, 2], [0, -1, 1],
|
||||
[1, 1, 1], [3, 3, 3],
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [-1, 1])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
|
||||
class TestBoxSphereBoundariesIntersections(TestCase):
|
||||
|
||||
def test_2d_box_constraints(self):
|
||||
# Both constraints are active
|
||||
ta, tb, intersect = box_sphere_intersections([1, 1], [-2, 2],
|
||||
[-1, -2], [1, 2], 2,
|
||||
entire_line=False)
|
||||
assert_array_almost_equal([ta, tb], [0, 0.5])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# None of the constraints are active
|
||||
ta, tb, intersect = box_sphere_intersections([1, 1], [-1, 1],
|
||||
[-1, -3], [1, 3], 10,
|
||||
entire_line=False)
|
||||
assert_array_almost_equal([ta, tb], [0, 1])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Box constraints are active
|
||||
ta, tb, intersect = box_sphere_intersections([1, 1], [-4, 4],
|
||||
[-1, -3], [1, 3], 10,
|
||||
entire_line=False)
|
||||
assert_array_almost_equal([ta, tb], [0, 0.5])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Spherical constraints are active
|
||||
ta, tb, intersect = box_sphere_intersections([1, 1], [-4, 4],
|
||||
[-1, -3], [1, 3], 2,
|
||||
entire_line=False)
|
||||
assert_array_almost_equal([ta, tb], [0, 0.25])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Infeasible problems
|
||||
ta, tb, intersect = box_sphere_intersections([2, 2], [-4, 4],
|
||||
[-1, -3], [1, 3], 2,
|
||||
entire_line=False)
|
||||
assert_equal(intersect, False)
|
||||
ta, tb, intersect = box_sphere_intersections([1, 1], [-4, 4],
|
||||
[2, 4], [2, 4], 2,
|
||||
entire_line=False)
|
||||
assert_equal(intersect, False)
|
||||
|
||||
def test_2d_box_constraints_entire_line(self):
|
||||
# Both constraints are active
|
||||
ta, tb, intersect = box_sphere_intersections([1, 1], [-2, 2],
|
||||
[-1, -2], [1, 2], 2,
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [0, 0.5])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# None of the constraints are active
|
||||
ta, tb, intersect = box_sphere_intersections([1, 1], [-1, 1],
|
||||
[-1, -3], [1, 3], 10,
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [0, 2])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Box constraints are active
|
||||
ta, tb, intersect = box_sphere_intersections([1, 1], [-4, 4],
|
||||
[-1, -3], [1, 3], 10,
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [0, 0.5])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Spherical constraints are active
|
||||
ta, tb, intersect = box_sphere_intersections([1, 1], [-4, 4],
|
||||
[-1, -3], [1, 3], 2,
|
||||
entire_line=True)
|
||||
assert_array_almost_equal([ta, tb], [0, 0.25])
|
||||
assert_equal(intersect, True)
|
||||
|
||||
# Infeasible problems
|
||||
ta, tb, intersect = box_sphere_intersections([2, 2], [-4, 4],
|
||||
[-1, -3], [1, 3], 2,
|
||||
entire_line=True)
|
||||
assert_equal(intersect, False)
|
||||
ta, tb, intersect = box_sphere_intersections([1, 1], [-4, 4],
|
||||
[2, 4], [2, 4], 2,
|
||||
entire_line=True)
|
||||
assert_equal(intersect, False)
|
||||
|
||||
|
||||
class TestModifiedDogleg(TestCase):
|
||||
|
||||
def test_cauchypoint_equalsto_newtonpoint(self):
|
||||
A = np.array([[1, 8]])
|
||||
b = np.array([-16])
|
||||
_, _, Y = projections(A)
|
||||
newton_point = np.array([0.24615385, 1.96923077])
|
||||
|
||||
# Newton point inside boundaries
|
||||
x = modified_dogleg(A, Y, b, 2, [-np.inf, -np.inf], [np.inf, np.inf])
|
||||
assert_array_almost_equal(x, newton_point)
|
||||
|
||||
# Spherical constraint active
|
||||
x = modified_dogleg(A, Y, b, 1, [-np.inf, -np.inf], [np.inf, np.inf])
|
||||
assert_array_almost_equal(x, newton_point/np.linalg.norm(newton_point))
|
||||
|
||||
# Box constraints active
|
||||
x = modified_dogleg(A, Y, b, 2, [-np.inf, -np.inf], [0.1, np.inf])
|
||||
assert_array_almost_equal(x, (newton_point/newton_point[0]) * 0.1)
|
||||
|
||||
def test_3d_example(self):
|
||||
A = np.array([[1, 8, 1],
|
||||
[4, 2, 2]])
|
||||
b = np.array([-16, 2])
|
||||
Z, LS, Y = projections(A)
|
||||
|
||||
newton_point = np.array([-1.37090909, 2.23272727, -0.49090909])
|
||||
cauchy_point = np.array([0.11165723, 1.73068711, 0.16748585])
|
||||
origin = np.zeros_like(newton_point)
|
||||
|
||||
# newton_point inside boundaries
|
||||
x = modified_dogleg(A, Y, b, 3, [-np.inf, -np.inf, -np.inf],
|
||||
[np.inf, np.inf, np.inf])
|
||||
assert_array_almost_equal(x, newton_point)
|
||||
|
||||
# line between cauchy_point and newton_point contains best point
|
||||
# (spherical constraint is active).
|
||||
x = modified_dogleg(A, Y, b, 2, [-np.inf, -np.inf, -np.inf],
|
||||
[np.inf, np.inf, np.inf])
|
||||
z = cauchy_point
|
||||
d = newton_point-cauchy_point
|
||||
t = ((x-z)/(d))
|
||||
assert_array_almost_equal(t, np.full(3, 0.40807330))
|
||||
assert_array_almost_equal(np.linalg.norm(x), 2)
|
||||
|
||||
# line between cauchy_point and newton_point contains best point
|
||||
# (box constraint is active).
|
||||
x = modified_dogleg(A, Y, b, 5, [-1, -np.inf, -np.inf],
|
||||
[np.inf, np.inf, np.inf])
|
||||
z = cauchy_point
|
||||
d = newton_point-cauchy_point
|
||||
t = ((x-z)/(d))
|
||||
assert_array_almost_equal(t, np.full(3, 0.7498195))
|
||||
assert_array_almost_equal(x[0], -1)
|
||||
|
||||
# line between origin and cauchy_point contains best point
|
||||
# (spherical constraint is active).
|
||||
x = modified_dogleg(A, Y, b, 1, [-np.inf, -np.inf, -np.inf],
|
||||
[np.inf, np.inf, np.inf])
|
||||
z = origin
|
||||
d = cauchy_point
|
||||
t = ((x-z)/(d))
|
||||
assert_array_almost_equal(t, np.full(3, 0.573936265))
|
||||
assert_array_almost_equal(np.linalg.norm(x), 1)
|
||||
|
||||
# line between origin and newton_point contains best point
|
||||
# (box constraint is active).
|
||||
x = modified_dogleg(A, Y, b, 2, [-np.inf, -np.inf, -np.inf],
|
||||
[np.inf, 1, np.inf])
|
||||
z = origin
|
||||
d = newton_point
|
||||
t = ((x-z)/(d))
|
||||
assert_array_almost_equal(t, np.full(3, 0.4478827364))
|
||||
assert_array_almost_equal(x[1], 1)
|
||||
|
||||
|
||||
class TestProjectCG(TestCase):
|
||||
|
||||
# From Example 16.2 Nocedal/Wright "Numerical
|
||||
# Optimization" p.452.
|
||||
def test_nocedal_example(self):
|
||||
H = csc_matrix([[6, 2, 1],
|
||||
[2, 5, 2],
|
||||
[1, 2, 4]])
|
||||
A = csc_matrix([[1, 0, 1],
|
||||
[0, 1, 1]])
|
||||
c = np.array([-8, -3, -3])
|
||||
b = -np.array([3, 0])
|
||||
Z, _, Y = projections(A)
|
||||
x, info = projected_cg(H, c, Z, Y, b)
|
||||
assert_equal(info["stop_cond"], 4)
|
||||
assert_equal(info["hits_boundary"], False)
|
||||
assert_array_almost_equal(x, [2, -1, 1])
|
||||
|
||||
def test_compare_with_direct_fact(self):
|
||||
H = csc_matrix([[6, 2, 1, 3],
|
||||
[2, 5, 2, 4],
|
||||
[1, 2, 4, 5],
|
||||
[3, 4, 5, 7]])
|
||||
A = csc_matrix([[1, 0, 1, 0],
|
||||
[0, 1, 1, 1]])
|
||||
c = np.array([-2, -3, -3, 1])
|
||||
b = -np.array([3, 0])
|
||||
Z, _, Y = projections(A)
|
||||
x, info = projected_cg(H, c, Z, Y, b, tol=0)
|
||||
x_kkt, _ = eqp_kktfact(H, c, A, b)
|
||||
assert_equal(info["stop_cond"], 1)
|
||||
assert_equal(info["hits_boundary"], False)
|
||||
assert_array_almost_equal(x, x_kkt)
|
||||
|
||||
def test_trust_region_infeasible(self):
|
||||
H = csc_matrix([[6, 2, 1, 3],
|
||||
[2, 5, 2, 4],
|
||||
[1, 2, 4, 5],
|
||||
[3, 4, 5, 7]])
|
||||
A = csc_matrix([[1, 0, 1, 0],
|
||||
[0, 1, 1, 1]])
|
||||
c = np.array([-2, -3, -3, 1])
|
||||
b = -np.array([3, 0])
|
||||
trust_radius = 1
|
||||
Z, _, Y = projections(A)
|
||||
with pytest.raises(ValueError):
|
||||
projected_cg(H, c, Z, Y, b, trust_radius=trust_radius)
|
||||
|
||||
def test_trust_region_barely_feasible(self):
|
||||
H = csc_matrix([[6, 2, 1, 3],
|
||||
[2, 5, 2, 4],
|
||||
[1, 2, 4, 5],
|
||||
[3, 4, 5, 7]])
|
||||
A = csc_matrix([[1, 0, 1, 0],
|
||||
[0, 1, 1, 1]])
|
||||
c = np.array([-2, -3, -3, 1])
|
||||
b = -np.array([3, 0])
|
||||
trust_radius = 2.32379000772445021283
|
||||
Z, _, Y = projections(A)
|
||||
x, info = projected_cg(H, c, Z, Y, b,
|
||||
tol=0,
|
||||
trust_radius=trust_radius)
|
||||
assert_equal(info["stop_cond"], 2)
|
||||
assert_equal(info["hits_boundary"], True)
|
||||
assert_array_almost_equal(np.linalg.norm(x), trust_radius)
|
||||
assert_array_almost_equal(x, -Y.dot(b))
|
||||
|
||||
def test_hits_boundary(self):
|
||||
H = csc_matrix([[6, 2, 1, 3],
|
||||
[2, 5, 2, 4],
|
||||
[1, 2, 4, 5],
|
||||
[3, 4, 5, 7]])
|
||||
A = csc_matrix([[1, 0, 1, 0],
|
||||
[0, 1, 1, 1]])
|
||||
c = np.array([-2, -3, -3, 1])
|
||||
b = -np.array([3, 0])
|
||||
trust_radius = 3
|
||||
Z, _, Y = projections(A)
|
||||
x, info = projected_cg(H, c, Z, Y, b,
|
||||
tol=0,
|
||||
trust_radius=trust_radius)
|
||||
assert_equal(info["stop_cond"], 2)
|
||||
assert_equal(info["hits_boundary"], True)
|
||||
assert_array_almost_equal(np.linalg.norm(x), trust_radius)
|
||||
|
||||
def test_negative_curvature_unconstrained(self):
|
||||
H = csc_matrix([[1, 2, 1, 3],
|
||||
[2, 0, 2, 4],
|
||||
[1, 2, 0, 2],
|
||||
[3, 4, 2, 0]])
|
||||
A = csc_matrix([[1, 0, 1, 0],
|
||||
[0, 1, 0, 1]])
|
||||
c = np.array([-2, -3, -3, 1])
|
||||
b = -np.array([3, 0])
|
||||
Z, _, Y = projections(A)
|
||||
with pytest.raises(ValueError):
|
||||
projected_cg(H, c, Z, Y, b, tol=0)
|
||||
|
||||
def test_negative_curvature(self):
|
||||
H = csc_matrix([[1, 2, 1, 3],
|
||||
[2, 0, 2, 4],
|
||||
[1, 2, 0, 2],
|
||||
[3, 4, 2, 0]])
|
||||
A = csc_matrix([[1, 0, 1, 0],
|
||||
[0, 1, 0, 1]])
|
||||
c = np.array([-2, -3, -3, 1])
|
||||
b = -np.array([3, 0])
|
||||
Z, _, Y = projections(A)
|
||||
trust_radius = 1000
|
||||
x, info = projected_cg(H, c, Z, Y, b,
|
||||
tol=0,
|
||||
trust_radius=trust_radius)
|
||||
assert_equal(info["stop_cond"], 3)
|
||||
assert_equal(info["hits_boundary"], True)
|
||||
assert_array_almost_equal(np.linalg.norm(x), trust_radius)
|
||||
|
||||
# The box constraints are inactive at the solution but
|
||||
# are active during the iterations.
|
||||
def test_inactive_box_constraints(self):
|
||||
H = csc_matrix([[6, 2, 1, 3],
|
||||
[2, 5, 2, 4],
|
||||
[1, 2, 4, 5],
|
||||
[3, 4, 5, 7]])
|
||||
A = csc_matrix([[1, 0, 1, 0],
|
||||
[0, 1, 1, 1]])
|
||||
c = np.array([-2, -3, -3, 1])
|
||||
b = -np.array([3, 0])
|
||||
Z, _, Y = projections(A)
|
||||
x, info = projected_cg(H, c, Z, Y, b,
|
||||
tol=0,
|
||||
lb=[0.5, -np.inf,
|
||||
-np.inf, -np.inf],
|
||||
return_all=True)
|
||||
x_kkt, _ = eqp_kktfact(H, c, A, b)
|
||||
assert_equal(info["stop_cond"], 1)
|
||||
assert_equal(info["hits_boundary"], False)
|
||||
assert_array_almost_equal(x, x_kkt)
|
||||
|
||||
# The box constraints active and the termination is
|
||||
# by maximum iterations (infeasible interaction).
|
||||
def test_active_box_constraints_maximum_iterations_reached(self):
|
||||
H = csc_matrix([[6, 2, 1, 3],
|
||||
[2, 5, 2, 4],
|
||||
[1, 2, 4, 5],
|
||||
[3, 4, 5, 7]])
|
||||
A = csc_matrix([[1, 0, 1, 0],
|
||||
[0, 1, 1, 1]])
|
||||
c = np.array([-2, -3, -3, 1])
|
||||
b = -np.array([3, 0])
|
||||
Z, _, Y = projections(A)
|
||||
x, info = projected_cg(H, c, Z, Y, b,
|
||||
tol=0,
|
||||
lb=[0.8, -np.inf,
|
||||
-np.inf, -np.inf],
|
||||
return_all=True)
|
||||
assert_equal(info["stop_cond"], 1)
|
||||
assert_equal(info["hits_boundary"], True)
|
||||
assert_array_almost_equal(A.dot(x), -b)
|
||||
assert_array_almost_equal(x[0], 0.8)
|
||||
|
||||
# The box constraints are active and the termination is
|
||||
# because it hits boundary (without infeasible interaction).
|
||||
def test_active_box_constraints_hits_boundaries(self):
|
||||
H = csc_matrix([[6, 2, 1, 3],
|
||||
[2, 5, 2, 4],
|
||||
[1, 2, 4, 5],
|
||||
[3, 4, 5, 7]])
|
||||
A = csc_matrix([[1, 0, 1, 0],
|
||||
[0, 1, 1, 1]])
|
||||
c = np.array([-2, -3, -3, 1])
|
||||
b = -np.array([3, 0])
|
||||
trust_radius = 3
|
||||
Z, _, Y = projections(A)
|
||||
x, info = projected_cg(H, c, Z, Y, b,
|
||||
tol=0,
|
||||
ub=[np.inf, np.inf, 1.6, np.inf],
|
||||
trust_radius=trust_radius,
|
||||
return_all=True)
|
||||
assert_equal(info["stop_cond"], 2)
|
||||
assert_equal(info["hits_boundary"], True)
|
||||
assert_array_almost_equal(x[2], 1.6)
|
||||
|
||||
# The box constraints are active and the termination is
|
||||
# because it hits boundary (infeasible interaction).
|
||||
def test_active_box_constraints_hits_boundaries_infeasible_iter(self):
|
||||
H = csc_matrix([[6, 2, 1, 3],
|
||||
[2, 5, 2, 4],
|
||||
[1, 2, 4, 5],
|
||||
[3, 4, 5, 7]])
|
||||
A = csc_matrix([[1, 0, 1, 0],
|
||||
[0, 1, 1, 1]])
|
||||
c = np.array([-2, -3, -3, 1])
|
||||
b = -np.array([3, 0])
|
||||
trust_radius = 4
|
||||
Z, _, Y = projections(A)
|
||||
x, info = projected_cg(H, c, Z, Y, b,
|
||||
tol=0,
|
||||
ub=[np.inf, 0.1, np.inf, np.inf],
|
||||
trust_radius=trust_radius,
|
||||
return_all=True)
|
||||
assert_equal(info["stop_cond"], 2)
|
||||
assert_equal(info["hits_boundary"], True)
|
||||
assert_array_almost_equal(x[1], 0.1)
|
||||
|
||||
# The box constraints are active and the termination is
|
||||
# because it hits boundary (no infeasible interaction).
|
||||
def test_active_box_constraints_negative_curvature(self):
|
||||
H = csc_matrix([[1, 2, 1, 3],
|
||||
[2, 0, 2, 4],
|
||||
[1, 2, 0, 2],
|
||||
[3, 4, 2, 0]])
|
||||
A = csc_matrix([[1, 0, 1, 0],
|
||||
[0, 1, 0, 1]])
|
||||
c = np.array([-2, -3, -3, 1])
|
||||
b = -np.array([3, 0])
|
||||
Z, _, Y = projections(A)
|
||||
trust_radius = 1000
|
||||
x, info = projected_cg(H, c, Z, Y, b,
|
||||
tol=0,
|
||||
ub=[np.inf, np.inf, 100, np.inf],
|
||||
trust_radius=trust_radius)
|
||||
assert_equal(info["stop_cond"], 3)
|
||||
assert_equal(info["hits_boundary"], True)
|
||||
assert_array_almost_equal(x[2], 100)
|
||||
@ -0,0 +1,34 @@
|
||||
import pytest
|
||||
import numpy as np
|
||||
from scipy.optimize import minimize, Bounds
|
||||
|
||||
def test_gh10880():
|
||||
# checks that verbose reporting works with trust-constr for
|
||||
# bound-contrained problems
|
||||
bnds = Bounds(1, 2)
|
||||
opts = {'maxiter': 1000, 'verbose': 2}
|
||||
minimize(lambda x: x**2, x0=2., method='trust-constr',
|
||||
bounds=bnds, options=opts)
|
||||
|
||||
opts = {'maxiter': 1000, 'verbose': 3}
|
||||
minimize(lambda x: x**2, x0=2., method='trust-constr',
|
||||
bounds=bnds, options=opts)
|
||||
|
||||
@pytest.mark.xslow
|
||||
def test_gh12922():
|
||||
# checks that verbose reporting works with trust-constr for
|
||||
# general constraints
|
||||
def objective(x):
|
||||
return np.array([(np.sum((x+1)**4))])
|
||||
|
||||
cons = {'type': 'ineq', 'fun': lambda x: -x[0]**2}
|
||||
n = 25
|
||||
x0 = np.linspace(-5, 5, n)
|
||||
|
||||
opts = {'maxiter': 1000, 'verbose': 2}
|
||||
minimize(objective, x0=x0, method='trust-constr',
|
||||
constraints=cons, options=opts)
|
||||
|
||||
opts = {'maxiter': 1000, 'verbose': 3}
|
||||
minimize(objective, x0=x0, method='trust-constr',
|
||||
constraints=cons, options=opts)
|
||||
@ -0,0 +1,346 @@
|
||||
"""Trust-region interior point method.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Byrd, Richard H., Mary E. Hribar, and Jorge Nocedal.
|
||||
"An interior point algorithm for large-scale nonlinear
|
||||
programming." SIAM Journal on Optimization 9.4 (1999): 877-900.
|
||||
.. [2] Byrd, Richard H., Guanghui Liu, and Jorge Nocedal.
|
||||
"On the local behavior of an interior point method for
|
||||
nonlinear programming." Numerical analysis 1997 (1997): 37-56.
|
||||
.. [3] Nocedal, Jorge, and Stephen J. Wright. "Numerical optimization"
|
||||
Second Edition (2006).
|
||||
"""
|
||||
|
||||
import scipy.sparse as sps
|
||||
import numpy as np
|
||||
from .equality_constrained_sqp import equality_constrained_sqp
|
||||
from scipy.sparse.linalg import LinearOperator
|
||||
|
||||
__all__ = ['tr_interior_point']
|
||||
|
||||
|
||||
class BarrierSubproblem:
|
||||
"""
|
||||
Barrier optimization problem:
|
||||
minimize fun(x) - barrier_parameter*sum(log(s))
|
||||
subject to: constr_eq(x) = 0
|
||||
constr_ineq(x) + s = 0
|
||||
"""
|
||||
|
||||
def __init__(self, x0, s0, fun, grad, lagr_hess, n_vars, n_ineq, n_eq,
|
||||
constr, jac, barrier_parameter, tolerance,
|
||||
enforce_feasibility, global_stop_criteria,
|
||||
xtol, fun0, grad0, constr_ineq0, jac_ineq0, constr_eq0,
|
||||
jac_eq0):
|
||||
# Store parameters
|
||||
self.n_vars = n_vars
|
||||
self.x0 = x0
|
||||
self.s0 = s0
|
||||
self.fun = fun
|
||||
self.grad = grad
|
||||
self.lagr_hess = lagr_hess
|
||||
self.constr = constr
|
||||
self.jac = jac
|
||||
self.barrier_parameter = barrier_parameter
|
||||
self.tolerance = tolerance
|
||||
self.n_eq = n_eq
|
||||
self.n_ineq = n_ineq
|
||||
self.enforce_feasibility = enforce_feasibility
|
||||
self.global_stop_criteria = global_stop_criteria
|
||||
self.xtol = xtol
|
||||
self.fun0 = self._compute_function(fun0, constr_ineq0, s0)
|
||||
self.grad0 = self._compute_gradient(grad0)
|
||||
self.constr0 = self._compute_constr(constr_ineq0, constr_eq0, s0)
|
||||
self.jac0 = self._compute_jacobian(jac_eq0, jac_ineq0, s0)
|
||||
self.terminate = False
|
||||
|
||||
def update(self, barrier_parameter, tolerance):
|
||||
self.barrier_parameter = barrier_parameter
|
||||
self.tolerance = tolerance
|
||||
|
||||
def get_slack(self, z):
|
||||
return z[self.n_vars:self.n_vars+self.n_ineq]
|
||||
|
||||
def get_variables(self, z):
|
||||
return z[:self.n_vars]
|
||||
|
||||
def function_and_constraints(self, z):
|
||||
"""Returns barrier function and constraints at given point.
|
||||
|
||||
For z = [x, s], returns barrier function:
|
||||
function(z) = fun(x) - barrier_parameter*sum(log(s))
|
||||
and barrier constraints:
|
||||
constraints(z) = [ constr_eq(x) ]
|
||||
[ constr_ineq(x) + s ]
|
||||
|
||||
"""
|
||||
# Get variables and slack variables
|
||||
x = self.get_variables(z)
|
||||
s = self.get_slack(z)
|
||||
# Compute function and constraints
|
||||
f = self.fun(x)
|
||||
c_eq, c_ineq = self.constr(x)
|
||||
# Return objective function and constraints
|
||||
return (self._compute_function(f, c_ineq, s),
|
||||
self._compute_constr(c_ineq, c_eq, s))
|
||||
|
||||
def _compute_function(self, f, c_ineq, s):
|
||||
# Use technique from Nocedal and Wright book, ref [3]_, p.576,
|
||||
# to guarantee constraints from `enforce_feasibility`
|
||||
# stay feasible along iterations.
|
||||
s[self.enforce_feasibility] = -c_ineq[self.enforce_feasibility]
|
||||
log_s = [np.log(s_i) if s_i > 0 else -np.inf for s_i in s]
|
||||
# Compute barrier objective function
|
||||
return f - self.barrier_parameter*np.sum(log_s)
|
||||
|
||||
def _compute_constr(self, c_ineq, c_eq, s):
|
||||
# Compute barrier constraint
|
||||
return np.hstack((c_eq,
|
||||
c_ineq + s))
|
||||
|
||||
def scaling(self, z):
|
||||
"""Returns scaling vector.
|
||||
Given by:
|
||||
scaling = [ones(n_vars), s]
|
||||
"""
|
||||
s = self.get_slack(z)
|
||||
diag_elements = np.hstack((np.ones(self.n_vars), s))
|
||||
|
||||
# Diagonal matrix
|
||||
def matvec(vec):
|
||||
return diag_elements*vec
|
||||
return LinearOperator((self.n_vars+self.n_ineq,
|
||||
self.n_vars+self.n_ineq),
|
||||
matvec)
|
||||
|
||||
def gradient_and_jacobian(self, z):
|
||||
"""Returns scaled gradient.
|
||||
|
||||
Return scaled gradient:
|
||||
gradient = [ grad(x) ]
|
||||
[ -barrier_parameter*ones(n_ineq) ]
|
||||
and scaled Jacobian matrix:
|
||||
jacobian = [ jac_eq(x) 0 ]
|
||||
[ jac_ineq(x) S ]
|
||||
Both of them scaled by the previously defined scaling factor.
|
||||
"""
|
||||
# Get variables and slack variables
|
||||
x = self.get_variables(z)
|
||||
s = self.get_slack(z)
|
||||
# Compute first derivatives
|
||||
g = self.grad(x)
|
||||
J_eq, J_ineq = self.jac(x)
|
||||
# Return gradient and Jacobian
|
||||
return (self._compute_gradient(g),
|
||||
self._compute_jacobian(J_eq, J_ineq, s))
|
||||
|
||||
def _compute_gradient(self, g):
|
||||
return np.hstack((g, -self.barrier_parameter*np.ones(self.n_ineq)))
|
||||
|
||||
def _compute_jacobian(self, J_eq, J_ineq, s):
|
||||
if self.n_ineq == 0:
|
||||
return J_eq
|
||||
else:
|
||||
if sps.issparse(J_eq) or sps.issparse(J_ineq):
|
||||
# It is expected that J_eq and J_ineq
|
||||
# are already `csr_matrix` because of
|
||||
# the way ``BoxConstraint``, ``NonlinearConstraint``
|
||||
# and ``LinearConstraint`` are defined.
|
||||
J_eq = sps.csr_matrix(J_eq)
|
||||
J_ineq = sps.csr_matrix(J_ineq)
|
||||
return self._assemble_sparse_jacobian(J_eq, J_ineq, s)
|
||||
else:
|
||||
S = np.diag(s)
|
||||
zeros = np.zeros((self.n_eq, self.n_ineq))
|
||||
# Convert to matrix
|
||||
if sps.issparse(J_ineq):
|
||||
J_ineq = J_ineq.toarray()
|
||||
if sps.issparse(J_eq):
|
||||
J_eq = J_eq.toarray()
|
||||
# Concatenate matrices
|
||||
return np.block([[J_eq, zeros],
|
||||
[J_ineq, S]])
|
||||
|
||||
def _assemble_sparse_jacobian(self, J_eq, J_ineq, s):
|
||||
"""Assemble sparse Jacobian given its components.
|
||||
|
||||
Given ``J_eq``, ``J_ineq`` and ``s`` returns:
|
||||
jacobian = [ J_eq, 0 ]
|
||||
[ J_ineq, diag(s) ]
|
||||
|
||||
It is equivalent to:
|
||||
sps.bmat([[ J_eq, None ],
|
||||
[ J_ineq, diag(s) ]], "csr")
|
||||
but significantly more efficient for this
|
||||
given structure.
|
||||
"""
|
||||
n_vars, n_ineq, n_eq = self.n_vars, self.n_ineq, self.n_eq
|
||||
J_aux = sps.vstack([J_eq, J_ineq], "csr")
|
||||
indptr, indices, data = J_aux.indptr, J_aux.indices, J_aux.data
|
||||
new_indptr = indptr + np.hstack((np.zeros(n_eq, dtype=int),
|
||||
np.arange(n_ineq+1, dtype=int)))
|
||||
size = indices.size+n_ineq
|
||||
new_indices = np.empty(size)
|
||||
new_data = np.empty(size)
|
||||
mask = np.full(size, False, bool)
|
||||
mask[new_indptr[-n_ineq:]-1] = True
|
||||
new_indices[mask] = n_vars+np.arange(n_ineq)
|
||||
new_indices[~mask] = indices
|
||||
new_data[mask] = s
|
||||
new_data[~mask] = data
|
||||
J = sps.csr_matrix((new_data, new_indices, new_indptr),
|
||||
(n_eq + n_ineq, n_vars + n_ineq))
|
||||
return J
|
||||
|
||||
def lagrangian_hessian_x(self, z, v):
|
||||
"""Returns Lagrangian Hessian (in relation to `x`) -> Hx"""
|
||||
x = self.get_variables(z)
|
||||
# Get lagrange multipliers related to nonlinear equality constraints
|
||||
v_eq = v[:self.n_eq]
|
||||
# Get lagrange multipliers related to nonlinear ineq. constraints
|
||||
v_ineq = v[self.n_eq:self.n_eq+self.n_ineq]
|
||||
lagr_hess = self.lagr_hess
|
||||
return lagr_hess(x, v_eq, v_ineq)
|
||||
|
||||
def lagrangian_hessian_s(self, z, v):
|
||||
"""Returns scaled Lagrangian Hessian (in relation to`s`) -> S Hs S"""
|
||||
s = self.get_slack(z)
|
||||
# Using the primal formulation:
|
||||
# S Hs S = diag(s)*diag(barrier_parameter/s**2)*diag(s).
|
||||
# Reference [1]_ p. 882, formula (3.1)
|
||||
primal = self.barrier_parameter
|
||||
# Using the primal-dual formulation
|
||||
# S Hs S = diag(s)*diag(v/s)*diag(s)
|
||||
# Reference [1]_ p. 883, formula (3.11)
|
||||
primal_dual = v[-self.n_ineq:]*s
|
||||
# Uses the primal-dual formulation for
|
||||
# positives values of v_ineq, and primal
|
||||
# formulation for the remaining ones.
|
||||
return np.where(v[-self.n_ineq:] > 0, primal_dual, primal)
|
||||
|
||||
def lagrangian_hessian(self, z, v):
|
||||
"""Returns scaled Lagrangian Hessian"""
|
||||
# Compute Hessian in relation to x and s
|
||||
Hx = self.lagrangian_hessian_x(z, v)
|
||||
if self.n_ineq > 0:
|
||||
S_Hs_S = self.lagrangian_hessian_s(z, v)
|
||||
|
||||
# The scaled Lagragian Hessian is:
|
||||
# [ Hx 0 ]
|
||||
# [ 0 S Hs S ]
|
||||
def matvec(vec):
|
||||
vec_x = self.get_variables(vec)
|
||||
vec_s = self.get_slack(vec)
|
||||
if self.n_ineq > 0:
|
||||
return np.hstack((Hx.dot(vec_x), S_Hs_S*vec_s))
|
||||
else:
|
||||
return Hx.dot(vec_x)
|
||||
return LinearOperator((self.n_vars+self.n_ineq,
|
||||
self.n_vars+self.n_ineq),
|
||||
matvec)
|
||||
|
||||
def stop_criteria(self, state, z, last_iteration_failed,
|
||||
optimality, constr_violation,
|
||||
trust_radius, penalty, cg_info):
|
||||
"""Stop criteria to the barrier problem.
|
||||
The criteria here proposed is similar to formula (2.3)
|
||||
from [1]_, p.879.
|
||||
"""
|
||||
x = self.get_variables(z)
|
||||
if self.global_stop_criteria(state, x,
|
||||
last_iteration_failed,
|
||||
trust_radius, penalty,
|
||||
cg_info,
|
||||
self.barrier_parameter,
|
||||
self.tolerance):
|
||||
self.terminate = True
|
||||
return True
|
||||
else:
|
||||
g_cond = (optimality < self.tolerance and
|
||||
constr_violation < self.tolerance)
|
||||
x_cond = trust_radius < self.xtol
|
||||
return g_cond or x_cond
|
||||
|
||||
|
||||
def tr_interior_point(fun, grad, lagr_hess, n_vars, n_ineq, n_eq,
|
||||
constr, jac, x0, fun0, grad0,
|
||||
constr_ineq0, jac_ineq0, constr_eq0,
|
||||
jac_eq0, stop_criteria,
|
||||
enforce_feasibility, xtol, state,
|
||||
initial_barrier_parameter,
|
||||
initial_tolerance,
|
||||
initial_penalty,
|
||||
initial_trust_radius,
|
||||
factorization_method):
|
||||
"""Trust-region interior points method.
|
||||
|
||||
Solve problem:
|
||||
minimize fun(x)
|
||||
subject to: constr_ineq(x) <= 0
|
||||
constr_eq(x) = 0
|
||||
using trust-region interior point method described in [1]_.
|
||||
"""
|
||||
# BOUNDARY_PARAMETER controls the decrease on the slack
|
||||
# variables. Represents ``tau`` from [1]_ p.885, formula (3.18).
|
||||
BOUNDARY_PARAMETER = 0.995
|
||||
# BARRIER_DECAY_RATIO controls the decay of the barrier parameter
|
||||
# and of the subproblem toloerance. Represents ``theta`` from [1]_ p.879.
|
||||
BARRIER_DECAY_RATIO = 0.2
|
||||
# TRUST_ENLARGEMENT controls the enlargement on trust radius
|
||||
# after each iteration
|
||||
TRUST_ENLARGEMENT = 5
|
||||
|
||||
# Default enforce_feasibility
|
||||
if enforce_feasibility is None:
|
||||
enforce_feasibility = np.zeros(n_ineq, bool)
|
||||
# Initial Values
|
||||
barrier_parameter = initial_barrier_parameter
|
||||
tolerance = initial_tolerance
|
||||
trust_radius = initial_trust_radius
|
||||
# Define initial value for the slack variables
|
||||
s0 = np.maximum(-1.5*constr_ineq0, np.ones(n_ineq))
|
||||
# Define barrier subproblem
|
||||
subprob = BarrierSubproblem(
|
||||
x0, s0, fun, grad, lagr_hess, n_vars, n_ineq, n_eq, constr, jac,
|
||||
barrier_parameter, tolerance, enforce_feasibility,
|
||||
stop_criteria, xtol, fun0, grad0, constr_ineq0, jac_ineq0,
|
||||
constr_eq0, jac_eq0)
|
||||
# Define initial parameter for the first iteration.
|
||||
z = np.hstack((x0, s0))
|
||||
fun0_subprob, constr0_subprob = subprob.fun0, subprob.constr0
|
||||
grad0_subprob, jac0_subprob = subprob.grad0, subprob.jac0
|
||||
# Define trust region bounds
|
||||
trust_lb = np.hstack((np.full(subprob.n_vars, -np.inf),
|
||||
np.full(subprob.n_ineq, -BOUNDARY_PARAMETER)))
|
||||
trust_ub = np.full(subprob.n_vars+subprob.n_ineq, np.inf)
|
||||
|
||||
# Solves a sequence of barrier problems
|
||||
while True:
|
||||
# Solve SQP subproblem
|
||||
z, state = equality_constrained_sqp(
|
||||
subprob.function_and_constraints,
|
||||
subprob.gradient_and_jacobian,
|
||||
subprob.lagrangian_hessian,
|
||||
z, fun0_subprob, grad0_subprob,
|
||||
constr0_subprob, jac0_subprob, subprob.stop_criteria,
|
||||
state, initial_penalty, trust_radius,
|
||||
factorization_method, trust_lb, trust_ub, subprob.scaling)
|
||||
if subprob.terminate:
|
||||
break
|
||||
# Update parameters
|
||||
trust_radius = max(initial_trust_radius,
|
||||
TRUST_ENLARGEMENT*state.tr_radius)
|
||||
# TODO: Use more advanced strategies from [2]_
|
||||
# to update this parameters.
|
||||
barrier_parameter *= BARRIER_DECAY_RATIO
|
||||
tolerance *= BARRIER_DECAY_RATIO
|
||||
# Update Barrier Problem
|
||||
subprob.update(barrier_parameter, tolerance)
|
||||
# Compute initial values for next iteration
|
||||
fun0_subprob, constr0_subprob = subprob.function_and_constraints(z)
|
||||
grad0_subprob, jac0_subprob = subprob.gradient_and_jacobian(z)
|
||||
|
||||
# Get x and s
|
||||
x = subprob.get_variables(z)
|
||||
return x, state
|
||||
@ -0,0 +1,122 @@
|
||||
"""Dog-leg trust-region optimization."""
|
||||
import numpy as np
|
||||
import scipy.linalg
|
||||
from ._trustregion import (_minimize_trust_region, BaseQuadraticSubproblem)
|
||||
|
||||
__all__ = []
|
||||
|
||||
|
||||
def _minimize_dogleg(fun, x0, args=(), jac=None, hess=None,
|
||||
**trust_region_options):
|
||||
"""
|
||||
Minimization of scalar function of one or more variables using
|
||||
the dog-leg trust-region algorithm.
|
||||
|
||||
Options
|
||||
-------
|
||||
initial_trust_radius : float
|
||||
Initial trust-region radius.
|
||||
max_trust_radius : float
|
||||
Maximum value of the trust-region radius. No steps that are longer
|
||||
than this value will be proposed.
|
||||
eta : float
|
||||
Trust region related acceptance stringency for proposed steps.
|
||||
gtol : float
|
||||
Gradient norm must be less than `gtol` before successful
|
||||
termination.
|
||||
|
||||
"""
|
||||
if jac is None:
|
||||
raise ValueError('Jacobian is required for dogleg minimization')
|
||||
if not callable(hess):
|
||||
raise ValueError('Hessian is required for dogleg minimization')
|
||||
return _minimize_trust_region(fun, x0, args=args, jac=jac, hess=hess,
|
||||
subproblem=DoglegSubproblem,
|
||||
**trust_region_options)
|
||||
|
||||
|
||||
class DoglegSubproblem(BaseQuadraticSubproblem):
|
||||
"""Quadratic subproblem solved by the dogleg method"""
|
||||
|
||||
def cauchy_point(self):
|
||||
"""
|
||||
The Cauchy point is minimal along the direction of steepest descent.
|
||||
"""
|
||||
if self._cauchy_point is None:
|
||||
g = self.jac
|
||||
Bg = self.hessp(g)
|
||||
self._cauchy_point = -(np.dot(g, g) / np.dot(g, Bg)) * g
|
||||
return self._cauchy_point
|
||||
|
||||
def newton_point(self):
|
||||
"""
|
||||
The Newton point is a global minimum of the approximate function.
|
||||
"""
|
||||
if self._newton_point is None:
|
||||
g = self.jac
|
||||
B = self.hess
|
||||
cho_info = scipy.linalg.cho_factor(B)
|
||||
self._newton_point = -scipy.linalg.cho_solve(cho_info, g)
|
||||
return self._newton_point
|
||||
|
||||
def solve(self, trust_radius):
|
||||
"""
|
||||
Minimize a function using the dog-leg trust-region algorithm.
|
||||
|
||||
This algorithm requires function values and first and second derivatives.
|
||||
It also performs a costly Hessian decomposition for most iterations,
|
||||
and the Hessian is required to be positive definite.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
trust_radius : float
|
||||
We are allowed to wander only this far away from the origin.
|
||||
|
||||
Returns
|
||||
-------
|
||||
p : ndarray
|
||||
The proposed step.
|
||||
hits_boundary : bool
|
||||
True if the proposed step is on the boundary of the trust region.
|
||||
|
||||
Notes
|
||||
-----
|
||||
The Hessian is required to be positive definite.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Jorge Nocedal and Stephen Wright,
|
||||
Numerical Optimization, second edition,
|
||||
Springer-Verlag, 2006, page 73.
|
||||
"""
|
||||
|
||||
# Compute the Newton point.
|
||||
# This is the optimum for the quadratic model function.
|
||||
# If it is inside the trust radius then return this point.
|
||||
p_best = self.newton_point()
|
||||
if scipy.linalg.norm(p_best) < trust_radius:
|
||||
hits_boundary = False
|
||||
return p_best, hits_boundary
|
||||
|
||||
# Compute the Cauchy point.
|
||||
# This is the predicted optimum along the direction of steepest descent.
|
||||
p_u = self.cauchy_point()
|
||||
|
||||
# If the Cauchy point is outside the trust region,
|
||||
# then return the point where the path intersects the boundary.
|
||||
p_u_norm = scipy.linalg.norm(p_u)
|
||||
if p_u_norm >= trust_radius:
|
||||
p_boundary = p_u * (trust_radius / p_u_norm)
|
||||
hits_boundary = True
|
||||
return p_boundary, hits_boundary
|
||||
|
||||
# Compute the intersection of the trust region boundary
|
||||
# and the line segment connecting the Cauchy and Newton points.
|
||||
# This requires solving a quadratic equation.
|
||||
# ||p_u + t*(p_best - p_u)||**2 == trust_radius**2
|
||||
# Solve this for positive time t using the quadratic formula.
|
||||
_, tb = self.get_boundaries_intersections(p_u, p_best - p_u,
|
||||
trust_radius)
|
||||
p_boundary = p_u + tb * (p_best - p_u)
|
||||
hits_boundary = True
|
||||
return p_boundary, hits_boundary
|
||||
@ -0,0 +1,438 @@
|
||||
"""Nearly exact trust-region optimization subproblem."""
|
||||
import numpy as np
|
||||
from scipy.linalg import (norm, get_lapack_funcs, solve_triangular,
|
||||
cho_solve)
|
||||
from ._trustregion import (_minimize_trust_region, BaseQuadraticSubproblem)
|
||||
|
||||
__all__ = ['_minimize_trustregion_exact',
|
||||
'estimate_smallest_singular_value',
|
||||
'singular_leading_submatrix',
|
||||
'IterativeSubproblem']
|
||||
|
||||
|
||||
def _minimize_trustregion_exact(fun, x0, args=(), jac=None, hess=None,
|
||||
**trust_region_options):
|
||||
"""
|
||||
Minimization of scalar function of one or more variables using
|
||||
a nearly exact trust-region algorithm.
|
||||
|
||||
Options
|
||||
-------
|
||||
initial_trust_radius : float
|
||||
Initial trust-region radius.
|
||||
max_trust_radius : float
|
||||
Maximum value of the trust-region radius. No steps that are longer
|
||||
than this value will be proposed.
|
||||
eta : float
|
||||
Trust region related acceptance stringency for proposed steps.
|
||||
gtol : float
|
||||
Gradient norm must be less than ``gtol`` before successful
|
||||
termination.
|
||||
"""
|
||||
|
||||
if jac is None:
|
||||
raise ValueError('Jacobian is required for trust region '
|
||||
'exact minimization.')
|
||||
if not callable(hess):
|
||||
raise ValueError('Hessian matrix is required for trust region '
|
||||
'exact minimization.')
|
||||
return _minimize_trust_region(fun, x0, args=args, jac=jac, hess=hess,
|
||||
subproblem=IterativeSubproblem,
|
||||
**trust_region_options)
|
||||
|
||||
|
||||
def estimate_smallest_singular_value(U):
|
||||
"""Given upper triangular matrix ``U`` estimate the smallest singular
|
||||
value and the correspondent right singular vector in O(n**2) operations.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
U : ndarray
|
||||
Square upper triangular matrix.
|
||||
|
||||
Returns
|
||||
-------
|
||||
s_min : float
|
||||
Estimated smallest singular value of the provided matrix.
|
||||
z_min : ndarray
|
||||
Estimatied right singular vector.
|
||||
|
||||
Notes
|
||||
-----
|
||||
The procedure is based on [1]_ and is done in two steps. First, it finds
|
||||
a vector ``e`` with components selected from {+1, -1} such that the
|
||||
solution ``w`` from the system ``U.T w = e`` is as large as possible.
|
||||
Next it estimate ``U v = w``. The smallest singular value is close
|
||||
to ``norm(w)/norm(v)`` and the right singular vector is close
|
||||
to ``v/norm(v)``.
|
||||
|
||||
The estimation will be better more ill-conditioned is the matrix.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Cline, A. K., Moler, C. B., Stewart, G. W., Wilkinson, J. H.
|
||||
An estimate for the condition number of a matrix. 1979.
|
||||
SIAM Journal on Numerical Analysis, 16(2), 368-375.
|
||||
"""
|
||||
|
||||
U = np.atleast_2d(U)
|
||||
m, n = U.shape
|
||||
|
||||
if m != n:
|
||||
raise ValueError("A square triangular matrix should be provided.")
|
||||
|
||||
# A vector `e` with components selected from {+1, -1}
|
||||
# is selected so that the solution `w` to the system
|
||||
# `U.T w = e` is as large as possible. Implementation
|
||||
# based on algorithm 3.5.1, p. 142, from reference [2]
|
||||
# adapted for lower triangular matrix.
|
||||
|
||||
p = np.zeros(n)
|
||||
w = np.empty(n)
|
||||
|
||||
# Implemented according to: Golub, G. H., Van Loan, C. F. (2013).
|
||||
# "Matrix computations". Forth Edition. JHU press. pp. 140-142.
|
||||
for k in range(n):
|
||||
wp = (1-p[k]) / U.T[k, k]
|
||||
wm = (-1-p[k]) / U.T[k, k]
|
||||
pp = p[k+1:] + U.T[k+1:, k]*wp
|
||||
pm = p[k+1:] + U.T[k+1:, k]*wm
|
||||
|
||||
if abs(wp) + norm(pp, 1) >= abs(wm) + norm(pm, 1):
|
||||
w[k] = wp
|
||||
p[k+1:] = pp
|
||||
else:
|
||||
w[k] = wm
|
||||
p[k+1:] = pm
|
||||
|
||||
# The system `U v = w` is solved using backward substitution.
|
||||
v = solve_triangular(U, w)
|
||||
|
||||
v_norm = norm(v)
|
||||
w_norm = norm(w)
|
||||
|
||||
# Smallest singular value
|
||||
s_min = w_norm / v_norm
|
||||
|
||||
# Associated vector
|
||||
z_min = v / v_norm
|
||||
|
||||
return s_min, z_min
|
||||
|
||||
|
||||
def gershgorin_bounds(H):
|
||||
"""
|
||||
Given a square matrix ``H`` compute upper
|
||||
and lower bounds for its eigenvalues (Gregoshgorin Bounds).
|
||||
Defined ref. [1].
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] Conn, A. R., Gould, N. I., & Toint, P. L.
|
||||
Trust region methods. 2000. Siam. pp. 19.
|
||||
"""
|
||||
|
||||
H_diag = np.diag(H)
|
||||
H_diag_abs = np.abs(H_diag)
|
||||
H_row_sums = np.sum(np.abs(H), axis=1)
|
||||
lb = np.min(H_diag + H_diag_abs - H_row_sums)
|
||||
ub = np.max(H_diag - H_diag_abs + H_row_sums)
|
||||
|
||||
return lb, ub
|
||||
|
||||
|
||||
def singular_leading_submatrix(A, U, k):
|
||||
"""
|
||||
Compute term that makes the leading ``k`` by ``k``
|
||||
submatrix from ``A`` singular.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
A : ndarray
|
||||
Symmetric matrix that is not positive definite.
|
||||
U : ndarray
|
||||
Upper triangular matrix resulting of an incomplete
|
||||
Cholesky decomposition of matrix ``A``.
|
||||
k : int
|
||||
Positive integer such that the leading k by k submatrix from
|
||||
`A` is the first non-positive definite leading submatrix.
|
||||
|
||||
Returns
|
||||
-------
|
||||
delta : float
|
||||
Amount that should be added to the element (k, k) of the
|
||||
leading k by k submatrix of ``A`` to make it singular.
|
||||
v : ndarray
|
||||
A vector such that ``v.T B v = 0``. Where B is the matrix A after
|
||||
``delta`` is added to its element (k, k).
|
||||
"""
|
||||
|
||||
# Compute delta
|
||||
delta = np.sum(U[:k-1, k-1]**2) - A[k-1, k-1]
|
||||
|
||||
n = len(A)
|
||||
|
||||
# Inicialize v
|
||||
v = np.zeros(n)
|
||||
v[k-1] = 1
|
||||
|
||||
# Compute the remaining values of v by solving a triangular system.
|
||||
if k != 1:
|
||||
v[:k-1] = solve_triangular(U[:k-1, :k-1], -U[:k-1, k-1])
|
||||
|
||||
return delta, v
|
||||
|
||||
|
||||
class IterativeSubproblem(BaseQuadraticSubproblem):
|
||||
"""Quadratic subproblem solved by nearly exact iterative method.
|
||||
|
||||
Notes
|
||||
-----
|
||||
This subproblem solver was based on [1]_, [2]_ and [3]_,
|
||||
which implement similar algorithms. The algorithm is basically
|
||||
that of [1]_ but ideas from [2]_ and [3]_ were also used.
|
||||
|
||||
References
|
||||
----------
|
||||
.. [1] A.R. Conn, N.I. Gould, and P.L. Toint, "Trust region methods",
|
||||
Siam, pp. 169-200, 2000.
|
||||
.. [2] J. Nocedal and S. Wright, "Numerical optimization",
|
||||
Springer Science & Business Media. pp. 83-91, 2006.
|
||||
.. [3] J.J. More and D.C. Sorensen, "Computing a trust region step",
|
||||
SIAM Journal on Scientific and Statistical Computing, vol. 4(3),
|
||||
pp. 553-572, 1983.
|
||||
"""
|
||||
|
||||
# UPDATE_COEFF appears in reference [1]_
|
||||
# in formula 7.3.14 (p. 190) named as "theta".
|
||||
# As recommended there it value is fixed in 0.01.
|
||||
UPDATE_COEFF = 0.01
|
||||
|
||||
EPS = np.finfo(float).eps
|
||||
|
||||
def __init__(self, x, fun, jac, hess, hessp=None,
|
||||
k_easy=0.1, k_hard=0.2):
|
||||
|
||||
super().__init__(x, fun, jac, hess)
|
||||
|
||||
# When the trust-region shrinks in two consecutive
|
||||
# calculations (``tr_radius < previous_tr_radius``)
|
||||
# the lower bound ``lambda_lb`` may be reused,
|
||||
# facilitating the convergence. To indicate no
|
||||
# previous value is known at first ``previous_tr_radius``
|
||||
# is set to -1 and ``lambda_lb`` to None.
|
||||
self.previous_tr_radius = -1
|
||||
self.lambda_lb = None
|
||||
|
||||
self.niter = 0
|
||||
|
||||
# ``k_easy`` and ``k_hard`` are parameters used
|
||||
# to determine the stop criteria to the iterative
|
||||
# subproblem solver. Take a look at pp. 194-197
|
||||
# from reference _[1] for a more detailed description.
|
||||
self.k_easy = k_easy
|
||||
self.k_hard = k_hard
|
||||
|
||||
# Get Lapack function for cholesky decomposition.
|
||||
# The implemented SciPy wrapper does not return
|
||||
# the incomplete factorization needed by the method.
|
||||
self.cholesky, = get_lapack_funcs(('potrf',), (self.hess,))
|
||||
|
||||
# Get info about Hessian
|
||||
self.dimension = len(self.hess)
|
||||
self.hess_gershgorin_lb,\
|
||||
self.hess_gershgorin_ub = gershgorin_bounds(self.hess)
|
||||
self.hess_inf = norm(self.hess, np.inf)
|
||||
self.hess_fro = norm(self.hess, 'fro')
|
||||
|
||||
# A constant such that for vectors smaller than that
|
||||
# backward substituition is not reliable. It was stabilished
|
||||
# based on Golub, G. H., Van Loan, C. F. (2013).
|
||||
# "Matrix computations". Forth Edition. JHU press., p.165.
|
||||
self.CLOSE_TO_ZERO = self.dimension * self.EPS * self.hess_inf
|
||||
|
||||
def _initial_values(self, tr_radius):
|
||||
"""Given a trust radius, return a good initial guess for
|
||||
the damping factor, the lower bound and the upper bound.
|
||||
The values were chosen accordingly to the guidelines on
|
||||
section 7.3.8 (p. 192) from [1]_.
|
||||
"""
|
||||
|
||||
# Upper bound for the damping factor
|
||||
lambda_ub = max(0, self.jac_mag/tr_radius + min(-self.hess_gershgorin_lb,
|
||||
self.hess_fro,
|
||||
self.hess_inf))
|
||||
|
||||
# Lower bound for the damping factor
|
||||
lambda_lb = max(0, -min(self.hess.diagonal()),
|
||||
self.jac_mag/tr_radius - min(self.hess_gershgorin_ub,
|
||||
self.hess_fro,
|
||||
self.hess_inf))
|
||||
|
||||
# Improve bounds with previous info
|
||||
if tr_radius < self.previous_tr_radius:
|
||||
lambda_lb = max(self.lambda_lb, lambda_lb)
|
||||
|
||||
# Initial guess for the damping factor
|
||||
if lambda_lb == 0:
|
||||
lambda_initial = 0
|
||||
else:
|
||||
lambda_initial = max(np.sqrt(lambda_lb * lambda_ub),
|
||||
lambda_lb + self.UPDATE_COEFF*(lambda_ub-lambda_lb))
|
||||
|
||||
return lambda_initial, lambda_lb, lambda_ub
|
||||
|
||||
def solve(self, tr_radius):
|
||||
"""Solve quadratic subproblem"""
|
||||
|
||||
lambda_current, lambda_lb, lambda_ub = self._initial_values(tr_radius)
|
||||
n = self.dimension
|
||||
hits_boundary = True
|
||||
already_factorized = False
|
||||
self.niter = 0
|
||||
|
||||
while True:
|
||||
|
||||
# Compute Cholesky factorization
|
||||
if already_factorized:
|
||||
already_factorized = False
|
||||
else:
|
||||
H = self.hess+lambda_current*np.eye(n)
|
||||
U, info = self.cholesky(H, lower=False,
|
||||
overwrite_a=False,
|
||||
clean=True)
|
||||
|
||||
self.niter += 1
|
||||
|
||||
# Check if factorization succeeded
|
||||
if info == 0 and self.jac_mag > self.CLOSE_TO_ZERO:
|
||||
# Successful factorization
|
||||
|
||||
# Solve `U.T U p = s`
|
||||
p = cho_solve((U, False), -self.jac)
|
||||
|
||||
p_norm = norm(p)
|
||||
|
||||
# Check for interior convergence
|
||||
if p_norm <= tr_radius and lambda_current == 0:
|
||||
hits_boundary = False
|
||||
break
|
||||
|
||||
# Solve `U.T w = p`
|
||||
w = solve_triangular(U, p, trans='T')
|
||||
|
||||
w_norm = norm(w)
|
||||
|
||||
# Compute Newton step accordingly to
|
||||
# formula (4.44) p.87 from ref [2]_.
|
||||
delta_lambda = (p_norm/w_norm)**2 * (p_norm-tr_radius)/tr_radius
|
||||
lambda_new = lambda_current + delta_lambda
|
||||
|
||||
if p_norm < tr_radius: # Inside boundary
|
||||
s_min, z_min = estimate_smallest_singular_value(U)
|
||||
|
||||
ta, tb = self.get_boundaries_intersections(p, z_min,
|
||||
tr_radius)
|
||||
|
||||
# Choose `step_len` with the smallest magnitude.
|
||||
# The reason for this choice is explained at
|
||||
# ref [3]_, p. 6 (Immediately before the formula
|
||||
# for `tau`).
|
||||
step_len = min([ta, tb], key=abs)
|
||||
|
||||
# Compute the quadratic term (p.T*H*p)
|
||||
quadratic_term = np.dot(p, np.dot(H, p))
|
||||
|
||||
# Check stop criteria
|
||||
relative_error = ((step_len**2 * s_min**2)
|
||||
/ (quadratic_term + lambda_current*tr_radius**2))
|
||||
if relative_error <= self.k_hard:
|
||||
p += step_len * z_min
|
||||
break
|
||||
|
||||
# Update uncertanty bounds
|
||||
lambda_ub = lambda_current
|
||||
lambda_lb = max(lambda_lb, lambda_current - s_min**2)
|
||||
|
||||
# Compute Cholesky factorization
|
||||
H = self.hess + lambda_new*np.eye(n)
|
||||
c, info = self.cholesky(H, lower=False,
|
||||
overwrite_a=False,
|
||||
clean=True)
|
||||
|
||||
# Check if the factorization have succeeded
|
||||
#
|
||||
if info == 0: # Successful factorization
|
||||
# Update damping factor
|
||||
lambda_current = lambda_new
|
||||
already_factorized = True
|
||||
else: # Unsuccessful factorization
|
||||
# Update uncertanty bounds
|
||||
lambda_lb = max(lambda_lb, lambda_new)
|
||||
|
||||
# Update damping factor
|
||||
lambda_current = max(
|
||||
np.sqrt(lambda_lb * lambda_ub),
|
||||
lambda_lb + self.UPDATE_COEFF*(lambda_ub-lambda_lb)
|
||||
)
|
||||
|
||||
else: # Outside boundary
|
||||
# Check stop criteria
|
||||
relative_error = abs(p_norm - tr_radius) / tr_radius
|
||||
if relative_error <= self.k_easy:
|
||||
break
|
||||
|
||||
# Update uncertanty bounds
|
||||
lambda_lb = lambda_current
|
||||
|
||||
# Update damping factor
|
||||
lambda_current = lambda_new
|
||||
|
||||
elif info == 0 and self.jac_mag <= self.CLOSE_TO_ZERO:
|
||||
# jac_mag very close to zero
|
||||
|
||||
# Check for interior convergence
|
||||
if lambda_current == 0:
|
||||
p = np.zeros(n)
|
||||
hits_boundary = False
|
||||
break
|
||||
|
||||
s_min, z_min = estimate_smallest_singular_value(U)
|
||||
step_len = tr_radius
|
||||
|
||||
# Check stop criteria
|
||||
if (step_len**2 * s_min**2
|
||||
<= self.k_hard * lambda_current * tr_radius**2):
|
||||
p = step_len * z_min
|
||||
break
|
||||
|
||||
# Update uncertanty bounds
|
||||
lambda_ub = lambda_current
|
||||
lambda_lb = max(lambda_lb, lambda_current - s_min**2)
|
||||
|
||||
# Update damping factor
|
||||
lambda_current = max(
|
||||
np.sqrt(lambda_lb * lambda_ub),
|
||||
lambda_lb + self.UPDATE_COEFF*(lambda_ub-lambda_lb)
|
||||
)
|
||||
|
||||
else: # Unsuccessful factorization
|
||||
|
||||
# Compute auxiliary terms
|
||||
delta, v = singular_leading_submatrix(H, U, info)
|
||||
v_norm = norm(v)
|
||||
|
||||
# Update uncertanty interval
|
||||
lambda_lb = max(lambda_lb, lambda_current + delta/v_norm**2)
|
||||
|
||||
# Update damping factor
|
||||
lambda_current = max(
|
||||
np.sqrt(lambda_lb * lambda_ub),
|
||||
lambda_lb + self.UPDATE_COEFF*(lambda_ub-lambda_lb)
|
||||
)
|
||||
|
||||
self.lambda_lb = lambda_lb
|
||||
self.lambda_current = lambda_current
|
||||
self.previous_tr_radius = tr_radius
|
||||
|
||||
return p, hits_boundary
|
||||
@ -0,0 +1,65 @@
|
||||
from ._trustregion import (_minimize_trust_region)
|
||||
from ._trlib import (get_trlib_quadratic_subproblem)
|
||||
|
||||
__all__ = ['_minimize_trust_krylov']
|
||||
|
||||
def _minimize_trust_krylov(fun, x0, args=(), jac=None, hess=None, hessp=None,
|
||||
inexact=True, **trust_region_options):
|
||||
"""
|
||||
Minimization of a scalar function of one or more variables using
|
||||
a nearly exact trust-region algorithm that only requires matrix
|
||||
vector products with the hessian matrix.
|
||||
|
||||
.. versionadded:: 1.0.0
|
||||
|
||||
Options
|
||||
-------
|
||||
inexact : bool, optional
|
||||
Accuracy to solve subproblems. If True requires less nonlinear
|
||||
iterations, but more vector products.
|
||||
"""
|
||||
|
||||
if jac is None:
|
||||
raise ValueError('Jacobian is required for trust region ',
|
||||
'exact minimization.')
|
||||
if hess is None and hessp is None:
|
||||
raise ValueError('Either the Hessian or the Hessian-vector product '
|
||||
'is required for Krylov trust-region minimization')
|
||||
|
||||
# tol_rel specifies the termination tolerance relative to the initial
|
||||
# gradient norm in the Krylov subspace iteration.
|
||||
|
||||
# - tol_rel_i specifies the tolerance for interior convergence.
|
||||
# - tol_rel_b specifies the tolerance for boundary convergence.
|
||||
# in nonlinear programming applications it is not necessary to solve
|
||||
# the boundary case as exact as the interior case.
|
||||
|
||||
# - setting tol_rel_i=-2 leads to a forcing sequence in the Krylov
|
||||
# subspace iteration leading to quadratic convergence if eventually
|
||||
# the trust region stays inactive.
|
||||
# - setting tol_rel_b=-3 leads to a forcing sequence in the Krylov
|
||||
# subspace iteration leading to superlinear convergence as long
|
||||
# as the iterates hit the trust region boundary.
|
||||
|
||||
# For details consult the documentation of trlib_krylov_min
|
||||
# in _trlib/trlib_krylov.h
|
||||
#
|
||||
# Optimality of this choice of parameters among a range of possibilities
|
||||
# has been tested on the unconstrained subset of the CUTEst library.
|
||||
|
||||
if inexact:
|
||||
return _minimize_trust_region(fun, x0, args=args, jac=jac,
|
||||
hess=hess, hessp=hessp,
|
||||
subproblem=get_trlib_quadratic_subproblem(
|
||||
tol_rel_i=-2.0, tol_rel_b=-3.0,
|
||||
disp=trust_region_options.get('disp', False)
|
||||
),
|
||||
**trust_region_options)
|
||||
else:
|
||||
return _minimize_trust_region(fun, x0, args=args, jac=jac,
|
||||
hess=hess, hessp=hessp,
|
||||
subproblem=get_trlib_quadratic_subproblem(
|
||||
tol_rel_i=1e-8, tol_rel_b=1e-6,
|
||||
disp=trust_region_options.get('disp', False)
|
||||
),
|
||||
**trust_region_options)
|
||||
@ -0,0 +1,126 @@
|
||||
"""Newton-CG trust-region optimization."""
|
||||
import math
|
||||
|
||||
import numpy as np
|
||||
import scipy.linalg
|
||||
from ._trustregion import (_minimize_trust_region, BaseQuadraticSubproblem)
|
||||
|
||||
__all__ = []
|
||||
|
||||
|
||||
def _minimize_trust_ncg(fun, x0, args=(), jac=None, hess=None, hessp=None,
|
||||
**trust_region_options):
|
||||
"""
|
||||
Minimization of scalar function of one or more variables using
|
||||
the Newton conjugate gradient trust-region algorithm.
|
||||
|
||||
Options
|
||||
-------
|
||||
initial_trust_radius : float
|
||||
Initial trust-region radius.
|
||||
max_trust_radius : float
|
||||
Maximum value of the trust-region radius. No steps that are longer
|
||||
than this value will be proposed.
|
||||
eta : float
|
||||
Trust region related acceptance stringency for proposed steps.
|
||||
gtol : float
|
||||
Gradient norm must be less than `gtol` before successful
|
||||
termination.
|
||||
|
||||
"""
|
||||
if jac is None:
|
||||
raise ValueError('Jacobian is required for Newton-CG trust-region '
|
||||
'minimization')
|
||||
if hess is None and hessp is None:
|
||||
raise ValueError('Either the Hessian or the Hessian-vector product '
|
||||
'is required for Newton-CG trust-region minimization')
|
||||
return _minimize_trust_region(fun, x0, args=args, jac=jac, hess=hess,
|
||||
hessp=hessp, subproblem=CGSteihaugSubproblem,
|
||||
**trust_region_options)
|
||||
|
||||
|
||||
class CGSteihaugSubproblem(BaseQuadraticSubproblem):
|
||||
"""Quadratic subproblem solved by a conjugate gradient method"""
|
||||
def solve(self, trust_radius):
|
||||
"""
|
||||
Solve the subproblem using a conjugate gradient method.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
trust_radius : float
|
||||
We are allowed to wander only this far away from the origin.
|
||||
|
||||
Returns
|
||||
-------
|
||||
p : ndarray
|
||||
The proposed step.
|
||||
hits_boundary : bool
|
||||
True if the proposed step is on the boundary of the trust region.
|
||||
|
||||
Notes
|
||||
-----
|
||||
This is algorithm (7.2) of Nocedal and Wright 2nd edition.
|
||||
Only the function that computes the Hessian-vector product is required.
|
||||
The Hessian itself is not required, and the Hessian does
|
||||
not need to be positive semidefinite.
|
||||
"""
|
||||
|
||||
# get the norm of jacobian and define the origin
|
||||
p_origin = np.zeros_like(self.jac)
|
||||
|
||||
# define a default tolerance
|
||||
tolerance = min(0.5, math.sqrt(self.jac_mag)) * self.jac_mag
|
||||
|
||||
# Stop the method if the search direction
|
||||
# is a direction of nonpositive curvature.
|
||||
if self.jac_mag < tolerance:
|
||||
hits_boundary = False
|
||||
return p_origin, hits_boundary
|
||||
|
||||
# init the state for the first iteration
|
||||
z = p_origin
|
||||
r = self.jac
|
||||
d = -r
|
||||
|
||||
# Search for the min of the approximation of the objective function.
|
||||
while True:
|
||||
|
||||
# do an iteration
|
||||
Bd = self.hessp(d)
|
||||
dBd = np.dot(d, Bd)
|
||||
if dBd <= 0:
|
||||
# Look at the two boundary points.
|
||||
# Find both values of t to get the boundary points such that
|
||||
# ||z + t d|| == trust_radius
|
||||
# and then choose the one with the predicted min value.
|
||||
ta, tb = self.get_boundaries_intersections(z, d, trust_radius)
|
||||
pa = z + ta * d
|
||||
pb = z + tb * d
|
||||
if self(pa) < self(pb):
|
||||
p_boundary = pa
|
||||
else:
|
||||
p_boundary = pb
|
||||
hits_boundary = True
|
||||
return p_boundary, hits_boundary
|
||||
r_squared = np.dot(r, r)
|
||||
alpha = r_squared / dBd
|
||||
z_next = z + alpha * d
|
||||
if scipy.linalg.norm(z_next) >= trust_radius:
|
||||
# Find t >= 0 to get the boundary point such that
|
||||
# ||z + t d|| == trust_radius
|
||||
ta, tb = self.get_boundaries_intersections(z, d, trust_radius)
|
||||
p_boundary = z + tb * d
|
||||
hits_boundary = True
|
||||
return p_boundary, hits_boundary
|
||||
r_next = r + alpha * Bd
|
||||
r_next_squared = np.dot(r_next, r_next)
|
||||
if math.sqrt(r_next_squared) < tolerance:
|
||||
hits_boundary = False
|
||||
return z_next, hits_boundary
|
||||
beta_next = r_next_squared / r_squared
|
||||
d_next = -r_next + beta_next * d
|
||||
|
||||
# update the state for the next iteration
|
||||
z = z_next
|
||||
r = r_next
|
||||
d = d_next
|
||||
972
venv/lib/python3.12/site-packages/scipy/optimize/_tstutils.py
Normal file
972
venv/lib/python3.12/site-packages/scipy/optimize/_tstutils.py
Normal file
@ -0,0 +1,972 @@
|
||||
r"""
|
||||
Parameters used in test and benchmark methods.
|
||||
|
||||
Collections of test cases suitable for testing 1-D root-finders
|
||||
'original': The original benchmarking functions.
|
||||
Real-valued functions of real-valued inputs on an interval
|
||||
with a zero.
|
||||
f1, .., f3 are continuous and infinitely differentiable
|
||||
f4 has a left- and right- discontinuity at the root
|
||||
f5 has a root at 1 replacing a 1st order pole
|
||||
f6 is randomly positive on one side of the root,
|
||||
randomly negative on the other.
|
||||
f4 - f6 are not continuous at the root.
|
||||
|
||||
'aps': The test problems in the 1995 paper
|
||||
TOMS "Algorithm 748: Enclosing Zeros of Continuous Functions"
|
||||
by Alefeld, Potra and Shi. Real-valued functions of
|
||||
real-valued inputs on an interval with a zero.
|
||||
Suitable for methods which start with an enclosing interval, and
|
||||
derivatives up to 2nd order.
|
||||
|
||||
'complex': Some complex-valued functions of complex-valued inputs.
|
||||
No enclosing bracket is provided.
|
||||
Suitable for methods which use one or more starting values, and
|
||||
derivatives up to 2nd order.
|
||||
|
||||
The test cases are provided as a list of dictionaries. The dictionary
|
||||
keys will be a subset of:
|
||||
["f", "fprime", "fprime2", "args", "bracket", "smoothness",
|
||||
"a", "b", "x0", "x1", "root", "ID"]
|
||||
"""
|
||||
|
||||
# Sources:
|
||||
# [1] Alefeld, G. E. and Potra, F. A. and Shi, Yixun,
|
||||
# "Algorithm 748: Enclosing Zeros of Continuous Functions",
|
||||
# ACM Trans. Math. Softw. Volume 221(1995)
|
||||
# doi = {10.1145/210089.210111},
|
||||
# [2] Chandrupatla, Tirupathi R. "A new hybrid quadratic/bisection algorithm
|
||||
# for finding the zero of a nonlinear function without using derivatives."
|
||||
# Advances in Engineering Software 28.3 (1997): 145-149.
|
||||
|
||||
from random import random
|
||||
|
||||
import numpy as np
|
||||
|
||||
from scipy.optimize import _zeros_py as cc
|
||||
from scipy._lib._array_api import array_namespace
|
||||
|
||||
# "description" refers to the original functions
|
||||
description = """
|
||||
f2 is a symmetric parabola, x**2 - 1
|
||||
f3 is a quartic polynomial with large hump in interval
|
||||
f4 is step function with a discontinuity at 1
|
||||
f5 is a hyperbola with vertical asymptote at 1
|
||||
f6 has random values positive to left of 1, negative to right
|
||||
|
||||
Of course, these are not real problems. They just test how the
|
||||
'good' solvers behave in bad circumstances where bisection is
|
||||
really the best. A good solver should not be much worse than
|
||||
bisection in such circumstance, while being faster for smooth
|
||||
monotone sorts of functions.
|
||||
"""
|
||||
|
||||
|
||||
def f1(x):
|
||||
r"""f1 is a quadratic with roots at 0 and 1"""
|
||||
return x * (x - 1.)
|
||||
|
||||
|
||||
def f1_fp(x):
|
||||
return 2 * x - 1
|
||||
|
||||
|
||||
def f1_fpp(x):
|
||||
return 2
|
||||
|
||||
|
||||
def f2(x):
|
||||
r"""f2 is a symmetric parabola, x**2 - 1"""
|
||||
return x**2 - 1
|
||||
|
||||
|
||||
def f2_fp(x):
|
||||
return 2 * x
|
||||
|
||||
|
||||
def f2_fpp(x):
|
||||
return 2
|
||||
|
||||
|
||||
def f3(x):
|
||||
r"""A quartic with roots at 0, 1, 2 and 3"""
|
||||
return x * (x - 1.) * (x - 2.) * (x - 3.) # x**4 - 6x**3 + 11x**2 - 6x
|
||||
|
||||
|
||||
def f3_fp(x):
|
||||
return 4 * x**3 - 18 * x**2 + 22 * x - 6
|
||||
|
||||
|
||||
def f3_fpp(x):
|
||||
return 12 * x**2 - 36 * x + 22
|
||||
|
||||
|
||||
def f4(x):
|
||||
r"""Piecewise linear, left- and right- discontinuous at x=1, the root."""
|
||||
if x > 1:
|
||||
return 1.0 + .1 * x
|
||||
if x < 1:
|
||||
return -1.0 + .1 * x
|
||||
return 0
|
||||
|
||||
|
||||
def f5(x):
|
||||
r"""
|
||||
Hyperbola with a pole at x=1, but pole replaced with 0. Not continuous at root.
|
||||
"""
|
||||
if x != 1:
|
||||
return 1.0 / (1. - x)
|
||||
return 0
|
||||
|
||||
|
||||
# f6(x) returns random value. Without memoization, calling twice with the
|
||||
# same x returns different values, hence a "random value", not a
|
||||
# "function with random values"
|
||||
_f6_cache = {}
|
||||
def f6(x):
|
||||
v = _f6_cache.get(x, None)
|
||||
if v is None:
|
||||
if x > 1:
|
||||
v = random()
|
||||
elif x < 1:
|
||||
v = -random()
|
||||
else:
|
||||
v = 0
|
||||
_f6_cache[x] = v
|
||||
return v
|
||||
|
||||
|
||||
# Each Original test case has
|
||||
# - a function and its two derivatives,
|
||||
# - additional arguments,
|
||||
# - a bracket enclosing a root,
|
||||
# - the order of differentiability (smoothness) on this interval
|
||||
# - a starting value for methods which don't require a bracket
|
||||
# - the root (inside the bracket)
|
||||
# - an Identifier of the test case
|
||||
|
||||
_ORIGINAL_TESTS_KEYS = [
|
||||
"f", "fprime", "fprime2", "args", "bracket", "smoothness", "x0", "root", "ID"
|
||||
]
|
||||
_ORIGINAL_TESTS = [
|
||||
[f1, f1_fp, f1_fpp, (), [0.5, np.sqrt(3)], np.inf, 0.6, 1.0, "original.01.00"],
|
||||
[f2, f2_fp, f2_fpp, (), [0.5, np.sqrt(3)], np.inf, 0.6, 1.0, "original.02.00"],
|
||||
[f3, f3_fp, f3_fpp, (), [0.5, np.sqrt(3)], np.inf, 0.6, 1.0, "original.03.00"],
|
||||
[f4, None, None, (), [0.5, np.sqrt(3)], -1, 0.6, 1.0, "original.04.00"],
|
||||
[f5, None, None, (), [0.5, np.sqrt(3)], -1, 0.6, 1.0, "original.05.00"],
|
||||
[f6, None, None, (), [0.5, np.sqrt(3)], -np.inf, 0.6, 1.0, "original.05.00"]
|
||||
]
|
||||
|
||||
_ORIGINAL_TESTS_DICTS = [
|
||||
dict(zip(_ORIGINAL_TESTS_KEYS, testcase)) for testcase in _ORIGINAL_TESTS
|
||||
]
|
||||
|
||||
# ##################
|
||||
# "APS" test cases
|
||||
# Functions and test cases that appear in [1]
|
||||
|
||||
|
||||
def aps01_f(x):
|
||||
r"""Straightforward sum of trigonometric function and polynomial"""
|
||||
return np.sin(x) - x / 2
|
||||
|
||||
|
||||
def aps01_fp(x):
|
||||
return np.cos(x) - 1.0 / 2
|
||||
|
||||
|
||||
def aps01_fpp(x):
|
||||
return -np.sin(x)
|
||||
|
||||
|
||||
def aps02_f(x):
|
||||
r"""poles at x=n**2, 1st and 2nd derivatives at root are also close to 0"""
|
||||
ii = np.arange(1, 21)
|
||||
return -2 * np.sum((2 * ii - 5)**2 / (x - ii**2)**3)
|
||||
|
||||
|
||||
def aps02_fp(x):
|
||||
ii = np.arange(1, 21)
|
||||
return 6 * np.sum((2 * ii - 5)**2 / (x - ii**2)**4)
|
||||
|
||||
|
||||
def aps02_fpp(x):
|
||||
ii = np.arange(1, 21)
|
||||
return 24 * np.sum((2 * ii - 5)**2 / (x - ii**2)**5)
|
||||
|
||||
|
||||
def aps03_f(x, a, b):
|
||||
r"""Rapidly changing at the root"""
|
||||
return a * x * np.exp(b * x)
|
||||
|
||||
|
||||
def aps03_fp(x, a, b):
|
||||
return a * (b * x + 1) * np.exp(b * x)
|
||||
|
||||
|
||||
def aps03_fpp(x, a, b):
|
||||
return a * (b * (b * x + 1) + b) * np.exp(b * x)
|
||||
|
||||
|
||||
def aps04_f(x, n, a):
|
||||
r"""Medium-degree polynomial"""
|
||||
return x**n - a
|
||||
|
||||
|
||||
def aps04_fp(x, n, a):
|
||||
return n * x**(n - 1)
|
||||
|
||||
|
||||
def aps04_fpp(x, n, a):
|
||||
return n * (n - 1) * x**(n - 2)
|
||||
|
||||
|
||||
def aps05_f(x):
|
||||
r"""Simple Trigonometric function"""
|
||||
return np.sin(x) - 1.0 / 2
|
||||
|
||||
|
||||
def aps05_fp(x):
|
||||
return np.cos(x)
|
||||
|
||||
|
||||
def aps05_fpp(x):
|
||||
return -np.sin(x)
|
||||
|
||||
|
||||
def aps06_f(x, n):
|
||||
r"""Exponential rapidly changing from -1 to 1 at x=0"""
|
||||
return 2 * x * np.exp(-n) - 2 * np.exp(-n * x) + 1
|
||||
|
||||
|
||||
def aps06_fp(x, n):
|
||||
return 2 * np.exp(-n) + 2 * n * np.exp(-n * x)
|
||||
|
||||
|
||||
def aps06_fpp(x, n):
|
||||
return -2 * n * n * np.exp(-n * x)
|
||||
|
||||
|
||||
def aps07_f(x, n):
|
||||
r"""Upside down parabola with parametrizable height"""
|
||||
return (1 + (1 - n)**2) * x - (1 - n * x)**2
|
||||
|
||||
|
||||
def aps07_fp(x, n):
|
||||
return (1 + (1 - n)**2) + 2 * n * (1 - n * x)
|
||||
|
||||
|
||||
def aps07_fpp(x, n):
|
||||
return -2 * n * n
|
||||
|
||||
|
||||
def aps08_f(x, n):
|
||||
r"""Degree n polynomial"""
|
||||
return x * x - (1 - x)**n
|
||||
|
||||
|
||||
def aps08_fp(x, n):
|
||||
return 2 * x + n * (1 - x)**(n - 1)
|
||||
|
||||
|
||||
def aps08_fpp(x, n):
|
||||
return 2 - n * (n - 1) * (1 - x)**(n - 2)
|
||||
|
||||
|
||||
def aps09_f(x, n):
|
||||
r"""Upside down quartic with parametrizable height"""
|
||||
return (1 + (1 - n)**4) * x - (1 - n * x)**4
|
||||
|
||||
|
||||
def aps09_fp(x, n):
|
||||
return (1 + (1 - n)**4) + 4 * n * (1 - n * x)**3
|
||||
|
||||
|
||||
def aps09_fpp(x, n):
|
||||
return -12 * n * (1 - n * x)**2
|
||||
|
||||
|
||||
def aps10_f(x, n):
|
||||
r"""Exponential plus a polynomial"""
|
||||
return np.exp(-n * x) * (x - 1) + x**n
|
||||
|
||||
|
||||
def aps10_fp(x, n):
|
||||
return np.exp(-n * x) * (-n * (x - 1) + 1) + n * x**(n - 1)
|
||||
|
||||
|
||||
def aps10_fpp(x, n):
|
||||
return (np.exp(-n * x) * (-n * (-n * (x - 1) + 1) + -n * x)
|
||||
+ n * (n - 1) * x**(n - 2))
|
||||
|
||||
|
||||
def aps11_f(x, n):
|
||||
r"""Rational function with a zero at x=1/n and a pole at x=0"""
|
||||
return (n * x - 1) / ((n - 1) * x)
|
||||
|
||||
|
||||
def aps11_fp(x, n):
|
||||
return 1 / (n - 1) / x**2
|
||||
|
||||
|
||||
def aps11_fpp(x, n):
|
||||
return -2 / (n - 1) / x**3
|
||||
|
||||
|
||||
def aps12_f(x, n):
|
||||
r"""nth root of x, with a zero at x=n"""
|
||||
return np.power(x, 1.0 / n) - np.power(n, 1.0 / n)
|
||||
|
||||
|
||||
def aps12_fp(x, n):
|
||||
return np.power(x, (1.0 - n) / n) / n
|
||||
|
||||
|
||||
def aps12_fpp(x, n):
|
||||
return np.power(x, (1.0 - 2 * n) / n) * (1.0 / n) * (1.0 - n) / n
|
||||
|
||||
|
||||
_MAX_EXPABLE = np.log(np.finfo(float).max)
|
||||
|
||||
|
||||
def aps13_f(x):
|
||||
r"""Function with *all* derivatives 0 at the root"""
|
||||
if x == 0:
|
||||
return 0
|
||||
# x2 = 1.0/x**2
|
||||
# if x2 > 708:
|
||||
# return 0
|
||||
y = 1 / x**2
|
||||
if y > _MAX_EXPABLE:
|
||||
return 0
|
||||
return x / np.exp(y)
|
||||
|
||||
|
||||
def aps13_fp(x):
|
||||
if x == 0:
|
||||
return 0
|
||||
y = 1 / x**2
|
||||
if y > _MAX_EXPABLE:
|
||||
return 0
|
||||
return (1 + 2 / x**2) / np.exp(y)
|
||||
|
||||
|
||||
def aps13_fpp(x):
|
||||
if x == 0:
|
||||
return 0
|
||||
y = 1 / x**2
|
||||
if y > _MAX_EXPABLE:
|
||||
return 0
|
||||
return 2 * (2 - x**2) / x**5 / np.exp(y)
|
||||
|
||||
|
||||
def aps14_f(x, n):
|
||||
r"""0 for negative x-values, trigonometric+linear for x positive"""
|
||||
if x <= 0:
|
||||
return -n / 20.0
|
||||
return n / 20.0 * (x / 1.5 + np.sin(x) - 1)
|
||||
|
||||
|
||||
def aps14_fp(x, n):
|
||||
if x <= 0:
|
||||
return 0
|
||||
return n / 20.0 * (1.0 / 1.5 + np.cos(x))
|
||||
|
||||
|
||||
def aps14_fpp(x, n):
|
||||
if x <= 0:
|
||||
return 0
|
||||
return -n / 20.0 * (np.sin(x))
|
||||
|
||||
|
||||
def aps15_f(x, n):
|
||||
r"""piecewise linear, constant outside of [0, 0.002/(1+n)]"""
|
||||
if x < 0:
|
||||
return -0.859
|
||||
if x > 2 * 1e-3 / (1 + n):
|
||||
return np.e - 1.859
|
||||
return np.exp((n + 1) * x / 2 * 1000) - 1.859
|
||||
|
||||
|
||||
def aps15_fp(x, n):
|
||||
if not 0 <= x <= 2 * 1e-3 / (1 + n):
|
||||
return np.e - 1.859
|
||||
return np.exp((n + 1) * x / 2 * 1000) * (n + 1) / 2 * 1000
|
||||
|
||||
|
||||
def aps15_fpp(x, n):
|
||||
if not 0 <= x <= 2 * 1e-3 / (1 + n):
|
||||
return np.e - 1.859
|
||||
return np.exp((n + 1) * x / 2 * 1000) * (n + 1) / 2 * 1000 * (n + 1) / 2 * 1000
|
||||
|
||||
|
||||
# Each APS test case has
|
||||
# - a function and its two derivatives,
|
||||
# - additional arguments,
|
||||
# - a bracket enclosing a root,
|
||||
# - the order of differentiability of the function on this interval
|
||||
# - a starting value for methods which don't require a bracket
|
||||
# - the root (inside the bracket)
|
||||
# - an Identifier of the test case
|
||||
#
|
||||
# Algorithm 748 is a bracketing algorithm so a bracketing interval was provided
|
||||
# in [1] for each test case. Newton and Halley methods need a single
|
||||
# starting point x0, which was chosen to be near the middle of the interval,
|
||||
# unless that would have made the problem too easy.
|
||||
|
||||
_APS_TESTS_KEYS = [
|
||||
"f", "fprime", "fprime2", "args", "bracket", "smoothness", "x0", "root", "ID"
|
||||
]
|
||||
_APS_TESTS = [
|
||||
[aps01_f, aps01_fp, aps01_fpp, (), [np.pi / 2, np.pi], np.inf,
|
||||
3, 1.89549426703398094e+00, "aps.01.00"],
|
||||
[aps02_f, aps02_fp, aps02_fpp, (), [1 + 1e-9, 4 - 1e-9], np.inf,
|
||||
2, 3.02291534727305677e+00, "aps.02.00"],
|
||||
[aps02_f, aps02_fp, aps02_fpp, (), [4 + 1e-9, 9 - 1e-9], np.inf,
|
||||
5, 6.68375356080807848e+00, "aps.02.01"],
|
||||
[aps02_f, aps02_fp, aps02_fpp, (), [9 + 1e-9, 16 - 1e-9], np.inf,
|
||||
10, 1.12387016550022114e+01, "aps.02.02"],
|
||||
[aps02_f, aps02_fp, aps02_fpp, (), [16 + 1e-9, 25 - 1e-9], np.inf,
|
||||
17, 1.96760000806234103e+01, "aps.02.03"],
|
||||
[aps02_f, aps02_fp, aps02_fpp, (), [25 + 1e-9, 36 - 1e-9], np.inf,
|
||||
26, 2.98282273265047557e+01, "aps.02.04"],
|
||||
[aps02_f, aps02_fp, aps02_fpp, (), [36 + 1e-9, 49 - 1e-9], np.inf,
|
||||
37, 4.19061161952894139e+01, "aps.02.05"],
|
||||
[aps02_f, aps02_fp, aps02_fpp, (), [49 + 1e-9, 64 - 1e-9], np.inf,
|
||||
50, 5.59535958001430913e+01, "aps.02.06"],
|
||||
[aps02_f, aps02_fp, aps02_fpp, (), [64 + 1e-9, 81 - 1e-9], np.inf,
|
||||
65, 7.19856655865877997e+01, "aps.02.07"],
|
||||
[aps02_f, aps02_fp, aps02_fpp, (), [81 + 1e-9, 100 - 1e-9], np.inf,
|
||||
82, 9.00088685391666701e+01, "aps.02.08"],
|
||||
[aps02_f, aps02_fp, aps02_fpp, (), [100 + 1e-9, 121 - 1e-9], np.inf,
|
||||
101, 1.10026532748330197e+02, "aps.02.09"],
|
||||
[aps03_f, aps03_fp, aps03_fpp, (-40, -1), [-9, 31], np.inf,
|
||||
-2, 0, "aps.03.00"],
|
||||
[aps03_f, aps03_fp, aps03_fpp, (-100, -2), [-9, 31], np.inf,
|
||||
-2, 0, "aps.03.01"],
|
||||
[aps03_f, aps03_fp, aps03_fpp, (-200, -3), [-9, 31], np.inf,
|
||||
-2, 0, "aps.03.02"],
|
||||
[aps04_f, aps04_fp, aps04_fpp, (4, 0.2), [0, 5], np.inf,
|
||||
2.5, 6.68740304976422006e-01, "aps.04.00"],
|
||||
[aps04_f, aps04_fp, aps04_fpp, (6, 0.2), [0, 5], np.inf,
|
||||
2.5, 7.64724491331730039e-01, "aps.04.01"],
|
||||
[aps04_f, aps04_fp, aps04_fpp, (8, 0.2), [0, 5], np.inf,
|
||||
2.5, 8.17765433957942545e-01, "aps.04.02"],
|
||||
[aps04_f, aps04_fp, aps04_fpp, (10, 0.2), [0, 5], np.inf,
|
||||
2.5, 8.51339922520784609e-01, "aps.04.03"],
|
||||
[aps04_f, aps04_fp, aps04_fpp, (12, 0.2), [0, 5], np.inf,
|
||||
2.5, 8.74485272221167897e-01, "aps.04.04"],
|
||||
[aps04_f, aps04_fp, aps04_fpp, (4, 1), [0, 5], np.inf,
|
||||
2.5, 1, "aps.04.05"],
|
||||
[aps04_f, aps04_fp, aps04_fpp, (6, 1), [0, 5], np.inf,
|
||||
2.5, 1, "aps.04.06"],
|
||||
[aps04_f, aps04_fp, aps04_fpp, (8, 1), [0, 5], np.inf,
|
||||
2.5, 1, "aps.04.07"],
|
||||
[aps04_f, aps04_fp, aps04_fpp, (10, 1), [0, 5], np.inf,
|
||||
2.5, 1, "aps.04.08"],
|
||||
[aps04_f, aps04_fp, aps04_fpp, (12, 1), [0, 5], np.inf,
|
||||
2.5, 1, "aps.04.09"],
|
||||
[aps04_f, aps04_fp, aps04_fpp, (8, 1), [-0.95, 4.05], np.inf,
|
||||
1.5, 1, "aps.04.10"],
|
||||
[aps04_f, aps04_fp, aps04_fpp, (10, 1), [-0.95, 4.05], np.inf,
|
||||
1.5, 1, "aps.04.11"],
|
||||
[aps04_f, aps04_fp, aps04_fpp, (12, 1), [-0.95, 4.05], np.inf,
|
||||
1.5, 1, "aps.04.12"],
|
||||
[aps04_f, aps04_fp, aps04_fpp, (14, 1), [-0.95, 4.05], np.inf,
|
||||
1.5, 1, "aps.04.13"],
|
||||
[aps05_f, aps05_fp, aps05_fpp, (), [0, 1.5], np.inf,
|
||||
1.3, np.pi / 6, "aps.05.00"],
|
||||
[aps06_f, aps06_fp, aps06_fpp, (1,), [0, 1], np.inf,
|
||||
0.5, 4.22477709641236709e-01, "aps.06.00"],
|
||||
[aps06_f, aps06_fp, aps06_fpp, (2,), [0, 1], np.inf,
|
||||
0.5, 3.06699410483203705e-01, "aps.06.01"],
|
||||
[aps06_f, aps06_fp, aps06_fpp, (3,), [0, 1], np.inf,
|
||||
0.5, 2.23705457654662959e-01, "aps.06.02"],
|
||||
[aps06_f, aps06_fp, aps06_fpp, (4,), [0, 1], np.inf,
|
||||
0.5, 1.71719147519508369e-01, "aps.06.03"],
|
||||
[aps06_f, aps06_fp, aps06_fpp, (5,), [0, 1], np.inf,
|
||||
0.4, 1.38257155056824066e-01, "aps.06.04"],
|
||||
[aps06_f, aps06_fp, aps06_fpp, (20,), [0, 1], np.inf,
|
||||
0.1, 3.46573590208538521e-02, "aps.06.05"],
|
||||
[aps06_f, aps06_fp, aps06_fpp, (40,), [0, 1], np.inf,
|
||||
5e-02, 1.73286795139986315e-02, "aps.06.06"],
|
||||
[aps06_f, aps06_fp, aps06_fpp, (60,), [0, 1], np.inf,
|
||||
1.0 / 30, 1.15524530093324210e-02, "aps.06.07"],
|
||||
[aps06_f, aps06_fp, aps06_fpp, (80,), [0, 1], np.inf,
|
||||
2.5e-02, 8.66433975699931573e-03, "aps.06.08"],
|
||||
[aps06_f, aps06_fp, aps06_fpp, (100,), [0, 1], np.inf,
|
||||
2e-02, 6.93147180559945415e-03, "aps.06.09"],
|
||||
[aps07_f, aps07_fp, aps07_fpp, (5,), [0, 1], np.inf,
|
||||
0.4, 3.84025518406218985e-02, "aps.07.00"],
|
||||
[aps07_f, aps07_fp, aps07_fpp, (10,), [0, 1], np.inf,
|
||||
0.4, 9.90000999800049949e-03, "aps.07.01"],
|
||||
[aps07_f, aps07_fp, aps07_fpp, (20,), [0, 1], np.inf,
|
||||
0.4, 2.49375003906201174e-03, "aps.07.02"],
|
||||
[aps08_f, aps08_fp, aps08_fpp, (2,), [0, 1], np.inf,
|
||||
0.9, 0.5, "aps.08.00"],
|
||||
[aps08_f, aps08_fp, aps08_fpp, (5,), [0, 1], np.inf,
|
||||
0.9, 3.45954815848242059e-01, "aps.08.01"],
|
||||
[aps08_f, aps08_fp, aps08_fpp, (10,), [0, 1], np.inf,
|
||||
0.9, 2.45122333753307220e-01, "aps.08.02"],
|
||||
[aps08_f, aps08_fp, aps08_fpp, (15,), [0, 1], np.inf,
|
||||
0.9, 1.95547623536565629e-01, "aps.08.03"],
|
||||
[aps08_f, aps08_fp, aps08_fpp, (20,), [0, 1], np.inf,
|
||||
0.9, 1.64920957276440960e-01, "aps.08.04"],
|
||||
[aps09_f, aps09_fp, aps09_fpp, (1,), [0, 1], np.inf,
|
||||
0.5, 2.75508040999484394e-01, "aps.09.00"],
|
||||
[aps09_f, aps09_fp, aps09_fpp, (2,), [0, 1], np.inf,
|
||||
0.5, 1.37754020499742197e-01, "aps.09.01"],
|
||||
[aps09_f, aps09_fp, aps09_fpp, (4,), [0, 1], np.inf,
|
||||
0.5, 1.03052837781564422e-02, "aps.09.02"],
|
||||
[aps09_f, aps09_fp, aps09_fpp, (5,), [0, 1], np.inf,
|
||||
0.5, 3.61710817890406339e-03, "aps.09.03"],
|
||||
[aps09_f, aps09_fp, aps09_fpp, (8,), [0, 1], np.inf,
|
||||
0.5, 4.10872918496395375e-04, "aps.09.04"],
|
||||
[aps09_f, aps09_fp, aps09_fpp, (15,), [0, 1], np.inf,
|
||||
0.5, 2.59895758929076292e-05, "aps.09.05"],
|
||||
[aps09_f, aps09_fp, aps09_fpp, (20,), [0, 1], np.inf,
|
||||
0.5, 7.66859512218533719e-06, "aps.09.06"],
|
||||
[aps10_f, aps10_fp, aps10_fpp, (1,), [0, 1], np.inf,
|
||||
0.9, 4.01058137541547011e-01, "aps.10.00"],
|
||||
[aps10_f, aps10_fp, aps10_fpp, (5,), [0, 1], np.inf,
|
||||
0.9, 5.16153518757933583e-01, "aps.10.01"],
|
||||
[aps10_f, aps10_fp, aps10_fpp, (10,), [0, 1], np.inf,
|
||||
0.9, 5.39522226908415781e-01, "aps.10.02"],
|
||||
[aps10_f, aps10_fp, aps10_fpp, (15,), [0, 1], np.inf,
|
||||
0.9, 5.48182294340655241e-01, "aps.10.03"],
|
||||
[aps10_f, aps10_fp, aps10_fpp, (20,), [0, 1], np.inf,
|
||||
0.9, 5.52704666678487833e-01, "aps.10.04"],
|
||||
[aps11_f, aps11_fp, aps11_fpp, (2,), [0.01, 1], np.inf,
|
||||
1e-02, 1.0 / 2, "aps.11.00"],
|
||||
[aps11_f, aps11_fp, aps11_fpp, (5,), [0.01, 1], np.inf,
|
||||
1e-02, 1.0 / 5, "aps.11.01"],
|
||||
[aps11_f, aps11_fp, aps11_fpp, (15,), [0.01, 1], np.inf,
|
||||
1e-02, 1.0 / 15, "aps.11.02"],
|
||||
[aps11_f, aps11_fp, aps11_fpp, (20,), [0.01, 1], np.inf,
|
||||
1e-02, 1.0 / 20, "aps.11.03"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (2,), [1, 100], np.inf,
|
||||
1.1, 2, "aps.12.00"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (3,), [1, 100], np.inf,
|
||||
1.1, 3, "aps.12.01"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (4,), [1, 100], np.inf,
|
||||
1.1, 4, "aps.12.02"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (5,), [1, 100], np.inf,
|
||||
1.1, 5, "aps.12.03"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (6,), [1, 100], np.inf,
|
||||
1.1, 6, "aps.12.04"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (7,), [1, 100], np.inf,
|
||||
1.1, 7, "aps.12.05"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (9,), [1, 100], np.inf,
|
||||
1.1, 9, "aps.12.06"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (11,), [1, 100], np.inf,
|
||||
1.1, 11, "aps.12.07"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (13,), [1, 100], np.inf,
|
||||
1.1, 13, "aps.12.08"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (15,), [1, 100], np.inf,
|
||||
1.1, 15, "aps.12.09"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (17,), [1, 100], np.inf,
|
||||
1.1, 17, "aps.12.10"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (19,), [1, 100], np.inf,
|
||||
1.1, 19, "aps.12.11"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (21,), [1, 100], np.inf,
|
||||
1.1, 21, "aps.12.12"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (23,), [1, 100], np.inf,
|
||||
1.1, 23, "aps.12.13"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (25,), [1, 100], np.inf,
|
||||
1.1, 25, "aps.12.14"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (27,), [1, 100], np.inf,
|
||||
1.1, 27, "aps.12.15"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (29,), [1, 100], np.inf,
|
||||
1.1, 29, "aps.12.16"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (31,), [1, 100], np.inf,
|
||||
1.1, 31, "aps.12.17"],
|
||||
[aps12_f, aps12_fp, aps12_fpp, (33,), [1, 100], np.inf,
|
||||
1.1, 33, "aps.12.18"],
|
||||
[aps13_f, aps13_fp, aps13_fpp, (), [-1, 4], np.inf,
|
||||
1.5, 0, "aps.13.00"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (1,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.00"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (2,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.01"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (3,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.02"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (4,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.03"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (5,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.04"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (6,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.05"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (7,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.06"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (8,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.07"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (9,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.08"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (10,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.09"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (11,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.10"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (12,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.11"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (13,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.12"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (14,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.13"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (15,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.14"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (16,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.15"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (17,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.16"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (18,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.17"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (19,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.18"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (20,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.19"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (21,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.20"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (22,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.21"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (23,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.22"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (24,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.23"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (25,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.24"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (26,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.25"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (27,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.26"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (28,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.27"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (29,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.28"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (30,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.29"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (31,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.30"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (32,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.31"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (33,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.32"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (34,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.33"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (35,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.34"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (36,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.35"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (37,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.36"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (38,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.37"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (39,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.38"],
|
||||
[aps14_f, aps14_fp, aps14_fpp, (40,), [-1000, np.pi / 2], 0,
|
||||
1, 6.23806518961612433e-01, "aps.14.39"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (20,), [-1000, 1e-4], 0,
|
||||
-2, 5.90513055942197166e-05, "aps.15.00"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (21,), [-1000, 1e-4], 0,
|
||||
-2, 5.63671553399369967e-05, "aps.15.01"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (22,), [-1000, 1e-4], 0,
|
||||
-2, 5.39164094555919196e-05, "aps.15.02"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (23,), [-1000, 1e-4], 0,
|
||||
-2, 5.16698923949422470e-05, "aps.15.03"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (24,), [-1000, 1e-4], 0,
|
||||
-2, 4.96030966991445609e-05, "aps.15.04"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (25,), [-1000, 1e-4], 0,
|
||||
-2, 4.76952852876389951e-05, "aps.15.05"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (26,), [-1000, 1e-4], 0,
|
||||
-2, 4.59287932399486662e-05, "aps.15.06"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (27,), [-1000, 1e-4], 0,
|
||||
-2, 4.42884791956647841e-05, "aps.15.07"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (28,), [-1000, 1e-4], 0,
|
||||
-2, 4.27612902578832391e-05, "aps.15.08"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (29,), [-1000, 1e-4], 0,
|
||||
-2, 4.13359139159538030e-05, "aps.15.09"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (30,), [-1000, 1e-4], 0,
|
||||
-2, 4.00024973380198076e-05, "aps.15.10"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (31,), [-1000, 1e-4], 0,
|
||||
-2, 3.87524192962066869e-05, "aps.15.11"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (32,), [-1000, 1e-4], 0,
|
||||
-2, 3.75781035599579910e-05, "aps.15.12"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (33,), [-1000, 1e-4], 0,
|
||||
-2, 3.64728652199592355e-05, "aps.15.13"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (34,), [-1000, 1e-4], 0,
|
||||
-2, 3.54307833565318273e-05, "aps.15.14"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (35,), [-1000, 1e-4], 0,
|
||||
-2, 3.44465949299614980e-05, "aps.15.15"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (36,), [-1000, 1e-4], 0,
|
||||
-2, 3.35156058778003705e-05, "aps.15.16"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (37,), [-1000, 1e-4], 0,
|
||||
-2, 3.26336162494372125e-05, "aps.15.17"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (38,), [-1000, 1e-4], 0,
|
||||
-2, 3.17968568584260013e-05, "aps.15.18"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (39,), [-1000, 1e-4], 0,
|
||||
-2, 3.10019354369653455e-05, "aps.15.19"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (40,), [-1000, 1e-4], 0,
|
||||
-2, 3.02457906702100968e-05, "aps.15.20"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (100,), [-1000, 1e-4], 0,
|
||||
-2, 1.22779942324615231e-05, "aps.15.21"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (200,), [-1000, 1e-4], 0,
|
||||
-2, 6.16953939044086617e-06, "aps.15.22"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (300,), [-1000, 1e-4], 0,
|
||||
-2, 4.11985852982928163e-06, "aps.15.23"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (400,), [-1000, 1e-4], 0,
|
||||
-2, 3.09246238772721682e-06, "aps.15.24"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (500,), [-1000, 1e-4], 0,
|
||||
-2, 2.47520442610501789e-06, "aps.15.25"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (600,), [-1000, 1e-4], 0,
|
||||
-2, 2.06335676785127107e-06, "aps.15.26"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (700,), [-1000, 1e-4], 0,
|
||||
-2, 1.76901200781542651e-06, "aps.15.27"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (800,), [-1000, 1e-4], 0,
|
||||
-2, 1.54816156988591016e-06, "aps.15.28"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (900,), [-1000, 1e-4], 0,
|
||||
-2, 1.37633453660223511e-06, "aps.15.29"],
|
||||
[aps15_f, aps15_fp, aps15_fpp, (1000,), [-1000, 1e-4], 0,
|
||||
-2, 1.23883857889971403e-06, "aps.15.30"]
|
||||
]
|
||||
|
||||
_APS_TESTS_DICTS = [dict(zip(_APS_TESTS_KEYS, testcase)) for testcase in _APS_TESTS]
|
||||
|
||||
|
||||
# ##################
|
||||
# "complex" test cases
|
||||
# A few simple, complex-valued, functions, defined on the complex plane.
|
||||
|
||||
|
||||
def cplx01_f(z, n, a):
|
||||
r"""z**n-a: Use to find the nth root of a"""
|
||||
return z**n - a
|
||||
|
||||
|
||||
def cplx01_fp(z, n, a):
|
||||
return n * z**(n - 1)
|
||||
|
||||
|
||||
def cplx01_fpp(z, n, a):
|
||||
return n * (n - 1) * z**(n - 2)
|
||||
|
||||
|
||||
def cplx02_f(z, a):
|
||||
r"""e**z - a: Use to find the log of a"""
|
||||
return np.exp(z) - a
|
||||
|
||||
|
||||
def cplx02_fp(z, a):
|
||||
return np.exp(z)
|
||||
|
||||
|
||||
def cplx02_fpp(z, a):
|
||||
return np.exp(z)
|
||||
|
||||
|
||||
# Each "complex" test case has
|
||||
# - a function and its two derivatives,
|
||||
# - additional arguments,
|
||||
# - the order of differentiability of the function on this interval
|
||||
# - two starting values x0 and x1
|
||||
# - the root
|
||||
# - an Identifier of the test case
|
||||
#
|
||||
# Algorithm 748 is a bracketing algorithm so a bracketing interval was provided
|
||||
# in [1] for each test case. Newton and Halley need a single starting point
|
||||
# x0, which was chosen to be near the middle of the interval, unless that
|
||||
# would make the problem too easy.
|
||||
|
||||
|
||||
_COMPLEX_TESTS_KEYS = [
|
||||
"f", "fprime", "fprime2", "args", "smoothness", "x0", "x1", "root", "ID"
|
||||
]
|
||||
_COMPLEX_TESTS = [
|
||||
[cplx01_f, cplx01_fp, cplx01_fpp, (2, -1), np.inf,
|
||||
(1 + 1j), (0.5 + 0.5j), 1j, "complex.01.00"],
|
||||
[cplx01_f, cplx01_fp, cplx01_fpp, (3, 1), np.inf,
|
||||
(-1 + 1j), (-0.5 + 2.0j), (-0.5 + np.sqrt(3) / 2 * 1.0j),
|
||||
"complex.01.01"],
|
||||
[cplx01_f, cplx01_fp, cplx01_fpp, (3, -1), np.inf,
|
||||
1j, (0.5 + 0.5j), (0.5 + np.sqrt(3) / 2 * 1.0j),
|
||||
"complex.01.02"],
|
||||
[cplx01_f, cplx01_fp, cplx01_fpp, (3, 8), np.inf,
|
||||
5, 4, 2, "complex.01.03"],
|
||||
[cplx02_f, cplx02_fp, cplx02_fpp, (-1,), np.inf,
|
||||
(1 + 2j), (0.5 + 0.5j), np.pi * 1.0j, "complex.02.00"],
|
||||
[cplx02_f, cplx02_fp, cplx02_fpp, (1j,), np.inf,
|
||||
(1 + 2j), (0.5 + 0.5j), np.pi * 0.5j, "complex.02.01"],
|
||||
]
|
||||
|
||||
_COMPLEX_TESTS_DICTS = [
|
||||
dict(zip(_COMPLEX_TESTS_KEYS, testcase)) for testcase in _COMPLEX_TESTS
|
||||
]
|
||||
|
||||
|
||||
def _add_a_b(tests):
|
||||
r"""Add "a" and "b" keys to each test from the "bracket" value"""
|
||||
for d in tests:
|
||||
for k, v in zip(['a', 'b'], d.get('bracket', [])):
|
||||
d[k] = v
|
||||
|
||||
|
||||
_add_a_b(_ORIGINAL_TESTS_DICTS)
|
||||
_add_a_b(_APS_TESTS_DICTS)
|
||||
_add_a_b(_COMPLEX_TESTS_DICTS)
|
||||
|
||||
|
||||
def get_tests(collection='original', smoothness=None):
|
||||
r"""Return the requested collection of test cases, as an array of dicts with subset-specific keys
|
||||
|
||||
Allowed values of collection:
|
||||
'original': The original benchmarking functions.
|
||||
Real-valued functions of real-valued inputs on an interval with a zero.
|
||||
f1, .., f3 are continuous and infinitely differentiable
|
||||
f4 has a single discontinuity at the root
|
||||
f5 has a root at 1 replacing a 1st order pole
|
||||
f6 is randomly positive on one side of the root, randomly negative on the other
|
||||
'aps': The test problems in the TOMS "Algorithm 748: Enclosing Zeros of Continuous Functions"
|
||||
paper by Alefeld, Potra and Shi. Real-valued functions of
|
||||
real-valued inputs on an interval with a zero.
|
||||
Suitable for methods which start with an enclosing interval, and
|
||||
derivatives up to 2nd order.
|
||||
'complex': Some complex-valued functions of complex-valued inputs.
|
||||
No enclosing bracket is provided.
|
||||
Suitable for methods which use one or more starting values, and
|
||||
derivatives up to 2nd order.
|
||||
|
||||
The dictionary keys will be a subset of
|
||||
["f", "fprime", "fprime2", "args", "bracket", "a", b", "smoothness", "x0", "x1", "root", "ID"]
|
||||
""" # noqa: E501
|
||||
collection = collection or "original"
|
||||
subsets = {"aps": _APS_TESTS_DICTS,
|
||||
"complex": _COMPLEX_TESTS_DICTS,
|
||||
"original": _ORIGINAL_TESTS_DICTS,
|
||||
"chandrupatla": _CHANDRUPATLA_TESTS_DICTS}
|
||||
tests = subsets.get(collection, [])
|
||||
if smoothness is not None:
|
||||
tests = [tc for tc in tests if tc['smoothness'] >= smoothness]
|
||||
return tests
|
||||
|
||||
|
||||
# Backwards compatibility
|
||||
methods = [cc.bisect, cc.ridder, cc.brenth, cc.brentq]
|
||||
mstrings = ['cc.bisect', 'cc.ridder', 'cc.brenth', 'cc.brentq']
|
||||
functions = [f2, f3, f4, f5, f6]
|
||||
fstrings = ['f2', 'f3', 'f4', 'f5', 'f6']
|
||||
|
||||
# ##################
|
||||
# "Chandrupatla" test cases
|
||||
# Functions and test cases that appear in [2]
|
||||
|
||||
def fun1(x):
|
||||
return x**3 - 2*x - 5
|
||||
fun1.root = 2.0945514815423265 # additional precision using mpmath.findroot
|
||||
|
||||
|
||||
def fun2(x):
|
||||
return 1 - 1/x**2
|
||||
fun2.root = 1
|
||||
|
||||
|
||||
def fun3(x):
|
||||
return (x-3)**3
|
||||
fun3.root = 3
|
||||
|
||||
|
||||
def fun4(x):
|
||||
return 6*(x-2)**5
|
||||
fun4.root = 2
|
||||
|
||||
|
||||
def fun5(x):
|
||||
return x**9
|
||||
fun5.root = 0
|
||||
|
||||
|
||||
def fun6(x):
|
||||
return x**19
|
||||
fun6.root = 0
|
||||
|
||||
|
||||
def fun7(x):
|
||||
xp = array_namespace(x)
|
||||
return 0 if xp.abs(x) < 3.8e-4 else x*xp.exp(-x**(-2))
|
||||
fun7.root = 0
|
||||
|
||||
|
||||
def fun8(x):
|
||||
xp = array_namespace(x)
|
||||
xi = 0.61489
|
||||
return -(3062*(1-xi)*xp.exp(-x))/(xi + (1-xi)*xp.exp(-x)) - 1013 + 1628/x
|
||||
fun8.root = 1.0375360332870405
|
||||
|
||||
|
||||
def fun9(x):
|
||||
xp = array_namespace(x)
|
||||
return xp.exp(x) - 2 - 0.01/x**2 + .000002/x**3
|
||||
fun9.root = 0.7032048403631358
|
||||
|
||||
# Each "chandropatla" test case has
|
||||
# - a function,
|
||||
# - two starting values x0 and x1
|
||||
# - the root
|
||||
# - the number of function evaluations required by Chandrupatla's algorithm
|
||||
# - an Identifier of the test case
|
||||
#
|
||||
# Chandrupatla's is a bracketing algorithm, so a bracketing interval was
|
||||
# provided in [2] for each test case. No special support for testing with
|
||||
# secant/Newton/Halley is provided.
|
||||
|
||||
_CHANDRUPATLA_TESTS_KEYS = ["f", "bracket", "root", "nfeval", "ID"]
|
||||
_CHANDRUPATLA_TESTS = [
|
||||
[fun1, [2, 3], fun1.root, 7],
|
||||
[fun1, [1, 10], fun1.root, 11],
|
||||
[fun1, [1, 100], fun1.root, 14],
|
||||
[fun1, [-1e4, 1e4], fun1.root, 23],
|
||||
[fun1, [-1e10, 1e10], fun1.root, 43],
|
||||
[fun2, [0.5, 1.51], fun2.root, 8],
|
||||
[fun2, [1e-4, 1e4], fun2.root, 22],
|
||||
[fun2, [1e-6, 1e6], fun2.root, 28],
|
||||
[fun2, [1e-10, 1e10], fun2.root, 41],
|
||||
[fun2, [1e-12, 1e12], fun2.root, 48],
|
||||
[fun3, [0, 5], fun3.root, 21],
|
||||
[fun3, [-10, 10], fun3.root, 23],
|
||||
[fun3, [-1e4, 1e4], fun3.root, 36],
|
||||
[fun3, [-1e6, 1e6], fun3.root, 45],
|
||||
[fun3, [-1e10, 1e10], fun3.root, 55],
|
||||
[fun4, [0, 5], fun4.root, 21],
|
||||
[fun4, [-10, 10], fun4.root, 23],
|
||||
[fun4, [-1e4, 1e4], fun4.root, 33],
|
||||
[fun4, [-1e6, 1e6], fun4.root, 43],
|
||||
[fun4, [-1e10, 1e10], fun4.root, 54],
|
||||
[fun5, [-1, 4], fun5.root, 21],
|
||||
[fun5, [-2, 5], fun5.root, 22],
|
||||
[fun5, [-1, 10], fun5.root, 23],
|
||||
[fun5, [-5, 50], fun5.root, 25],
|
||||
[fun5, [-10, 100], fun5.root, 26],
|
||||
[fun6, [-1., 4.], fun6.root, 21],
|
||||
[fun6, [-2., 5.], fun6.root, 22],
|
||||
[fun6, [-1., 10.], fun6.root, 23],
|
||||
[fun6, [-5., 50.], fun6.root, 25],
|
||||
[fun6, [-10., 100.], fun6.root, 26],
|
||||
[fun7, [-1, 4], fun7.root, 8],
|
||||
[fun7, [-2, 5], fun7.root, 8],
|
||||
[fun7, [-1, 10], fun7.root, 11],
|
||||
[fun7, [-5, 50], fun7.root, 18],
|
||||
[fun7, [-10, 100], fun7.root, 19],
|
||||
[fun8, [2e-4, 2], fun8.root, 9],
|
||||
[fun8, [2e-4, 3], fun8.root, 10],
|
||||
[fun8, [2e-4, 9], fun8.root, 11],
|
||||
[fun8, [2e-4, 27], fun8.root, 12],
|
||||
[fun8, [2e-4, 81], fun8.root, 14],
|
||||
[fun9, [2e-4, 1], fun9.root, 7],
|
||||
[fun9, [2e-4, 3], fun9.root, 8],
|
||||
[fun9, [2e-4, 9], fun9.root, 10],
|
||||
[fun9, [2e-4, 27], fun9.root, 11],
|
||||
[fun9, [2e-4, 81], fun9.root, 13],
|
||||
]
|
||||
_CHANDRUPATLA_TESTS = [test + [f'{test[0].__name__}.{i%5+1}']
|
||||
for i, test in enumerate(_CHANDRUPATLA_TESTS)]
|
||||
|
||||
_CHANDRUPATLA_TESTS_DICTS = [dict(zip(_CHANDRUPATLA_TESTS_KEYS, testcase))
|
||||
for testcase in _CHANDRUPATLA_TESTS]
|
||||
_add_a_b(_CHANDRUPATLA_TESTS_DICTS)
|
||||
Binary file not shown.
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user