The Nonlinear Conjugate Gradient (CG) method generalizes its linear namesake to a broader class of convex optimization problems.
bool cg(arma::vec& init_out_vals, std::function<double (const arma::vec& vals_inp, arma::vec* grad_out, void* opt_data)> opt_objfn, void* opt_data); bool cg(arma::vec& init_out_vals, std::function<double (const arma::vec& vals_inp, arma::vec* grad_out, void* opt_data)> opt_objfn, void* opt_data, algo_settings_t& settings);
Function arguments:
init_out_vals
a column vector of initial values; will be replaced by the solution values.opt_objfn
the function to be minimized, taking three arguments:
vals_inp
a vector of inputs;grad_out
a vector to store the gradient; andopt_data
additional parameters passed to the function.opt_data
additional parameters passed to the function.settings
parameters controlling the optimization routine; see below.Optimization control parameters:
double err_tol
the tolerance value controlling how small $\| \nabla f \|$ should be before 'convergence' is declared.int iter_max
the maximum number of iterations/updates before the algorithm exits.bool vals_bound
whether the search space is bounded. If true, thenarma::vec lower_bounds
this defines the lower bounds.arma::vec upper_bounds
this defines the upper bounds.int cg_method
which updating formula to use; see the details section below.double cg_restart_threshold
the value $\nu$ in the details section below.Let $x^{(i)}$ denote the function input values at stage $i$ of the algorithm. The Nonlinear Conjugate Gradient updating rule is as follows.
cg_method
.The algorithm stops when $\| \nabla f \|$ is less than err_tol
, or the total number of iterations exceeds iter_max
.
cg_method = 1
Fletcher–Reeves (FR):cg_method = 2
Polak-Ribiere (PR) '+':cg_method = 3
FR-PR Hybrid:cg_method = 4
Hestenes-Stiefel:cg_method = 5
Dai-Yuan:cg_method = 6
Hager-Zhang:As examples, consider minimization of:
Below are plots of each example, restricted to the two-dimensional problem where relevant.
Code solving each example using OptimLib is given below.
#include "optim.hpp" double sphere_fn(const arma::vec& vals_inp, arma::vec* grad_out, void* opt_data) { double obj_val = arma::dot(vals_inp,vals_inp); // if (grad_out) { *grad_out = 2.0*vals_inp; } // return obj_val; } double booth_fn(const arma::vec& vals_inp, arma::vec* grad_out, void* opt_data) { double x_1 = vals_inp(0); double x_2 = vals_inp(1); double obj_val = std::pow(x_1 + 2*x_2 - 7.0,2) + std::pow(2*x_1 + x_2 - 5.0,2); // if (grad_out) { (*grad_out)(0) = 2*(x_1 + 2*x_2 - 7.0) + 2*(2*x_1 + x_2 - 5.0)*2; (*grad_out)(1) = 2*(x_1 + 2*x_2 - 7.0)*2 + 2*(2*x_1 + x_2 - 5.0); } // return obj_val; } int main() { // // sphere function const int test_dim = 5; arma::vec x = arma::ones(test_dim,1); // initial values (1,1,...,1) bool success = optim::cg(x,sphere_fn,nullptr); if (success) { std::cout << "cg: sphere test completed successfully." << std::endl; } else { std::cout << "cg: sphere test completed unsuccessfully." << std::endl; } arma::cout << "cg: solution to sphere test:\n" << x << arma::endl; // // Booth function arma::vec x_2 = arma::zeros(2,1) + 2; // initial values (2,2) bool success_2 = optim::cg(x,booth_fn,nullptr); if (success_2) { std::cout << "cg: Booth test completed successfully." << std::endl; } else { std::cout << "cg: Booth test completed unsuccessfully." << std::endl; } arma::cout << "cg: solution to Booth test:\n" << x_2 << arma::endl; return 0; }