Function Optimization

Most data-driven scientific inference, qualitative, quantitative, and visual analytics involve formulating, understanding the behavior of, and optimizing objective (cost) functions. Presenting the mathematical foundations of representation and interrogati

  • PDF / 1,998,118 Bytes
  • 29 Pages / 439.37 x 666.142 pts Page_size
  • 62 Downloads / 209 Views

DOWNLOAD

REPORT


Function Optimization

Most data-driven scientific inference, qualitative, quantitative, and visual analytics involve formulating, understanding the behavior of, and optimizing objective (cost) functions. Presenting the mathematical foundations of representation and interrogation of diverse spectra of objective functions provides mechanisms for obtaining effective solutions to complex big data problems. (Multivariate) function optimization (minimization or maximization) is the process of searching for variables x1, x2, x3, . . ., xn that either minimize or maximize the multivariate cost function f(x1, x2, x3, . . ., xn). In this chapter, we will specifically discuss (1) constrained and unconstrained optimization; (2) Lagrange multipliers; (3) linear, quadratic and (general) non-linear programming; and (4) data denoising.

22.1

Free (Unconstrained) Optimization

We will start with function optimization without restrictions for the domain of the cost function, Ω 3 {xi}. The extreme value theorem suggests that a solution to the free optimization processes, minx1 , x2 , x3 , ..., xn f ðx1 ; x2 ; x3 ; . . . ; xn Þ or maxx1 , x2 , x3 , ..., xn f ðx1 ; x2 ; x3 ; . . . ; xn Þ, may be obtained by a gradient vector descent method. This means n that we can minimize/maximize the objective function by finding df df df solutions to ∇f ¼ dx ; ; . . . ; dx g ¼ f0; 0; . . . ; 0g. Solutions to this equation, x1, . . ., 1 dx2 1 xn, will present candidate (local) minima and maxima. In general, identifying critical points using the gradient or tangent plane, where the partial derivatives are trivial, may not be sufficient to determine the extrema (minima or maxima) of multivariate objective functions. Some critical points may represent inflection points, or local extrema that are far from the global optimum of the objective function. The eigenvalues of the Hessian matrix, which includes the second order partial derivatives, at the critical points provide clues to pinpoint extrema. For instance, invertible Hessian matrices that (i) are positive definite (i.e., © Ivo D. Dinov 2018 I. D. Dinov, Data Science and Predictive Analytics, https://doi.org/10.1007/978-3-319-72347-1_22

735

736

22

Function Optimization

all eigenvalues are positive), yield a local minimum at the critical point, (ii) are negative definite (all eigenvalues are negative) at the critical point suggests that the objective function has a local maximum, and (iii) have both positive and negative eigenvalues yield a saddle point for the objective function at the critical point where the gradient is trivial. There are two complementary strategies to avoid being trapped in local extrema. First, we can run many iterations with different initial vectors. At each iteration, the objective function may achieve a (local) maximum/minimum/saddle point. Finally, we select the overall minimal (or maximal) value from all iterations. Another adaptive strategy involves either adjusting the step sizes or accepting solutions in probability, e.g., simulated annealing is one example of an adap