DefBi-level Optimization Problem
Bi-level optimization problems refer to two related optimization tasks, each one is assigned to a decision level (i.e., upper and lower levels). The evaluation of an upper level solution requires the evaluation of the lower level.
DefParameterized Optimization Problem
Parameterized optimization problem considers optimization problems depending on a parameter and describes how the feasible set, the value function, and the local or global minimizers of the program depend on changes in the parameter. That is we define a solution mapping as :
Differentiable Optimization
ThrmDini’s Implicit Function Theorem
Let be differentiable in a neighbourhood of and such that , and let be nonsingular. Then the solution mapping has a single-valued localization around for which is differentiable in a neighbourhood of with Jacobian satisfying
e.g. Circle example: Consider , we have
ThrmDifferentiating Unconstrained Optimization Problems
Let be twice differentiable and let . Then for non-zero Hessian
ThrmDifferentiating Equality Constrained Optimization Problems
Consider functions and . Let
Assume that exists, that and are twice differentiable in the neighbourhood of , and that . Then for non-singular:where
Misplaced & &&B=\frac{\partial^2f(x,y)}{\partial x\partial y}-\sum_{i=1}^q\nu_i\frac{\partial^2h_i(x,y)}{\partial x\partial y}\in\R^{m\times n}\\ C=\frac{\partial h(x,y)}{\partial x}\in\R^{q\times n}&& H=\frac{\partial^2f(x,y)}{\partial y^2}-\sum_{i=1}^q\nu_i\frac{\partial^2h_i(x,y)}{\partial y^2}\in\R^{m\times m}\end{array} $$ and $\nu \in \R^q$ satisfies $\nu^{\mathrm{T}}A = \frac{\partial f(x,y)}{\partial y}$ **Fact** <i><u>Dealing with Inequality Constraints</u></i> Replace inequality constraints with log-barrier approximation, treat as equality constraints if active and ignore otherwise. **Fact** At one extreme we can try back propagate through the optimization algorithm, while at the other extreme we can use the implicit differentiation result to hand-craft efficient backward pass code. However, there are also options in between, for example: - use automatic differentiation to obtain quantities $A$, $B$, $C$ and $H$ from software implementations of the objective and (active) constraint functions. - implement the optimality condition $∇\mathcal{L} = 0$ in software and automatically differentiate that.