Bi-level Optimization

Def Bi-level Optimization Problem Bi-level optimization problems refer to two related optimization tasks, each one is assigned to a decision level (i.e., upper and lower levels). The evaluation of an upper level solution requires the evaluation of the lower level.

Def Parameterized Optimization Problem Parameterized optimization problem considers optimization problems depending on a parameter and describes how the feasible set, the value function, and the local or global minimizers of the program depend on changes in the parameter. That is we define a solution mapping as :

Differentiable Optimization

Thrm Dini’s Implicit Function Theorem Let be differentiable in a neighbourhood of and such that , and let be nonsingular. Then the solution mapping has a single-valued localization around for which is differentiable in a neighbourhood of with Jacobian satisfying

e.g. Circle example: Consider , we have

Thrm Differentiating Unconstrained Optimization Problems Let be twice differentiable and let . Then for non-zero Hessian

Thrm Differentiating Equality Constrained Optimization Problems Consider functions and . Let Assume that exists, that and are twice differentiable in the neighbourhood of , and that . Then for non-singular:where

Misplaced & &&B=\frac{\partial^2f(x,y)}{\partial x\partial y}-\sum_{i=1}^q\nu_i\frac{\partial^2h_i(x,y)}{\partial x\partial y}\in\R^{m\times n}\\ C=\frac{\partial h(x,y)}{\partial x}\in\R^{q\times n}&& H=\frac{\partial^2f(x,y)}{\partial y^2}-\sum_{i=1}^q\nu_i\frac{\partial^2h_i(x,y)}{\partial y^2}\in\R^{m\times m}\end{array} $$ and $\nu \in \R^q$ satisfies $\nu^{\mathrm{T}}A = \frac{\partial f(x,y)}{\partial y}$ **Fact** <i><u>Dealing with Inequality Constraints</u></i> Replace inequality constraints with log-barrier approximation, treat as equality constraints if active and ignore otherwise. **Fact** At one extreme we can try back propagate through the optimization algorithm, while at the other extreme we can use the implicit differentiation result to hand-craft efficient backward pass code. However, there are also options in between, for example: - use automatic differentiation to obtain quantities $A$, $B$, $C$ and $H$ from software implementations of the objective and (active) constraint functions. - implement the optimality condition $∇\mathcal{L} = 0$ in software and automatically differentiate that.