# Constructing an interpolated function for the class of smooth convex functions

Shuvomoy Das Gupta

November 22, 2021

In this blog, we study constructing interpolated function for $L$​​-smooth convex function due to Yoel Drori from [3].

Notation and notions. All norms are 2-norm in this blog. A function is $L$-smooth convex if and only

$\left(\forall x,y\in\mathbf{R}^{d}\right)\quad f(y)\geq f(x)+\langle\nabla f(x)\mid y-x\rangle+\frac{1}{2L}\|\nabla f(x)-\nabla f(y)\|^{2}.$

On the other hand, a function is $\mu$​-strongly convex if and only if

$\left(\forall x,y\in\mathbf{R}^{d}\right)\quad f(y)\geq f(x)+\langle\nabla f(x)\mid y-x\rangle+\frac{\mu}{2}\|f^{\prime}(x)-f^{\prime}(y)\|^{2},$

​ where $f^{\prime}(\cdot)$​​ denotes a subgradient of $f$​​ at $(\cdot)$​​. The set of all $L$​​-smooth convex functions on $\mathbf{R}^{d}$​​ is denoted by $\mathcal{F}_{0,L}(\mathbf{R}^{d})$​​, and the set of all $\mu$​​-strongly convex functions is denoted by $\mathcal{F}_{\mu,\infty}(\mathbf{R}^{d})$​​​.

### Sion's minimax theorem

We are going to use the famous result due to Maurice Sion.

## Theorem 1: Sion's minimax theorem Theorem

Let $X$ be a compact convex set (over which minimization will be performed) and $Y$ be a convex set in $\mathbf{R}^{d}$ (over which supremum will be computed). Suppose $g:X\times Y\to\mathbf{R}$ satisfies the following properties: (i) $g(x,\cdot)$ is upper-semicontinuous and quasi-concave on $Y$ for all $x\in X$, and (ii) $g(\cdot,y)$ is lower-semicontinuous and quasi-convex on $X$ for all $y\in Y$. Then we have

$\sup_{y\in Y}\min_{x\in X}g(x,y)=\min_{x\in X}\sup_{y\in Y}g(x,y).$

## Definition 1: Interpolable function. Definition

Suppose we are given the set of triplets $\{(x_{i},g_{i},f_{i})\}_{i\in I}\subseteq\mathbf{R}^{d}\times\mathbf{R}^{d}\times\mathbf{R}$​ where $I$​ is a finite index set. Let $\mathcal{F}(\mathbf{R}^d)$​ be some class of functions. Then the set $\{(x_{i},g_{i},f_{i} ) \}_{i\in I}$​ is $\mathcal{F}(\mathbf{R}^{d})$​-interpolable if and only if there exists a function $f\in\mathcal{F}(\mathbf{R}^{d})$​ such that for all $i\in I$​ we have $f_{i}=f(x_{i})$​ and $g_{i} \in \partial f(x_{i})$​.

### Main result (due to Yoel Drori)

Next, we prove the following result due to Yoel Drori from [3].

## Theorem 2: Interpolation of smooth convex functions. Theorem

If $\{(x_{i},g_{i},f_{i})\}_{i\in I}\subseteq\mathbf{R}^{d}\times\mathbf{R}^{d}\times\mathbf{R}$​​​​ is $\mathcal{F}_{0,L}(\mathbf{R}^{d})$​​​​-interpolable, then one interpolated function $f\in\mathcal{F}_{0,L}(\mathbf{R}^{d})$​​​​ that can be constructed from $\{(x_{i},g_{i},f_{i})\}_{i\in I}$​​​​ is:

$f(x)=\min_{\alpha\in\Delta}\left[\frac{L}{2}\|x-\sum_{i\in I}\alpha_{i}\left(x_{i}-\frac{1}{L}g_{i}\right)\|^{2}+\sum_{i\in I}\alpha_{i}\left(f_{i}-\frac{1}{2L}\|g_{i}\|^{2}\right)\right],$

​​​​ where $\Delta=\{\bar{\alpha}\in\mathbf{R}^{\vert I\vert}\mid\bar{\alpha}\geq0,\sum_{i\in I}\bar{\alpha}_{i}=1\}.$​​​​​

Proof sketch. We use the following chain of logic:

(a) $\{(x_{i},g_{i},f_{i})\}_{i\in I}$ interpolated by $f\in\mathcal{F}_{0,L}(\mathbf{R}^{d})$

$\Leftrightarrow$

(b) $\{(\bar{x}_{i},\bar{g}_{i},\bar{f}_{i})\}_{i\in I}\coloneqq\{(g_{i},x_{i},\left\langle x_{i}\mid g_{i}\right\rangle -f_{i})\}_{i\in I}$​ interpolated by $h=f^{*}\in\mathcal{F}_{1/L,\infty}$​, where $f^*$ denotes conjugate function of $f$.

So, we start backwards from (b). First construct $h$ that interpolates $\{(\bar{x}_{i},\bar{g}_{i},\bar{f}_{i})\}_{i\in I}$. Then construct $f=h^{*},$ which will be interpolate $\{(x_{i},g_{i},f_{i})\}_{i\in I}$​​.

Now we start the proof.

### Proof

The set $\{(x_{i},g_{i},f_{i})\}_{i\in I}$​​​​ is $\mathcal{F}_{0,L}(\mathbf{R}^{d})$​​​​-interpolable if and only if $\{(\bar{x}_{i},\bar{g}_{i},\bar{f}_{i})\}_{i\in I}\coloneqq\{(g_{i},x_{i},\left\langle x_{i}\mid g_{i}\right\rangle -f_{i})\}_{i\in I}$​​​​ is $\mathcal{F}_{1/L,\infty}(\mathbf{R}^{d})$-interpolable​​​​, which is from [1, Lemma 3.7]. Also, if $\{(\bar{x}_{i},\bar{g}_{i},\bar{f}_{i})\}_{i\in I}$​​​​ is $\mathcal{F}_{\mu,\infty}(\mathbf{R}^{d})$​​​​-interpolable then one such interpolated function $h\in\mathcal{F}_{\mu,\infty}(\mathbf{R}^{d})$​​​​ would be:

\begin{aligned} h(\bar{x}) & =\max_{i\in I}\{\bar{f}_{i}+\left\langle \bar{g}_{i}\mid\bar{x}-\bar{x}_{i}\right\rangle +\frac{\mu}{2}\|\bar{x}-\bar{x}_{i}\|^{2}\}\end{aligned}

​​​​ where we have used the fact that $\max_{i\in I}a_{i}=\max_{\alpha\in\Delta}\sum_{i\in I}\alpha_{i}a_{i}.$​​​​

So, if $\{(\bar{x}_{i},\bar{g}_{i},\bar{f}_{i})\}_{i\in I}\coloneqq\{(g_{i},x_{i},\left\langle x_{i}\mid g_{i}\right\rangle -f_{i})\}_{i\in I}$​ is $\mathcal{F}_{1/L,\infty}(\mathbf{R}^{d})$​-interpolable, then one such interpolated function $h\in\mathcal{F}_{1/L,\infty}(\mathbf{R}^{d})$​ would be

\begin{aligned} h(\bar{x}) & =\max_{i\in I}\left\{ \bar{f}_{i}+\left\langle \bar{g}_{i}\mid\bar{x}-\bar{x}_{i}\right\rangle +\frac{1}{2L}\|\bar{x}-\bar{x}_{i}\|^{2}\right\} \\ & =\max_{i\in I}\left\{ \left\langle x_{i}\mid g_{i}\right\rangle -f_{i}+\left\langle x_{i}\mid\bar{x}-g_{i}\right\rangle +\frac{1}{2L}\|\bar{x}-g_{i}\|^{2}\right\} \\ & =\max_{i\in I}\left\{ \cancel{\left\langle x_{i}\mid g_{i}\right\rangle }-f_{i}+\left\langle x_{i}\mid\bar{x}\right\rangle -\cancel{\left\langle x_{i}\mid g_{i}\right\rangle }+\frac{1}{2L}\|\bar{x}-g_{i}\|^{2}\right\} \\ & =\max_{i\in I}\left\{ -f_{i}+\left\langle x_{i}\mid\bar{x}\right\rangle +\frac{1}{2L}\|\bar{x}-g_{i}\|^{2}\right\} \\ & =-\min_{i\in I}\left\{ f_{i}-\left\langle x_{i}\mid\bar{x}\right\rangle -\frac{1}{2L}\|\bar{x}-g_{i}\|^{2}\right\} ,\quad\quad\quad(1)\end{aligned}

​ where in the last line we have used $\max(\cdot)=-\min(-\cdot).$

But, we are looking for an interpolated function that is in $\mathcal{F}_{0,L}(\mathbf{R}^{d}),$ not $\mathcal{F}_{1/L,\infty}(\mathbf{R}^{d})$. How to go from $\mathcal{F}_{1/L,\infty}(\mathbf{R}^{d})$ to $\mathcal{F}_{0,L}(\mathbf{R}^{d})$? To that goal, we use the following results:

(i) $f\in\mathcal{F}_{0,L}(\mathbf{R}^{d})$​ if and only if $f$​'s conjugate function $f^{*}(\cdot)=\sup_{y\in\mathbf{R}^{d}}\left[-f(y)+\left\langle \cdot\mid y\right\rangle \right]\in\mathcal{F}_{1/L,\infty}(\mathbf{R}^{d})$[1, Theorem 2.34].

(ii) for any lower-semicontinuous, proper, and convex function $f$, we have $f=f^{**}$ [2, Theorem 12.2].

Due to (i) and (ii), to find the interpolated function in $\mathcal{F}_{0,L}(\mathbf{R}^{d}),$​​ all we have to do is to compute the conjugate function of $h$​​. In other words, the desired interpolated function in $\mathcal{F}_{0,L}(\mathbf{R}^{d})$​​ would be

\begin{aligned} f(x) & =h^{*}(x)\\ & =\sup_{y\in\mathbf{R}^{d}}\left[-h(y)+\left\langle x\mid y\right\rangle \right]\\ & \overset{(1)}{=}\sup_{y\in\mathbf{R}^{d}}\left[-\left[-\min_{i\in I}\left\{ f_{i}-\left\langle x_{i}\mid y\right\rangle -\frac{1}{2L}\|y-g_{i}\|^{2}\right\} \right]+\left\langle x\mid y\right\rangle \right]\\ & =\sup_{y\in\mathbf{R}^{d}}\left[\min_{i\in I}\left\{ f_{i}-\left\langle x_{i}\mid y\right\rangle -\frac{1}{2L}\|y-g_{i}\|^{2}+\left\langle x\mid y\right\rangle \right\} \right]\\ & =\sup_{y\in\mathbf{R}^{d}}\left[\min_{i\in I}\left\{ f_{i}-\frac{1}{2L}\|y-g_{i}\|^{2}+\left\langle x-x_{i}\mid y\right\rangle \right\} \right]\\ & =\sup_{y\in\mathbf{R}^{d}}\left[\min_{\alpha\in\Delta}\sum_{i\in I}\alpha_{i}\left\{ f_{i}-\frac{1}{2L}\|y-g_{i}\|^{2}+\left\langle x-x_{i}\mid y\right\rangle \right\} \right],\quad\quad\quad(2)\end{aligned}

where in the last line we have used the fact that $\min_{i\in I}a_{i}=\min_{\alpha\in\Delta}\sum_{i\in I}\alpha_{i}a_{i}$. Now, in (2), denote the inner function by

$p(y,\alpha)=\sum_{i\in I}\alpha_{i}\left\{ f_{i}-\frac{1}{2L}\|y-g_{i}\|^{2}+\left\langle x-x_{i}\mid y\right\rangle \right\} ,$

which is continuous and concave with respect to the maximizing variable $y$ and continuous and convex (in fact linear) with respect to the minimizing variable $\alpha$. Finally, $\Delta$–-the minimizing set–-is compact and convex, and $\mathbf{R}^{d}$​–-​​the maximizing set–-is convex. Hence, applying Sion's minimax theorem we have:

\begin{aligned} & f(x)\\ = & \sup_{y\in\mathbf{R}^{d}}\left[\min_{\alpha\in\Delta}\sum_{i\in I}\alpha_{i}\left\{ f_{i}-\frac{1}{2L}\|y-g_{i}\|^{2}+\left\langle x-x_{i}\mid y\right\rangle \right\} \right]\\ = & \min_{\alpha\in\Delta}\left[\sup_{y\in\mathbf{R}^{d}}\sum_{i\in I}\alpha_{i}\left\{ f_{i}-\frac{1}{2L}\|y-g_{i}\|^{2}+\left\langle x-x_{i}\mid y\right\rangle \right\} \right].\quad\quad\quad(3)\end{aligned}

Let us focus on the inner maximization problem, which is a convex optimization problem. So, by taking derivative with respect to $y$ and then setting that equal to zero, we can compute the maximizer as follows:

\begin{aligned} & \nabla_{y}\left[\sum_{i\in I}\alpha_{i}\left\{ f_{i}-\frac{1}{2L}\|y-g_{i}\|^{2}+\left\langle x-x_{i}\mid y\right\rangle \right\} \right]=0\\ \Leftrightarrow & \sum_{i\in I}\alpha_{i}\left\{ -\frac{1}{\cancel{2}}\frac{1}{L}\times\cancel{2}\times(y-g_{i})+x-x_{i}\right\} =\sum_{i\in I}\alpha_{i}\left(-\frac{1}{L}y+\frac{1}{L}g_{i}+x-x_{i}\right)=0\\ \Leftrightarrow & \sum_{i\in I}\alpha_{i}\left(\frac{1}{L}g_{i}+x-x_{i}\right)=\sum_{i\in I}\alpha_{i}\left(\frac{1}{L}y\right)=\frac{1}{L}y\underbrace{\sum_{i\in I}\alpha_{i}}_{=1}=\frac{1}{L}y\\ \Leftrightarrow & \sum_{i\in I}\alpha_{i}\left(\frac{1}{L}g_{i}+x-x_{i}\right)=\frac{1}{L}y\\ \Leftrightarrow & y^{\star}=L\Big(\sum_{i\in I}\alpha_{i}\{(x-x_{i})+\frac{1}{L}g_{i}\}\Big)\\ \therefore y^{\star} & =L\Big(x-\sum_{i\in I}\alpha_{i}\big(x_{i}-\frac{1}{L}g_{i}\big)\Big)\quad\quad\quad(4) \end{aligned}

Putting the value of $y$​​​​​ in (4) in the inner maximization problem, we have:

\begin{aligned} & \sup_{y\in\mathbf{R}^{d}}\sum_{i\in I}\alpha_{i}\Big(f_{i}-\frac{1}{2L}\underbrace{\|y-g_{i}\|^{2}}_{=\|y\|^{2}+\|g_{i}\|^{2}-2\langle y\mid g_{i}\rangle}+\left\langle x-x_{i}\mid y\right\rangle \Big)\\ & =\sup_{y\in\mathbf{R}^{d}}\sum_{i\in I}\alpha_{i}\Big(f_{i}-\frac{1}{2L}\{\|y\|^{2}+\|g_{i}\|^{2}-2\langle y\mid g_{i}\rangle\}+\left\langle x-x_{i}\mid y\right\rangle \Big)\\ & =\sum_{i\in I}\alpha_{i}\Big(f_{i}-\frac{1}{2L}\|g_{i}\|^{2}\Big)+\sup_{y\in\mathbf{R}^{d}}\sum_{i\in I}\alpha_{i}\Big(-\frac{1}{2L}\|y^{\star}\|^{2}+\langle y^{\star}\mid\frac{1}{L}g_{i}\rangle+\left\langle x-x_{i}\mid y^{\star}\right\rangle \Big)\\ & =\sum_{i\in I}\alpha_{i}\Big(f_{i}-\frac{1}{2L}\|g_{i}\|^{2}\Big)+\sup_{y\in\mathbf{R}^{d}}\sum_{i\in I}\alpha_{i}\Big(-\frac{1}{2L}\|y^{\star}\|^{2}+\left\langle x-x_{i}+\frac{1}{L}g_{i}\mid y^{\star}\right\rangle \Big)\\ & =\sum_{i\in I}\alpha_{i}\Big(f_{i}-\frac{1}{2L}\|g_{i}\|^{2}\Big)+\sum_{i\in I}\alpha_{i}\Big(-\frac{1}{2L}\|y^{\star}\|^{2}+\left\langle x-x_{i}+\frac{1}{L}g_{i}\mid y^{\star}\right\rangle \Big)\\ & =\sum_{i\in I}\alpha_{i}\Big(f_{i}-\frac{1}{2L}\|g_{i}\|^{2}\Big)-\frac{1}{2L}\|y^{\star}\|^{2}+\sum_{i\in I}\alpha_{i}\left\langle x-x_{i}+\frac{1}{L}g_{i}\mid y^{\star}\right\rangle \\ & =\sum_{i\in I}\alpha_{i}\Big(f_{i}-\frac{1}{2L}\|g_{i}\|^{2}\Big)-\frac{1}{2L}\|L\Big(x-\sum_{i\in I}\alpha_{i}\big(x_{i}-\frac{1}{L}g_{i}\big)\Big)\|^{2}\\ & +\left\langle \Big(x-\sum_{i\in I}\alpha_{i}\big(x_{i}-\frac{1}{L}g_{i}\big)\Big)\mid L\Big(x-\sum_{i\in I}\alpha_{i}\big(x_{i}-\frac{1}{L}g_{i}\big)\Big)\right\rangle \\ & =\sum_{i\in I}\alpha_{i}\Big(f_{i}-\frac{1}{2L}\|g_{i}\|^{2}\Big)-\frac{L}{2}\|x-\sum_{i\in I}\alpha_{i}\big(x_{i}-\frac{1}{L}g_{i}\big)\Big)\|^{2}+L\|x-\sum_{i\in I}\alpha_{i}\big(x_{i}-\frac{1}{L}g_{i}\big)\Big)\|^{2}\\ & =\sum_{i\in I}\alpha_{i}\Big(f_{i}-\frac{1}{2L}\|g_{i}\|^{2}\Big)+\frac{L}{2}\|x-\sum_{i\in I}\alpha_{i}\big(x_{i}-\frac{1}{L}g_{i}\big)\Big)\|^{2}\\ & =\frac{L}{2}\|x-\sum_{i\in I}\alpha_{i}\left(x_{i}-\frac{1}{L}g_{i}\right)\|^{2}+\sum_{i\in I}\alpha_{i}\left(f_{i}-\frac{1}{2L}\|g_{i}\|^{2}\right).\quad\quad\quad(5) \end{aligned}

Using (5) in (3), we arrive at the desired interpolated function $f\in\mathcal{F}_{0,L}(\mathbf{R}^{d}):$

$f(x)=\min_{\alpha\in\Delta}\left[\frac{L}{2}\|x-\sum_{i\in I}\alpha_{i}\left(x_{i}-\frac{1}{L}g_{i}\right)\|^{2}+\sum_{i\in I}\alpha_{i}\left(f_{i}-\frac{1}{2L}\|g_{i}\|^{2}\right)\right],$

​​​​​​ thus completing the proof.

### References

[1] Taylor, Adrien B. Convex interpolation and performance estimation of first-order methods for convex optimization. Diss. Catholic University of Louvain, Louvain-la-Neuve, Belgium, 2017.

[2] Rockafellar, Tyrell. Convex Analysis. Princeton University Press, 1996.

[3] Drori, Yoel. The exact information-based complexity of smooth convex minimization. Journal of Complexity 39 (2017): 1-16.