Shuvomoy Das Gupta
July 30, 2021
In this blog, we provide a proof about certain geometric properties of a prox-regular set based on Proposition 3.1 of the paper "Local Differentiability of Distance Functions" by R. A. Poliquin, R. T. Rockafellar and L. Thibault (link to pdf here ).
We consider nonempty, closed set X \mathcal{X} X in a finite-dimensional vector space, which is prox-regular at a point x ˉ ∈ X \bar{x}\in\mathcal{X} x ˉ ∈ X , i.e. , Euclidean projection onto the set from x ˉ \bar{x} x ˉ is single-valued in some neighborhood of that point. Throughout the blog, the underlying space is a finite-dimensional vector space over the reals. To state the result and to prove it, we introduce certain notation and notions.
An open ball with center c c c and radius r r r is denoted by
B ( c ; r ) = { x ∣ ∥ x − c ∥ < r } , B(c;r)=\left\{ x\mid\|x-c\| < r\right\}, B ( c ; r ) = { x ∣ ∥ x − c ∥ < r } , where the norm is is the 2 norm throughout the blog.
The indicator function of X \mathcal{X} X is denoted by ι , \iota, ι , with
ι ( x ) = { 0 , x ∈ X ∞ , x ∉ X . \iota(x)=\begin{cases} 0, & x\in\mathcal{X}\\ \infty, & x\notin\mathcal{X}. \end{cases} ι ( x ) = { 0 , ∞ , x ∈ X x ∈ / X . A Euclidean projection from a point x x x onto this set X \mathcal{X} X is denoted by Π ( x ) \Pi(x) Π ( x ) , and one arbitrary projection in Π ( x ) \Pi(x) Π ( x ) is denoted by π x \pi_{x} π x . It is defined as:
π x ∈ Π ( x ) = argmin y ∈ X ∥ x − y ∥ 2 . ( Projx ) \pi_{x}\in\Pi(x)=\mathop{\textrm{argmin}_{y\in\mathcal{X}}\|x-y\|^{2}.}\quad(\textrm{Projx}) π x ∈ Π ( x ) = argmin y ∈ X ∥ x − y ∥ 2 . ( Projx ) The distance function at the point x x x is denoted by d ( x ) = ∥ x − π x ∥ d(x)=\|x-\pi_{x}\| d ( x ) = ∥ x − π x ∥ , where the norm is the 2 norm.
A function is locally Lipschitz, if at any point, there is a neighborhood around that point where the function is Lipschitz continuous.
For any closed, nonempty set its distance function is 1-Lipschitz continuous everywhere (Rockafellar and Wets (2009) ) [Example 9.6]. This implies that for any β \beta β , d 2 + β ∥ x ∥ 2 d^{2}+\beta\|x\|^{2} d 2 + β ∥ x ∥ 2 is locally Lipschitz on any point x x x . To see that, consider x ∈ B x\in\mathcal{B} x ∈ B where B \mathcal{B} B is some closed ball around x x x . Then, for any x , y ∈ B , x,y\in\mathcal{B}, x , y ∈ B ,
∣ d 2 ( x ) + β ∥ x ∥ 2 − d 2 ( y ) − β ∥ y ∥ 2 ∣ ≤ ∣ ( d ( x ) − d ( y ) ) ( d ( x ) + d ( y ) ) ∣ + ∣ β ∣ ∣ ∥ x ∥ 2 − ∥ y ∥ 2 undefined = ( ∥ x ∥ + ∥ y ∥ ) ( ∥ x ∥ − ∥ y ∥ ) ∣ ≤ ∣ d ( x ) + d ( y ) ∣ undefined ≤ M 1 ∣ d ( x ) − d ( y ) ∣ undefined ≤ ∥ x − y ∥ + ∣ β ∣ ∣ ∥ x ∥ + ∥ y ∥ ∣ undefined ≤ M 2 ∣ ∥ x ∥ − ∥ y ∥ ∣ undefined ≤ ∥ x − y ∥ ≤ ( M 1 + M 2 β ) ∥ x − y ∥ , ( DistSqdLcLip ) \begin{align*} & |d^{2}(x)+\beta\|x\|^{2}-d^{2}(y)-\beta\|y\|^{2}|\\ \leq & |\left(d(x)-d(y)\right)\left(d(x)+d(y)\right)|+|\beta||\underbrace{\|x\|^{2}-\|y\|^{2}}_{=(\|x\|+\|y\|)(\|x\|-\|y\|)}|\\ \leq & \underbrace{|d(x)+d(y)|}_{\leq M_{1}}\underbrace{|d(x)-d(y)|}_{\leq\|x-y\|}\\ & +|\beta|\underbrace{|\|x\|+\|y\||}_{\leq M_{2}}\underbrace{|\|x\|-\|y\||}_{\leq\|x-y\|}\\ \leq & (M_{1}+M_{2}\beta)\|x-y\|,\quad(\textrm{DistSqdLcLip}) \end{align*} ≤ ≤ ≤ ∣ d 2 ( x ) + β ∥ x ∥ 2 − d 2 ( y ) − β ∥ y ∥ 2 ∣ ∣ ( d ( x ) − d ( y ) ) ( d ( x ) + d ( y ) ) ∣ + ∣ β ∣∣ = ( ∥ x ∥ + ∥ y ∥ ) ( ∥ x ∥ − ∥ y ∥ ) ∥ x ∥ 2 − ∥ y ∥ 2 ∣ ≤ M 1 ∣ d ( x ) + d ( y ) ∣ ≤ ∥ x − y ∥ ∣ d ( x ) − d ( y ) ∣ + ∣ β ∣ ≤ M 2 ∣∥ x ∥ + ∥ y ∥∣ ≤ ∥ x − y ∥ ∣∥ x ∥ − ∥ y ∥∣ ( M 1 + M 2 β ) ∥ x − y ∥ , ( DistSqdLcLip ) where in the second line we have used (i) the existence of constant M 1 , M 2 > 0 M_{1},M_{2}>0 M 1 , M 2 > 0 due to the points x , y ∈ B x,y\in\mathcal{B} x , y ∈ B , and (ii) the reverse triangle inequality.
The proximal normal cone of X \mathcal{X} X at a point x ∈ X x\in\mathcal{X} x ∈ X is defined as follows (Rockafellar and Wets (2009) )
v ∈ N ( x ) ⇔ ∃ τ > 0 x ∈ Π ( x + τ v ) , ( PrxNrmlCn ) v\in N(x)\Leftrightarrow\exists_{\tau>0}\;x\in\Pi(x+\tau v),\quad(\textrm{PrxNrmlCn}) v ∈ N ( x ) ⇔ ∃ τ > 0 x ∈ Π ( x + τv ) , ( PrxNrmlCn ) which also implies that
∀ τ undefined ∈ ( 0 , τ ) Π ( x + τ undefined v ) = { x } . \forall_{\widetilde{\tau}\in(0,\tau)}\;\Pi(x+\widetilde{\tau}v)=\{x\}. ∀ τ ∈ ( 0 , τ ) Π ( x + τ v ) = { x } . For any function f f f (not necessarily convex), its Frechet subdifferential ∂ f \partial f ∂ f at a point x x x is defined as follows (Correa et al. (1992) )
v ∈ ∂ f ( x ) ⇔ lim inf y → 0 f ( x + y ) − f ( x ) − ⟨ v ∣ y ⟩ ∥ y ∥ ≥ 0. v\in\partial f(x)\Leftrightarrow\liminf_{y\to0}\frac{f(x+y)-f(x)-\left\langle v\mid y\right\rangle }{\|y\|}\geq0. v ∈ ∂ f ( x ) ⇔ y → 0 lim inf ∥ y ∥ f ( x + y ) − f ( x ) − ⟨ v ∣ y ⟩ ≥ 0.
On the other hand, The Clarke subdifferential of a locally Lipschitz function f f f is defined as follows:
u ∈ ∂ Clarke f ( x ) ⇔ ∀ d [ lim sup y → x , t ↓ 0 f ( y + t d ) − f ( y ) t ] ≥ ⟨ u ∣ d ⟩ . u\in\partial^{\textrm{Clarke}}f(x)\Leftrightarrow\forall_{d}\;\left[\limsup_{y\to x,t\downarrow0}\frac{f(y+td)-f(y)}{t}\right]\geq\left\langle u\mid d\right\rangle . u ∈ ∂ Clarke f ( x ) ⇔ ∀ d [ y → x , t ↓ 0 lim sup t f ( y + t d ) − f ( y ) ] ≥ ⟨ u ∣ d ⟩ . For a locally Lipschitz function the Clarke subdifferential is nonempty everywhere (Correa et al. (1992) ) [Property 2.2].
Define the infimal convolution function between two lower-semicontinuous functions f f f and g g g as follows (Correa et al. (1992) ) [Lemma 3.6]:
( f □ g ) ( x ) = inf y { f ( y ) + g ( x − y ) } . \begin{aligned} (f\square g)(x) & =\inf_{y}\left\{ f(y)+g(x-y)\right\} .\end{aligned} ( f □ g ) ( x ) = y inf { f ( y ) + g ( x − y ) } . Also, define
y x ⋆ ∈ argmin y { f ( y ) + g ( x − y ) } , y_{x}^{\star}\in\mathop{\textrm{argmin}}_{y}\left\{ f(y)+g(x-y)\right\}, y x ⋆ ∈ argmin y { f ( y ) + g ( x − y ) } , which can be empty. If y x ⋆ y_{x}^{\star} y x ⋆ exists, then
∂ ( f □ g ) ( x ) ⊆ ∂ f ( y x ⋆ ) ∩ ∂ g ( x − y x ⋆ ) . ( FrechSubRule ) \partial(f\square g)(x)\subseteq\partial f(y_{x}^{\star})\cap\partial g(x-y_{x}^{\star}).\quad(\textrm{FrechSubRule}) ∂ ( f □ g ) ( x ) ⊆ ∂ f ( y x ⋆ ) ∩ ∂ g ( x − y x ⋆ ) . ( FrechSubRule ) Note that,
d 2 ( x ) = min y ∈ X ∥ x − y ∥ 2 = min y { ι ( y ) + ∥ x − y ∥ 2 } = ( ι □ ∥ ⋅ ∥ 2 ) ( x ) , d^{2}(x)=\min_{y\in\mathcal{X}}\|x-y\|^{2}=\min_{y}\left\{ \iota(y)+\|x-y\|^{2}\right\} =(\iota\square\|\cdot\|^{2})(x), d 2 ( x ) = y ∈ X min ∥ x − y ∥ 2 = y min { ι ( y ) + ∥ x − y ∥ 2 } = ( ι □ ∥ ⋅ ∥ 2 ) ( x ) , where by definition:
π x = argmin y ∈ X ∥ x − y ∥ 2 = argmin y { ι ( y ) □ ∥ x − y ∥ 2 } , \pi_{x}=\mathop{\textrm{argmin}}_{y\in\mathcal{X}}\|x-y\|^{2}=\mathop{\textrm{argmin}}_{y}\left\{ \iota(y)\square\|x-y\|^{2}\right\}, π x = argmin y ∈ X ∥ x − y ∥ 2 = argmin y { ι ( y ) □ ∥ x − y ∥ 2 } , Hence, recalling that ∂ ∥ x ∥ 2 = 2 x \partial\|x\|^{2}=2x ∂ ∥ x ∥ 2 = 2 x , we can find a Frechet subgradient of d 2 d^{2} d 2 using (FrechSubRule):
∂ d 2 ( x ) = ∂ ( ι □ ∥ ⋅ ∥ 2 ) ( x ) ⊆ ∂ ι ( π x ) ∩ ∂ [ ∥ x − π x ∥ 2 ] undefined = 2 ( x − π x ) = ∂ ι ( π x ) ∩ { 2 ( x − π x ) } , \begin{aligned} \partial d^{2}(x) & =\partial(\iota\square\|\cdot\|^{2})(x)\\ \subseteq & \partial\iota(\pi_{x})\cap\underbrace{\partial\left[\|x-\pi_{x}\|^{2}\right]}_{=2(x-\pi_{x})}\\ & =\partial\iota(\pi_{x})\cap\{2(x-\pi_{x})\},\end{aligned} ∂ d 2 ( x ) ⊆ = ∂ ( ι □ ∥ ⋅ ∥ 2 ) ( x ) ∂ ι ( π x ) ∩ = 2 ( x − π x ) ∂ [ ∥ x − π x ∥ 2 ] = ∂ ι ( π x ) ∩ { 2 ( x − π x )} , hence if ∂ d 2 ( x ) ≠ ∅ \partial d^{2}(x)\neq\emptyset ∂ d 2 ( x ) = ∅ , then ∂ d 2 ( x ) = { 2 ( x − π x ) } \partial d^{2}(x)=\{2(x-\pi_{x})\} ∂ d 2 ( x ) = { 2 ( x − π x )} . In other words,
∀ x : ∂ d 2 ( x ) ≠ ∅ ∂ d 2 ( x ) = 2 ( x − π x ) . ( FrechSubDistSqd ) \forall_{x:\partial d^{2}(x)\neq\emptyset}\quad\partial d^{2}(x)=2(x-\pi_{x}).\quad(\textrm{FrechSubDistSqd}) ∀ x : ∂ d 2 ( x ) = ∅ ∂ d 2 ( x ) = 2 ( x − π x ) . ( FrechSubDistSqd ) If a locally Lipschitz function f f f has its Frechet subdifferential ∂ f \partial f ∂ f monotone on { ( x , u ) ∈ g r a ∂ f ∣ x ∈ A } \{(x,u) \in \mathbf{gra}\partial f \mid x\in A\} {( x , u ) ∈ gra ∂ f ∣ x ∈ A } , where A A A is a convex set, then f f f is convex on A A A (Correa et al. (1992) ) [Theorem 3.8, Remark after Property 2.7]. In other words, for a locally Lipschitz function f f f
∀ ( x , u ) , ( y , v ) ∈ g r a ∂ f : x , y ∈ A ⟨ u − v ∣ x − y ⟩ ≥ 0 ⇒ f : convex on A ( LocCvx ) , \forall_{(x,u),(y,v)\in\mathbf{gra}\partial f:x,y\in A}\;\left\langle u-v\mid x-y\right\rangle \geq0\Rightarrow f:\textrm{convex on }A\quad(\textrm{LocCvx}), ∀ ( x , u ) , ( y , v ) ∈ gra ∂ f : x , y ∈ A ⟨ u − v ∣ x − y ⟩ ≥ 0 ⇒ f : convex on A ( LocCvx ) , where g r a ∂ f = { ( x , u ) ∣ u ∈ ∂ f ( x ) } . \mathbf{gra}\partial f=\left\{ (x,u)\mid u\in\partial f(x)\right\}. gra ∂ f = { ( x , u ) ∣ u ∈ ∂ f ( x ) } .
Furthermore, if f f f is locally Lipschitz, then monotonicity of ∂ f \partial f ∂ f is equivalent to the monotonicity of ∂ Clarke f \partial^{\textrm{Clarke}}f ∂ Clarke f , so proving either is fine to establish convexity (Correa et al. (1992) ) [Remark after Property 2.7].
An operator A A A is β \beta β -cocoercive on some convex set S S S if
∀ ( x , u ) , ( y , v ) ∈ S ∩ g r a A ⟨ x − y ∣ u − v ⟩ ≥ β ∥ u − v ∥ 2 . ( CocrcvOpt ) \forall_{(x,u),(y,v)\in S\cap\mathbf{gra}A}\quad\left\langle x-y\mid u-v\right\rangle \geq\beta\|u-v\|^{2}.\quad(\textrm{CocrcvOpt}) ∀ ( x , u ) , ( y , v ) ∈ S ∩ gra A ⟨ x − y ∣ u − v ⟩ ≥ β ∥ u − v ∥ 2 . ( CocrcvOpt ) A β \beta β -cocoercive operator on S S S is also 1 β \frac{1}{\beta} β 1 -Lipschitz on S S S .
An operator A A A is ρ \rho ρ -hypomonotone on a convex set S S S if
∀ ( x , u ) , ( u , v ) ∈ S ∩ g r a A ⟨ u − v ∣ x − y ⟩ + ρ ∥ x − y ∥ 2 ≥ 0. \forall_{(x,u),(u,v)\in S\cap\mathbf{gra}A}\quad\left\langle u-v\mid x-y\right\rangle +\rho\|x-y\|^{2}\geq0. ∀ ( x , u ) , ( u , v ) ∈ S ∩ gra A ⟨ u − v ∣ x − y ⟩ + ρ ∥ x − y ∥ 2 ≥ 0. The following result is a restatement of Equation (3.1) of (Poliquin et al. (2000) ) . If X \mathcal{X} X is prox-regular at x ˉ ∈ X \bar{x}\in\mathcal{X} x ˉ ∈ X , then there exists some ρ > 0 \rho>0 ρ > 0 and R > 0 R>0 R > 0 such that the proximal normal cone is a ρ \rho ρ -hypomonotone operator on an R R R -neighborhood of ( x ˉ , 0 ) (\bar{x},0) ( x ˉ , 0 ) , defined by
V R ( x ˉ , 0 ) = { ( p , u ) ∣ ∥ p − x ˉ ∥ ≤ R , ∥ u − 0 ∥ ≤ R } , (VR) \mathcal{V}_{R}(\bar{x},0)=\left\{ (p,u)\mid\|p-\bar{x}\|\leq R,\|u-0\|\leq R\right\},\quad \textrm{(VR)} V R ( x ˉ , 0 ) = { ( p , u ) ∣ ∥ p − x ˉ ∥ ≤ R , ∥ u − 0∥ ≤ R } , (VR) i.e. ,
∀ ( p , u ) , ( q , v ) ∈ V R ( x ˉ , 0 ) ∩ g r a N ⟨ u − v ∣ p − q ⟩ + ρ ∥ p − q ∥ 2 ≥ 0. ( HypMntn ) \begin{aligned} \forall_{(p,u),(q,v)\in\mathcal{V}_{R}(\bar{x},0)\cap\mathbf{gra}N} & \quad\left\langle u-v\mid p-q\right\rangle +\rho\|p-q\|^{2}\geq 0. \quad(\textrm{HypMntn})\end{aligned} ∀ ( p , u ) , ( q , v ) ∈ V R ( x ˉ , 0 ) ∩ gra N ⟨ u − v ∣ p − q ⟩ + ρ ∥ p − q ∥ 2 ≥ 0. ( HypMntn ) Now we are in a position to state the main result regarding geometry of a prox-regular set and prove it.
If X \mathcal{X} X is set that is prox-regular at x ˉ ∈ X \bar{x}\in\mathcal{X} x ˉ ∈ X , then there exist some ρ > 0 , R > 0 \rho>0,R>0 ρ > 0 , R > 0 such that for any λ \lambda λ satisfying λ ∈ ( 0 , 2 ) \lambda\in(0,2) λ ∈ ( 0 , 2 ) and λ ≤ ρ \lambda\leq\rho λ ≤ ρ we have the following properties :
(i) for any x ∈ B ( x ˉ ; λ R 2 ρ ) x\in B(\bar{x};\frac{\lambda R}{2\rho}) x ∈ B ( x ˉ ; 2 ρ λ R ) , we have ( π x , 2 ρ λ ( x − π x ) ) ∈ V R ( x ˉ , 0 ) ∩ g r a N \left(\pi_{x},\frac{2\rho}{\lambda}(x-\pi_{x})\right)\in\mathcal{V}_{R}(\bar{x},0)\cap\mathbf{gra}N ( π x , λ 2 ρ ( x − π x ) ) ∈ V R ( x ˉ , 0 ) ∩ gra N , where V R ( x ˉ , 0 ) \mathcal{V}_R(\bar x,0) V R ( x ˉ , 0 ) is the same set defined in (VR).
(ii) the projection operator Π \Pi Π is 2 2 − λ \frac{2}{2-\lambda} 2 − λ 2 -Lipschitz continuous on B ( x ˉ , λ R 2 ρ ) B\left(\bar{x},\frac{\lambda R}{2\rho}\right) B ( x ˉ , 2 ρ λ R ) ,
(iii) on B ( x ˉ , λ R 2 ρ ) B\left(\bar{x},\frac{\lambda R}{2\rho}\right) B ( x ˉ , 2 ρ λ R ) , the function ϕ λ = d 2 + λ 2 − λ ∥ ⋅ ∥ 2 \phi_{\lambda}=d^{2}+\frac{\lambda}{2-\lambda}\|\cdot\|^{2} ϕ λ = d 2 + 2 − λ λ ∥ ⋅ ∥ 2 is convex and differentiable, with the derivative given by ∇ ϕ λ ( x ) = 2 ( x − π x ) + 2 2 2 − λ x . \nabla\phi_{\lambda}(x)=2(x-\pi_{x})+2\frac{2}{2-\lambda}x. ∇ ϕ λ ( x ) = 2 ( x − π x ) + 2 2 − λ 2 x .
First, we show that for for any x ∈ B ( x ˉ , λ R 2 ρ ) x\in B\left(\bar{x},\frac{\lambda R}{2\rho}\right) x ∈ B ( x ˉ , 2 ρ λ R ) , ∥ π x − x ˉ ∥ ≤ R \|\pi_{x}-\bar{x}\|\leq R ∥ π x − x ˉ ∥ ≤ R . For x ∈ B ( x ˉ , λ R 2 ρ ) , x\in B\left(\bar{x},\frac{\lambda R}{2\rho}\right), x ∈ B ( x ˉ , 2 ρ λ R ) , we have
∥ π x − x ˉ ∥ = ∥ π x − x + x − x ˉ ∥ ≤ a ) ∥ π x − x ∥ undefined = d ( x ) ≤ ∥ x ˉ − x ∥ < ( λ R / 2 ρ ) + ∥ x − x ˉ ∥ undefined < ( λ R / 2 ρ ) < λ R ρ ≤ R , \begin{aligned} & \|\pi_{x}-\bar{x}\|\\ = & \|\pi_{x}-x+x-\bar{x}\|\\ \overset{a)}{\leq} & \underbrace{\|\pi_{x}-x\|}_{=d(x)\leq\|\bar{x}-x\|<(\lambda R/2\rho)}+\underbrace{\|x-\bar{x}\|}_{<(\lambda R/2\rho)}\\ < & \frac{\lambda R}{\rho}\\ \leq & R,\end{aligned} = ≤ a ) < ≤ ∥ π x − x ˉ ∥ ∥ π x − x + x − x ˉ ∥ = d ( x ) ≤ ∥ x ˉ − x ∥ < ( λ R /2 ρ ) ∥ π x − x ∥ + < ( λ R /2 ρ ) ∥ x − x ˉ ∥ ρ λ R R , where in the last line we have used λ ≤ ρ \lambda\leq\rho λ ≤ ρ .
Set τ ≔ λ 2 ρ > 0 \tau\coloneqq\frac{\lambda}{2\rho}>0 τ : = 2 ρ λ > 0 , then
Π ( π x + τ { 2 ρ λ ( x − π x ) } ) = Π ( x ) ∋ π x , \Pi\left(\pi_{x}+\tau\left\{ \frac{2\rho}{\lambda}(x-\pi_{x})\right\} \right)=\Pi\left(x\right)\ni\pi_{x}, Π ( π x + τ { λ 2 ρ ( x − π x ) } ) = Π ( x ) ∋ π x , where the last inclusion follows from definition of projection. Thus 2 ρ λ ( x − π x ) ∈ N ( π x ) \frac{2\rho}{\lambda}(x-\pi_{x})\in N(\pi_{x}) λ 2 ρ ( x − π x ) ∈ N ( π x ) by (PrxNrmlCn). Also, from d ( x ) ≤ ∥ x ˉ − x ∥ < ( λ R / 2 ρ ) d(x)\leq\|\bar{x}-x\|<(\lambda R/2\rho) d ( x ) ≤ ∥ x ˉ − x ∥ < ( λ R /2 ρ ) , we have
∥ 2 ρ λ ( x − π x ) ∥ < R , \begin{aligned} \|\frac{2\rho}{\lambda}(x-\pi_{x})\|< & R,\end{aligned} ∥ λ 2 ρ ( x − π x ) ∥ < R ,
thus completing proof to (i).
From (i), we have for any x , y ∈ B ( x ˉ , λ R 2 ρ ) x,y\in B\left(\bar{x},\frac{\lambda R}{2\rho}\right) x , y ∈ B ( x ˉ , 2 ρ λ R ) , we have ( π x , 2 ρ λ ( x − π x ) ) , ( π y , 2 ρ λ ( y − π y ) ) ∈ V R ( x ˉ , 0 ) ∩ g r a N \left(\pi_{x},\frac{2\rho}{\lambda}(x-\pi_{x})\right),\left(\pi_{y},\frac{2\rho}{\lambda}(y-\pi_{y})\right)\in\mathcal{V}_{R}(\bar{x},0)\cap\mathbf{gra}N ( π x , λ 2 ρ ( x − π x ) ) , ( π y , λ 2 ρ ( y − π y ) ) ∈ V R ( x ˉ , 0 ) ∩ gra N , and using (HypMntn), we have
0 ≤ ⟨ 2 ρ λ ( x − π x ) − 2 ρ λ ( y − π y ) ∣ π x − π y ⟩ + ρ ∥ π x − π y ∥ 2 = ⟨ 2 ρ λ ( x − y ) − 2 ρ λ ( π x − π y ) ∣ π x − π y ⟩ + ρ ∥ π x − π y ∥ 2 = 2 ρ λ ⟨ x − y ∣ π x − π y ⟩ − 2 ρ λ ∥ π x − π y ∥ 2 + ρ ∥ π x − π y ∥ 2 undefined = − 2 ρ λ ( 1 − λ 2 ) ∥ π x − π y ∥ 2 = 2 ρ λ ⟨ x − y ∣ π x − π y ⟩ − 2 ρ λ ( 1 − λ 2 ) ∥ π x − π y ∥ 2 ⇔ ( 1 − λ 2 ) ∥ π x − π y ∥ 2 ≤ ⟨ x − y ∣ π x − π y ⟩ . \begin{aligned} 0 & \leq\left\langle \frac{2\rho}{\lambda}(x-\pi_{x})-\frac{2\rho}{\lambda}(y-\pi_{y})\mid\pi_{x}-\pi_{y}\right\rangle +\rho\|\pi_{x}-\pi_{y}\|^{2}\\ & =\left\langle \frac{2\rho}{\lambda}(x-y)-\frac{2\rho}{\lambda}(\pi_{x}-\pi_{y})\mid\pi_{x}-\pi_{y}\right\rangle +\rho\|\pi_{x}-\pi_{y}\|^{2}\\ & =\frac{2\rho}{\lambda}\left\langle x-y\mid\pi_{x}-\pi_{y}\right\rangle \underbrace{-\frac{2\rho}{\lambda}\|\pi_{x}-\pi_{y}\|^{2}+\rho\|\pi_{x}-\pi_{y}\|^{2}}_{=-\frac{2\rho}{\lambda}\left(1-\frac{\lambda}{2}\right)\|\pi_{x}-\pi_{y}\|^{2}}\\ & =\frac{2\rho}{\lambda}\left\langle x-y\mid\pi_{x}-\pi_{y}\right\rangle -\frac{2\rho}{\lambda}\left(1-\frac{\lambda}{2}\right)\|\pi_{x}-\pi_{y}\|^{2}\\ \Leftrightarrow & \left(1-\frac{\lambda}{2}\right)\|\pi_{x}-\pi_{y}\|^{2}\leq\left\langle x-y\mid\pi_{x}-\pi_{y}\right\rangle .\end{aligned} 0 ⇔ ≤ ⟨ λ 2 ρ ( x − π x ) − λ 2 ρ ( y − π y ) ∣ π x − π y ⟩ + ρ ∥ π x − π y ∥ 2 = ⟨ λ 2 ρ ( x − y ) − λ 2 ρ ( π x − π y ) ∣ π x − π y ⟩ + ρ ∥ π x − π y ∥ 2 = λ 2 ρ ⟨ x − y ∣ π x − π y ⟩ = − λ 2 ρ ( 1 − 2 λ ) ∥ π x − π y ∥ 2 − λ 2 ρ ∥ π x − π y ∥ 2 + ρ ∥ π x − π y ∥ 2 = λ 2 ρ ⟨ x − y ∣ π x − π y ⟩ − λ 2 ρ ( 1 − 2 λ ) ∥ π x − π y ∥ 2 ( 1 − 2 λ ) ∥ π x − π y ∥ 2 ≤ ⟨ x − y ∣ π x − π y ⟩ . So we have shown that
∀ ( x , π x ) , ( y , π y ) ∈ g r a Π ∩ B ( x ˉ , λ R 2 ρ ) ⟨ x − y ∣ π x − π y ⟩ ≥ ( 1 − λ 2 ) ∥ π x − π y ∥ 2 , \forall_{(x,\pi_{x}),(y,\pi_{y})\in\mathbf{gra}\Pi\cap B\left(\bar{x},\frac{\lambda R}{2\rho}\right)}\quad\left\langle x-y\mid\pi_{x}-\pi_{y}\right\rangle \geq\left(1-\frac{\lambda}{2}\right)\|\pi_{x}-\pi_{y}\|^{2}, ∀ ( x , π x ) , ( y , π y ) ∈ gra Π ∩ B ( x ˉ , 2 ρ λ R ) ⟨ x − y ∣ π x − π y ⟩ ≥ ( 1 − 2 λ ) ∥ π x − π y ∥ 2 , hence the projection operator Π \Pi Π is [ ( 2 − λ ) / 2 ] \left[(2-\lambda)/2\right] [ ( 2 − λ ) /2 ] -cocoercive on B ( x ˉ , λ R 2 ρ ) B\left(\bar{x},\frac{\lambda R}{2\rho}\right) B ( x ˉ , 2 ρ λ R ) from (CocrcvOpt). Because a β \beta β -cocoercive operator on S S S is also 1 β \frac{1}{\beta} β 1 -Lipschitz on S S S , we have the projection operator Π \Pi Π being [ 2 / ( 2 − λ ) ] \left[2/(2-\lambda)\right] [ 2/ ( 2 − λ ) ] -Lipschitz continuous on B ( x ˉ , λ R 2 ρ ) B\left(\bar{x},\frac{\lambda R}{2\rho}\right) B ( x ˉ , 2 ρ λ R ) , i.e. ,
∀ ( x , π x ) , ( y , π y ) ∈ g r a Π ∩ B ( x ˉ , λ R 2 ρ ) ∥ π x − π y ∥ ≤ 2 2 − λ ∥ x − y ∥ . ( LipCont ) \forall_{(x,\pi_{x}),(y,\pi_{y})\in\mathbf{gra}\Pi\cap B\left(\bar{x},\frac{\lambda R}{2\rho}\right)}\quad\|\pi_{x}-\pi_{y}\|\leq\frac{2}{2-\lambda}\|x-y\|.\quad(\textrm{LipCont}) ∀ ( x , π x ) , ( y , π y ) ∈ gra Π ∩ B ( x ˉ , 2 ρ λ R ) ∥ π x − π y ∥ ≤ 2 − λ 2 ∥ x − y ∥. ( LipCont ) This proves (ii).
Take a point x ∈ B ( x ˉ , λ R 2 ρ ) . x\in B\left(\bar{x},\frac{\lambda R}{2\rho}\right). x ∈ B ( x ˉ , 2 ρ λ R ) . Due to (ii), the projection operator Π ( x ) \Pi(x) Π ( x ) is single-valued on B ( x ˉ , λ R 2 ρ ) B\left(\bar{x},\frac{\lambda R}{2\rho}\right) B ( x ˉ , 2 ρ λ R ) .
From (DistSqdLcLip) ϕ λ \phi_{\lambda} ϕ λ is locally Lipschitz, so we will employ (LocCvx) to prove its convexity on B ( x ˉ , λ R 2 ρ ) B\left(\bar{x},\frac{\lambda R}{2\rho}\right) B ( x ˉ , 2 ρ λ R ) . First, we will show that on the set
S = { ( x , u ) ∣ ( x , u ) ∈ g r a ∂ ϕ λ , x ∈ B ( x ˉ , λ R 2 ρ ) } S=\left\{ (x,u)\mid(x,u)\in\mathbf{gra}\partial\phi_{\lambda},x\in B\left(\bar{x},\frac{\lambda R}{2\rho}\right)\right\} S = { ( x , u ) ∣ ( x , u ) ∈ gra ∂ ϕ λ , x ∈ B ( x ˉ , 2 ρ λ R ) } the operator ∂ ϕ λ \partial\phi_{\lambda} ∂ ϕ λ is monotone, which will help in proving that ϕ λ \phi_{\lambda} ϕ λ is convex and differentiable on B ( x ˉ , λ R 2 ρ ) B\left(\bar{x},\frac{\lambda R}{2\rho}\right) B ( x ˉ , 2 ρ λ R ) .
Recall from (FrechSubDistSqd) that
∀ x : ∂ d 2 ( x ) ≠ ∅ ∂ d 2 ( x ) = 2 ( x − π x ) . \forall_{x:\partial d^{2}(x)\neq\emptyset}\quad\partial d^{2}(x)=2(x-\pi_{x}). ∀ x : ∂ d 2 ( x ) = ∅ ∂ d 2 ( x ) = 2 ( x − π x ) . Consider two points ( x , u ) , ( y , v ) ∈ S (x,u),(y,v)\in S ( x , u ) , ( y , v ) ∈ S . Without loss of generality, we can assume that ∂ d 2 ( x ) , ∂ d 2 ( y ) \partial d^{2}(x),\partial d^{2}(y) ∂ d 2 ( x ) , ∂ d 2 ( y ) are nonempty, because for the empty case (iii) is vacuously true. Then from (FrechSubDistSqd) we have ∂ d 2 ( x ) = 2 ( x − π x ) \partial d^{2}(x)=2(x-\pi_{x}) ∂ d 2 ( x ) = 2 ( x − π x ) and ∂ d 2 ( y ) = 2 ( y − π y ) \partial d^{2}(y)=2(y-\pi_{y}) ∂ d 2 ( y ) = 2 ( y − π y ) . On these points,
∂ ϕ λ ( x ) = ∂ d 2 ( x ) + 2 λ 2 − λ ( x ) , and ∂ ϕ λ ( y ) = ∂ d 2 ( y ) + 2 λ 2 − λ ( y ) . \partial\phi_{\lambda}(x)=\partial d^{2}(x)+2\frac{\lambda}{2-\lambda}(x),\textrm{ and }\partial\phi_{\lambda}(y)=\partial d^{2}(y)+2\frac{\lambda}{2-\lambda}(y). ∂ ϕ λ ( x ) = ∂ d 2 ( x ) + 2 2 − λ λ ( x ) , and ∂ ϕ λ ( y ) = ∂ d 2 ( y ) + 2 2 − λ λ ( y ) . We want to show that for any such ( x , u ) , ( y , v ) ∈ S (x,u),(y,v)\in S ( x , u ) , ( y , v ) ∈ S , we have
0 ≤ ⟨ ∂ ϕ λ ( x ) − ∂ ϕ λ ( y ) ∣ x − y ⟩ = ⟨ ∂ d 2 ( x ) + 2 λ 2 − λ ( x ) − ∂ d 2 ( y ) − 2 λ 2 − λ ( y ) ∣ x − y ⟩ = ⟨ ∂ d 2 ( x ) − ∂ d 2 ( y ) ∣ x − y ⟩ + 2 λ 2 − λ ∥ x − y ∥ 2 . ( GoalA ) \begin{aligned} 0 & \leq\left\langle \partial\phi_{\lambda}(x)-\partial\phi_{\lambda}(y)\mid x-y\right\rangle \\ & =\left\langle \partial d^{2}(x)+2\frac{\lambda}{2-\lambda}(x)-\partial d^{2}(y)-2\frac{\lambda}{2-\lambda}(y)\mid x-y\right\rangle \\ & =\left\langle \partial d^{2}(x)-\partial d^{2}(y)\mid x-y\right\rangle +2\frac{\lambda}{2-\lambda}\|x-y\|^{2}.\quad(\textrm{GoalA})\end{aligned} 0 ≤ ⟨ ∂ ϕ λ ( x ) − ∂ ϕ λ ( y ) ∣ x − y ⟩ = ⟨ ∂ d 2 ( x ) + 2 2 − λ λ ( x ) − ∂ d 2 ( y ) − 2 2 − λ λ ( y ) ∣ x − y ⟩ = ⟨ ∂ d 2 ( x ) − ∂ d 2 ( y ) ∣ x − y ⟩ + 2 2 − λ λ ∥ x − y ∥ 2 . ( GoalA ) To prove (GoalA), first we note that
⟨ ∂ d 2 ( x ) − ∂ d 2 ( y ) ∣ x − y ⟩ = ⟨ 2 ( x − π x ) − 2 ( y − π y ) ∣ x − y ⟩ = 2 ⟨ ( x − y ) − ( π x − π y ) ∣ x − y ⟩ = 2 ∥ x − y ∥ 2 − 2 ⟨ π x − π y ∣ x − y ⟩ . ( EqPart1 ) \begin{aligned} \left\langle \partial d^{2}(x)-\partial d^{2}(y)\mid x-y\right\rangle & =\left\langle 2(x-\pi_{x})-2(y-\pi_{y})\mid x-y\right\rangle \\ & =2\left\langle (x-y)-(\pi_{x}-\pi_{y})\mid x-y\right\rangle \\ & =2\|x-y\|^{2}-2\left\langle \pi_{x}-\pi_{y}\mid x-y\right\rangle .\quad(\textrm{EqPart1})\end{aligned} ⟨ ∂ d 2 ( x ) − ∂ d 2 ( y ) ∣ x − y ⟩ = ⟨ 2 ( x − π x ) − 2 ( y − π y ) ∣ x − y ⟩ = 2 ⟨ ( x − y ) − ( π x − π y ) ∣ x − y ⟩ = 2∥ x − y ∥ 2 − 2 ⟨ π x − π y ∣ x − y ⟩ . ( EqPart1 ) Now, using Cauchy–Schwarz inequality, we have
⟨ π x − π y ∣ x − y ⟩ ≤ ∥ π x − π y ∥ undefined ≤ 2 2 − λ ∥ x − y ∥ ∥ x − y ∥ ≤ 2 2 − λ ∥ x − y ∥ 2 ⇒ − 2 ⟨ π x − π y ∣ x − y ⟩ ≥ − 4 2 − λ ∥ x − y ∥ 2 , \begin{aligned} \left\langle \pi_{x}-\pi_{y}\mid x-y\right\rangle & \leq\underbrace{\|\pi_{x}-\pi_{y}\|}_{\leq\frac{2}{2-\lambda}\|x-y\|}\|x-y\|\\ & \leq\frac{2}{2-\lambda}\|x-y\|^{2}\\ \Rightarrow-2\left\langle \pi_{x}-\pi_{y}\mid x-y\right\rangle & \geq-\frac{4}{2-\lambda}\|x-y\|^{2},\end{aligned} ⟨ π x − π y ∣ x − y ⟩ ⇒ − 2 ⟨ π x − π y ∣ x − y ⟩ ≤ ≤ 2 − λ 2 ∥ x − y ∥ ∥ π x − π y ∥ ∥ x − y ∥ ≤ 2 − λ 2 ∥ x − y ∥ 2 ≥ − 2 − λ 4 ∥ x − y ∥ 2 , and putting this in (EqPart1), we have
⟨ ∂ d 2 ( x ) − ∂ d 2 ( y ) ∣ x − y ⟩ ≥ 2 ∥ x − y ∥ 2 − 4 2 − λ ∥ x − y ∥ 2 ≥ ( 2 − 4 2 − λ ) ∥ x − y ∥ 2 = − 2 λ 2 − λ ∥ x − y ∥ 2 ⇒ ⟨ ∂ d 2 ( x ) − ∂ d 2 ( y ) ∣ x − y ⟩ + 2 λ 2 − λ ∥ x − y ∥ 2 ≥ 0 , \begin{aligned} \left\langle \partial d^{2}(x)-\partial d^{2}(y)\mid x-y\right\rangle & \geq2\|x-y\|^{2}-\frac{4}{2-\lambda}\|x-y\|^{2}\\ & \geq\left(2-\frac{4}{2-\lambda}\right)\|x-y\|^{2}\\ & =\frac{-2\lambda}{2-\lambda}\|x-y\|^{2}\\ \Rightarrow\left\langle \partial d^{2}(x)-\partial d^{2}(y)\mid x-y\right\rangle +2\frac{\lambda}{2-\lambda}\|x-y\|^{2} & \geq0,\end{aligned} ⟨ ∂ d 2 ( x ) − ∂ d 2 ( y ) ∣ x − y ⟩ ⇒ ⟨ ∂ d 2 ( x ) − ∂ d 2 ( y ) ∣ x − y ⟩ + 2 2 − λ λ ∥ x − y ∥ 2 ≥ 2∥ x − y ∥ 2 − 2 − λ 4 ∥ x − y ∥ 2 ≥ ( 2 − 2 − λ 4 ) ∥ x − y ∥ 2 = 2 − λ − 2 λ ∥ x − y ∥ 2 ≥ 0 , thus reaching (GoalA). So, we have shown that on S S S , ∂ ϕ λ \partial\phi_{\lambda} ∂ ϕ λ is monotone on S S S . As ϕ λ \phi_{\lambda} ϕ λ is locally Lipschitz, it means that ∂ Clarke f \partial^{\textrm{Clarke}}f ∂ Clarke f is monotone on S S S , and due to (LocCvx), we have ϕ λ \phi_{\lambda} ϕ λ convex on B ( x ˉ , λ R 2 ρ ) B\left(\bar{x},\frac{\lambda R}{2\rho}\right) B ( x ˉ , 2 ρ λ R ) . This further implies that, for any x x x in B ( x ˉ , λ R 2 ρ ) B\left(\bar{x},\frac{\lambda R}{2\rho}\right) B ( x ˉ , 2 ρ λ R ) , ∂ ϕ λ ( x ) = ∂ Clarke ϕ λ ( x ) , \partial\phi_{\lambda}(x)=\partial^{\textrm{Clarke}}\phi_{\lambda}(x), ∂ ϕ λ ( x ) = ∂ Clarke ϕ λ ( x ) , and due to the locally Lipschitz nature of ϕ λ , \phi_{\lambda}, ϕ λ , it has nonempty Clarke subdifferential everywhere (Correa et al. (1992) ) [Property 2.2]. So, all points in B ( x ˉ , λ R 2 ρ ) B\left(\bar{x},\frac{\lambda R}{2\rho}\right) B ( x ˉ , 2 ρ λ R ) is Frechet subdifferentiable, i.e., for any x x x in B ( x ˉ , λ R 2 ρ ) B\left(\bar{x},\frac{\lambda R}{2\rho}\right) B ( x ˉ , 2 ρ λ R ) , we have ∂ ϕ λ ( x ) = ∂ Clarke ϕ λ ( x ) ≠ ∅ \partial\phi_{\lambda}(x)=\partial^{\textrm{Clarke}}\phi_{\lambda}(x)\neq\emptyset ∂ ϕ λ ( x ) = ∂ Clarke ϕ λ ( x ) = ∅ , and as we have shown before, on those it is in fact differentiable with the gradient
∂ ϕ λ ( x ) = 2 ( x − π x ) + 2 2 2 − λ x . \partial\phi_{\lambda}(x)=2(x-\pi_{x})+2\frac{2}{2-\lambda}x. ∂ ϕ λ ( x ) = 2 ( x − π x ) + 2 2 − λ 2 x .
Thus, on B ( x ˉ , λ R 2 ρ ) B\left(\bar{x},\frac{\lambda R}{2\rho}\right) B ( x ˉ , 2 ρ λ R ) , the function ϕ λ \phi_{\lambda} ϕ λ is convex and differentiable.
Rockafellar, R. Tyrrell, and Roger J-B. Wets. Variational analysis . Vol. 317. Springer Science & Business Media, 2009. [pdf ]
Correa, Rafael, Alejandro Jofre, and Lionel Thibault. "Characterization of lower semicontinuous convex functions." Proceedings of the American Mathematical Society (1992): 67-72. [pdf ]
Poliquin, René, R. Rockafellar, and Lionel Thibault. "Local differentiability of distance functions." Transactions of the American mathematical Society 352.11 (2000): 5231-5249. [pdf ]