KKT Operator and KKT conditions
Shuvomoy Das Gupta
September 27, 2024
This is a short blog to show the connection between the zeros of the
KKT (Karush–Kuhn–Tucker) operator and the KKT conditions.
Contents
Optimization problem in consideration. 1
KKT conditions. 1
KKT operator. 2
Zeros of the KKT operator are the KKT points. 2
Optimization problem in consideration.
Consider the minimization problem
Notation and notions.
R
n
+
= {x R
n
| x 0} = {x R
n
|
i{1,2,...,n}
x
i
0}.
Indicator function. Indicator function of
a set S is defined by:
δ
S
(x) =
(
0, if x S ,
, else.
Normal cone. The subdifferential set of
δ
S
is called the normal cone of S, and is
denoted by N
S
= ∂δ
S
(x). It can be shown
easily that,
N
S
(x) =
(
{u | sup
zS
u | z x
0}, if x S
, if x / S .
Furthermore, N
S
(x) = {0} for any
x interior(S), so the normal cone takes
interesting values only at the boundary of
the set.
Set addition. If A, B are two sets in R
n
,
then A + B = {x + y | x A, y B}.
Also, A + = .
minimize
xR
n
f
0
(x)
subject to f
i
(x) 0, i = 1, 2, . . . , m, dual variable λ
i
0
h
i
(x) = 0, i = 1, 2, . . . , p, dual variable ν : free
(P)
where w f
0
, f
1
, . . . , f
m
, h
1
, . . . , h
p
: R
n
R {} are proper, closed,
and differentiable functions. We assume that the problem is feasible
and has a finite optimal solution.
KKT conditions.
For problem (P), the Lagrangian function is defined as,
L(x, λ, ν) = f
0
(x) +
m
i=1
λ
i
f
i
(x) +
p
i=1
ν
i
h
i
(x),
where λ 0. The points ( x, λ, ν) are called KKT points if they satisfy
the following four conditions:
1. Primal feasibility. f
i
(x) 0, for all i = 1, 2, . . . , m, and h
i
(x) = 0, for
all i = 1, 2, . . . , p.
2. Dual feasibility. λ
i
0 for all i = 1, 2, . . . , m.
3. Vanishing gradient of Lagrangian at primal variable.
x
f
0
(x) +
m
i=1
λ
i
x
f
i
(x) +
p
i=1
ν
i
x
h
i
(x) = 0.
4. Complementary slackness. λ
i
f
i
(x) = 0 for all i = 1, 2, . . . , m.
The four conditions above are called the KKT conditions.
kkt operator and kkt conditions 2
KKT operator.
Calculation for (a).
Note that
x
L
ext
(x, λ, ν)
=
x
f
0
(x) +
m
i=1
λ
i
f
i
(x) +
p
i=1
ν
i
h
i
(x) δ
R
m
+
(λ)
!
=
x
f
0
(x) +
m
i=1
λ
i
x
f
i
(x) +
p
i=1
ν
i
x
h
i
(x)
0
z }| {
x
δ
R
m
+
(λ)
=
x
f
0
(x) +
m
i=1
λ
i
x
f
i
(x) +
p
i=1
ν
i
x
h
i
(x),
and
(λ,ν)
L
ext
(x, λ, ν)
=
λ
f
0
(x) λ
F(x) ν
H(x) + δ
R
m
+
(λ)
ν
f
0
(x) λ
F(x) ν
H(x) + δ
R
m
+
(λ)
=
0
z }| {
λ
f
0
(x)
F(x)
z }| {
λ
λ
F(x)
0
z }| {
λ
ν
H(x) +
N
R
m
+
(λ)
z }| {
λ
δ
R
m
+
(λ)
ν
f
0
(x)
| {z }
0
ν
λ
F(x)
| {z }
0
ν
ν
H(x)
| {z }
H(x)
+
ν
δ
R
m
+
(λ)
| {z }
0
/
*
using ( f
1
(x) + f
2
(x)) = f
1
(x) + f
2
(x),
where f
1
subdifferentiable and f
2
differentiable
*
/
=
F(x) + N
R
m
+
(λ)
H(x)
.
The KKT operator of (P) is defined based on the extended La-
grangian function. The extended Lagrangian function is defined as:
L
ext
(x, λ, ν) = f
0
(x) +
m
i=1
λ
i
f
i
(x) +
p
i=1
ν
i
h
i
(x) δ
R
m
+
(λ)
= f
0
(x) + λ
F(x) + ν
H(x) δ
R
m
+
(λ),
where in the second line use the notation:
F(x) =
f
1
(x)
f
2
(x)
.
.
.
f
m
(x)
, and H(x) =
h
1
(x)
h
2
(x)
.
.
.
h
m
(x)
.
The KKT operator associated with L
ext
(x, λ, ν) is defined as:
T(x, λ, ν) =
"
x
(
L
ext
(x, λ, ν)
)
(λ,ν)
(
L
ext
(x, λ, ν)
)
#
(a)
=
x
f
0
(x) +
m
i=1
λ
i
x
f
i
(x) +
p
i=1
ν
i
x
h
i
(x)
F(x) + N
R
m
+
(λ)
H(x)
, (1)
where the calculation of (a) is shown in the sidenote.
Zeros of the KKT operator are the KKT points.
Normal cone of R
n
+
.
The normal cone of R
n
+
is given by:
N
R
n
+
(y) =
(
, if y / R
n
+
,
{
u R
n
| u 0,
u | y
= 0
}
if y R
n
+
.
(2)
In other words, if y R
n
+
then u N
R
n
+
(y) is
defined by:
u
i
(
= 0, if y
i
> 0
0, if y
i
0
means for those y
i
= 0, we set the associated
u
i
0 and for those y
i
> 0 we set the
associated u
i
= 0.
Explanation ( b).
The inclusion 0 F(x) + N
R
m
+
(λ) is
the same as saying that there is some u
N
R
m
+
(λ) such that F(x) + N
R
m
+
(λ) = 0.
We cannot have N
R
m
+
(λ) = as otherwise
F(x) + N
R
m
+
(λ) = , but we know that it
at least contains 0. Hence, we must also have
λ 0.
We now claim that if 0 T(x, λ, ν) if and only if (x, λ, ν) are the
KKT points satisfying the KKT conditions. Clearly, the first row of
T(x, λ, ν) being zero is the same as the third KKT condition: vanishing
gradient of Lagrangian at primal variable. The third row of T(x, λ, ν)
being zero is the same as second part of the first KKT conditions:
primal feasibility of the equality constraints. Now, we show that the
second row of T(x, λ, ν) being zero is equivalent to the rest of the
KKT conditions as follows.
We start with
F(x) + N
R
m
+
(λ) 0
see
Explanation (b)
uN
R
m
+
(λ)
F(x) + u = 0,
λ 0.
f
i
(x) + u
i
= 0, i {1, 2, . . . , m},
u N
R
m
+
(λ),
λ 0.
kkt operator and kkt conditions 3
f
i
(x) = u
i
, i {1, 2, . . . , m},
u 0,
u | λ
= 0, /
*
using (2)
*
/
λ 0.
f
i
(x) = u
i
i {1, 2, . . . , m},
u
i
0, λ
i
0, i {1, 2, . . . , m},
u | λ
=
m
i=1
λ
i
f
i
(x) = 0,
f
i
(x) = u
i
0 i {1, 2, . . . , m},
λ
i
0, i {1, 2, . . . , m},
m
i=1
λ
i
f
i
(x) = 0.
f
i
(x) 0 i {1, 2, . . . , m},
λ
i
0, i {1, 2, . . . , m},
λ
i
f
i
(x) = 0, i {1, 2, . . . , m}, /
*
addition of nonpositive summands is zero
if and only if each summand is zero.
*
/
where in the last line, the first inequality is primal feasibility of the
inequality constraints, the second inequality is dual feasibility, and
the last line is complementary slackness of the KKT conditions.